 So, last time what we had seen was the statistical model for linear regression, which is of this type where some values x i are the control variable or the input variable or the independent variable and given values of those x i, the values y i take on a value alpha plus beta x i plus some error term. The relationship of y and x is approximately linear given by this alpha plus beta x i, but there is some error because of random causes. So, if we plot data paired sets of data y and x, then we may be able to observe some sort of linear relationship and the first task of these linear regression model is to calibrate this type of regression equation where values of alpha and beta have to be determined. The problem is that because these error terms which you do not know I mean you do not know the so, these error terms e i are assumed to be mean 0 and variance sigma square. So, you do not know those you do not know that variance. So, what you observe are values of x i's and y i's. So, you can pretend that given x i value y i is as per this and we want to find the best value of alpha and beta which will fit this equation. So, what we do is we take the criterion of minimizing the sum of squared errors. So, given x i y i minus alpha plus beta x i is the error term because the e i is assumed to be mean 0. So, that you can remove. So, this is the pictorial representation that there is a line alpha plus beta x i underlying line alpha plus beta x i superimposed that on that are some error terms. So, alpha is the y intercept in this diagram and beta is the slope of this line. So, that is what has been determined. So, if the relationship between y i and y i and x is exactly linear then it all values y i will lie on this straight line alpha plus beta x i, but because of this error term each of the for a given x i the value y i will be some normally distributed random variable with mean alpha plus beta x i is that ok. So, it could be a little more than alpha plus beta x i or little less than alpha plus beta x i and the variance is sigma square. So, that is what we actually observe. So, given all those values that we observe we have to fit a linear expression. So, that is one way of looking at it which is more in the spirit of curve fitting. There is a statistical basis on the basis of I mean on the principle of maximum likelihood estimators which is also given in Ross's book which you can see. In the case of normally distributed errors these two estimates of the regression line the so called regression line turn out to be the same that is the least squares error criterion and the maximum likelihood estimate of this of this line alpha plus beta x i they turn out to be the same. So, we have derived this line alpha plus beta x i that means estimates a and b of alpha and beta using the principle of least square errors. So, the way that goes is you look at the observed value which is y i minus the predicted value as per the regression equation is alpha plus beta x i. So, if you fit values a and b for alpha and beta then the predicted value is a plus b x i. So, the difference in the predicted value and observed value is y i minus a i minus y i minus a minus b x i some of the squares of all these errors because you want to penalize both positive and negative deviations from this set I mean from this value. So, the values of a and b obtained by the minimizing this expression as a function of a and b they are called least square estimators of a and b of alpha and b. So, in this sum of square error term we had seen that the x i values and the corresponding y i values they are known this gives you a quadratic expression in a and b can be minimized sum of I mean two linear equations in two variables. So, actually these are linear expressions in a and b I mean it is a sort of it is a quadratic expression in a and b. If you set derivatives partial derivatives with respect to a and b equal to 0 you get two linear equations in a and b which you can solve those are called the normal equations 1 and 2 in this slide they can be solved and you get these. So, the point is that for given values of x i because the a and b values are the error term e i is the normal random variable with mean 0 and variance sigma square b is which has the y terms. So, in this expression for b let us say the x i values are all known. So, in the summation each x i is known x bar which is the summation of x i divided by n if there n n observations that is also known and the denominator is also known x i minus x bar square that is like the variance of the x i is treated. So, it is like the standard deviation square. So, you treat the x i treat the x i values not as random variables but just as a set of values which you observe that has got a standard deviation. So, that is the term there, but for given values of x i the y i's are all random variables right each y i is equal to alpha plus beta x i plus e i square. So, for a for given values of x i these are random variables. So, actually this b b is a sum of normal random variables. So, actually b b is b is itself a random variable. So, it has got some some b in and variance and distribution. So, it can be shown that the expected value of a is actually alpha the parameter that we are trying to estimate and the expected value of b is actually beta the parameter that we are trying to estimate. So, the expressions for a and b if we treat x i as known and y i's are random variable then a and b are random variables with expectations equal to alpha and beta. So, you can treat the whole set of observations that you see as outcomes of an experiment. If the underlying model is y is equal to alpha plus beta x i plus plus e i then for a given x i the y i is a random variable because it is it has got this e i term and that is what you observe. So, for a for a given set of values you compute this. So, it is an instance of the random variable it is like before you toss it before you toss a coin before you throw a die the outcome is a random variable with values 1 2 3 4 5 6 with probabilities 1 6th each after you throw the die the outcome is either 1 2 3 4 or 5 or 6 it is a it is a known thing. So, here in when you when you compute when you compute b as per this expression for given x i and observed y i it is like the realization of a random variable is that ok, but before you do it what you observe y i is a random function of x i because of the error term. So, it has got a certain expectation it has got a certain variance and so on right. So, supposing I think of supposing I think of doing a regression analysis of a large population of people where I look at height versus weight. Now, this is clearly I mean an example where these are characteristics of a certain population it would be difficult to argue that you know height causes weight or weight causes height that would be a bit stretching the thing may be there is some common genetic characteristic which causes both height and weight. So, that that may be, but in any case we are able to observe two attributes of the population. So, we are trying to see if there is any relationship between height and weight of of an individual. So, if we sample several individuals we get we get several readings ok. So, now the question is if this is sort of commonsensical that the taller the person the heavier he or she may be, but you know you have all seen tall thin people and short fat people and also some short thin people and tall fat people and several medium people and all those things right. So, you will you will see a scatter, but if you if you just look at the cloud of points or on a two dimensional graph of the attributes you can say that there is some sort of correlation. So, actually if you if you compute the correlation coefficient between height and weight I believe it is approximately 0.4 or something like that. So, it is it is positively correlated that means you expect a taller person to be heavier, but it is not that you know if the if there are two people and one person is twice as tall as the other person then she is going to be twice as heavy as the other person. So, it is not like a linear relationship or something it is it is got some correlation, but it is not it is not exact and in any case the for a for a given height you you see a certain spread of weight. So, if you if you take any any height like like this height whatever it is you you see that there is a spread. So, for given height of whatever you see that there are some some light people and some heavier people. So, you see a certain spread because of the randomness inherent in the evolution I mean the growth of living beings ok. So, so what we want to so we can do the following thing we can plot this in such a way that the units of height are in some units say centimeters, but the scale is such that one standard deviation of the height unit is equal to one standard deviation of the weight unit in at least on this page. So, if I you know if I if I move one centimeter on this page it corresponds to some number of standard deviation of the height parameter and same thing on the weight parameter. So, then the I plot a 45 degree standard deviation line which says. So, there will be a certain mean there will be a certain mean height right and there will be people who are one standard deviation more than the mean height means people who are taller than average by one standard deviation and people who are heavier than average on one standard deviation. So, that line will be a 45 degree line is that ok you got what I am trying to do. So, this is the standard deviation line which says this is the mean height this is the mean weight of the population that I am observing and this is the point corresponding to mean height plus one standard deviation and mean weight plus one standard deviation and this is the line corresponding to mean weight minus one standard deviation and so on. So, plus minus k sigma for different values of k that line will be a 45 degree line as because of the scaling of the x axis and the y axis is that ok that is what I have just schematically represented here. Now, the question which we want is the following that given given a certain height guess I mean given a certain height what is the average weight of the person ok. So, given given a certain height. So, let us say I mean in centimeters I do not know what is what is the average height which people have you think I mean what is your height in centimeters I mean does anyone know their average height that does anyone know their height in centimeters 170 ok. So, let us say supposing that is the average height of this population. So, and the standard deviation of this group is something let us say 10 centimeters ok we can compute the standard deviation. So, we plot this thing which is the 45 degree line is 170 plus 10 centimeter on the x axis and the average weight may be whatever 62 kilo whatever it may be is on the y axis and it has got a standard deviation of let us say 4 kilo. So, we can plot plot that thing. So, now the question is if I give a height like 180 centimeters what is the average weight of people who are 180 centimeters tall ok. So, there will be a certain spread. So, if I take. So, this is say. So, somewhere here in the in the midpoint of this range is say 170 this is 180. So, I would I would have a certain range ok. So, what is the average of people in this range? Similarly, for people here what is the average of people in this range? So, what you will see the typical sort of typically be shaped like you know the American football like thing right something like a ellipse ellipse like thing you know and. So, what you would see is that if I if I take a band here and ask what is the average of average weight of such people then you would see that if I go by this line that is likely to be an underestimate. And if I go by I mean if I if I am that is if I am below the mean height then if I go by this line then that would be a slight underestimated. If I go to the right of this that means if I look at the average height of people in a certain band here average weight of certain people in a certain band here then that would be a slight overestimate. So, actually if you if you if you go see you are you are looking at every person has got a discrete value of height or weight you know. So, if I if I ask what is the average height average weight of a person who is 180 kilos then you know I round it off and I say say I create a small vertical strip of people who are approximately 180 kilos say plus minus 0.5 kilos rounded off to the nearest kilo. And there will be some people in that band right. So, what is the average weight of such people? So, if I plot that curve that will be some straight I mean that will you know because of the nature of this curve that will be I mean because of the nature of this data that will be it will be something like this. So, the regression line is actually a smoothed version of this line. So, what comes out if you if you look at this scatter plot type of data it it actually you can you can see it that the the regression line actually will be something like see if I see data like this then I suspect that there is a linear relationship. That means, as one increases the other increases on average. So, I mean I may have some outlier here and here and things like that or here, but on average this is the cloud of of data. But if I look at so this is the 45 degree line if I look at this strip average of these people is likely to be above this line right. If I look at this strip the average of these people is likely to be below this line. So, actually the regression line is something like this ok. So, if you look at a set of data and if you sort of sketch these type of data and look at scatter plots of related things you will see that the regression line is is going to be something like this. And in the case of correlation equals 0 correlation equals 1 and correlation equals minus 1 you can see that this is the case, but for other values it. So, this this so the actual statement is something like this let me let me just go back to this for a minute. So, this is the typical shape of linearly related data that you have the American football type cluster data that is what we typically see. So, if you plot the line of x plus k sigma x that is the standard deviation of the x variable and y plus k sigma y then you call this the standard deviation line. So, if you scale it properly then that will be a 45 degree line just for convenience. This will pass through the center of the data this will pass through the x bar y bar point ok because that is k equal to 0. So, you can you can see that. So, can we use this line to predict the average value of y given given x ok. So, what you see is that when the correlation is positive that this average for y lies above the standard deviation line for x when x is below the standard deviation. So, for x k sigma below the standard deviation the average value of y will be above the standard deviation line and for when x is above the standard deviation the average value of y will be below the standard deviation ok. So, so this is what was observed. So, this this this term so there was this scientist Francis Galton whose who is responsible for this term regression. So, this term regression it actually means to go back to ok. So, if you just look at English meaning of regression it is to go back to. So, what why is this term used in the context of curve fitting or statistical analysis of correlated data. So, he conducted in a series of observations measurements of father's height versus son's height ok. So, in the sample that he had this thing and I think he took quite a few many hundreds of observations and this is the summary of what he apparently concluded. The average height of father's around 68 inches average height of son's was around 69 inches. So, you know in one generation there was a some small increase in the in the height because of better nutrition whatever whatever whatever was the condition in that society. The standard deviation for both was around 27 2.7 inches and the correlation coefficient was positive 0.5 which means that on average taller fathers have taller sons because of genetics may be ok. But it is not but the but the key point was that fathers who are k sigma in this case sigma f that is the standard deviation of the father's heights and the standard deviation of the son's heights were both same. So, sigma f is equal to sigma s, but father's yeah this is what I am talking about. So, the average height of father's was 68 inches average height of son's was 69 inches for both the standard deviation is approximately 2.7 inches and the correlation coefficient was 0.5. So, the scatter plot data of this for many hundreds was was compute was gathered and computed and in this cloud of data this person saw that there is some sort of linear relationship. I mean he hypothesized that it is useful to have a linear relationship as an explanation between the two because it is simple to explain simple to manipulate and so on. And so, from this type of data it it it shows that taller fathers on average have taller taller sons, but fathers who are k sigma. So, let us say fathers who are sigma above the mean. So, the mean is 68 inches and so fathers who are approximately 70.7 inches that means who are taller than the mean by one sigma. Their sons were were not you know one sigma taller than their mean. Their sons were only r times taller than the mean and similarly for shorter if a father was one standard deviation below if a set of fathers was one standard deviation below the mean height for fathers. The sons were shorter on average, but they were not as short as as the fathers relatively speaking. So, they were only r times shorter than the mean of the their population the sons population. In this case you know the standard deviation for both is the same because you are measuring height only and the standard deviation did not change too much from generation to generation. The average height changed a little bit, but the standard deviation did not change. So, in this case you can actually sort of related, but even if any other things was measured like supposing you are measuring height versus weight and so the the right way to think about it is that if I look at correlated values of any any paired data the regression equation shows that if if the x variable is say one standard deviation above the mean of the x values the corresponding y variables their average will be only r times more than the mean of the y values, where r is the coefficient of correlation ok. And this coefficient of correlation you know from from your earlier these things it is it is a number less than 1 it is minus 1 to 1 it it is absolute value less than 1. So, so what it says is that taller taller fathers have taller sons, but not as tall as as as much taller about their mean as the as the father. So, so Dalton Galton sorry not Dalton Galton term this this phenomenon as regression to mediocrity. So, taller fathers sons of tall fathers were taller than average, but not as much above average as their fathers shorter shorter sons of shorter fathers were also shorter than average, but not as short as not as much shorter as deviation from the mean as their fathers. So, they tended towards the average and and taller fathers had sons who tended towards the average. So, Galton term this term as going back to the mean or he called it regression to mediocrity or something. So, he was some aristocrat or whatever. So, he had some meaning some sociological type of explanation that society pulls you back to mediocrity or anyway he called it regression to mediocrity or regression to the mean is the is the phenomenon clear what I am talking about. So, you can read the explanation in in the book also and so the the other question which we talked about briefly in the last class which I want to just explain here is what to regress on what. So, we called x as the independent variable and y is the dependent variable. So, like supposing we are talking about height and weight data then what is independent and what is dependent actually both are both are probably dependent on on some gene or or environmental factor or or something on nutrition or whatever it is some combination of some unknown things we can make a guess, but right now we just want to look at the relationship between height and weight. So, we cannot say height causes weight or weight causes height we can we can very well regress either on a on the other any one of them we can regress on the other. So, the example of even in the example of linking sun's height to father's height maybe you can say sun's height is caused by father's height because of genetics, but you can regress the other way also and there you can certainly not argue that you know the father's height is caused by the sun's height that that would be a little stretching. So, remember that regression does not imply causality if if one of the variables in fact causes the other then we might be tempted to try that relationship first that means whatever is the causal variable we take as the independent variable and the the caused variable we take as a dependent variable, but otherwise you can regress anything on anything. The other reason we may want to regress one variable on the other is that we want to predict that one. So, if I want to predict the the the average weight of a group of people based on height measurements then I will take y as the independent variable which is high which is weight and x as the dependent variable independent variable which is height. So, if I want to predict one on the other then the one which I want to predict I take as the dependent variable. So, then I but I can I can do regression either way I can regress y on x or x on y. So, the if we if we follow this argument then so, this is y and this is x this is the sd line and this is the regression of y on x it sort of gives you the average values of y for a given x you can do the other way also and that will be this sort of line. So, this will actually give the the other type of thing that means, I I will I supposing for a given y I want to find out what is the value of x then you will see that this is a better explanation of the the midpoint of this line then this standard deviation line which is this one is that ok. So, you can actually if you if you look at this data. So, you can you can look at any typical exercise in in Ross's book or anywhere else and and actually try this you can try regression y on x and and x on y. So, suggested exercise is chapters I mean exercise 7 in chapter 9 which is the chapter on regression in Ross's book. So, there so, you can on any set of regression data you can you can you can construct a regression line for y given x that is called regressing y on x or x on y. So, it is actually paired set of data in in this example as in several other examples the the data are clearly sort of two related attributes from the same population there is no implication of causality. So, it is you know that so, you can try both regression and you can actually you will you will get coefficients alpha and beta in both the cases you can actually think about it and you can derive a simple relationship between what are the relationship between those alpha and beta coefficients that you get in when you regress y on x and x on y it will be some sort of reciprocal of each other. So, you can work it out actually ok. So, just let me complete this part which is the sort of technical part of it. So, this is all this I have put up. So, these are the sections which I would like you to read. So, the basic regression model is this. So, for for given x i y i is a normally distributed random variable with unknown parameters alpha beta and sigma square ok. So, now you can you can think about it as a hypothesis testing you can give confidence intervals you can you can go back to the framework of the statistical. So, if you have a statistical model like this then from the given data you can actually compute all the statistical quantities of interest. So, sigma square must be estimated from the given data and alpha beta are chosen to minimize some error term. So, the distribution. So, b b is the least square estimator that we have given a formula for b it is a lean it is an it is a linear combination of independent normal random variables with expected value beta and variance equal to this. So, if you see the variance of b it is sigma square and just depend on the x values and the n values. So, the. So, similarly a is also a normally distributed random variable with that mean alpha and the variance of a is that ok. So, this can be derived ok. So, the part that is remaining is the estimate of sigma. So, what we see is the sigma is see if sigma is large then what do you expect that the y i will be quite scattered around alpha plus beta x i. So, if I look at y minus a minus b x i then I will get that is an estimate of sigma. So, the sums of squares of residuals. So, the residual is for fixed value of a and b and x i I mean for fixed value of a and b the residual is y i minus a minus b x i. So, this term once I once I find a and b using those formulas then this term is an estimate of sigma square. The only thing is I have used the same expressions for estimating those mean values. So, I have to divide it by n minus 2 ok. So, the sums of squares of residuals divide by n minus 2 is actually turns out to be an unbiased estimator of sigma square which is the variance of the error term ok. So, this also can be can be derived. So, in summary a which is got from that formula actually the mathematics of it from the normal equations those two normal equations linear equations got by setting derivatives of the error terms equal to 0 with respect to a and b they are known parameters. Turn out to be such that you first compute b and based on that you compute a ok, but it is all a function of the data y and x and if you treat x as given and y as a random variable given that x then this is also a random variable and what you observe is a realization of the random variable ok. So, in summary a is an unbiased estimator of alpha b is an unbiased estimator of beta. What are a and b? They are from the quantities derived from the normal equations and the sums of squares of residuals. So, once a and b are determined then the sums of squares of residuals this SSR that divide by n minus 2 is an unbiased estimator of sigma square ok. So, that is the basic summary of the regression linear regression model and the associated statistics. So, in fact, we know the variances of these things also. So, we can actually give confidence intervals for these a and b and things like that. So, you know we can write a regression equation for anything this because this formulae are there. So, you can write you can construct a regression equation for anything, but how good is it for prediction? So, if I use it for prediction then I will get an estimate, but the spread of that estimate may be quite large it might be quite it might be so large that it is almost useful almost useless to predict. So, that depends on the data of course. So, you can compute those those confidence intervals for associated predictions. So, just to complete this thing the notation is you know if you if you write it in this sort of covariance like term S x y and the variance term S x is standard deviation square and S y y then we can actually do inference about regression. So, for example, one of the first questions you would ask is that is the dependent variable really correlated with the independent variable you know I see this cloud does it really mean anything or is it just you know x and y are just moving around randomly I mean they are just independent random parameters. So, is there really a connection? So, the hypothesis is is beta equal to 0 if beta is equal to 0 then y is just alpha plus some noise whatever x is y is. So, you know I may just have some some values which are you know just noise terms you know if I just see a just a some values like this. So, there is no there is nothing I mean if I if I tell you that x is this value then y has got a certain spread if I tell you x is this value then y has got a certain spread which seems almost the same as this. So, I do not get any more information about the likely values of y if I tell you what x is and maybe vice versa also. So, in this case there is no relationship between x and y. So, if I plot a linear regression then you know I am going to get a line like this this is the best fit. So, beta is equal to 0 slope is equal to 0 ok. So, that is one of the fundamental you know hypothesis regarding regression that is is beta equal to is beta equal to 0. So, we can pose this question in the language of hypothesis testing and take that as the null hypothesis and we can sort of work it through it turns out that I mean just going through the same whatever I said about the parameters b and the estimates of those things. So, actually it turns out that this quantity here which seems complicated, but basically it is I mean the main thing to note is that it is something that explicitly computable. So, square root of s x x x is the standard deviation of the x values b minus beta, beta is in this case 0. So, that is the hypothesis divide by sigma. So, sigma is estimated by that. So, this quantity has and divide by the denominator which is also a known quantity once you compute it. So, this quantity treated as a random variable has a t distribution with n minus 2 degrees of freedom. So, from the regression data if we treat x i as random variables one can construct a new random variable this one. It looks complicated, but it is a function of the known parameters which has a known distribution. So, using that known distribution we know what values it can take with what probabilities. So, we can test hypothesis. See in hypothesis testing what we did was to answer the questions of our interest we found a known random variable with a given distribution. So, that will have certain properties and. So, therefore, it can take on extreme values only with the small probability. So, then we are able to put type 1 errors, type 2 errors, level of significance. So, because of the randomness it is possible that this random variable can take on some extreme values you know very large values or very small value, but that is very unlikely. So, we put a bound on this value and if it lies in a certain interval then we come to some conclusion about the hypothesis. If it lies outside then we reject this hypothesis. So, that is what we do here. So, if h naught is true that is the null hypothesis that beta is equal to 0 then this random variable is the t distribution with n minus 2 degrees of freedom. So, then that gives rise to this test. So, compute this quantity n minus 2 s x x divided by s r square root of that times absolute value of b. So, given the data this left hand side in this test can be computed. And on the right hand side you have the critical region of the test which is t distribution with n minus 2 degrees of freedom and if gamma is the level of significance then gamma by 2 and if this test statistic is outside that then we reject the null hypothesis otherwise we accept it. So, it is the same logic as before only the expressions are little complicated, but it can be I mean. So, I would not expect you to derive this or something it is actually quite straight forward it follows the same logic as before and it just deals with normal random variables and t t distribution. So, this this can be derived. The point is that there is a well defined statistical test for accepting or rejecting the hypothesis that there is a relationship between the input and output variables. Last thing which I would like to say is I think this is the last one. That there is a there is a value called the r square value which is used in regression models which is the extent of fit. So, see I can I can have regression I can plot regression values for anything I can plot regression values for supposing these are the values then you know my regression line is like this. Yeah I can if I look at these two cases in this case increasing x values indicate increasing y value in both cases correlation is positive. In this case increasing x value explains the variation in the y value. It says the y value increases and to a very great extent the explanation is fully captured by the x variable. In this case or let let me just exaggerate it a little bit. In this case also if I if I increase the x value the y value also increases, but the increase in y value is not explained by the increase in x value to that extent. So, obviously you know there is a coefficient of determination which I mean this is a better regression fit than this right. So, what is the what is the quantitative way of saying that? So, it is the it is the the deviation sums of squares of deviation of the y values with respect to its mean minus the sum of squares of errors of the residual divide by the I mean just to get a standardized quantity that is called the r square value. So, this r square value you can show is between 0 and 1. So, if this if there is r square values is high close to 1 then we say that the regression fit is good. So, the coefficient of determination is good ok. So, this is just a quantitative test for how well the the the regression line explains the data it no causality and all that how how well does it explain the data ok. So, basically so y y increases as x increases, but how much of the increase of y is explained by increase of x this r square value will tell you that it is a value between 0 and 1 that that can be shown. So, this r square value is between 0 and 1 a value close to 1 indicates that the regression model provides a good explanation for the variability in the y values. A value close to 0 indicates that the model does not really explain the variability in y I mean the x values so, this r square actually it is in the example of the father and son heights business this r square value is 0.961 which is quite good. So, of course, the the square I mean you would you would suspect that this r square has something to do with the correlation coefficient r if I just view. So, forget about this random variable business if I just look at a set of value say say heights and a set of value say weights as just independent data sets. And if I compute the correlation between these two sets of data then I get the correlation coefficient r that is actually the square root of this r square only the sign has to be appropriately put because you know it could be positively correlated or negatively correlated for convenience this r square value is the sort of square of that. So, it so, correlation coefficient r is between minus 1 and 1 and this r square value is square of that. So, it is between 0 and 1 high value of r square indicates good fit of the regression model and low value of r square indicates that the regression model is not a very good one ok. So, of course, there are many caveats this this r is the the standard coefficient of correlation. So, it is the square root of r square actually. So, except for the sign indicating whether it is a positive correlation or negative correlation the sample correlation coefficient is equal to the square root of the coefficient of determination. And the sign of r if you want to sort of derive it from the data it is the same as that of b that slope coefficient you know we have we have we have calibrated the regression model as y is equal to a plus b x. So, if b is positive it means a positive correlation if b is negative it means a negative correlation. So, that that the sign of r is the same as that of b and the value of r is the is the square root of r square ok. So, now where should you use this regression model well whenever you see data like this you can use a regression model if you see data like this regression model is very great and probably somebody has cooked up the data this is very too good to be true, but you can use the regression model this one is also seems ok. But if I see data like this you know then I I would be a bit foolish to use a linear regression model to explain the relationship of y and x right it is not linear I mean it is clearly not linear it is some nice quadratic or something like that. So, linear regression model is is not really useful, but you know the thing is that you can you can still do it what will you get if you if you if you use a linear regression model here well you will get some line like this I mean it this this is the best fit straight line fitting this data. So, that is not very useful. So, for high values of x and for low values of x I mean it it just says that you know the values of y are clustered around this in some way. So, it does not tell you the qualitative nature of this behavior is that for some low values of x and for some high values of x the values of y dip down and there is an so it is a it is a quadratic relationship the regression model will not be able to capture that although technically you can you can do this and you can compute r square and you can compute a and b and all that you will probably get a low value of r square which should warn you that this thing, but the the point is that when you are using these models in in in practice please plot the data and look at it first before doing anything please plot it and look at it. So, later on if you use regression models anywhere the first step is to plot the data and and have a look because you can always put all this data and spreadsheets and you know software and and get get some coefficients and keep on doing it, but you should not lose the basic sense of what what is it that you are trying to do is that ok. So, all this data analysis and all that is fine, but please do not discard your common sense if something does not agree with your common sense chances are you know something is wrong. So, please take a look at it in a few cases your intuition may be wrong, but it is definitely worth a look. So, do not disregard your common sense no matter what the software tells you or the program tells you or whatever tells you do not disregard your common sense. So, no matter how many marks you get in this course or do not get in this course one thing which you have it use your common sense do not lose it.