 Hey everyone it's MJ and welcome to this video on regression. Now I'm very excited about regression because we're going to see that this is one of the most powerful subjects in the world and I want this video to be the video on the internet when it comes to regression so we are going to explain everything. So because it's going to be a long video we might as well give a nice introduction on why this is so awesome. What we know from statistics is that the purpose of statistics is that we want to take data and we want to turn it into information. We've also looked at something called hypothesis testing and we see that that's where we make some presumptions and we then go and look for some data to provide evidence and for a lot of us this is this is statistics to us this is the subject we use this in science to make observations and to progress with research and all of those type of things but there is another part of stats or another extension of it and that is something known as regression and regression is so powerful because it allows us to take information and find knowledge and this is very much a game changer what we're going to see is many jobs pay big money for people who have the skill what you can do with regression it's going to help you to optimize and it's going to help you to make predictions so you can almost make the best decision and see what the future is going to hold and both of these things are critical when it comes to achieving our goals so like I said this is a powerful subject I think it's very important that the theory is out there so what I'm going to be doing in this video is I'm going to be focusing on the ideas behind regression I want you guys to understand what the subject is all about what that means and because like I said this video is going to be long on its own I am going to be skipping a lot of the maths so I will give you hints and I will be using maths but the actual proofs and all these other things I'm rather going to just leave out so we can rather focus on the ideas behind it and also because computers today are doing all of the maths for us what I will be doing is relying a little bit on statistical knowledge and I am assuming that you guys by watching this video you have got some sort of background in stats if not I do have that whole teachable account which these videos will also be on so if you want the videos in a more condensed format that's where you can find them on teachable and links going to be in the description below but this is going to also contain all the previous videos on stats that you need as your background theory to understand what we are going to be doing in this regression fully explained is we are going to be looking at nine parts so this video is going to be divided into nine parts part number one we're going to be looking at something called correlation part two we are going to be looking at sample correlation if we look at part number three we're going to get stuck into the actual linear regression model make sure you don't just skip ahead to that because I mean you do need to understand what correlation is before you can get to that and part four we are going to measure the goodness of fit and this is going to be one of the few things that we do to check how good the model actually is part five we're going to be looking more at this thing known as the slope parameter which is given by the symbol of beta number six is where things do get a lot of fun we're going to be looking at how to make predictions and especially looking at the mean response and the individual response variable part seven it's going to be another test to see how good our model is we are going to be analyzing the residuals so residual analysis will be part seven part eight is where maths does get a little bit insane and it's where you do need to be a little bit of a genius behind this stuff but yeah we're going to be just having a very brief introduction to what is transformation and then of course this is where computers come in at part nine where we start looking at multiple linear regression and what we're going to see is that the stuff can get insane it can be very complicated and you can build an entire career from this stuff but what we're going to be doing in this video or in these these nine parts is just giving you the foundation so that after you've got this you can then go on and do weird and wonderful things I mean just an example of some of the things that you can do after going through this course you can use it to make predictions around you know what what the bitcoin price is going to be tomorrow it could help you you find that model or you might not be into you know making money and all those things you might be more into the optimization side and you can maybe even use this if you do go-karting like myself you can use regression modeling to optimize what is the best strategy for your cart in order to help you win races so there's a whole bunch of things that you can do with this skill and that's why I think it's really important that it's out there on the internet so that people can embrace it but without further ado let's jump on to part one so yeah part one part one what we're going to be doing is you're after each of the parts you can have a little bit of a break or you can just keep watching them in its entirety but part one it's we're going to be looking at this idea of correlation you know what do we mean by correlation correlation essentially means the strength and direction of a relationship so we're looking at the strength and direction of a relationship so what does that mean what does that mean what I'm going to do is I'm going to draw out five different graphs so these are our x and y axis yeah what we might have is let's say these are all of our points remember this is the x and this is the y if those are all of our points and it's just all random scattered about we will say that this is uncorrelated if we can see some sort of pattern where as x increases y increases we will say that we have a weak positive relationship we can maybe even see that this gets a little bit stronger and then we're going to use the terminology of this is strong positive if it follows a perfect little linear line like that we are going to say that it is perfect and positive but remember we can also get it if the data starts doing this we can see that we have this is perfect negative and I mean you might be saying okay hold on what what is this y and what is this x y I mean could be the interest oh no no what we would have for y y would be the bitcoin price say the bitcoin price and x could be the interest rates or if we're talking carting y could be our lap time and x could be our weight and what we could see in this situation we could see something like a negative correlated thing where as our weight increases so I don't know weight in this will be positive because as our weight increases our lap time will also increase we could see a negative correlation with bitcoin and interest rates as an example of a negative correlation so that that's what we what we kind of have but that's all good in theory where where do we get these numbers from and I mean we can't just say something as strong correlated we need to have some sort of numbers to help us help us in this thing so what we've had if we if we look back in the beginning of stats we looked at this thing known as the variance and the variance is simply how spread out a random variable is so how spread out a random variable is we then also looked at something called covariance and this is the joint spread of two random variables and covariance is kind of what we after we want to see how how yeah what is the spread between our y and our x how are they spread out the thing about covariance is that it's a bit of a meaningless number because what do we mean if our covariance is 10 or what do we mean if our covariance is 100 there's it's a very difficult number to compare and that's where correlation comes in correlation which is donated by the symbol p or written like this car x and y it is equal to the following formula we have the covariance of x and y but what we're doing is we're dividing it by the variance of x and the variance of y and what this is going to do it's going to remove that whole like magnitude or distortion that will come from the data and it's going to allow us to compare correlations across various data sets because what this does is it's going to make a set of bounds for our correlation so our correlation can never be greater than one and it can be never be less than negative one now why are we so interested in correlation when it comes to regression analysis is that if our correlation is equal to zero then it means that x and y are uncorrelated and that it's pretty pointless to do regression if we see that the interest rate and the bitcoin price are uncorrelated then there's no point in building a model with interest rate as our explanatory variable and bitcoin as our response variable because it's going to just be an absolute waste of time so this is what we want to do is before we go ahead and do our regression analysis we check the correlation because if the correlation is zero then like i said there's no points in wasting our time the problem though the problem is that this is the population correlation and very rarely will we have access to the population data instead we're going to have access to the sample data and that's where part two comes in so let's move on to part two of regression and in part two we're going to be looking at something called the sample correlation and the reason why we have to look at the sample correlation is because the population correlation is sometimes beyond us or we don't have access to all of it we do sometimes have access to a sample and we now need to use our sample to come up with an expectation or a guess for what the population correlation is supposed to be so let's look at sample correlation what is it it's donated by the little letter r and it is equal to s x y divided by s x x s y y now let's just explain what these s x y things actually are and it's important because we're going to be using these these formulas throughout the rest of the course so it's important that you pay attention to what these things are s x y is equal to the sum of the x i minus x bar y i minus y bar okay almost think of it as very much the covariance of x and y s y y is going to be equal to the following the sum of y i minus y bar squared and the same here we're going to have x i minus x bar squared and you can see it's similar to this whole concept oops i have made a terrible mistake um the variance of the y's and the variance of the x okay and you can see this formula over here looks like that formula over there but remember we are using this is in a sense using sample data where as this stuff over here was using population data okay like i said go check out the teachable course where we explain what sample and population is in in some of the earliest background stuff on stats but i think i think we should all be all be happy with that um what we now see is that because r is a function of the sample data what that means is that r is going to be a random variable so because r is a random variable we can do a whole bunch of lovely things on it but what it also means is as a random variable um we could have a situation where our population correlation is equal to zero but due to the fact that it's a random variable the sample might not be equal to zero due to random fluctuations which means which means we do need to do a bit of a hypothesis test and the hypothesis test will be something as follows we would have ho population correlation is equal to zero and h1 alternative hypothesis that it is not equal to zero we then have the following as our test stat which is as follows 1 minus r squared and this is with the t distribution um like i said i do want to skip the maths uh but if you want to prove this for yourself i mean you do need to rely a little bit on the central limit theorem um and look at the situations where your sigma squared is unknown but like i said the stats the maths behind this can get you can make an entire video just on that um but just take it as a result take it as one of the results uh in this course and we're going to see that if we have this as our test stat we can then calculate our confidence intervals at 95 percent uh that's going to give us two numbers and what we know is that if zero is not included in this in this confidence interval then we can reject h uh zero and we can continue with our regression model so just a little bit of a recap why are we doing everything that we're doing before we go into linear regression models we want to just check to see if there is some sort of relationship between the data in order to test what the relationship is we're going to need this thing known as the correlation and the correlation is the covariance between the two datasets divided by the variance of each one however we might not be able to have access to the population data in which case we need to take a sample from the sample we can calculate the sample correlation and then we can do a hypothesis test to see if it isn't or if zero is not and 95 percent uh possible in the confidence interval once we have all of that we can now go on to part three where we're going to be looking at linear regression models okay so now we're coming up to part three so yeah just a bit of a recap before we look into our linear regression models we had to look at the correlation to see if it's actually worth doing we want to make sure that the data is not uncorrelated but of course that's for the population we might not have access to the population data so we're going to look at the sample correlation which is a random variable which means we have to do a hypothesis test around it to make sure that we can do the linear regression okay so that is a bit of like the backstory now we're actually getting into the whole thing um because once we know that our correlation or sample correlation is not equal to zero with a 95 percent confidence we can fit our model um now this is becomes a whole thing you know what what type of model uh do we fit and when you have this as a as your day job when you're a data scientist and you're doing this this will take up a big chunk of the process is to decide what model are we going to use for now what we're going to be doing is we're going to be using the baby model which is known as the linear model we're just going to be using this it's the simplest and and what this is going to do it's we're going to build a lot of the theory or talk about a lot of the ideas which can then extend to the other models that you might have chosen uh the linear model in its very raw sense is something like this if we have our data we've tested that there is some sort of correlation we then going to draw a straight line and that is our linear model so if we were to write it out mathematically we would have the following we'd have yi is equal to alpha plus beta xi plus ei and I think what we're going to do is just quickly explain what each of these things are so this guy over here is our response variable we then have our alpha which is our intercept parameter beta is going to be our slope parameter xi is going to be our explanatory variable and then ei is going to be our error variable and a little fun thing with the error variable is that it is distributed normally with zero and another parameter called sigma squared and what this means is that in order to build our model so in order to build our linear regression model we have some parameters that we need to estimate and these parameters are alpha beta and sigma squared now once we find what these three parameters are this well the whole process of that is known as fitting the model so what we're going to be doing now is we're going to be looking at how do we actually fit the model and um so yeah let's end off this part by looking at that we've got the estimate for the slope parameter and it is equal to s x y divided by s x x the estimate for the intercept parameter alpha is equal to y bar minus beta hat x bar which is the estimator of the slope parameter and then we have sigma squared hat is equal to 1 over n minus 2 sigma of the sum of yi minus yi hat squared okay now if you're probably thinking where on earth did we get these things from essentially the mathematics behind it is we're doing a simultaneous equation the one equation is the sum of the yi's is equal to alpha n plus beta sum xi and then we also have the following one as well and what we do is your simultaneous equations we do the maths it's it's not that difficult but it takes up a bit of time and we we get those two guys over there we're not going to be going into that result as well like i said we just want to stay away from maths and focus on my ideas but what we will start to see is that these things once again are random variables so we're going to see that our little b hat guy over here is a random variable it's going to have an expected value um which is going to just equal this shows that it's unbiased and we're going to have the variance of it which is an important result which we're going to be using a lot later on um as follows one thing to always keep in mind is that the fitted model will always go through the point x bar y bar it's one of the little quirks of linear regression models anyway that concludes part three which is just an introduction to fitting the model okay let's look now at part four what we're going to be doing in part four is we want to measure the goodness of fit okay what do i mean by the goodness of fit well let's just draw it out to illustrate what i mean um we might have two linear models like this but the one might have all the data points very close to it we could say that's that's quite a good fit whereas we could have another one that looks like this and you can see number a has got a much better fit than number b however we don't want to just you know rely on our own eyes we want to actually have some mathematics behind it or a number to actually state how good of a fit it is another way of saying it is how much how much is explained by the model so what we mean by that is in scenario a a lot is explained by the model and in b not a lot is explained by the model so what we need to be doing here is we need to therefore study this s y y in more detail which as we remembered is equal to the sum of the y i's minus the y bars squared and if we remember from the previous part we have our linear regression model as follows we have y i's equal to alpha plus beta x i plus e i and we're going to see that there are two sources where we're getting the y i's from okay this source over here it's the x i's okay and this we could say is the good source and then we have the e i's which i guess you could say is the bad source um so what we can do what we can do is is the following okay the next little thing here is a bit of a jump in mathematics um so once again it's one of these results that we're just going to accept but you can you can look into it a little bit more we have this y i so if we look here we have y i minus y bar and we can say that this is equal to y i minus y i hat and whatever we subtract we also have to add so you can see those two terms cancel each other out with y bar there but what we're going to see what we're going to see and we're not going to prove this um it's quite a magical result but we get the following when we sum this side and we square it something interesting happens in the sense that we can also do the same on this side okay now this doesn't happen for all cases if this is a special case that this happens for um the reason being or if you do want to try and prove that that is the situation i will give you a hint and the hint is that the b hat is equal to s x y divided by s x remember we looked at that in part four or in the earlier part and you do need that relation in order to prove the following but what this does what this does is it partitions the the the response of the the model so what we're going to see is we can call this over here this is that s y y we can also call it the total sum of squares and we use this as the symbol for that so we can have this as the total sum of squares but what we've also done is we've broken it into this thing over here um which we are going to see a lot later is your e i your e i is equal to your y i minus your y i hat so this is the residual sum of estimated errors or s s res res is short for residuals plus um what we have over here which is our regression sum of squares so plus s s reg and a simple but unfortunate we've got res and reg they look a little bit similar but essentially what we're doing here is we're taking the variance and we're splitting it up into these two components this component over here which is represented by the e is or the unexplained part of the model and we have s s reg which is x i which is how much is explained by the model so we want the majority of s s tot to come from s s reg and we want to come we want s s res to be as small as possible now when we start getting to a little bit of the mass we can see that the estimator for this is also equal to s s res divided by n minus two and we can see that s s res itself is equal to s y y minus s x y squared over s x x now this is where the mass does get a little bit tricky and in the exam they can ask you to use these puzzle pieces to solve certain problems so it is important that you go through this yourself and you just get a better better understanding but what is cool is i mean these things as well you can take the expected value so the expected value of s s total which is the total sum of squares it is equal to n minus one sigma squared plus beta squared s x x the expected value of s s reg is going to be equal to sigma squared plus beta squared s x x and like i said we're not going to be going into the actual proofs for these things but you can do them yourselves if you want to if you just want to double check but essentially what we are saying is a good fit we're saying that the model is a good fit if our s s reg which is remember the x or the explained is high and our s s res the residuals is low and remember that's given by the error term what we can also do is we can explain this in a nice way with something called the coefficient of determination coefficient of determination and there's given by capital r squared which is equal to s s reg divided by s s tot basically we want to see how much of the variance is a proportion of the total which is also given by this formula here you can see how useful knowing those s x x's and s y y's actually are because that's an alternative formula but essentially your coefficient of determination we're going to quote it as a percentage and it can be either between zero percent and 100 percent what is interesting is if we look at it like this if we look at that formula there and we compare it to our sample correlation it's very interesting how it is the square of sample correlation and this makes sense if we have a very strong correlation we're expecting our linear model to be able to explain a lot of it if our sample correlation is very weak then we're expecting our model not to be able to explain a lot of it so it is very interesting to see how these two things are connected okay now we are moving on to part what part is this this is the slope parameter which is part number five okay so let's get into this part number five and we're going to be looking at the slope parameter in a little bit more detail okay but now in order to understand the slope parameter we just need to make a few assumptions about our error term and our little errors what we're going to be assuming is that they are all independent of one another and we're going to be assuming that they are normally distributed okay now what is interesting is when we assume this we can say the following we can say the following that our response variable is going to be normally distributed with alpha plus beta x i and sigma squared as the variance of course this is the beta one that we're interested in and we're going to see that it itself is normally distributed with beta and sigma squared s x x okay in order to do anything more we do need a following results okay and again we're not going to be proving these results but the results are one that the estimate for our slope parameter and sigma squared are independent okay and two we're going to be saying that n minus two the estimator for the variance divided by sigma squared is distributed according to the chi-squared distribution of n minus two and again this is something that you should be comfortable with if you've been doing the previous courses um otherwise go check out those videos if you're like whoa what on earth is sigma squared where did that come from anyway we're going to say that this thing is equal to n and once we have all of this i mean the maths is a little bit tricky but we can derive the maximum likelihood estimators of alpha beta and sigma squared and that might very well be an exam question especially if you're studying actual science so you do want to make sure that you're comfortable with that maths but because of the fact that we've got the maximum likelihood estimators of these guys we know then that the following is true of our slope parameter that we can put it in this format over here and we have it as in the standard normal and let's call this result equal to m now since these two things are independent okay what we can do and from the central limit theorem and all of our understanding of probability distributions we're going to see the following okay we've got this as our t distribution with n minus two degrees of freedom now what that basically allows us to do is write our slope parameter in the following way okay and you might be saying hold on what is that s e thing what is this this is almost the estimated standard error or the standard deviation kind of idea where we have it like this okay we're using the estimator over here instead of the actual population parameter and the whole idea is that we're going to use the standard normal when sigma is known but very likely that it's unknown in which case according to the central limit theory we can use the following here when this is unknown okay and this is great because now it's going to let us use this as our test statistic when it comes to calculating confidence intervals and hypothesis i mean we could very much get to this stage as our confidence interval is something like this plus minus t alpha divided by two n minus two times s e theta hash and bam we have we have our confidence interval what is interesting is once again we if we look at it a little bit closer we will see the following we see that beta hat is equal to this and r is equal to to this over here our sample correlation and what this basically means is that if beta hat is equal to zero then the only way that can equal to zero is if s x y is equal to zero and if x y is equal to zero then r has to be equal to zero and what we're seeing here is that if our slope is zero then it means that the data is uncorrelated so this is useful if we want to check if the model is a good fit but why it is a little bit more superior to the sample correlation is because now we can start testing now hypotheses where say beta is equal to one or where beta is equal to two or three or four it can extend our testing abilities okay let's use orange for the next part i think this is now this is part six let me keep in count yeah okay this is part six where things start getting very interesting we're going to be looking at the mean response and predictions of individual responses and I guess this is the whole point of regression analysis you know what is y i going to be when we have x i and what we can I think maybe let's just explain this right at the get go is let's say I'm busy doing carting now in carting I can do a one lap for qualifying and I can do 15 laps for the race okay when we're looking at the average lap time of the race we're going to be using following mean and when we're just going to be looking at one lap on its own we're going to be using y okay so I think that is just a very very important that we get that out of the way just in case you get a little bit confused like you're saying where on earth is this mu zero thing coming from it is the mean response of y given x o and I mean once again we're seeing alpha plus beta x o big thing to notice is there are no e i's in this formula here because it is the mean response which means we can estimate the mean response with our intercept parameter and our slope parameter given what our x o is now it's important to remember that this though is still a random variable and we're going to have the following set of results okay the first result should not scare anyone because it's unbiased the second result is where things get a little bit interesting if we have to take the variance of this estimator we're going to get an interesting formula let me just write it out okay now if you want to prove this if you want to be very brave and prove this I will give you I'll give you two hints okay the one hint is that the variance of alpha or the intercept parameter is equal to the following and two that the covariance of alpha hat and beta hat is equal to negative x bar now you need these two results if you are going to prove this but if in an exam or in some unfortunate situation you are told to prove this you can't just state these results you're going to have to do too many proofs for each of them as well before you can just state them so I can say this is where the maths can get a little bit insane but the important thing to realize is that is the result and just like how we did above we can take the standard error of it the important thing to note here is not only are we taking the square roots of it okay but watch there is another big difference there's another big difference feels like that game spot the difference but what we're going to be seeing here is is this in there you can see it we're using the estimator of sigma as this is the situation where the population variance is is unknown but what is nice is it means oh look at this we get a lovely little test stat as the following and it is according to the t distribution with n minus 2 which once again we can use for confidence intervals and the null hypothesis but that's all good for the mean response what if we want to do it for an individual response so like I said this was the way to think about it when we're doing my average lap time over the race of 15 laps what happens if we want to look at qualifying where it's just one lap we're only getting one chance what happens then so in this case we are going to use y o hats and at first it looks very similar to what we had before I mean look that that's actually the same however so the expected value is going to be the same but what about the variance the variance is and I think just just let's think about it logically okay if you do something 15 times or if you do something once what is going to be more variable and the answer is when you only do it once because you don't have the multiple things cancelling each other out or tending towards something so how do we handle the extra variance and what we do is to handle the extra variance we're going to add in another sigma squared which means the variance for y hat zero o is equal to the following 1 plus 1 over n plus x o minus x bar squared divided by xx sigma squared okay you can see we've added in another one of these things to compensate for the fact that the variance should be bigger but I mean once again we then just get to the situation where I mean now you should start seeing the pattern start seeing that the stuff actually isn't that difficult because once again we have it like this we're using the estimator and we can use this as our test stat in order to do our confidence intervals or our hypothesis testing which is very nice very very nice yeah so that is that was part six when we come now to part seven what we're going to be doing is checking the model so checking the model what do we mean by that well checking the model means we're going to be observing something known as scatter plots of the residuals now you might be saying okay hold on what is a residual well the residual at xi is ei hat which is equal to yi minus yi hat okay and the reason why we want to check the model or we want to observe these scatter parts is is for two reasons one we want to check our assumptions you know do we have true errors in our model you know our little e's are identical and independently distributed around the normal with a mean of zero and a variance of sigma squared and two what we want to do is we want to investigate the nature of the relationship between our response and exploratory variable so in a perfect situation our scatter plots would be just a random scatter and we can be like okay fantastic that is what we want we can then be quite happy with the model however however let's say we have a straight line and the data does this okay it's almost got a bit of a exponential growth and we fitted a linear model what we can then expect is our random errors are going to be growing as time goes on you can see they they're getting worse and worse the model starts deteriorating as we extend it and that's we don't want that we want our model to be to be the same whether we're in the beginning or at the end what this shows is we now have a pattern forming in our residuals and this is panic mode so we like no we don't want patterns in our residuals this is very bad and it's going to show that our model is inadequate okay if there's patterns in the residuals however there is a way to deal with our patterns if the residuals not doing well and that's what we're going to be looking into here in part eight so part eight we going to just briefly talk about something called transformation okay and transformation it's it's something like as we said when we have a pattern in our residual analysis so we might have a situation where like I said we have a growth model so an exponential growth which means as soon as we fit a linear regression the errors are going to be increasing as time goes forward so how we would show this is that the expected value of our response variable given our explanatory variable is equal to alpha to the power of beta xi now what do we do because we need something in a linear format if we're going to be doing a lot of our testing fortunately we can use a mathematical concept known as transformation and this is the transformation we can set up a new variable which is a transformation of our response variable and if we look at it now we will see that wi is going to be equal to log alpha plus because if you understand how logs work we now get this situation here which is our linear model so we have transformed an exponential into a linear which is going to be lovely because it's going to allow us to do all of our wonderful tests although one thing you've noticed is I've added the error term okay now this is implying that the error structure is additive which means it was multiplicative in the situation over here but there's potential problems with this and I mean I'm getting down a little bit down the rabbit hole which saying we look at more in generalized linear models which is a much later actual exam so I just wanted to briefly mention transformation to you it is saying that you'll deal with in the future and it is where things do get a little bit tricky but I do want to end off with the subject on regression with part nine which is where we're looking at something called multiple linear regression models and I think this is where the real fun actually begins because what we're going to be doing here is we're going to be looking at more than one variable so if it's bitcoin we might want to look at interest rates and we might also want to look at USD debt and we might want to look at I don't know inflation we might even want to look at trading volumes etc etc etc when it comes to carting we might want to look at say okay not only what is the weight of the cart but what is the tire pressure as well as you know what are the track conditions as well we can start getting more than one variable and the idea behind this is that more information means that our prediction power is going to be more accurate okay predictions are going to become more accurate okay so we want it how does this look mathematically well we have it of the following form y given x1 x2 all the way I mean up to xk it's going to equal to we're always going to have just one intercept parameter but look at this we now are going to have a whole bunch of these bees which we call the multiple regression coefficients so once again this y here is a random variable and all these bees are what we call the multiple regression coefficients so how we would then write it out is the following yi is equal to alpha plus b1 x1 plus dot dot dot plus bk xk plus our little error term but like I said this is this is very much just the tip of the iceberg I mean we can start playing around with many different models sometimes we can combine variables and the maths does start getting a little bit heavy and so this is where computers are are used but I thought I'd just give you a very brief introduction on multiple linear models so yeah if you've made it this far welcome well done you've made it to the conclusion look in this video we did talk a lot about confidence intervals hypothesis testing things known as test stats estimators etc etc etc so if you're new to data science and you've done this video you know what on earth was he talking about I do have a course on teachable there's actually a few other videos I do have on this channel as well but majority of them are going to be on teachable with some exam help and some other goodies to help you really understand these concepts and like I said the link is going to be in the description below and you can follow that and also feel free to ask me questions on this platform all the videos are nicely put together so it just makes for a much better learning environment anyway hope you guys enjoyed this course on regression analysis and yeah the maths is a little bit intimidating but I hope we were able to communicate the big ideas and we can see what a powerful form of stats this is as always thanks so much for watching cheers