 Now, okay, so the session is in progress, okay, we can start with simple linear regression. So ignore the times, we're going to do the basic concepts of linear regression and then we'll do the calculations. So by the end of the session, you should learn how to use regression analysis to predict the value of a dependent variable based on the value of your independent variable. You should be able to know the meaning of every regression coefficient, which is your slope and your intercept. And you should be able to incorporate what we go in, make inferences about the slope and also calculate your correlation coefficient and interpret it. You should be able to also interpret and calculate the coefficient of determination. So let's look at how we do all this in this next half and one and a half hour. So when we talk about correlation and regression, we're talking about the relationship between two numerical variables. And sometimes the easy way of visualizing those relationships is by using a scatter plot, which will intend to show you if your data is related for some sort of a reason. And the correlation can be derived from there because it will tell you how is it negatively related or is it positively related in terms of that. So a scatter plot can be used to show the relationship between two variables and the correlation is a measure of the strength of the association or the relationship between those two variables. The correlation is only concerned with the strength and not casual or causal, causal effect is implied with correlation. So therefore, it means with correlation, you cannot say X because let's say, wait, let me put it this way. So with correlation, you say the quantity relates to the price. You can never say the quantity influences your price. That is a casual effect. There are different types of relationships when we look at the correlation. So if we look at those scatter plots that are in front of us, they are going into the same kind of a pattern, which is going up. And there are those that are going into a cap pattern, which looks like the same size. So when your scatter plot shows the relationship like the one that is going up in a similar pattern, we call that linear relationship. The first two graphs, they show what we call a straight line or a linear relationship because when we draw a line, you will see that the patterns of those dots, they follow that line and they are close to that line. And when it's going up, like the first one, we say X and Y are positively related. If it's going down, like the second one, we say X and Y are negatively related. If it's a cave one, we say this is a cave relationship. And with the cave one, if it looks like this, if it closes down, we say it is a minimum, which is a quadratic relationship. Sometimes it can open up because it looks like a mountain shape type of a graph. Sometimes it can open up. It's still going to be a quadratic relationship because it will be a cave as well. Sometimes it's not a complete cave, but it looks like the one at the bottom, which we call it an exponential graph. Like when we were taking the COVID-19, it was growing exponential. So now when the cases were dropping, we said it shows a quadratic relationship. If the cases were just going up and up and up, following a linear relationship is not an exponential, but it depends also on the case. Sometimes there can be no relationship because if you look at the two graphs that we have here, the first one, it shows all the graphs are scattered everywhere. There is no pattern in terms of those, in terms of that scatter plot. Therefore, it means there is no relationship. Things just happen. And if you look at the one at the bottom, it looks like everything is constant. But even though x increases, the value of y, it's almost like it's state constant. It doesn't change, it doesn't fluctuate, it's almost like it's constant. And these are the graphs or scatter plots that show no relationship. So x and y are not related in this two graphs. And this is all about the correlation. Moving into, before I move away from the correlation, the other thing about the relationship, we're going to learn at the later stage how to calculate the correlation, which tells you the strength of this relationship, especially when we look at the linear relationship. We can calculate the strength of this and say this is a, because the way it looks, it looks like it's 80% correlated. And then we can say it is a positive 80% correlation that happens between the value of x and the value of y. That's coming up a little later. For now, we're just talking about the relationship. Before I even go back to that, and what's coming next is going to be this line. What we define this line to be, and this line is what we call a linear regression line. So this will be our regression line and we can calculate this because it's the same as your y is equals to mx plus c, which we call a straight line or equation of a straight line. And we can use it to calculate what we call the regression analysis. And the regression is also used to predict the value of your dependent variable based on at least one value of your independent variable. So we can have the values that goes like that and this is our x and this is our y. We can determine what the value, if we put a value here, what will be the value of y? We use regression to predict that value for the value of y. That the value of y will be two or will be one and things like that. And we will learn that at the later stage. The regression also helps us explain the impact of the changes in the value of our y, which is in the value of our x relating to the value of y. So our independent variable, which is our x, how does it affect dependent variable y? And we use regression to do that. I already explained this. So our dependent variable is the variable that we want to predict. And we use our independent variable, which we can use that value to predict what the value of y will be. And the simple linear regression equation, like I said, it looks those who did max, y is equals to mx. In this instance, it will be mx plus c. Your c will be b0. Today, I've been having challenges with my pen. It's like, doesn't talk to my computer. Let me see if it registers because I've been, I've took it out for too long. So this will be mx plus c. If you did max, this should not be a standard tree. The value of c is the same as b0 and the value of m is b1. And the value of x will be the same. So this will be the y estimate, will be the value that we are estimating based on the intercept plus your slope times your x observation. We are going to be able to calculate the intercept and the slope. And we're going to use our x value to interpret or predict the value of our yx. If you have a financial couple, if you have a scientific calculator in the exam, if they give you the values of x and y, it's easy to calculate the value of b0 and b1 from your calculator. And I'm going to show you at the later stage. But now, we do things manually. And this equation, you are able to see the relationship between your values of x and the value of y, which is the value of your independent variable and the value of your y independent variable. And with this slope, you should be able to see the changes in y as they relate to the changes in x. And that is the slope. So if you determine slope, which is our m in this instance, is our b, b1. It's given by your y2 minus y1 divided by x2 minus x1. Remember, the changes in the values of y divided by the changes in the values of x will give you the slope. And we can interpret the slope because for every additional input that you get for your x-value, there will be an increase or a decrease in the value of your y. So that is the slope. How do we interpret the slope and the intercept? To start with, the intercept will be where x is equals to 0. And it is just the estimate value of y, where x is equals to 0. Remember our formula of y hat is equals to b0 plus b1 x. If our x be 1 times 0 of b0, we will end up having our y hat as b0. So that is our intercept. The slope we interpreted as the estimated change in the average value of y as a result of one unit increase in the value of x. So for one additional increase in the value of x, it will result in either a positive or a negative change in the value of y. That is the slope. Because the slope, if we look at the slope, so this is a positive one. If we move from this point to that point, we are changing one additional one. So if I move from year to year, I'm still going to find the change in the value. So for every one additional value you get, you will get an increase or a decrease. In this instance, for every additional increase, you will get an increase. In the values of the changes will be an increase in the values of y. If it looks like this, if this is my x and this is my y, for every additional increase in the value of x, there will be a decrease in the value of y. Of course, this highest point is related to the smallest point of the y value. So let's look at this. If we have a real estate agency who wishes to examine the relationship between the selling price and the size of the houses measured in square feet, a random sample of 10 houses are selected and all these houses are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. So 2, 4, 6, 8, 10. Yes, 10 houses are selected. Our y is our dependent variable and it is measured in thousands. So this is 245,000. This is 405,000 square feet. So this is 1,400 square feet for a price of 245,000, which our x, our square feet are our independent variable. We can take these houses and put them on a scatterplot to see the relationship. So if I draw the line, maybe I'm not drawing it right, so let's start from here. If I draw the line, I can clearly see that this is almost like a linear relationship. So the points are following in the same direction. And that is the scatterplot of all these points. How we draw that? So 245 will link up with 1,400. So if these are the square feet, go look for 1,400. Let me remove the ink. So we look for 1,400 and we go look for 245 in some way. And some way there. I'm just going to assume that is the one. And if I look at that one, it's 99 and 1,000. Oh, gosh. It's 1,99 and 1,000. So 1,100 is somewhere there. And the 1,99 is somewhere there before 200. So therefore, this one relates to that one. So that is that point. And that's how you plot the scatterplot. OK, I was just demonstrating how you plot the two numerical values. This is an example of a regression output using Excel. If you look here, there are so many things happening. But what I want to draw to your attention is those things that we were talking about, which is your intercept and your slope. Your intercept on the output, it will be set. So this will be the value of B0. And your B1, which is your slope, will be your square feet, meterish, or square feet. And we can take these values and substitute it back into the formula. Remember, our formula was yx is equal to B0 plus B1x. And our x is square feet. So that is why our x will be square feet. So B0, we take the coefficient and we substitute it there. And we take the square meterish and we substitute it in there. So now the other thing that I need to draw your attention to on this is on Excel is able to give you the output. Your R output. So this will be your coefficient of correlation. Let's put it this way. Let's write it here. This is your coefficient of correlation. If I look at this coefficient, it says it is 76% or 0.67. So which means it's a positive relationship. And it's a strong positive relationship because it's bigger. I will explain about the different types of correlations just now. The other measure that you need to be interested in is the R square. And this is what we call the coefficient of determination. And I will expand more on it at a later stage. So in Excel, it gives you so many other inputs in terms of correlations. So it gives you your coefficient of correlation and your coefficient of determination. And it gives you your slope and your intercept. But how do we even calculate all this? So let's go into the technicality of calculating. But before that, let's interpret. So let's learn how to interpret. Interpreting the slope, sorry, the intercept, remember, is not as difficult, but it's straightforward because the slope is where our x is 0. So it means if we put x is equals to 0 and I think we've done this before, then the average crisis will be 98. So when everything is constant, everything is x is 0, then it means the average price for this data will be 98,000 rent. And because we cannot have a square footerage of 0, there is no practical application on this because it will not make any sense in that. That is why, especially the y-intercept, we do not even encourage for interpretation of that because it might lead to misinformation as well. The slope, we can interpret because our slope tells us, if you remember what it says, the slope is the estimate in the value of y or the change in the value of y as a result of one additional increase in the value of x. Remember that. So how do we then interpret this 0.10? What does that mean? It means the value or the average value of the houses will increase by 109. Remember, our values add to the thousand. So it will increase by 109. So we must remember to multiply this 109 by 1,000. So it will be 109 run 77 on average for every or for each additional one square foot size house that we have. So it means if I add one day, it means our average house price will just be 98,000. Let me just call it this way, 98,000 plus 109. And that will give you the average increase for every house if we increase it by one additional unit. How do we then estimate the value? So remember, this is our linear regression model. And if we need to predict the sale of the price with the 2,000, we just substitute into the formula, the 2,000, and we will calculate the average price. So for a 2,000 square meter or a square feet house, then it will be 317,850 rent a house, because you just substitute your 2,000 and solve the equation. And remember to multiply by 1,000 so that it becomes 317,850. So these are the basic knowledge that you needed to know in terms of the linear regression. Now let's go and do some calculation, and I hope that's the next slide. The other time, you might be asked to calculate what we call the total variations or the sum square eras or the sum square measures of regression. And you might be asked to calculate them, and these are the formulas. So to calculate the total variation, we use the sum of your observation minus the mean of your y-observation squared. And that will give you your total variation or what we call the total sum square. To calculate the sum square regression measures, we use the sum of your estimated value minus the mean squared. So you will have to calculate the linear regression and estimate the points and then use those estimates and subtract the mean from every estimate so that then you can find your sum square regression. To calculate the eras or what we call the sum square eras, we use your observation minus the estimated value. So you will take your real observation and subtract the estimated, and that will give you the eras of your regression line. And I'm not going to dwell too much onto those formulas, but I'm going to show you how to calculate the y-hat. And the rest is just the summation. So remember, if you have your y-observation, you can just use the mean and your y-hat estimations to calculate some of this straightforward. So what did this total variation mean? It measures the variation of your y-values around the mean. That is what it does. And your sum square regression, it is your total variation that is explained by your values. So if attributed to the relationship between your x-values and your y-values. So for example, when you have that regression line, we can calculate those. But sometimes there are some errors in between. So we need to account for those errors, because if this is our regression, it couplets all your sum square regression line. They fit perfectly. Then you will not have errors. But sometimes you will have those errors. And those little bit of errors are what we call the additional factors, other than those that are attributed to the x-values. We can calculate them. So those will be your unexplained variation between your variables relating to one another. OK, so how do we then calculate the coefficient of correlation or interpret it? The coefficient of correlation, remember, correlation measures the strength or the direction of the relationship. Now, that relationship can be between minus one and one. It cannot be more than that. It's between minus one and one. What does those values mean? So if it is more than zero, so if it takes a positive, we say it is positive correlation. Therefore, it means when the value of x increase, the value of y will increase. If it is less than zero, therefore, it means it is a negative correlation. When the value of x increases, the value of y decreases. If it's zero, oh, sorry, here I'm going to explain in detail almost like every value that you will get. So if it's a negative one, we say it is a perfect relationship. If your r is between 0.6 and minus one, so it's minus one and 0.6, I should have them the other way around, but because I'm working towards, then it's fine. If it's between minus one and 0.6, then we say it has a strong negative relationship. So 0.8 is a strong relationship. Remember that Excel one that we used, it was 0.76. So this will be a strong positive relationship, also a strong positive relationship because it was positive. So if it's less than 0.6 and 0.3, negative 0.6 and negative 0.3, we say it is moderately negative relationship. And if it is between minus 0.3 and 0, we say it is a negative, a weak negative relationship. If it's 0, we say there is no correlation. There is no relationship. And the positive will be the same. It starts with a weak relationship. If it's between 0 and 0.3, then it will be moderately positively related. And it will be strongly related, positive strong relationship between your x and your y variable. And those are how you interpret your correlation value. The other measure, when it's r is equals to 1, we say it is perfect positive relationship. The other measure, which is r squared. So if I have the value of r, I can just press the x squared button and calculate my coefficient of determination. Remember that. On your calculator, you should be able to calculate your r and then use your x squared to calculate your coefficient of determination. Sometimes they might ask you to calculate the coefficient of determination. And the formula for that, they will give it to you. It's very complex. But you have the value of r. You just take the square of that value and calculate. So if they give you the coefficient of determination and they ask you to calculate the correlation, you just take the square root and that will give you. So if you take the square root, it will give you the value of r. If I want to calculate r squared, I just press the r squared. When I press the x squared button, and it will give me the value of my coefficient of determination. So you just need to know how to use your calculator to move between the two. So what is coefficient of determination? It is the portion of the total variation that is explained by the variation in your independent variable. We'll go into explaining it further. So it accounts for the total variation that is explained by the independent variable. And like I said, it is the r squared, which is your coefficient of determination, which if you have the coefficient of correlation, you just take the square. And that will give you the coefficient of determination. And since it is the portion of the total variation, so therefore it means we take the sum squared regression formula divided by the sum total of your variation. So we take the total variation. And what you need to also remember is the coefficient of determination lies between 0 and 1. So it can never be negative. So it lies between 0 and 1. And how do we interpret it? So if r squared is equals to 1, therefore it shows that the relationship is perfect. How will we interpret that? Whether it's negative or positive relationship, so they mean the same thing. So since r squared can never be negative, or a positive and a negative, so this is a negative for a negative relationship and a positive relationship irregardless, how you interpret your coefficient of determination you always say, since it is equals to 1, it means 100% of the variation in the values of y are explained by the variation in the values of x. 100%, since it's a perfect relationship and r squared is equals to 1. Not always that you will have a 100% relationship. So if we look at this, we can see that this is a weak positive or negative relationship. So if we calculate the r squared here, we might find that the r squared here is 0.38. So this, we can always say sum but not all variation in the values of y are explained by the variation in x. So if, for example, this is r squared is equals to 0.38. So it means only 38% of the variation in y are explained by the variation in x. If our r squared is equals to 0, remember then it means our r is also equals to 0, then there is no relationship. Therefore, the value of y does not depend on x. It means none of the variation in y are explained by the variation in x. And then, for example, let's look at this. Suppose that the correlation coefficient between the person's salary and his or her education attainment is equals to 0.39. Let's suppose that this is coefficient of correlation. To determine the coefficient of determination, we just take this and put the square. So 0.6395 squared. So on your calculator, you just say 0.6395 and then you press the square button and you say equal and it will give you 0.48960. And if we run it off, it will be 0.4090. And that's how you will calculate the coefficient of determination. To interpret this, we say approximately 1% of the variation in the person's salary can be explained by the variation in his or her education attainment. If we use the best fit linear regression model, or we could just say 41% of the variation in 1% salary is explained by the variation in their education attainment. And that, my good people, concludes what I needed to share with you in terms of the basic concepts of regression. I'm going to show you how to do the calculations just now. What we've discussed so far, how to use the regression to predict the value, the meaning of the regression coefficient. We covered that. We interpreted our B0 and our B1. We made inference in terms of the slope, which is our B1. And we looked at the coefficient of correlation. We also were able to calculate or look at how we interpret the coefficient of correlation and the coefficient of determination. Now the journey begins. Let's see how we do all this. If they give it to you in the exam, how to answer the linear regression questions using your scientific calculator. So let's say this is our data, x, y, our x values and our y values. And they're asking us, choose one of the following questions. I want to do this. I'm just going to show you. Since we all have our calculators, I don't know what follows next. Let's see what follows next. So oh, sorry. I must explain something. So in the exam, you will actually see questions like this. So the first one option will say, what would be the relationship between x and y? So you can only find the relationship if you calculated the correlation. Remember that. And all you calculated the slope, it will tell you whether the value is negative or positive. And yeah, they're asking you to confirm whether this is the slope. So it means you need to know your formula. It's y hat. It's b0, which is your intercept plus your b1, x. So once you calculate the value of b0, does that mean the value of b0? If you calculate the value of b1, is this the value of b1, which is your slope? Then the next one, they're asking you, is this the regression line? So if this are your slope and your intercept, so if I go here, my b0, which is 0.9, but this is not 0.9. And my slope, which is b1, is 1.6. This is not 1.6. Any of this true? So you need to choose the correct one. Step number five is asking, for each unit increase in the value of x, the value of y will increase by 1.6. So it means if we increase here by 1, what will be the average increase in the value of that? So it will mean that plus that, because that will be 1. So we will add just one there, and then say 1.69 plus 0.302. If it gives you this value, then that is correct. If it doesn't, then it is not right. And that's how you do that. So this is predicting the value of your y hat value for the new one. OK, let's look at how we do this manually, because sometimes you will have to calculate this manually. So to do this manually, we can first calculate the summation, which means adding up all the values. So we add our x value, and we find the value, and we add our y value, we find the value there. Then the next step, remember, this is our regression line. So we need to be calculating b0 and b1. To do that, to calculate b0, we use this formula. As you can see, the formula, it looks almost like, oh, sorry, b1, which is the slope, we use this formula. It's not as straightforward as we used to get in met. So we use the summation of your x and y minus the summation of x and multiply by the summation of y divided by n. Everything divided by the summation of x squared minus the summation squared over n. That will calculate the slope. Then to calculate the intercept, we say the mean minus b1, which is the slope, we calculate that times the mean of x. So we will take the mean of y, multiply by the mean of x. So in order for us to find all this information, we need to do something with these values. So the first one, we can, oh, to calculate the mean, sorry, to calculate the mean of y and the mean of x, the formula is the sum of your value divided by how many there are. So since we want to be calculating a lot of things, so for example, like the x squared, sum of x squared and the sum of x. So the sum of x will be those values. So this will be the sum of x, this will be the sum of y. So it's easy to calculate those ones. So we will have sum of x is 19, sum of y is 26. Also for this one, 19 and 26, our n, how many there are, one, two, three, four, five. So our n will be easy to find. So to calculate sum of x squared, we go quickly, we do the x squared. So it means four times four is 16. Two times two is four, n like that. Then we add all of them. This will be our sum of x squared, which we will replace into the sum there. Then we also need the sum of x and y, which means multiplying x and y. So we just multiply four times five is doing two times five or two times three is six. And like that, we get to the end and we add all of the values, which is our total sum, which this then becomes our sum x and y and we're going to substitute it into the end. We are able to calculate the slope. So we should be able to calculate the slope there. And then the rest, for example, like the mean, before I go there, to calculate the mean of y, remember it's the sum, we have the sum of x, which is 19 divided by how many there are and that. And then we will go back to the formula and calculate. So the other type of a complex formula as well, which uses also those sum, the sum, the summations will be your regression. This is the formula for regression, which is n times your sum, sum, sum of y or sum of x and y, which means it's n times 107 minus the sum of x times the sum of y, which will be 19 times 26 divided by the square root of n times the sum of x squared. Remember, this is your x squared is 81 minus sum of x, which is 19 squared times n times the sum of y squared, which is this 144 minus the sum of, which is 26 squared. And that will give you the coefficient of correlation. Then to calculate the coefficient of determination, you can use this formula, or if you calculated the coefficient of correlation, then you just take the x squared, but otherwise, if you use this formula, then it means we need the value of our estimated value. So it means we need to go back to this formula and use the estimation so we can estimate the value of y from this formula. So after we calculated the b0 and we calculated the b1, we are going to substitute it back into the formula. And then we're going to take our x value. You take four, you substitute it into this formula, and you create your y hat and you will create your y hat. So I've already calculated this, so that is why I know what my value of my b y is and my value of my b0 is. So don't ask me how I got to the y hat, 5.3, 5.3. I'll show you how to do that to calculate the slope and the intercept. So this is just for demonstration purpose. So you will calculate your b0 and your b1, you substitute them back into the regression line. Then you're going to take your four and substitute it into the value of x, so that you can estimate the new value of y hat, which will be 5.389. Then you substitute the value of two into the x value and it will give you three point da, da, da. And that will be the y hat you use on your coefficient of correlation. I know I'm talking too much. So then it will mean you will substitute it into this formula, but remember, since it is the summation, so it will be this minus the mean, you would have calculated the mean of y. You will subtract this value from the mean of y and do the next one plus the next one, subtracting it from the mean of y, plus seven minus it mean from the mean of y and squaring the answers like that as you go along. And then the bottom part, which is your y observation minus the mean, you will be doing y minus the mean, two minus the mean squared. Remember to square the answer and that will give you the hat and that will calculate the coefficient of determination. It's a long process if you go through this manual and that is why I want to show you on the calculator instead of doing this manual. So let's use the calculator to answer this question. So this is, these steps are for a, maybe I should write the, oh yeah, I have it. It's the sharp scientific calculator steps. So if you have a sharp calculator, you will follow this step. You will press the mode function and it will take your calculator to state mode by pressing one for state mode and then pressing one for regression line. So your calculator will show step one because then it is the line. So you will be pressing line. So I think when you press line on your calculator, it will show line and that is the one that you select which is the regression line and your calculator will show it on the screen as step one. And then now you can capture the data. Now you need to use the STO function and the M plus on your calculator. And you will say for STO and press five and then press the M plus and it will show you as data set equals to one on your calculator and continue and continue and then since you are not, all of you are not here today are not using the sharp calculator. So I don't have to go through each step in detail. But your steps calculator, your sharp calculator, let me see if the next step is to show. So this is not the sharp, I will change the heading just now. So you will be using the alpha button for all the green button that are written here. So on the division sign, there is the R, there is the coefficient of correlation. It's linked to the division sign. So that you will press your alpha division. So you will press alpha division, it will give you your R. After you have stored your values. So alpha division will give you your R. On the closed bracket and the open bracket, closed bracket has the value of A and open bracket has the value of B. Now, remember our formula is Y hat is equals to B zero plus B one X. So in terms of your calculator, this is what you will see. Y is equals to A plus BX. This is what you will see. So knowing that our A, which is that A alpha A will give you the value of alpha, alpha open bracket will give you your intercept and alpha closed bracket will give you your slope. And that is for the sharp calculator. So let's go and look at the cashier calculator. So please note the heading there. We give up the cashier calculator. So for cashier, you will do the same. You will press the mode. So I expect you to follow me on your calculator since you have your calculators with you. You will press mode and then it will show you the screen that looks like this. And you will press the state mode and it will show you the screen like this. You will press number two, which is that one. Number two, which is A plus BX. If you look at it, you must write it down. It's equals to A plus BX. So that you can also write Y is equals to B zero plus B one X. So that when you go back to the question, you don't get confused when you get the value of B, you know that it's relating to the value of B one. When you get the value of A, it's relating to the value of B zero. So we press two for state one and your calculator is ready to capture the data on your calculator because it will show two it will show a table that looks like this, which has the X and the Y values. Now to capture the values, you can start capturing one value at a time. So let's say you want to capture the X, you must be very careful. Since your X value relates to the Y value. So when you capture the X value first, you're going to say four equal, two equal, six equal. So you gotta continue and say six equal. Then you say four equal and then you say three equal. When you go capture the value of Y now. So let's say we've captured all the value of X. Now we want to capture the value of Y. All you want to do is use the arrow and go right at the top. So you will scroll with your arrow. So you will move your arrow to the left. So you will press your left arrow and then you will go right to the top. You will go up, up, up until you get to number one. And when you get to number one, you start capturing as well. Five equal, three equal, seven equal, six equal, five equal. And you will have captured all your values. Okay, so this is me explaining in the long term. And once you are done, now you can press on and off on your calculator or your AC button, sorry, not the on and off your AC button. Once you press the AC button, then you can press the shift button and press one. And you will have this table. We are interested in the reg. But before we go to the reg, there are two things. Remember, there were some sums and some, some, some. The sum sums, you will find them on number three. And on number five, you will find the regression information. So if you press five, if you press five, you will find A, which is your intercept and you will find your B in number two, which is your slope and your three will be your regression and four will be your estimate of the value of X and estimate of the value of Y. I will show you just now. I will do a practical exercise with you. So let's go back to the question. So I've shown you how to use your calculator net. So I'm gonna go back to the original question, which is this one. And, oh, sorry. And I am going to go out. I don't think we will have time for exercises, but I just wanted to show you how to use your calculator. So I must end this slide mode so that we can have the calculator here with some. Come on, what did I do now? Sorry, it's opened here. I didn't see it. Oh, come on. How do I do this now? I want to drag it waste of my time now. There we go. I think it's going to be very clear now. Okay, so I hope you are able to see the calculator and now I've moved the calculator somewhere on top of the screen. Okay, so now let's capture the data. We first need to take our calculator to that mode. So we press the mode and we press on this calculator. Mine is not on two, but it's on three. So I will press three on the state mode and there is one plus BX, which is two. And I have my table there. Remember those steps that we needed to follow. Now I'm ready to capture the data. So I will say four equal and continue carrying on to equal. Six equal. Four, three equal. So there were five. One, two, three, four, five. There are five. So I must go up. So I must come to this side. So you press the arrow to the left, to the right and then we go up, up, up, up, up until we get to number one and then we start five equal. Three equal. Seven equal. So you must make sure that all the lines are aligned. Six is lined with form. And five equal. Now I have my data stored. So I can press AC button. Then I need to press the shift and you must go to that state which is on button number one. And I have my table. So before I go to the regression I want to show you those sums. So if I press three and you can see them here. So I can use this to just substitute into the formula if they gave me the formula or if they want me to validate some of the questions that have those formulas. So for example, I have a lot of other exercises. I'm just gonna scroll down to the exercise to see if there is one way you need to be looking at the sums sums. Let's see. No, on this one you are given all the sum, the questions as the linear regression. So but sometimes they might give you the table that or the answers as sums sums. So you will just come here and look for the sums sums of those questions. So let's go back to our question. Okay. So I'm gonna go back to the shift state. Then I will press five for regression and you will see there are my A which is my intercept, my B which is my slope and my regression and I'm not going to use the estimate but I'm going to use the Y estimates to answer the question number five. So I cannot answer question number one because it says what is the relationship? I'm going to wait a bit before I can answer that because the value of my slope or the value of my regression will tell me whether this is positive or negative. So for now let's calculate the intercept and our intercept is B zero and on here it's A. Remember that I cannot write when it's in the view mode so you will just stay with me on this one. So my intercept, my intercept is A so I'm going to press one and I will press equal and it says my intercept is 1.659. So therefore it means number two is incorrect. Then I must calculate my slope. See if that is the value of the slope but I will really see that this is not the value of the slope because this is the value of my intercept but let's just calculate it. So on your piece of paper you can write this value is down it's one or you can even start substituting into the formula. So you will put there 1.659 into this formula there and we go and calculate the slope which is shift one and we go back to red and we press two and we press equal and this will give us 0.932 if we round it off and this is not the slope it's not that and we can go to the formula and substitute it in the formula so this will be 0.931 or 32x. So if I come to number four it says the regression line is this and I know that that B0 was 1.659 and B1 was 0.32 and then number four will be the right one. To estimate the Y or the unit so if I make I'm going to go out and go back to the shift and then I go back to X and go back to the regression I need to estimate the new value of Y which is this value. So to do that I have to tell it what is the estimate value. So I'm getting out my calculator because your first needs to put in the value you want to estimate. So because I'm estimating the value of one I am going to press one first then go shift and one and then five for regression and then press five again for the estimate value and press equal and that will estimate the value of one. If you want to test this that this is the answer you can just substitute where you see X there with one and answer this because you will be solving this equation and you will see that 1.659 plus 0.32 will give you 2.591 and that's what you will see on when you calculate it manually. So we've answered number two, number three, number four and number five. We are left with what is the relationship? Is it negative or positive? But it is positive because we can also use the value we find on the slope because if it was negative your slope would also be negative. But for argument six, let's go check it with the regression line. So if we go back, shift and we press one and we press five and we go to the R which is our coefficient of correlation which is number three and press equal that gives us 0.392 and if we want to calculate this is fortunate enough that the value of our slope is the same as the value of our regression line, our coefficient of correlation. And if we take this value and press the X squared we will be calculating what we call the coefficient of determination. And it means 87% of the total variation that is seen in the value of Y is attributed by the variation of the values from X. And that is how you interpret this coefficient of correlation. And if I want to go back to my coefficient of correlation then I just press the square root of the answer equal and it takes me back. And this tells me this is a strongly positive relationship between the value of X and the value of Y. Since we have 15 more minutes I want to also do another exercise. So to do another exercise I need to hear my calculator from any stored baby that are in here. Since I'm still trying to figure out how to bear your values from this calculator it doesn't work well. So I'm just going to go out of the state mode and go back to computing and I will come back to the state mode again. So let's go look for another calculation. Not this, not that, not this, not that, not this, not that. You need to go to the exercise. Okay, so let's look at this. Okay, so this is a good example of the exercise that we can do. So now if we look at this question they don't even expect us to do a lot of calculations because this question has the X and Y values. X and Y values, we're not going to use this on the calculator for now but they have the values of your X and your Y in decimal. So this is 2.6, 2.6, 3.2 which relates to the Y values that are 5.6, 5.1, 5.4 and they calculated the sum of X so that you are able to calculate your mean of X and the mean of Y, easy because they don't want you to spend more time doing a lot of calculation. The other thing that they did as well is to calculate for you the coefficient of correlation. And the question they're asking you is to find which one of these statement is incorrect. So how do you answer this question? The first one is talks about a positive relationship between X and Y. You're going to look at the coefficient of correlation. Is this value positive? If it's positive, then it's a positive relationship. Not strong, not because they're just asking is it a positive relationship? So this will be correct. So you don't have to do any calculations to find that. So if you look at the value, is it positive? Then it's a positive relationship because remember this is a coefficient of correlation. The second question they say the mean of Y is 5.043. So you have to take 35.3 and divide by one, two, three, four, five, six, seven. So we go 35.3, divide that by seven. Did I say seven, two, four, six, seven. Yes, divide by seven and that gives us 5.0. 4.3 if I estimate it. So therefore it means this number three is also correct. Then it also gives us the coefficient of determination is this. Now the coefficient of determination remember it's R squared. So we're going to take that value. So coefficient of determination we're going to say 0.37 and we take the square of this answer and say equal and that is the coefficient of determination. And if I look at this answer that we have is 0.5719 it is not the same as 0.16 therefore that will be incorrect. And it also says the regression coefficient B1 which is our slope is positive. Since our correlation coefficient is positive we can also assume that our slope which is B1 is also positive because they are related. If the slope is negative, the coefficient of correlation will also be negative. Then the other thing it says only 10% of the variation in Y is explained by the variation in X. Is it true? Yes, it is true because this is your coefficient of determination and how we interpret the coefficient of determination I'm going to multiply this by 100 because the answer there is in percentage. So if I multiply the answer by 100 it becomes 10.79 which is 10.7 which also this explanation is correct because the coefficient of determination is 10.7 which says the variation in explained by the variation in X and that's how you answer the questions. Let's look at another example in the last 10 minutes. Okay, so this you will have to go and answer it on your own, they just want to find out which one of the following statement is correct. So you will need to go and understand the basic concepts of your linear regression. In statement number one it talks about is Y your dependent and Y your dependent. Remember going back to how I explained your X will be your independent and your Y will be your dependent because it's something that we want to predict. And you're also going to look at the values of your R. What does R close to zero mean? What does R close to one mean? What does a positive relationship mean? And what does that mean? What does R close to minus one will mean? So you will have to go and look at all the statement and see which one is correct based on every statement. And the last one, I hope it will be the calculation one. Yes, it is the calculation. So with this one, they have given you the value of B zero and the value of B one. Let's say they didn't give it to you. So you will have to calculate them at this point because they gave you those values. So there's no need for me to show you on the calculator. But for interest sake, we can do that so that we then don't have to... And we can check if those values are correct. So it's time consuming to capture the information. So let's see. Step three, two for the var. And I will do this quickly. It's five equal, four equal. Three equal, six equal, nine equal. Eight equal, 10 equal. And we go left or right. And then we go back to the top, back to the top. Seven equal, seven equal. Eight equal, 10 equal. Five equal, three equal, one equal. So all the values should be captured. I can just go back up to see if all the values relates nine and two, five and six, three and 10, four and eight, and five and seven. So I've captured all the values. And then I press AC, I go V01. Regression is five. And I'm going to check my B0, which is the value of A. And I press A equal. And that is 18.223. And I can see that that is correct. So that is my value of B, B0. So if I look at the regression lines, that we have here. So I expect the value without an X next to E to be my A. So if I look at number three, this will be incorrect. That will be correct. Because that is the value of B0. So calculating the value of B1, which is those two X, we go shift, one, five. And we press two and press equal. And that is minus 1.257. And I can see that that was correct. And this is correct because it's a negative. Therefore, we were looking for the correct answer, which will be number two. But apart from it, the first one said the relationship is linear and it's positive. I can see that this is not true because my slope is negative. Therefore, it means the relationship will not be positive. It will be negative. But just to confirm that, we press shift and press one and press five. And we're going to press button number three for the regression and press equal. And you can see that the regression is coefficient of correlation. Sorry, our coefficient of correlation is 0.99. And this tells me this is closer to one. And it is a negative, strong, negative relationship between the value of X and the value of Y. The relationship is negative. Then number four, you see there it says if X is equals to two, the estimated value of your regression line will be minus 25. So we can calculate that. So I will have to go out because I am going to be doing the estimation. So I'm going to press two first. So also, you can go to this formula because we said number two is the right. So you can use number two and say where there is X, you put two and calculate it. And you will see that I will get the same answer as that. So five for regression. And we go and press five and press equal sign. And that gives us 10.70. And you can see that this is not correct. So this should have been the answer. And you can also see that that wouldn't be because this is 1.25. If I multiply by two, this will still be 0.2. Yes, it will still be 2.5 of some sort. And if we subtract it from 18, it wouldn't be minus 25. So this will be the right one. The last one it says an X value of 11 resulted in the estimated value of minus 1. So if X, so you can also do the same. So now we're going to estimate, but using 11. You replace X by 11 and calculate and see if you get the same answer. So I'm going to do the same. 11, shift, 1, 5, and 3, not 3. So I must start again. 11, shift, set, number 1, 5 for regression and 5 for the estimation. So it should say 11 Y hat and I press equal. And the answer I get will be minus 0.60. So there they have minus 1.8. So that is also incorrect. As we made it, we know that that was the correct one. And that is how you are going to answer the questions in the exam. If they give you a table like this and they expect you to answer quickly, but you will see that in the exam, sometimes they save you time. You didn't have to also capture all the information because if they have given it to you, you just substitute it into this formula and calculate. But if in case they give you the table, just make sure that you know how to use your calculator to calculate the values and answer. And that, if I close and I go back to my presentation, that instead of doing all these complex calculations in the exam, use your calculator to minimize the time you spent calculating all the some-some measures. OK, so let's conclude. Do you have any questions? Because today, there are no exercises for you to do. Any question? Because it's time up. If there are no questions. Just one from me, ma'am. You OK? If you don't know how to use your calculator, did I give you enough time to do the manual calculation? No, you do not get enough time. So if your exam is two hours, it's two hours. So what will happen in the exam, actually? I still need to get. We'll still get the structure of the exam. What happened? Oh, I'm going to give you a scenario of how last year's exam went. Last year's exam was two hours as well. But because they had a lot of difficulties, they gave extra additional hours at some point, not hours, minutes, extra 30 more minutes. But the exam was split into two. So there was the first part of the exam and then the second part of the exam. So what they did was to split the exam into manageable number of questions. So that if something happens, you don't lose your first part of the exam. You have incomplete second part. So the first six chapters were part of session number one, which was almost like an hour and a half, or an hour. I think it was an hour. So you have an hour to write those questions from chapter one to chapter six. So it's half of the exam questions. Maybe they were 11. Let's say, I'm going to make it roughly 11 questions in that. So you have an hour to answer those 11 questions. When the time is up, then you move to the second part of the other question. Immediately, when the one click closes, it opens up the new one. Then you move into the new one and you start your one hour with the next six questions. And now you must remember that the next six question is from chapter eight, which are the most complex questions that you have to deal with. There are a lot of calculations here. So it will be chapter eight, which also have almost like 11 questions or 12 questions. Then you have an hour to answer those ones as well. But you do not get an extra minute. So you just need to make sure that you work smart. That is why we show you how to use the calculator to minimize where you have to do a lot of manual calculation. So your exam is only coming up in November. If you start now practicing, most of the exam questions are almost the same. It's just that in state of Seaport, they will use Paul. In state of Paul, they will use Mary. In state of Mary, they might say shop rights. So you don't get caught up in the real context or the naming of the question. Look at what are you given? What do you need to calculate? Oh, now I'm already in the exam preparation mode because we're doing it tomorrow. Tomorrow we'll go into detail on that. But make sure that you prepare and you practice. And that is why from tomorrow, after tomorrow, we'll start with the exam preparation. You will have to practice how we write the online sessions. I will make them timed sessions. But I won't give you a lot of questions. Like maybe I will give you five questions per assessment. So maybe it will cover chapter one till chapter six. But there will be five questions. And then there will be. So maybe I will do one, one, one, one, one. So that you have enough time and you can time yourself and see if you are able to complete that. But not now. At the later stage, we will do the timed. At the moment, I will give you enough questions from chapter one till chapter six to do for that weekend, Friday and Saturday. And then the following week, we do chapter six until chapter seven until chapter but watch for that weekend. And we discuss those questions as well. And then closer to the time where you go and write the exam, I will do the timed exam. The other thing I need to tell you is as well, your exam questions, even though you see you are so many, you will not be writing the exact same question paper. But the questions will be the same. So you might be getting a question that says Paul, somebody might be getting the question that says Pira. And this is to avoid people from cheating or people from asking people to complete their exam questions. So you will have a randomized question, but they will try as much as possible to keep the standard of the question the same. So you will be writing from chapter one, chapter two, chapter three, chapter four, that will be the format of the question. Your questions, they will be asking the same thing but in different manner. So your option number one on question one might not be the same as option number one on question one from the other person. So you must be very careful. Don't think that you're going to hire someone or get someone to or compare your answers with the person. It might not be the same. You might be writing Paul and Mary and maybe they change. Instead of giving 100 on this question, they give 200 and then on the other one, the value might be 50 and then on your one, the value might be 57. So you will not be writing the same. So depending on which group are you located in because they might split your groups into different sessions as well. Like you might have group A, group B, group C, group D, like you have with tutorials. There are other groups of e-tutorials to other students. So you will also be clustered into exam groups. So I hope I'm not scaring you but I'm just preparing you. So as much as we're going to prepare for the exam and go through all the past exam papers that I have access to, it is your responsibility to make sure that you do the work so that when we come and do the discussions on a Friday or Saturday, you can raise any concerns that you have with the questions and we can help you solve those problems. I will make sure that every week I, once I've created those assessment, I download the results of every individual but you can see that you are, we are only 12 in this group. How many are we today? Today we are only one, two, three, four, five, six, seven. Seven, there are only seven of you and I'm not sure how many people will be writing those assessment but if the seven of you can complete those assessment, I will have the result and see where you went wrong and then we're going to discuss those questions. If everybody has answered the question correctly and we got 100%, we're going to come back. We're still going to, you're going to tell me where are your challenges? Where did you pick up the challenges and we will find more exercises and do. There won't be a day where we will just sit and then there won't be anything. So there will be lots and lots of questions that we can go through or I can also open up an exam paper and then we go through that exam paper together and discuss it and see how we answered some of the questions but we will have something until you go write the exam I am here to make sure that you get 100% in your exam if possible. Okay guys, and that concludes today's session. So any other question? Maybe before the other question I can stop the recording because then it makes my life.