 is the link on the chat for the register, the link is in the chat. Please make sure that you complete the register and then we can start with today's session. Today we're going to be discussing measuring the relationship between two numerical variables which is the correlation and the regression. We're going to do both of them today. Welcome to your session seven. I think this part we already covered but unless there is a question relating to content, if you have any question or comment, if none then we can start with this session. So in terms of the session we're going to look at the regression but more specifically in terms of the correlation studies. The requirements for doing this you need not so much of statistical tables but if you are required to use the statistical table to do a regression test, you need that, you need the formulas and you need your calculator. By the end of the session you should be able to build a regression model, you should be able to estimate the coefficient of regressions and more specifically you should be able to calculate your slope and your intercepts and be able to calculate the correlation and also interpret the correlation. To start off with, because here we're talking about the relationship between two numerical variable and that is one of the measures of relationship that we're talking about here. So this you should have landed in 1501 and in 1502 you just need to know how to calculate it, not more about how to interpret it but because some students are doing 1501 and 1502 at the same time and I know with 1501 you wouldn't have by now have read through the correlation and question because they are the last chapters of your study unit. So I'm going to brush off like start off with the introduction of the concepts at the high level but we're going to go through them as quickly as possible and then we're going to get into how we do the calculations as well. So in terms of the relationship we can do that by visualizing the two numerical values in terms of a scatter plot so that we can look at their relationship as well. So a scatter plot can be used to show the relationship between two numerical variables and the correlation analysis is a measure of the strength of that relationship or that association. So the scatter plot shows you visually how the relationship looks. The correlation coefficient gives you a numerical value that tells you the strength of that relationship but also the value that you will get not only just the strength but it will also give you the direction of that relationship as well. So correlation is only concerned with the strength of the relationship and there is no causal effect which is implied when we look at correlation as well. So there are different types of relationships so they can be linear relationship whether when the values of x increases the values of y increases or when the values of x decreases the values of y increases. Things like that. Those we call them the linear relationship because it can be a positive relationship or a negative relationship. They can also be cavea relationship so these are what we call the quadratic relationships or also the exponential relationship as well. And during COVID we used to see more graphs being displayed in terms of the exponential relationship. Also there can be no relationship in terms of the two measures that you are looking at when data is just or the data is just scattered around everywhere supposedly or it might have same constant as spread where the other value stays the same maybe it stays flat it's just around 3 and 4, 3 and 4 even when the values of x are increasing the values of y stays constant. Sometimes it can be that the value of y stays constant sorry the value of x stays constant but the value of y increases. So if only the value of x being at that point stays constant at that point but the values of y will be increasing all the time and that there is no relationship because there is no relationship between your x and your y values. In terms of the calculation of correlation of coefficient because at the moment previously we looked at the scatter plot where these dots represent the points in terms of the values of x and y and we can calculate the correlation coefficient which will tell us the strength of that relationship and we use this formula. Later on we're going to look at some examples on how we apply this formula. You need to know that this formula is to calculate your coefficient of correlation which is r it is n times the sum of your x and y because we're looking at x and y observations here minus the sum of x times the sum of y divided by the square root of your n times the summation of x squared minus the sum x squared. Those two are different this one is sum of x squared so if I have the values of x 1 2 and 3 this says I must add is the sum of x which will give me 6 and this sorry this one is the sum of x and the sum of x squared which is that one it will be 1 times 1 squared plus 2 squared plus 3 squared which will give me the sum of x squared which will be 9 plus 4 which will be 13 plus 1 which will be 14 so those two are different and you need to treat them differently like that n multiplied by n times the summation of y squared minus the summation of y squared and you can see that also you do the same with the y. Later on we will get into more details in terms of in terms of how we calculate the coefficient of correlation. The value of r is always going to be between the values of negative 1 and 1 or it can be in a percentage in a decimal format so it's just between negative 1 and 1 so it can be 0.99 0.98 minus 0.99 minus 0.98 minus 0.75 and so forth. If the value of r is bigger than 0 then we say the correlation coefficient is positive then it means the relationship is positive which means when the value of x increases the value of y also increases. When your coefficient of correlation is less than 0 then it means the relationship is a negative 1 which will mean that when the values of x are increasing the values of y are decreasing so this relationship will look like like that. When the values of y are increasing that way you can see that the values of when the values of x are increasing this way because if I look at this point and this point you can see that the value of y of x here is big and the value of x on this point is small and the value of y is small at that point and the value of y is big at that point so when the values of x are increasing the values of y are decreasing. The first one when they are increasing the relationship looks like this it means when the values of x are increasing the values of y are also increasing as well but also we said they can also be no relationship and that is when the values of r is equals to 0 then there is no correlation there is no relationship between x and y however because there are many other values in between the zeros and negative one and one then we need to be able to also classify the type of that relationship in terms of the strength of that relationship and that is why coefficient of correlation measures the strength and the direction so here we talk about the direction if it's greater than zero the direction says it's positive if it talks about less than zero the direction says it's a negative in terms of the strength then we look at the actual value that we get of the coefficient of correlation if your coefficient of correlation is minus one we say it is a perfect negative relationship if it's between and some books they use different measures so you must also check if in your module they did define these relationships so here it says if the relationship is between minus one and your coefficient of correlation r is between minus one and zero point nine seven nine then it is a strong relationship and if it is between negative zero point nine and zero point three nine we say it is a moderate relationship and if it's between zero point three nine and zero we say it is a weak relationship and we know that when it's zero there is no relationship and also it will happen with their positive relationship between zero point nine zero point seven nine and zero point three nine it will be moderate and as it goes higher zero point seven nine and one it will be strong positive relationship and when r is one we say it is a perfect positive relationship so in terms of how you will find the questions or how you interpret the questions as you see them let's look at some of the scatter plots and the coefficient of correlation when it is calculated so let's say this is the relationship you can see that the r is equals to one therefore this is a perfect positive relationship because if you look at the points they are right on a straight line and it shows you when the value of x increases the values of y also are increasing so it's a positive relationship and it is a perfect because your r is equals to one next one it's got points scattered all over your r is zero point one eight when we calculated r and this we can see it is a weak positive relationship because I can see when these values are going up the values of y are also going up even though they are scattered everywhere the strength of this relationship it's very weak but it is still a strong positive it is still a positive relationship looking at this one where it's got an r of point eight five we say it's got a strong positive relationship and the last one when we look at it it's minus zero point nine two it is a strong negative relationship as we can see that when the values of x are increasing the values of y are decreasing the closer r comes to one the more x and y are related and you can see from the graphs that we have as well when we have a coefficient of correlation which is r we can calculate what we call a coefficient of determination now coefficient of determination tells us what is the portion of that total variation in the dependent variable which is our y variable that is explained by the variation in our independent variable x so coefficient of determination tells us what is that total variation in the values of your y dependent variable that is explained by the variation in the independent variable and we're going to look at how we interpret this coefficient of determination because for example if my coefficient of correlation is zero point nine two if r is zero point nine two to calculate r squared my r squared is just to put zero point nine two squared and that should give me an answer of when nine two squared it's zero comma eight eight five so what it means it says zero point eight five percent of the total variation in my y independent variables are explained by the variation in the x variable and that's how you interpret this instead of writing of putting the portion there you can say zero point eight five of the total variation in the dependent variable is explained by the variation in the independent variable and that's how you interpret your r squared which is just the square of your coefficient of determination you can also calculate r squared by using the formula r squared is equals to the regression sum square measures divided by the total sum square which is your ssr divided by your ss t and later on we're going to expand more on that in terms of the coefficient of determination you also need to remember and know that it is always between the values of zero and one it will not be negative because we are squaring the values so it will be between zero and one in terms of the relationship so let's say for example we were finding the relationship and here is the relationship between x and y and you would have noticed that sometimes I like to draw the line like that so you can also fit a regression line on top of your scatter plot in order to check how related are your x and y what is the influence of x on your y and you can use that regression line to predict what the new number can be if it's not there or what the number could be if we predicting a number that is on the air you can find any of the values on your scatter plot by just substituting the value of x so that you can find or find the estimated value of y and that's where regression line comes in so we're going to use this line the regression line to estimate the value of our dependent variable and how do we do that by using regression line and I just said it we use the regression analysis to predict the value of your dependent variable based on the value of at least one of the independent variable now in your module you're not doing a multivariate analysis we're doing only one so there will only be one independent variable there will not be more than one so you will always work with one independent variable which is your x and the regression analysis will help explain the impact of changes in the independent variable on the dependent variable and in school we used to use the formula y is equals to mx plus c now we're going to use especially in your module you're using your textbook is kella and I think it's called the economics of statistics or something like that we use the formula y is equals to b1 times x which is the slope times x plus b0 which is our intercept so that is the formula that we're going to be using the second formula y is equals to b1 x plus b0 and some errors but we're not going to worry about the errors we're just going to use b0 plus b1 x to explain your regression line your dependent variable y will be the variable that we wish to explain or predict and your independent variable that will be the variable that we use to predict or explain our dependent variable so this is our dependent variable this sorry independent variable this is our dependent variable so let's look at how we do this regression line so a regression line which is given by this equation y estimate is equals to your slope or your intercept b0 plus b1 which is your slope times your x your slope we can interpret it your intercept it is where x is equals to zero if x is equals to zero then your y estimate will be the same as your intercept and this will give us the relationship between x and y because it will be described by this linear function that we're going to be using and the slope will tell us what will be the changes in the values of y as assumed to be related to the changes in the values of our x that will be from calculating the slope as well and b0 will be the estimated average value of y which will be the estimated value of y if our x is equals to zero because if x is equals to zero zero times b1 will be equals to zero therefore it means those two values will be the same and our b1 which we can also interpret when we have the value of b1 we can interpret it as the estimated change in the average value of y as a result of one unit change or one unit increase or decrease in the value of x so we can interpret b1 we can interpret the coefficient of correlation we can interpret the coefficient of determination so this equation you can calculate it and do it on your excel sheet there is a regression model on your uh under the data analysis tunnel and you can use that so let's look at an example fm personnel department hired employees for a given job primarily on the basis of the result of an aptitude test administered to job applicants the performance of those hired was rated on the same scale by their supervisor a year after they were hired a sample of the test grades and the supervisors assigned grades is as follows so they we have the test grade and the supervisors grade so our dependent variable x is our test grade and our our sorry our independent variable x is always independent our independent variable x and our dependent variable y so we can plot this onto our scatter plot so where a value of x is one the value of y is four and that will be the point when the value of x is three the value of y is six and that will be the point because that will be the point when the value of x is five here and the value of y is 10 and 10 that will be that way there when the value of x is five and the value of y is 12 so y and 12 that would be the point there when the value of x is one and the value of y is 13 and that is the point there by looking at this graph we can deduce we can have so many other information relating to this this is an outlier because it's the value far away from the rest of the other value or what we call an extreme out as extreme outlier or value looking at this when the value of x are increasing the value of y is also increasing based on this information that we have and we can see that this is a positive relationship we don't know what is the strength but all what we know is this is a positive relationship based on this information then we can do our regression line we can plot our regression line and that is the line that we have and we can also describe what this line is by calculating the values of that line so now i'm going to show you an excel output of the same data that we have the very same your text test and supervisor grade your x and y values i plug them in into excel and i can show you later on if we have time plugged it in into excel and run the regression line and this is the output that i got from this excel output there are several things that you can take out already calculated your multiple r is just your r your r squared is your coefficient of correlation the others you can ignore the other thing that you need to be aware of is these two your intersect which is b zero and your test grade which will be your b one now with the b one we can also include x as our test grade so we can do this in state of writing x we can put this value as b one and into bracket test grades and those are the values that we can use so the first one we know that is the coefficient of correlation coefficient of determination and we can write out our regression line remember our regression line is y hat is equals to b zero plus b one x always remember that your slope is multiplying with the x and our x is just the test grade and that is our regression line as you can see i have covered everything that you needed to know in two minutes but sometimes it's not as easy and straightforward as this because you need to do some calculations okay let's go into that let's say we need to use the same information that we have the test grades and the supervisor's grade let's determine the least square regression line manually without using excel and find the coefficient of correlation and the coefficient of determination manually meaning using the formulas so we have the table as you can see there are my values from one up until one and thirteen from y four until thirteen i can calculate the total the total is like summation so this will be the sum so this will represent the sum of x and this will represent the sum of x and the sum of y so the totals will represent the sum of y's and i can calculate the mean because the mean we know what the mean is the mean is the sum of x divided by n they are so you add all of them they are 15 divided by one two three four five fifteen divided by five will give us three the same here the mean will be the sum of y divided by how many they are and that will give us the mean of y which is the sum of all of them which are there 45 divided by five which gives us nine so i've got the mean the sum total i can also calculate the sum of x times y which it says one times one one times four is four three times six is eighteen and so forth and i can add all of them and that will be the sum of x y x times y i can also do the same with the sum of with x squared so i'm just going to come here and say one squared is one three squared is nine five squared is 25 and so on and i sum all of them and that will give us the sum of x squared and the same would be for the sum of y squared so i know from the regression line you remember the regression formula r is equals to n times the sum of x and y minus the sum of x divided by n or something like that do you still remember divide by the square root of your sum of n times the sum squared sum that whole formula we'll get to that i'm not going to write it because i didn't memorize it to that extent so how do we then find the regression line know that this is the regression line but it means we need to know the formulas so the formulas are there given to you so we know this is the regression line we need the value of our intercept the value of our slow b1 we don't have to substitute the value of x and y so the red those two we don't have to substitute them we need to calculate those two b1 and b0 first of all we can calculate b0 b0 which is the intercept is calculated by the mean not the mean estimate the mean bar x y bar of y minus this slope times the mean of x so it means we need to calculate the mean of y the mean of x bar and the slope so how do we calculate the slope calculating the slope we use this formula the sum of x y minus the sum of x times the sum of y divide by n divide everything by the sum of x squared minus the sum of x squared divide by l the mean we know that is the sum of y divide by n and the same with the mean of x will be the sum of x divide by n we also know that we had this table already calculated with all our values you can for now ignore these two columns we'll get to them later on so we know from our previous exercise we had all these two these three columns so those are the ones that we can use so let's start substituting into this so the first one we we need to start with is let's start with the b1 the slope because we cannot calculate b0 we need to calculate the slope so let's substitute the values the sum of x y remember we calculated that it's 145 the sum of x is 15 the sum of y is 45 the sum squared is 61 the sum of x is 15 squared divided by n everywhere where n is so n is how many there are they are 1, 2, 3, 4, 5 they are 5 and then you calculate and you find your b1 as 0.625 now we can calculate the mean or we can just substitute because we calculated the mean previously so calculating the mean we get 9 and 3 and we can then substitute into our intercept into our intercept the mean of y is 9 minus b1 is 0.625 the mean of x is 3 and the answer we get is 7.125 now we have the intercept and the slope we can then substitute into the regression line substituting into the regression line b0 is 7.125 b1 is 0.625 as you can see i'm gonna go quick quick quick quick back to one slide 7.125 plus 0.625 from excel is the same as the same as calculating from formulas manually you can still get the same answer it's just that one is the shortcut of the other now let's calculate because the question this is to answer a a set we need to answer this so this was option number one to find the regression the least square regression line b set calculate the coefficient of correlation and the coefficient of determination calculating the coefficient of correlation we use the formula you still remember the formula i'm gonna bring back our data remember this you can just ignore we also have n times the sum of x and y minus the sum of x times the sum of y divided by the square root of the values underneath the square root so we can just substitute into this formula as well so we substitute n is 5 the sum of x and y is 145 the sum of x is 15 the sum of y is 45 divided by the square root of n is 5 the sum of x squared is 61 minus the sum of x the sum of x is 15 squared times 5 minus the sum of x y squared which is 465 minus the sum of 45 squared and solve the equation we get the answer of 0 comma 3 2 2 2 7 and if we run it up to 2 decimals we get the answer of 0 comma 3 2 and from here you can calculate the coefficient of determination before we go there let's calculate coefficient of determination which is r squared which will be 0 comma 3 2 let's take all of them all for digit squared and that will give us your coefficient of determination of 0.3 2 2 7 squared is 0 comma 1 0 if i leave it like that interpreting this i can say that this is a positive oh sorry we can start with this strength a weak positive relationship based on that and we can also interpret this 0.10 variation in y is explained by variation in x or we can even include total okay 0.1 total variation in y is explained by the variation in x or you can even say 10 percent of the total variation in y is explained by the variation in x and that one you can just say there is a weak positive relationship between x and y and how do we interpret the regression line remember the regression line was the supervisor's estimated value that we can estimate anytime will be given by 7.125 plus 0.625 and your test rate and how we get that this is very important as well so if you calculate your coefficient of correlation and you get it as negative therefore the slope sign as well will be negative as well so in terms of this we can say because there is an increase a plus so we're going to put there plus is an increase and a negative is a is a decrease so in terms of interpreting the slope we're going to say because we know that the slope is 0.625 tells us that the mean of the supervisor grade will increase by 0.625 on average for each additional one unit increase in the values of your test grades and there it is and if it was negative we would have said the supervisor grade will decrease by 0.625 we don't have to interpret what the value of 7.125 is but you can say the value of your intercept of 7.125 will be the same as your supervisor will be the estimated supervisor test grade if the test grade is equals to zero so if this will be equals to zero therefore the supervisor grade will just be on average 7.125 and that's how you interpret the regression line and the coefficients of the regression lines so we spoke about the sum square measures sometimes you might be given the sum square measures and be asked to calculate the sum square measures so you just need to know how to use the sum square measures to calculate remember when we were doing the regression calculations and I said ignore that those are the things that you can use so if I go back here to this table of hours with the sum square measure so this will be x minus the mean so it means one minus three will give us minus two three minus three is zero five minus three is two and the same will apply with a y and you can use that to answer any of the sum square measures so with regression you might be expected to use the sum square measures to answer some of the questions or you might be expected to just be given the data on the table and then you can use it so now I've gone through using excel I've gone through using the manual calculations and later on we can go through how do we use a calculator so you can calculate your coefficient of correlation you can calculate this slope by using this and more especially you can also calculate the core variance let's write it yeah core covariance you can calculate the covariance using the sum square measures as well and that is the sum square measures you can also be asked to interpret what the total sum square measures mean which means is your total variation which is the measure of variation in the values of y value around the amine and your ssr which is your sum square measures of regression which is the variation attribute table to the relationship between x and y which we can also refer to it as the explained variation and then we have the sse which is the era sum square measure which is the unexplained variation and that is the variation in y attributed by the factors other than the x which are those unexplained factors that you don't know which ones are those and in terms of the formula your ss t which is your total variation is the same as it's made up of two parts your ssr plus your sse so if they give you ssr and ss t and they ask you to calculate sse you know that ss t will be i'm sorry ss e will be ss t minus ssr all right and we know that the coefficient of determination is ss r over ss t right you are also expected to know some concepts in terms of the diagnostic tests of the regression so these are the things that are in your prescribed book or in your study guide you just need to know that that in terms of looking at the regression sometimes you do get some of this measure so sometimes you can test for normality it means you will have a belly shape curve and you can draw a histogram on top of the residuals that you get you can test for heteroskeletal which is the variance in terms of the variance which will not be constant so it will have varying variances and you can use this by looking at when you plot the residuals but in your in your module you are not expected to plot the residuals and interpreting but you just need to know the type of test that you need to do in order to look at different diagnostic tests like for example testing for the outliers you like I did by putting it on the scatter plot you are able to identify those extreme values from the the scatter plot diagram also the most influential observation will also look like outliers those will be those values that are away from everything that also influences in terms of the measures that you get like for example your measure of correlation or measures of determination okay so any question before we have 30 minutes I'm not sure if I will be able to demonstrate this because I do have another example that I also want to share with you which I've already shared on the on the graph what do you call on the notes folder with you so in terms of the calculator you can use the case your calculator you can use a sharp calculator to calculate the regression so if you have the data this is my data and I have a case your calculator there I need to put my calculator to state mode and that is the first step that you need to do so you press the mode button which is the button written mode on top so it's the mode you press that and you will have a view that looks like this that will have your stat and your table and you press two for stat you will press that button too and you will choose number two for x a plus b x now you need to be very careful here remember our regression line is y is equals to b zero plus b one x always remember that the slope is the value that is multiplying with an x so if you look at this you have a plus b x so it means this is our slope if any question is asked this is our slope and this is our intercept so you just need to be very careful about that because now on your calculator it's a plus b not b zero and b one so once you press two your calculator will show state one or it will come up with a table it means now you are ready to capture your data so the table will have your x and y values at the top in order for us to put in the value of x we press four because those are the values of x we're going to press four equal so you just say four equal and you go to the next one two equal six equal five four equal three equal and you would have had all the values of x to capture the value of y you use your arrow to navigate you go to the left so that you go to the x values and then you go to the top so that you go back to the beginning of the table until you get to where it corresponds with four and then you start capturing your y values and you say five equal three equal seven equal six equal five equal until you've captured all of them and then you go next when you are done the story there is a step missing so once you you capture all your data by pressing three equal you have your table with your x and y so they should cross bond four should cross bond with five two should cross bond with three six should cross bond with seven four should cross bond with six and three should cross bond with five and once you are done you press ac you just press the ac button and it would remove the table from your screen but the data is stored at the back of in the memory of your calculator now you can uh get the same table by pressing shift there is a shift button and you press the step one there is a step written in orange on button number one when you press that you will get this menu which will read one two three four number one gives you the types of data number two gives you the table that we've catch kept chat which will be the same table number three will give you all the sum x sum of x y sum of y and so and so on and so on number four will give you the mean of x the mean of y and the standard deviation of x the standard deviation of y in terms of the population s x x y it will give you all those values under the bar but that is not what we are looking for we are calculating the regression line and that will be on button number five then you will press button number five when you press five you will get this menu which will look like this and a is your intercept b is your slope r is your coefficient of correlation and x hat is your estimate we haven't spoken about the estimate y hat is your estimate as well so if you need to calculate any of those you press shift first and then oh not shift you will press the buttons that so you will press shift five and it opens this menu and then you press a if you want to calculate the value of your slope and then you go shift one five and you will get to two and you do that and you do that and that will give you the value of b and you press equal sign and you will have your regression line i'm not going to go into too much details in terms of that on your sharp calculator the same steps also happens you need to put your calculator to state mode by pressing the mode function so the mode button will be somewhere there next to the clear it will say mode and you will press whatever the value it shows it will show us um sd um reg or lin it might show you reg or lin i'm not sure about what your sharp calculator shows you um so you will press the mode you press button number one for state mode because in the beginning it will give you sta um and comp and all that so you will you will have so once you press the mode you will have the function sd uh i think it's comp sd and no no no no no it's comp and state so you press that and once you press state for one then you will have to press there will be sd there will be reg and i think the other one's uh other functions you just need to press the one that says reg for regression line or it will say lin something like that it might say reg or it might say lin and on your calculator on the screen it will write the state one it will write state one in black and you know that your calculator is in state mode and you are ready to capture the data to capture the data on whether you're using a financial calculator or a casual calculator so we say form and there is an an x and y button that looks like that uh no it's a store it's i'm talking about the financial one now it's st o so that button there it's called store and this button is m plus so we're going to press four st o five m plus and we go two st o three m plus six st o seven m plus until we've captured all of them and every time you capture the data it will say data set what data set three data set three and all the data will be captured onto your calculator and in order for us to calculate the stats functions all your values are written in green in uh i'm not sure if it's called green or blue button uh which is the alpha button so you will use the alpha button and the values are a and b they will be on your open bracket and closed bracket and also on the multiplication and the division they will also be some values and i think like your r and your y estimate they will be on the division and multiplication and you use those values to calculate your uh if we have time i will show you on the calculate the actual calculate the same method will happen with your sharp calculator on the sharp instead of st o on the financial calculator you will have the x and y and the ent so when the sharp calculator is in the previous slide when we were pressing st o and m plus you are going to press x and y and ent in the place of those ones so you will say four x and y button five ent and it will say data set one and you continue and also the same your a and b your a and b buttons will be on the multiplication and delete and your r will be on the open bracket and your x squared you can look for x squared to calculate the coefficient of determination okay so i know that i didn't give you time to do any of the exercises we've got 20 minutes left let me ask you do you have a calculator what kind of a calculator do you have maybe we can do one example of the calculator um i'm using sharp using sharp sharp sharp sharp sharp so there is no other calculators i don't have a case yeah i don't have an hp steps at the moment hp is the one complex calculator that i don't have the step immediately but i can look for them and share with with you um oh yes the other thing i wanted to to to share with you as well Adele shared the link to the schedule and where you can find the resources so you just come to that link and if you scroll to where it says a statistical inference there are the notes and recordings and you click on that there will be the recordings and your class notes will be there you click on the class notes i just want to draw your attention to this which is the regression model template that i was referring to you need to download it don't use it on here because if you change it you change it for everyone download it so that it's on your machine and you can open it from your machine and i'm gonna show you just now can just minimize this and we can use it's going to use the supervisor example that we know we have the answers to i use this one use the table table table come come come there we go we can use this it's fine here i'll just use the x and y values come on come on come on come on okay so it will open a it will open a an excel sheet like this so you can save it um and you just scroll to there right i've got so many other values two values one two three four the instructions are very clear here to add a row click on column b row eight and the right until y squared right click insert okay so i can do the same must end on yeah and i must go this way and i must just delete because these are the things i don't want i can delete and i must say up i'm not sure if i did the right thing but it seems as if i did the right thing i didn't do the right thing and now i forgot about my own my own notes and now it's taking me long when deleting the row do the same clicking on column b of the row then you want to delete okay at the moment seem as if like i am not doing what i am supposed to be doing there we go it was i deleted the row i deleted the row so i need only five one two three four five so i still have an additional one so i must just go there and delete okay so let's capture the data as we saw them on here so we have one three five five do this to me is the table back we have five and we have one and then this side we have four we have six we have ten don't worry those who are doing 15 on one you will have a chance again to do the same on saturday when we do it okay so don't worry about that that is to calculate something else okay so we have our data so we we know that they were 15 some total and 45 and if i go to the other side you will see the other baby so remember we had about i scroll to the side just to show you the other measures that we had so we had sum of x squared which there were 145 sum of x squared 61 sum of y squared 465 you can see that is the same table as that the in-between are those that i said you can just ignore for now so in terms of this template what i did was to create all the calculations so i just show you the formula that i used to calculate each one of them so you can see there is our b1 this is our mean 1 b0 and you can see there is our automatically it completes the regression line and our coefficient of determination and the coefficient of correlation there it calculates them and i just give you the formulas as i have done it now i also said the coefficient of determination you can calculate the ssr over st and i spoke about some square measures that you can use and there i'm just showing you how to calculate them and how to calculate this sr and sr squared you can just use these measures here at the bottom to calculate sr divided by st if we remember what the value is it is that 0 comma and 1 0 so we can use the same measures so let's see equal sr divided by st equal no something is wrong with my calculations oh yes because i'm not using the right values you also need to adjust your values on there that's why it's not calculating correctly so we'll just adjust them here and delete the values that we don't need and there we go you can see that they are the same amount so you can use this to heck as well if you want so now let's look at how do we calculate using your calculator i'm gonna go back to oh let's not go back to that one let's use the same the same data that we have here so we know that this will be our a and this will be our our b so we're gonna use the same data i'm gonna use both calculators you can follow me or you can watch come back and watch the recordings later on we're gonna finish not exactly at half past because we took some 10 minutes that we were discussing so just bear with me so i'll start with the case you want so we're going to first press the mode and then we press two first that so my bad mode two first that and we press two again for the table and then we capture the data remember you just press the value as you see it so on this we're going to use only those values come on we only interested in those values right so let's capture the values as we have seen them so i'm first going to do the x values so it's one equals three equal and you need to make sure that you capture them as you see them on the table don't mix and match now i must use my arrows you can see there i'm i moved to the right then i'm gonna use my up arrow to navigate to the beginning where it's row one and it's equivalent to one then i can put the four equal six equal ten equal twelve equal thirteen equal and you can see that it's on row number five and i press the ac button now my calculator is in state mode i press shift state which is button number one and i'm interested in number five which is the reg and they are my values so let's go back to the site we know that b zero is our a let's see if we get seven point one two so a is our number one and we press equal and seven comma one two five you see i use the calculator i use the excel i use manual calculation i still get the same now let's get b zero uh i sorry b one b one b one is on b so i press ac shift state reg is five and we're looking for number two each is b you press two and you press equal zero comma six two five and that is your regression uh remember the r was zero comma three two two seven so let's go find r shift that reg r is on button number three and we press that zero comma two two seven and in order to find the coefficient of determination we press the x squared and i just press x squared equal zero comma one zero and let's calculate the mean remember same shift that and it's under var it's not under reg it will be under the var so we press four and there are your mean mean x mean y so i'm looking for the mean x uh mean x is on button number two and the it's equals to three and you can also go this way as well four and we look for five mean y and mean y is nine easy okay we done with that one let's look at the case the sharp calculator now i don't have the financial calculator so i'm going to show you using this calculator now the challenge i have with this calculator is it's so big and i cannot minimize it or make it small or make the screen small uh so it will hide some of the numbers but if i put it here so i press the mode button and i press one for that so this is easy because everything is written except it so this one it's written line so it's not written reg it's written line so we need the line so we press water gate and our calculator is that one now our calculator is ready to capture remember we use st o and m plus on the financial calculator you use on the financial calculator you use the x and y and ent so like x and y like that and ent so let's go one st o four and then m plus then we continue three st o six and then m plus five st o ten and then m plus five st o twelve and then m plus and one st o and 13 and m plus and i start all my data they are five of them then you press on and off button now i'm ready to calculate any of the values remember the values that we're looking for are these values here i didn't show you all on your sum square measures i didn't show you all these values on this calculator the values are just in front they you can see you press the alpha button to get any of them so we're not going to test them because we ran out of time so let's find our a and our b so they are a and b here they are written in small letters not those big letters we use it this once yeah a and b so let's go a is b zero so we press alpha and you press the open bracket and you press equal and seven one comma two five to find b you again you can also just move from there alpha close bracket equal zero comma six two five to find r alpha divide equal zero comma two three seven two if i need the r square which is coefficient of determination i just press the x square button and equal and that will give me my coefficient of determination the mean is that mean of x mean of y and understand activation and the sum of x and all that so you can use your calculator to calculate that how do i get the table that i just showed you previously so let's copy these two values i'm gonna copy them onto a new sheet i'm gonna copy them here onto a new sheet you need to make sure that your your excel has the data analysis type if it doesn't have the data analysis type you go to find options you can always rewind this and watch and do step by step and go to addings and you go to if it doesn't appear here as analysis tool pack you just go and look for excel addings and press go and it should pick up this menu button and then you just tick analysis tool pack i always also include the vba one which is the programming one and tick them you make sure that both of them are ticked let me show you if i include the euro currency too it will also appear on here sometimes it doesn't appear automatically then you need to close your machine and come back in as you can see it didn't put that solver here so it's fine but what i'm interested in is that so i'm interested in the data analysis panel so you just click on the data analysis it will come up there and we're interested in the regression and there is our regression and you click okay and it's going to ask you what is the value of your x and what is the values of your y that you want to input and i'm gonna click inside the the button where it's flicking and then i'm going to go to the y values and direct and it will put all the values in there and for y i will also make sure that the cursor is flicking inside the box then go and highlight the value of y and i'm going to say there are labels because i included the x and y labels on there you don't have to worry about the confidence level and the constant of zero i want the output to be on the same screen as here don't worry about the rest of these other things all i need is my output the output i click on that and i must go inside the block because i want to say where i want to place it it needs to start from there doesn't matter it will be a big table but i want it to be to start there and when i click okay it will generate a table as you can see this is the same table as we started with and there we go so they are my data is the same so my r my r square 10 is my r my r squared and my b zero and my b one i've shown you how to navigate this okay and that is the end of the session for today i had some exercises and because we only had one hour 10 minutes and this is very i'm consuming in terms of all the calculations and all the different things that i needed to show you but so far we have learned how to use the regression line to predict the value of your dependent variable or we didn't use the other thing let's go back to the debate because i need to also show you this so for example i'm going to go back to our example that we used all along can use this it's fine so if for example if let's say let's say the test the supervisor's test we know what the supervisor's test was remember these are the values let's say our test great we know this is our test and this is the supervisor's one right so let's assume that the test grade uh the value of the test grade was six what will be if we want to estimate what will be the supervisor's test grade so we can come here and say if test grade is six seven point one two five plus zero point two five times six and we can calculate we can calculate this let's gonna remove this and say actually i'm gonna use my cashier because cashier it's easier than shop okay let's take that so i'm gonna say seven point one two five plus why do i have zero point two five i wrote this number this should be zero point six two five plus point six two five and open bracket six and close bracket equal now i know now that the answer here will be ten point eight seven five right ten point eight seven five so if the if um the new persons goes a test grade of six therefore it means the supervisor's grade will be ten point eight seven five now we have started the values on our calculator we can estimate the same thing so i'm gonna have sell this since our data is still stored let me just show you i didn't clear my calculator so my data is still there so i'm gonna press two to show you my data there is still my data so i'm going to estimate this value so to estimate uh we're going to use shift that rec is five we're going to use this number five why head to estimate now before i estimate that value i need to press it first so i need to say it's six then i must go to shift that five five and that would look like this on the calculator and oh come on and then when i press equal i should get ten point eight seven five instead of calculating manually you can calculate manually by using the formula the same on this calculator as well you can do the same you go alpha and oh you don't need alpha now the estimate is written differently here is y with a copy so we're going to use the open bracket but it's written in orange so we're going to press shift second function and oh before we press second function i forget as well you press six and you go second function and you press your y estimate and it will look like this and there is the answer even i didn't even press equal it will give you your answer what if it's four you do the same for second function open bracket and you will get the answer of nine comma six two five what if it's 20 you do the same 20 second function open bracket and it will give you your 20 what if now we are given the value of x and we need to what if we given the value of y and we need to estimate the value of x same to estimate x there is your x copy so five i'm not i'm not estimating y i'm estimating x five second function and there is your estimate of your y or your x if my y is five that will be the value of your test grade you will get a negative test grade if my value is 20 of x is 20 oh sorry by value of y is 20 to estimate the value of x it will be 20.6 and you can do the same on this calculator as well estimating 20 second function or shift step one five and four or x and equal 20.6 you can see the the answers are the same and that's how you can estimate so i just want to also bring to your attention that in your exam these are the type of questions that you will get you will need to tell them about this cat upload you will need to tell them about the the relationship you will need to calculate the progression like the same way as we have calculated and you need to be able to interpret the slope if that is the slope you will need to tell them if this slide fits perfectly and so forth whether also the strength of the relationship is it a strong relationship or not so those are the type of questions that you will get other questions that you will get might be asking you to find in a simple regression line determine the coefficient of determination we know that that is your your r-squat is given by ssr over sst and here you are given sse and sst and we know that sst is the same as ssr plus sse so you just use this to calculate the value of ssr and substitute into the formula that's all what you need to do in terms of this kind of questions the other thing remember i told you you can also calculate the sum square measures so if you put this data into your calculator you don't even have to worry too much because this is your x this is your y your y value you can calculate the sum of x y on your calculator remember the sum of x y you find them they they are you go to sum which is three they are the answer the sum of x y the sum of x y squared sum of y y squared if you want the y squared now here at the bottom it has your variance standard the variance and this is this one is that and the the one at the bottom is your variance so we have the standard deviation remember where you find your standard deviation it's on button number four and it is that sx and x y sy now you will get the value there and you're going to square the answer because this is sx squared and this is you will multiply that by your sy squared that you will find on your calculator this value you will have to calculate because you will have to go and find your sum of x and your sum of x and the sum of x y sum of x times y and your n is how many they are they are 10 so this will be 10 minus one and that will be 10 what else so this is the same as that value so if you have that value you should be able to get that coefficient of correlation you would have calculated the coefficient of correlation using the covariance and your standard deviations and that concludes today's session are there any questions you will need to you will need to exercise and also go through the templates if you want to use the template whichever one that you feel comfortable with but you will also need to have lots and lots and lots of practice in order for you to be able to know how to use all these things that I've just shared with you and I am glad that by the end of the session we have about one two three four five six seven seven people on here it's encouraging then probably we will have a session next week but I am not sure if by June you will have a session because you guys you were not attending the sessions but that will be some engagement otherwise you will have to send an email with with motivation to say why you need the sessions not to be cancelled um other than that thank you very much for coming are there any questions comments anything from anyone please make sure that you complete the register I'm going to share the register again on the chat I'm going to also share the link but you all if you joined then it means you are on the WhatsApp group the link is shared on the WhatsApp group on how you access the notes and the recordings other than that thank you guys see you next week Tuesday same time same place yes good evening yes good evening I was asking if I can please be added to this WhatsApp group um someone sent a link in the telegram group that's how I ended up here uh okay uh the WhatsApp group uh how do you guys join the WhatsApp group um okay let me see the link on the chat I think the link should be there on the chat uh I must just go to the history to see if we haven't shared it before it has been long okay been long long way we long long way okay who's the admin the other I don't know okay there is the information on the chat I am the admin of the group okay yeah so I've sent two links there you can copy them and keep them safe if you want the first one is where you will find the notes and this I created a short link for the notes and the recordings and then the other link is the WhatsApp group to the STA 1501 and the other group uh the WhatsApp group is for the link to the 1502 if you're doing 1501 we have sessions every Saturday half past eight tell half past ten I'm doing 1502 only okay so then you will just use the 1502 and to and next week it's our last session how Madoda how Madoda we have covered everything needed for 1502 since we started the sessions in April we are now next week we covering the last session that covers everything 1502 so that will be time series and forecasting that we will cover next week and that will be the last chapter of 1502 then the next sessions will start in second semester I think it might be the repeat of every session or there might not be any I don't know I don't know what the outcome will be we'll have a discussion on Friday with Unissa um but yeah so all the recordings are there thank you okay okay thank you very much