 Good afternoon. Good morning everyone. Good morning. How are you all doing? You are trying under this difficult situation and how are you doing? Oh no, me I'm good. That's great. Are we going to start now? Is the recording started? Yes. Okay so welcome to your other session where we learn skills on how to answer questions relating to your statistics, your basic statistics modules. Today's session we're going to do a little bit of content and then because I'm going to show you so many other ways that you can use to answer the questions. Bear with me and let me know if you're getting lost in all the things that we will be doing. I will show you how to calculate manually as unlike how to use formulas and do the calculations. I will show you how to use your scientific calculator to put your calculator to state mode and capture the data and do some calculation to answer the questions. I will also give you a template which I've already shared with you in the notes section, the template that you can use as well to do some calculations. So all these are just different ways that you can use. All I can say to you is find the best one that you will feel comfortable following or using because not all three are going to be complex because we're using complex calculations as well but you need to practice and practice and practice in order for you to determine which way or which type of engagement you're going to use to answer or solve your regression questions. Can you please also make sure that your cameras and mics are muted? Justice, your camera is on. Thank you. Okay, so please make sure also to complete the register. The link is shared in the chat. Shantel, the notes for today. Sorry, I just realized I didn't share them yesterday but they are uploaded on the notes section where we share all the notes for every Saturday. So you should have all the notes up to today. On the team shape or inside? No. So for Saturday, yeah, I'm not sure which one you're referring to. On the teams, on the Saturday classes, where you joined the session, there is always a link to say notes and recordings. That's where you will find today's notes. If you are in my E-Twitter group, that is a totally separate thing. So for today's session, the notes are shared on the UNISA Western Cape Role Platform. All right. Okay. The last one I'm getting is the Chai Squid under the schedule. So I'm not sure if I'm not looking at the correct place but I am checking under that link on the my UNISA link where you joined the session for teams and I'm not finding the last notes that were loaded were Chai Squid. I'm not sure if it's just me or if everyone didn't see it. It's telling me it's not loaded, even now. Huh, my bad. I thought I loaded it this morning. Should be there. I don't know why you're not seeing it because now it says when I try to upload it again, it says replace or keep both. It should have today's date. It should read as basic statistics skill session or probably because it says session eight. The last one here is September 2nd. It is there. Could you just rename it? Okay. I'll grab it. Thank you. This is Lizzie. Yes. You said under notes, recordings and if you're under tutor, is your tutor what is this? No, don't worry about that one. You will. Okay. Let me do this. Let me stop sharing so that everybody, especially those who don't know where to find the notes, and I'm going to share my entire screen. So when you join your notes, you go, when you join the session, you go to join session there. If you click on notes and recordings, it will open up a page like this. You scroll to numeracy center. You go to basic statistics and statistical matrices. When you click on it, it will open like this. There it says open class notes, and there will be recordings for all the sessions. You can see all the recordings there. If you click on this way, it says open class notes. It will open a folder or a site with all the notes. All the notes will be there. There is the notes for today. Please make sure that you download it. There is a download button there. There is something wrong with my notes, get them. They look almost exactly the same. Let me just double-check something. It might be my error on my site. I'll see if I can re-upload it. Okay. Let me just double-check. So it's my badge. It should have the latest version. Apologies for that. So there are the notes for today that we're going to go through. Got it. Thank you. Yes, that should be. Then you can download that. And also, when you scroll down, there is a regression model example template. Please make sure also you just download. By clicking on those three dots, there is a download. You can download that. We're going to use that template today as well. Okay. Without wasting more time, let's go back to our presentation. So today we're going to learn the basic skills to conduct the regression and also a correlation analysis. See my data. The plan for the following week is to look at a revision of everything we have covered from semester one up until now. So we'll just do it at a high level. Looking at, I will just pick one, past exam paper from one of the modules because if you realize you've got different modules, but I will bring all three modules past exam papers and then we will look at different ways that the questions are asked and the format that they have been asked. But since you must also bear in mind that since everything is written online for the past two years, I would not have some of the past exam papers that were written online. But it will also, the ones that I'm going to use will also give you a guidance in terms of how your exams looks like. Okay. So we'll see how we do that. And then from October, based on when you start writing your exams, we will have to start scheduling individual module exam preparation sessions. So it means not all of you will meet in the same room. So we will have on an alternate days different modules looking at their exam preparation content. Okay. So let's start with today's session. I'm not going to ask you any comments and queries or questions, but because we wasted a lot of time while I was showing you where to find the notes. So today we're going to look at, like I said, regression and correlation, they are linked to one another. In some modules, you do measures of relationship in chapter three, but we didn't cover it when we were looking at measures of relationship in semester one, right? We deferred that portion to today to say we're going to deal with the measures of relationship when we deal with regression because they are linked. And since they are linked, we tend to also cover them in the same session. So regression and correlation, we're going to explain in more detail what those are. But in order for you to be able to answer any of these questions, you require actually the calculator and the formula. You don't require any statistical table because they are not going to ask you to do any hypothesis testing. So you don't require any statistical table, just the calculator and the formulas. By the end of the session today, you should be able to make inferences about the correlation coefficient, which means you should be able to say, what does that mean, interpret it. You should be able to interpret both the coefficient of correlation and the coefficient of determination. You should be able to know how to use your regression, especially how to build the regression line, and how to use that regression analysis or the regression line to predict the value of your dependent variable based on your independent variables. You should be able to interpret or know the meaning of some of the regression coefficients, such as the slope and the intercept. Okay, so when we talk about correlation, we're talking about finding out a relationship between two numerical variables. And we can do that by visualizing the two numerical variables on a Cartesian plane. And by visualization of the two numerical values, we will be creating a scatter plot. And from that scatter plot, it can tell us what is the relationship between numerical value one to numerical value two, where our first numerical value will be of the independent variable. And your numerical value two will be from your dependent variable, which is the variable that at the later stage, we will want to predict what that new value would be. So a scatter plot, we use it to show those relationships of the two variables, sorry, and from that scatter plot, we are able to tell what is the relationship and we can calculate what we call a correlation coefficient. But in terms of that relationship, it will also tell us what is the strength of that relationship and what is the direction of that relationship. So what do I mean by the strength and the strength and the direction? So in terms of the strength, we talk about whether they, if there is a strong relationship, or if there is a weak relationship, or if there is a moderate relationship, when we talk about the direction, we're talking about whether there is a negative relationship or there is a positive relationship, or otherwise, there is no relationship, because yeah, we're talking about the relationship between two numerical values. Let's look at the types of relationship that can exist using the scatter plot. The line that you see that passes through all the dots on your scatter plot, we call that the regression line. Ignore that for now, because we're not going to talk about it now, we're only going to talk about it at the later stage. But in terms of the relationship that you see, there can be different relationships, like I already explained, there can be a relationship or there can be no relationship, but also on top of it, there can be multiple different types of relationship that can exist, all different types of relationship that can exist, like they can be a linear relationship. With a linear relationship, it means the dots on your scatter plot, they follow a straight line. Like you see with a straight line that you see in here, all these dots, the example is straight line that is going up or the straight line that is going down, and this are the types of relationship. And this is when the value, we can interpret this by saying when the values of your x, which is your independent variable, the value of your independent variable increases the value of your dependent variable increases. So when this increase, that increase, that is the linear relationship for a positive relationship. When it's a negative relationship, we say when the value of x are increasing, which it means the value of your independent variables increases, your value of your dependent variables are decreasing and that is called a negative relationship. So the strength in terms of that will, sorry, the direction will just be a negative relationship. We'll talk about the strengths just now. There are also some type of a cavalier relationships. And this you would have noticed when they were presenting data on the COVID-19 data, usually they would use some exponential graph to show how the number of infections or people who are getting COVID are increasing exponential. And they would use the second graph that is at the bottom, what we call an exponential relationship. It means when the values of your independent variable, their relationship are exponentially increasing and sometimes it can be exponentially decreasing, but for this one it shows the exponential increase. Otherwise, if you ever played a, or you have ever bounced a ball, you will see that when you bounce a ball, it takes, it goes up, but it also has to come down. And that type of a visualization that you see that the ball is creating, it is what we call a cavalier relationship between whatever you have put, for example, the pressure you put and the time it took for the ball to go up and come down. So that relationship, we call it a quadratic relationship because it goes up and then it has to come down. Sometimes there can be no relationship between your dependent and independent variable and those type of relationship you can visualize them and see them in this type of views where all the dots are scattered everywhere, where it does not even form any pattern, whether you can see when the values of x are increasing, the values of y are decreasing or increasing, you cannot make sense of that because the data points are scattered everywhere. And sometimes they might be on a constant move. So for example, here we can see with the value of x increasing, the value of y almost similar, they stay constant. The values of y, they are almost constant. Sometimes it doesn't have to be like that. It can be that when the values of y are increasing or decreasing, the values you find that the relationship with your x value stays constant. So your value of your x stays constant and you have these vertical dots that are going just up, up, up, up, up. Right. So those are the type of relationship. Now let's talk about the strength. So we spoke about the direction. Now let's talk about the strength of those relationships. So to calculate the strength of the relationship, we normally use what we call a coefficient of correlation, which is r. And this measures the strength also, it can also give you the direction, but we already now know how to interpret the direction based on the negative and the positive. But with the coefficient of correlation, the most important thing is it will give you a number. And that number will always be between zero and one, and it will tell you what is the strength of that relationship, whether it's direction of negative or positive, but it will definitely tell you the strength of that relationship. And like I said, I said it's between zero and one. So, sorry, my bad. The coefficient of correlation r has the value between negative one and positive one. So it takes any value between negative one and positive one. So it can be one, 0.9, 0.3, 0.4, or it can even be in a percentage format where it can be 100% or negative 100%. Negative 0.99, negative 0.98, sorry, negative 98%, negative 99%, negative 68%, or 0.68, or 0.5. You can refer to your coefficient of r in that manner. How do we then interpret the value that we get? If your r, which is your coefficient of correlation, if it's bigger than zero, we say the correlation is positive. So any value that will be bigger than zero, we would say the correlation is positive. When x increases, y increases as well, and we could see it from the scatter plot as well. If the value of r is less than zero, we say it is a negative relationship. When the value of x increases, the value of y decreases. And if the value of r is equals to zero, there is no relationship. So how do we then calculate or describe the strength in terms of the actual value that we have? So let's assume that we have an r value of minus one. How we interpret that? We would say your dependent and independent variable have a perfect negative correlation or perfect negative relationship. If it's between minus one and 0.79, we say it has a strong negative relationship. If it is between negative 0.79 and 0.39, we say it has a moderate negative relationship. And if it has a score between 0.39 and zero, we say it has a weak relationship. And if it has a score of zero, we say it has a no correlation. So this is your question. Anyone can answer this. If I get a r of 16% who can tell me the direction and the strength of that relationship, how do you describe it? Anyone? When it is 16% or can say when it's 0.16. Anyone? Is it the perfect negative relationship or is it a perfect strong negative relationship or is it positive? How do you define that? How's Lizzy? Yes. I think it is a weak positive correlation because it lies between 0.39 and zero. Yes, definitely. Because I only gave you the negative, you should already look at this. If the answer here is positive, therefore it means it cannot be negative. It will be a positive. So similar thing for when it is positive. When it's between 0.39 and zero, we say it has a weak positive correlation. So that is a weak positive correlation or a weak positive relationship between your independent and your dependent variable. What about when it is r of negative 0.84 or we can say it is negative 84%. What type of a relationship will that be? A strong negative correlation? It will be a strong negative correlation because it will be between negative 1 and 0.79. Okay. So even on the positive, you will have defined the same criteria. If it lies between 0.39 and 0.79, it will be a moderate positive relationship. But if it is between 0.79 and 1, it will be a strong positive correlation. And when it is equals to 1, we'll say it has a perfect positive relationship. So you need to know how to define or interpret the number that you get from your coefficient of correlation. We will do lots of other exercises just now. So I'm not going to touch through all this because we spoke about it. We looked at them when we were looking at the scatter plot. But if, for example, a scatter plot that looks like the second one, where everything is scattered all over, when you calculate your r, you might find that your r will reflect as r of 0.18, which will tell you that this is a weak positive relationship. Because it's not clearly when you're looking at the scatter plot itself, you can determine whether this is positive or negative. And but when you calculate your r, it can give you that indication of whether this is a positive relationship or a negative relationship, or even if there's some sort of a relationship that exists within that. And sometimes when the points look like this, we say the r when you calculate it will be equals to 0.85. And you can say that is a strong positive relationship. But when they look like the first one where everything is almost forming a straight line, perfect straight line, and when you calculate your r and you find that your r is 1, you can say this is a perfect positive relationship and so on. So sometimes when you do the calculations, especially to calculate the coefficient of correlation, you might be asked to use the sum square measures. You need to know how to calculate your r by using the sum square measures. And later on when we calculate the slope, you will also be able to use the sum square measures. But they will give you formulas and ask you to calculate them. So you need to know that the sum square measures of x values is given by the summation of your x minus the mean of x squared. All we can say is the same as the sum of x squared minus the sum of x in bracket squared divided by n. It's easy to calculate all these sum square measures on your calculators. When you are giving your values for your x and y as well, we will look at more exercises to look at the sum square measures. But I was just highlighting this so that you are familiar with some of these formulas as well. With the sum square measures and especially when we calculate the relationship, we sometimes also get what we call the variation because what happens with when you do your correlation and your regression analysis, there will be some time where there are some errors that you get that are not accounted for in your regression line. And later on we will discuss what regression is all about. But in terms of the correlation and the regression, there is also what we call the measure of variation. And the measure of variation we get it by calculating the total variation. And that total variation is given by your sum square measures of your regression, which is the regression point or the regression line plus your sum square measures of your errors. Those are the errors that we cannot account for. So your SST is given by your SSR plus your SSE. And this is the formula to calculate your SST. This is the formula to calculate your SSR. And to calculate your SSE, you use that formula. Where your y-head is your estimated value for your regression. Here I'm just pointing out to you the formulas. You do not have to memorize the formulas because most of the time in the exam they will give you the formulas if they want you to use them in your calculation. But you need to know what each and every one of them, how do you calculate them and how do you get them. So for example, looking at SST is easy because it says it's the sum of your observed minus your estimated value of your y squared. So here you will take for every observed value, you will subtract the mean of your y value and square the answer and add all the answers that will give you your SST, which is the total sum of squares. For your SSR, which is your regression, the regression sum of squares is given by the sum of your estimated value. So this estimated value, you will calculate it using your regression line. I will show you how to estimate or how to find the estimate of y. So you will have your estimated value of your y. You will subtract, for every estimated value of y, you will subtract the mean of y and square the answer that will give you SSR. SSE is your observed minus your estimated because you estimate your values by using your observed value. So for every observed value, there will be an estimated value and you will subtract the difference and square the answer and the sum of them will give you your sum square errors and we will look at an example on how to answer some of this if they are given to you as a question. So your SST is a measure of variation of your y values around the mean and your SSR is your variation attributable to the relationship between your x and your y, which is why it is the explained variation and your SSE, which is your error sum of squares, is your variation of your y attributes to factors other than your x and hence this are called unexplained variation. So with that, the SSR and the SST, we can use them to calculate what we call a coefficient of determination. Otherwise, if you have your R, which is your coefficient of correlation, you can calculate what we call a coefficient of determination, which is R squared. So if we use the formula, R squared is given by SSR divided by SST or you can say this is the same as your R squared, your coefficient of R, which I can put it in bracket, it's your R squared and that will give you your SSR. So if you have already your coefficient of correlation, you can just find your coefficient of determination. Otherwise, if they've given you your SSR as an SSTs and they ask you to calculate R squared, you can calculate R squared and your R squared is the square root. If you take the square root of your R squared, you will get your R. So it means if you have R squared, you can find your coefficient of correlation. Right. What is what is the coefficient of determination? A coefficient of determination is the portion or the proportion of the total variation in the dependent variable that is explained by the variation in your independent variable. And that's how you're going to interpret the answer that you get. So you're going to say, for example, if I assume that we had an R squared of 64 percent or 0.64 percent, you would say 64 percent of the total variation in the dependent variable is explained by the variation in the independent variable or 0.64 of the total variation in Y is explained by the total variation in X. Let's look at more examples of this. I've already explained that. So let's say our R squared is R. So here we know that our R squared is equals to 1. We know whether the relationship is positive or negative. Positive or negative. We know that there is a perfect relationship on this because your R will be also equals to 1. The square root of 1 is 1. So in terms of how we interpret this R squared, we will say 100 percent of the variation in Y is explained by the variation in X. If we have your R squared, let's assume that this is the R squared of 0. It is a weaker relationship and therefore we can say it's of 0.17. If this is the relationship, this looks like negative so it will be a negative and this one will be R squared. R squared can never be negative. It will always be positive because it's a square. So this also, we can say this is 0.180. Let's assume that both of them are different. Let's make this one 25, 27 because it looks much better than the one below. So let's assume that those are the R squared that we have. So we can see clearly that this is a weak relationship but how we explain the R squared, we can say 0.27 of the variation in this is attributed to the variation in X and we can also say the same 0.18 of the variation in Y is attributed by the variation in X or we can also say some but not all variation in Y can be explained by the variation in X. You can use those terminologies as well. If your R is 0, we say there is no relationship therefore the value of Y does not depend on the value of X and we can also say it in this way. None of the variation in Y is explained by the variation in X. So always remember how we interpret your R squared. Suppose that the correlation coefficient between the person's salary and his educational attainment is equals to R equals to 0.6395. How do we interpret this? We interpret this by saying your R squared which is your coefficient of correlation because this is your R and we want to find the coefficient of determination. We take R, we square the answer so we just square this so it will be 0.63 times 0.6392 times 0.6392 or you can say it's 0.6392 which will give you 0.4090 and how do we interpret that? We can say that approximately 41% of the variation in the person's salary can be explained by the variation in his educational attainment. So from getting a coefficient of correlation we can just calculate the coefficient of determination to determine what is the percentage of the total variation in the person's salary that can be explained by the variation in his educational attainment. Before I move on are there any questions or comments? Okay. Yes justice. You will say when we interpret this we say this 41% is from the can you just elaborate on this 41% please. So the 41% you calculate it because you are giving your coefficient of correlation and you are asked to calculate the coefficient of determination and interpret it you will take your coefficient of correlation and square that because R, R squared is R times R right because there are two of them. R squared is the same as R times R which means there are two of the R's. So if I take R squared which is 0.6395 and I square that until I get 41% and when you calculate or when you interpret the R squared you always state it in this manner and say the 0.41 if we round it up to 2 decimals 0.41 or 41% of the variation in the person's salary can be explained by the variation in their educational attainment. Maybe I have done something wrong here I don't know because when I square this is 0.6395 I got 0.79 when I put it in the square root. 0.6395 square equals so you don't take this square root you put this square because it is 0.6395 times 0.6395 not this not the square root but the square the power to the power yes like that okay thank you okay if there are no other questions then we move on to the regression analysis. So in terms of the regression analysis regression analysis is used to predict the value of a dependent variable based on the value at least of one independent variable so in your module we only work with one independent variable so we don't do multiple regression analysis we only use one independent variable and your regression analysis is used to explain the impact of changes in the independent variable on the dependent variable and this impact of change we use the slope to calculate that impact that those changes as well. So your dependent variable will be your outcome variable which will be the variable that you wish to explain or predict and your independent variable are your input variable which are the variable you're going to use to predict or explain your dependent variable so our independent variable will be the one that we use a copy on top or y hat on top that will be your dependent variable because it's the value that we want to estimate so this is your regression line those who did math in in high school or did physics you remember y is equals to mx plus c which was the equation of a straight line that you have learned the same concept of y is equals to ax plus b sorry ax plus b some some of you would have learned it that way of y is equals to a plus bx something like that so some of you would have learned it like that in your high school especially those who did their caps recently you used y is equals to ax a plus bx those who learned in my era you you learned the equation of a straight line as y is equals to mx plus c and those who did physics you also probably would have learned it that way okay so they mean one and the same thing so in terms of statistics we use y hat is equals to b0 plus b1x where your y hat is your estimate your y estimate your b0 is your intercept which will call it the estimate of your regression and your b1 is your slope which we call it your estimate of the regression slope and your x is your observation value and that is your equation of the regression or what we call the regression line or the least square regression line or the regression equation on the regression equation these are the two values that we need to calculate and substitute the other states as they are as variables y hat and x stays the only value that changes will be that so in a way our regression line we will end up having y hat is equals to 2 plus 3x let's assume that this is our regression line we will have to write it in that manner or in this fashion always remember that the slope is the value that multiplies next to the x right and your intercept is the value that is standing alone so with that note what is a regression line a regression or a simple regression line provides an estimate of your population regression line which is the equation that we just given we should be able to use the regression line to find the relationship between or estimate the relationship between your x value and your y value and we should be able to also interpret the intercepts and the slope what do they mean in terms of the values that we are seeing where your slope will always tell us what will be the changes in your y value assumed by how they relate to the changes in your x value and your b zero which is your estimate will be where your x value is zero it will always be the same as your estimated value so your b zero is the estimated average value of y when your x value is equals to zero so if this value is zero then your estimate will be the same as your y intersect your b1 which I've already explained it's the estimated exchange in the way and the average value of y as a result of one increase or one unit increase in the value of your x and that's how you will interpret the two numbers so those two numbers b0 and b1 we're going to interpret them in this manner so let's look at the next example right fm personnel department hired employees for a given job primarily on the basis of the results of an aptitude administered to the job applicant and the performance of those hired was rated on the same scale by their supervisors a year after they have been hired a sample of the test grades and the supervisors assigned grades is as follows so here is an HR issue where they want to see whether their employees that they have hired are doing the job well based on how the score that they got when they they got in as job applicants and the score that they got when they do their performance management with their supervisor so we need to compare to see if we want to see if there is any relationship between how the job applicants got in their job aptitude test based on the performance in their job right now as their supervisor scored them is there a relationship then we need to put the test grades and the supervisor grade on a Cartesian plane and do a scatterplot to visualize them so here is the test scores and here is their supervisor grade so one employee scored a test grade of one and the supervisor gave them a one and we can go and plot that one and a four and that is the employee that we were talking about the other one scored three and they got a supervisor grade of six so three and six that is the one five and ten five and ten the next one five and twelve the same five and twelve the last one one and thirteen one and thirteen and there is our scatterplot and I also draw a line just to see how close they are from this line we'll talk about this line in a short while because this line remember the equation of a straight line is the same as your regression line which means this is the regression line that we have drawn so if we can see this we can interpret this how do you interpret this based on what we have learned in terms of the relationship when the value of x increases the value of y increases so we can interpret it in that way and therefore it this means there is a positive relationship right because here we are only talking about whether there is a a direction a positive relationship and we can see that there is a positive relationship with this one being an outlier and also refer to it as an outlier and you all know what an outlier is yes so if we need to go ahead and calculate this regression line so that we can write this regression line in terms of y hat is equals to be zero plus b one x we need to calculate the regression line so to calculate the regression line we can either use excel on excel we can go ahead and take the test grade and the and the supervisor score and create a regression output from the data analysis panel or data analysis main bar and i will show you how to you can do that and that will produce a summary output on the summary output there are a couple of things that you can take note of the first one is the multiple r and your r squared or your adjusted r squared and r you can ignore them only those two multiple r which will tell you this is a coefficient of correlation r and this is your r right so you are able to see the values so it's 0.32 and 0.10 for r squared and for r the other thing that is very important for you it's those two so we know that we have our intercept and our test grades will be our slope so intercept will always be stated as intercept your slope will not be stated as slope it will be stated as the actual value but you just need to remember that that will be your slope there only other thing that you need to worry about when you look at your output is the coefficients which means they would have given you your coefficients which are 7.125 and 0.625 so those are the coefficients so we already spoke about correlation coefficient of determination and taking the coefficient we can then go and and describe our remember our equation says y hat is equals to be 0 plus b1 x so our x is the test grade our b1 will be the coefficient which is 0.25 our supervisor grade is our y hat and our intercept is 7.12 and that is your regression line I will show you just now on how we can do this on excel okay so let's assume that now we are not given the excel output but we are asked these are the questions determine the least square regression line which is that equation they gave us find the coefficient of correlation and determine the equation of determination remember now we're going to use the manual calculation remember our two columns x and y our test grades test grade and supervisor grade you need to make sure that you capture them the way they reflect on your question as well you also need to add some totals because we need to calculate the sum of remember those sum square measures that we used they use the summation summations are totals so if I add this 15 this will be the same as the sum of x is equals to 15 this will be the sum of y is equals to 45 if I take x times y which will be the sum your x times y will be one times four will give me four three times six is 18 when I add all the answers it will give me the sum of x and y which is equals to 145 if I take your x and I square every answer it will be one square this one times one is one three square this three times three is nine five squared is five times five times five is 25 if I add all of them this will be the sum of x squared which will be 61 and I do the same with y I take the y squared y four squared is 16 six squared is 36 and so on and this will give me your sum y squared as you can see then I can go and use my sum square measures to answer any of the questions that we have I can also calculate the mean remember the mean which will be your x bar will be equals to 34 the y the mean of y will be equals to nine so why did I do this on the table so you just need to make sure that to answer the questions either for the correlation of coefficient or for the coefficient of determination or the regression line you need to create the table that looks almost exactly the same as this why because there are equations that you need to calculate so we already determined what all those are so we know that the regression line is y hat is equals to be zero plus b1 we need to first calculate our slope and the slope equation we calculated using that equation as you can see it uses the sum of your x y it you need the sum of x you need the sum of y or sum of y which is the sum of x the sum of y which are those values you need the n n is how many they are so we know that n here was equals to five because they were one two three four five they were five so n is five we need the sum of x squared sum of x squared we did calculate it here we also need the sum of x which is that so the sum of x squared and the sum of x squared are different 61 and 15 are different this is sum of x which is 15 squared and n is five you will need to calculate the slope from calculating the slope we need to calculate the intercept and our intercept is the b0 to calculate the intercept you need the mean of y we calculated x bar and y bar the mean of y minus the slope times the mean of x you know that how do we calculate all that so now let's do the calculations so the first one we substitute the values into the equation the sum of x y looking at sum of x y is 145 sum of x is 15 sum of y is 45 n is 5 divide by the sum of sum squared is 61 minus the sum of x squared is 15 squared divide by 5 and the answer you will get would be the same as the one that we got previously which is 0 comma 62 then we can calculate the sum of mean of y which is 45 divide by n 45 divide by 5 gives us 9 the sum of x 15 divide by 5 gives us 3 and we can substitute all these values that we have into the equation to calculate your intercept b0 is the same as your mean of y is 9 minus b1 is 0.625 times the mean of x of 3 is 71 7.125 that's the same as what we've got from the excel output now we need to substitute b0 and b1 into the equation so it means we're going to take only those two values and substitute back into the equation and our equation of straight line based on the test and the supervisor grade is 7.125 plus 0.625x let's say and assume now that we need to calculate or estimate if the test was 2 if the test someone got a test grade of 2 if we want to estimate that so our y hat will be equals to 7.125 plus 0.625 and the way we see x we're going to put the 2 and then we're going to calculate that and that will be equals to 0.7.125 plus 2 bracket plus 0.625 times 2 those bracket equals and the answer is 8.3 8.38 and I can see that this is in decimal in in in whole number so if I round it off it will just be 8 so that would be equals to 8 the same will happen if they ask what happens if they have a score of of 6 you will just go and and calculate that you will just go and substitute and say when the score is 6 the answer is 11 when the score is 6 the answer is 11 so I've already created two more points so these are my estimates my y hat oh sorry my y hat these are my y hat that I am estimating and you should be able to do that by using the equation of a straight line to estimate a new value of y you can also estimate the value of y by using the same so remember there on the SSIs where it's your estimate minus your observed so you will have to create a y hat estimate here so for example if I need to estimate what 1 would be so for 1 you go back and you say 1 and that will be equals to 8 so you're new for y it's a for 1 it says it's 8 that is your estimate it's not the actual value and if I estimate 3 you do the same 3 is 9 so your estimate for 3 will be 9 you go and estimate for 10 for 5 5 and 10 and the estimate for 1 you did calculate it was 8 so this is this becomes your new estimates and when you go back if we go back one must load five more slides back or more than that remember all this you are able to calculate them because that is how you will have calculated your y hat you know what your mean is we've calculated your mean of y and you can just substitute and calculate your ssr if you calculating sse you know what your y hat are you know what your observed values are so we know that our observed for 1 was 1 the estimate was 8 oh sorry the y hat the y value for 1 was 4 the estimate was 8 and the answer you take a square and you go to the next one the y hat estimate there was something else and we found another one so for 10 was 10 minus 10 for 5 was 10 and 10 and you square and you add all of them so you will just use this to answer those questions that you have there so that is one way or two ways so I've already shown you two ways so I've shown you the excel right I've shown you whether to use the manual calculations so these are your manual calculations but now let's continue so let's say we need to calculate your r because question b said calculate r and coefficient of correlation so your r this is the formula to calculate your r if you don't have your excel output summary therefore it means you are expected to do some calculation this is the formula to calculate r same you just need to substitute these values that we got here onto this formula as well we know what the sum square measures way this was 145 this was 15 this was 45 and was 5 and so on and so forth we just substitute into this formula and we calculate and we find your r and you can see that the r will be the same as the previous one that we got which was 0 comma 32 and from your r you can calculate your coefficient of correlation sorry we didn't calculate it yeah so your r squared you just take your 0 comma 3227 and then you spray it use the long one don't use the already summarized oh so you'll say 0.3 222 0.33227 squared and that gives you 0.10 oh we can call it 10 percent and now we can interpret this your coefficient of correlation we say there is a moderate positive correlation or relationship and for your coefficient of determination we say 10 percent of the variation in the test grades can be x claimed by variation which one was our x sorry my bad our x is our y is our supervisor great sorry my bad need to write it correctly so it's first the supervisor great supervisor great can be explained in the test grade or by test grade so that is how you interpret the two interpreting their correlation the regression line we can interpret our b0 which is our intercept because our normally the intercept we normally actually even don't interpret it because 7.125 it's your your your intercept so it would you will just say 7.125 average of the value of your supervisor grade is will be the same or the or you can say the estimated value of your supervisor grade will be 7.125 if your test grades are equals to zero because if if we put zero there this will be zero therefore it means the estimate will just be 7.125 that's how you will interpret that but in terms of the slope how we interpret the slope which is our b1 we say because it's 0.625 it will tell us that the mean of your supervisor score or your supervisor grade will increase by 0.625 on average for every additional one unit increase because it's positive so for this is the same as increase when it is negative we say decrease so 0.625 because it's positive we say the supervisor grade will increase with every one additional increase it will we will say if it's negative we would have said it will decrease by 0.625 for every one additional increase in your your test grades so you just need to pay attention to that so this is the key weight as well increase or decrease it's based on the sign in front of the slope and the sign in front of the slope should also correspond to the sign in front on your correlation coefficient okay in summary since we are left with 30 minutes and I haven't even gone through the excel output and the calculator in summary what you have learned is the following how to use your regression analysis to predict the value of your dependent variable based on the value of your independent variable we have learned the meaning of the regression coefficient b0 and b1 which is the intercept and the slope we've learned how to make inferences about the the the slope and the coefficient of correlation we've learned how to interpret the coefficient of correlation and the determination coefficient of determination now I want to go into how do we use a calculator so we might not be able to do a lot of exercises but this might might also give you some idea in terms of how you use your your calculator instead of using those formulas that I've shown you you can use your calculator to solve the same questions so let's assume that this is the new data that we are given consider this following data and we need to calculate the regression and the coefficient of correlation and so on I will suggest that if you have a case your calculator you open it and we start when I say press this you do that you follow as I do so that then you don't get lost I will also open my one and I will do the same with my one so I will expect you to mimic what I do as well and follow what the slide is saying okay so we have our data our x and y we need to first press our calculator and put it in a state mode so you go and press your mode button right you will go and press the mode button and this menu will come up on your calculator then from the mode you will need to press state two or you will need to press two for state mode so depending on your calculator because in some case your calculators are different on this one that I have is when it looks exactly the same as what I am displaying on the PowerPoint so I'm going to press two so you need to make sure that you also see what you see on your side and press state the one that looks like state then this a menu will come up and the next menu that comes up will look exactly the same as what you see in front of you on on the slide and on my calculator on my one it's on number two that I'm interested in that's the way we will do the regression calculations the first one we would use it for answering anything dealing with measures of measures of um variation and measures of central tendencies the ones that we did in tem one so for now we're going to use two for a plus bx as you can see that it looks almost like our regression line a plus bx so I'm going to press two again so it means after that and the table will appear therefore it means our take our data is ready or we're ready to capture our data to capture your data based on this you have your x and your y you need to capture first the x values then we go and capture the y values to capture the x value we say four equal two equal six equal four equal three equal once you are done and you are at the end you will use your arrow to scroll up to capture your y values you go up up up up with those arrows up up up up until you get to where it's four on the x five and then you move to the right to get to the y block so you will be on your case I will be flicking on the y block and then you start capturing five equal three equal seven equal so let's do that based on my one calculator so I'll start four equal two equal six equal four equal and three equal and you can see that there are five of them so one two three four five so I need to make sure that my observations there are five and my block is on six so I'm just going to go up up up up until I get to one where four then I use my right arrow to go to my right and I'm on the same you need to make sure that you capture the values exactly as you see them four needs to correspond with five two to three six with seven and four with six so I'm going to say five equal three equal seven equal and six equal five equal and that is the last one and I can just go back and double check if my values correspond and I can see four and five two and three six and seven four and six and three and five they are exactly the same then you're going to press an ac button your data is stored on your case your calculator you can just press the ac button now in order for us to get to the next step we need to press shift and then press one so that we can get a menu that looks like that so we need to press shift and then press one and you will get a menu like that any question okay so someone was asking a question so now you you should have a menu that looks like this you can ignore number one where it says type number two will say data that will be the table that we kept checked you will be able to see the same table or the same data number three will give you the sum square measures like those ones that you saw the summation of some numbers so if you press three you will see all those summations so if you are given a formula for the coefficient of correlation you can use the summations to calculate or to substitute into your formula so here are the summations so to get them you just press any of the the value that you want to use and you use that so once you're done let's say we want the summation of x and y so you'll just press five and that will be the summation of x and y and then you will press equal and it will tell you that it's equals to zero it's sorry it's equals to hundred and seven once you're done you have the answer you can press the ac button and go back shift stat and it will take you back to that and you want to continue and get the next summation value you just press three again and then if you go to the summation let's say you want two and then you press two and then equal and it will give you your summation and once you're done you go back and so on and so forth but we're not interested in the summation for now yes you say you press shift and you press what one shift one and it goes back to that type summation register then and then you will press any value that you want to go into so for now I was just demonstrating the sum but I want to move to the one that talks to the regression so let's go to regression which is five so we're going to press five for regression and you will see here you will have one which corresponds to a b corresponds to to b and two corresponds to b and three to r and four to x with a copy and y to the estimate y with y with a copy so now let me explain each and every one of them so where you see a it is your slope so what I would suggest you do is the following it's for you to write the equation y is equals to b zero plus b one x so that you know that where you see a it corresponds with b zero where you see b one it will be b right because from this equation that you see here it looks exactly the same as that remember your intercept and your slope this is very important especially when you go to answer the question so you need to know that this is your intercept and this is your slope b the one a for one will give you the answer for this intercept when you press two for b it will give you the answer for the slope that is one thing that you need to remember r is your coefficient of correlation right if you need to calculate the coefficient of determination you will have to press the x squared button so three will give you coefficient of correlation x squared after you press three will give you your coefficient of determination if you need to estimate the value of y I will show you how to do that first let's find the slope of this values I press the wrong number shift one five so let's find the slope which is one a equals our slope is one comma six five nine so I hope everyone has that so y is equals to one comma six nine I'm just going to keep two values six five nine so which is six six right and now let's go and find b so b before I write the sign I need to find the value of b b to find b you do the same it's on five b is two and you press equal it's positive if it was negative it would be negative so it's 0.93 so that will be b plus 0.93 93 and I'm not done I must put an x there that is that so let's say for example I need to find what is the coefficient of correlation r shift set five r is three equal 0.93 it's our r so our r our r is 0.93 let's go find r squared our r squared will just be we just press the square button the answer will square equal 0.87 0.87 so we know that it is a positive a strong positive relationship and we know that 87 percent of the variation in our y is attributed by the variation in in x right so that is that so what if I want to find a new estimate let's say five where x is five so if I need to go find the new estimate so let's go find the new estimate you know that our new estimate is five so to find the estimate you first press what x value is and then you go press shift shift and then one and rec five and then we go to five again for the estimate the y estimate five and you will see your answer will look like this and when you press equal the new estimate is six so where x is five y is six what if they give us the y value the estimate for y value let's say they say our estimate here is is the estimate here is four so they gave us the y estimate they asking us to find the x estimate same procedure you start first with the the value that you want to estimate which is four and then you press shift you press that you press reg for five because now we are estimating the new value of x we're going to use four we're going to use the x estimate which is on button number four it's just that I used four and the four is the number but if it was 10 you will start first by 10 and the new estimate is three so where y is four x will be three so let's say our y is 10 we need to estimate the x value similar thing you just press 10 first and then you go shift set five and four and equal and it is equals to nine and that's how you will do the estimate so always on your case here you will go back to shift and state because this tells you that you need to go and visit your state functions shift and state to call your state functions all the time you're going to always continue with shift and one and then select whichever one you want to to select so if you want to find the mean the mean of y and the mean of x if you will find it on var there are your means and there are your your s which are your standard deviation for x and standard deviation for y and so forth you will find one of them from yeah and the mean and the mean of x and mean of y so you will find them there so that's how you will use your calculator so those who have a sharp calculator also a financial calculator these are the steps you also just need to make sure that you put your calculator to state mode i'm just going to open my calculator so you follow the same steps you will also do the same because your calculator the values are visible in green in front of you the means mean of y mean of x the standard deviation of x standard deviation of y your sum square x and y your sum square y your sorry the sum of x and y the sum of y the sum of y squared the sum of x squared the sum of x they are visible here the only thing that is not visible don't look at the a b c d's that are there look at the a and the b that are here so the same thing your for you a plus b it will be the same so your y is equals to a plus b x is the same as b zero and b one x so on your one it will be written in small letters like that so let's look at how you follow this so you'll say go into mode and you will press one for state instead of pressing zero because zero is when we do the study unit three we're going to use line so you will press one for line and your calculator will be in state mode now if you have a financial calculator or you have a this calculator which is a normal case you a sharp calculator there is a small difference on your financial calculator yeah you have the data change and you have other things so you need to make sure that you press the the the values that are next to one another there the enter and the data change so what you will do is your calculator is ready it's in state mode one which means it's ready to do regression calculations so to capture your data because there are two of them we always capture your x and y it's not the same with the case your way they press the equal sign and capture one row first you need to capture your x and y so i will be using the st o and the n plus on your financial calculator you will be using the two buttons that are next to one another i forgot which one is which but there is an enter and the the enter and the data change so you will be using those two so you will first start by pressing the x value which is four and then you press the st o which will be the data change on your one and then you will press your five and then you will press the enter when you press enter it will say data set one or n is one whichever way it shows so i should have data set five when i get to three and five so let's continue two st o three n plus on my one six st o seven n plus and then i go four st o six n plus and three st o five n plus and then i've got all of them so it says data set five so i know that there were five records so i've started all of them and then i can press on and off button and from there i can select whichever calculation i need to be calculating because i'm calculating the values that are in green i will first need to press the alpha button so let's say i say alpha a and then i press equal you can see that you get the same enter which is one comma six five six one comma six six plus and you do the same if i need to get to the b you just press on and off and say alpha and b equal and zero comma nine three zero comma nine three x and now the difference with your calculator is your r is there so also you can do the same alpha r equal and b is your zero comma nine three and x squared you just press x squared it will square that it will give you your r squared so let's say we want to estimate the same thing that we did with the previous one let's say we want to estimate the value of y and they told us it is five on this side oh we want to estimate the value of y for x so you do the same you're going to press five now on your calculator this are the two on your calculator it has them as y hat y copy and x copy those are the ones that we're going to use to do the estimate so we press five and then you press shift because they are written in orange they are on the open and closed brackets for both so you just press second function and then you press the open bracket and then it is six you will notice that it will be the same as the one that we got from the previous one the same thing if i need to do nine and ten it will give you the same so let's assume that we have ten and we need to estimate the value of x you will do the same on and off ten second function x and they it's nine and those who are using the financial calculator i do also have the steps but you can see that the steps are almost exactly the same so you have your enter and you will have your your x and y data change or the comma on that one so the two buttons that are next to one another those are the things that you use the x and y and the enter button the steps are here you can and also your calculator looks almost exactly the same as the normal sub calculator especially when i was referring to the r and the a and the b and the summations they are there in front exactly the same so you should feel at ease if you use those types so we only have 15 minutes and in this 15 minute i wanted to do an activity but instead of doing an activity of using a calculator and doing that i'm going to do this activity on excel because that's what i said so you will have this excel sheet which is a template you will notice that there are a couple of things on this excel sheet already there are things that are pre-calculated automated the blue area you do not change you do not do anything through that it will calculate automatically all the values that you need the value of your b1 the value of your bx of sorry of your mean your means x and y the value of your b0 and it will also calculate your y estimate it will calculate sorry your regression line and you can see next to it there are some formulas right there the other thing as well that you don't have to do anything it will calculate is your coefficient of correlation and coefficient of determination automated to calculate and it also gives you what equation we are using and i also show you the functions that i used to do those calculations you can also see them there if you want to replicate them and do them your own i'll create them on your own the other thing that i've also included here is calculation of your sum square measures your sse's and your sst's in case they give you questions that relates to them these are calculations the answers for each one of them so to the blue one the answer is 263 the green one the answer is that however now the challenge with using this template is knowing how to use it so that it does not affect your calculation i've added the node here you need to read this node it says if you have more values or more data to add a row by clicking you need to click in the cell b row eight and drag that row and highlight the row until you get to until y equals two two or y squared right so you will just have to let me demonstrate what i mean by that if for example i need to add or delete for example this one i must delete all this so i need to go and highlight it like that you can see there right i'm not going on to the row and do this you will delete the rest of the other calculations that we have you need to go and select from where you need to stop so here we have maybe i should make this gray this or maybe this i must change this to gray this area so the white area is the only area that you need to do any changes so we have this part that you need to change so let's go to our question you can minimize this and go to our question our question has one two three four five six seven seven so i'm gonna go to our table and check one two three four five six seven so i don't need the rest of this so from 12 to 10 i must just highlight until i get to y squared to that column column k and when i get to that column i must just right click and that is the instructions as they are right click insert or delete maybe i should have said it insert or delete the columns that you don't need you can repeat the step until you have enough rows to complete your x and y so if you have more values that you want to include you can just right click you do the same right click and add more values so now i do not need the this other values so i'm going to delete and it's going to say do you want to delete the entire row or shift to the left i just wanted to shift up if i was inserting it will go down so i just want to delete that only and that is that and oh maybe oh i forgot sorry my bad and i said column b so you need to start from column b because there are some totals there that we need to calculate as well so you start from column b and you delete and it goes up and that will stay and today is my total so i'm just going to delete all this because i don't need that and you will see that my table now has meaningless information on that and then i'm going to capture the information as i see it one i'm going to start with all these values from x from y two four two four five six seven nine nine and they automatically it calculates all the other measures that i i have and they is my sum square measures so my summation of x y my summation of x squared my summation of y squared my n and if i scroll to this side i will get my intercepts and my slope my slope my mean and mean of y mean of x and the intercept and the equation of a straight line so i can also double check 28 divided by 7 is equals to 4 and 42 divided by 7 is equals to 6 so it means i've done some diligence there the calculations are correct so i can go to the question so let's look at our question and use the data that we have so the first question is asked they say oh we can also double check because they did give us the sum square measures here right so we can double check those measures so we know that this is the sum of x and y this was the sum of x if i go up you should be able to see all of them sum squared x squared it's 140 there is 140 y squared is 292 there is 292 so already i can see that we've done the right things on some of this it's difficult when you are doing a presentation and you open this thing okay and so on so let's go to this okay so here are the calculations the mean of x we said it's 4 it's correct the mean of y is 6 it's correct b1 it's 1,178 it's correct uh because this is two three decimals we can increase the decimals and you can see they it's the same 1,86 sorry and the regression line it says the regression line should look like this we're looking for the incorrect one so here is the answer to our regression line says the regression line should be this so looking at this and looking at this we can see that it is the incorrect one the slope we know that the slope is positive because the unsung god day was positive so the slope is positive that is the correct it says when we estimate the value of 8 where 8 is where x is 8 what is the value of y it says the value of y would be equals to 10.978 we can check that because we can say that value plus that value times 8 which is our estimated value they say it should give us 10.71 very disappointed my calculation doesn't want to work on there so but it's fine we can use our calculators because we do have calculators just want to double check something here as well the calculations that we have here which is the same so that multiply by line and they say my fault here was using something not right some more it's because I used 8 6 instead of 8 and that is the answer that is correct but if we use our calculator we can also get the same intact because it's we close the case you want the shop or let me use the case you easy with the case you so we have our equation of a straight line as our slope as 1.286 plus sorry our intersect plus the slope of 1.178 6 times 8 equals and that is the same is just that because I've rounded off the 2.80 as well so that is why the answer my answer is increased so we can increase the decimal here and we should be getting the same the same answer more or less so because I I have rounded off yes in a question like this so if this was one of the questions in the exam they will obviously give us the intercept and the slope in order to answer a question like that right nope nope they won't give you like you see here I you have to calculate them in order for you to get them right in order for we are answering this question the only information they gave you was this was the table and the sum square measures you need to be able to know how to calculate the mean of x the mean of y b1 you can either use the formulas like like this formulas that I shared previously you should be able to calculate this formulas all this all these formulas based on that information and if they ask you about the r you should be able to use the r's the r formula or you can use the excel sheet that I've just shared with you and I've just demonstrated it or you can use your calculator we've done we've used the calculator right to calculate all the values in order for you to answer this you can use your calculator or the excel or the formula now the last one that I also wanted to show you in terms of that is I'm going to take the same questions the same data that we have which is that x and y and z some some of you you might not have the if you go to data there is a data analysis panel if you don't have that you go to file you can come back to this circuit and watch it again you go to find accounts and sorry not account options and you go to add ints and it will come to this menu there is the thing called analysis tool pack you click on the analysis tool pack and you say go at the end and this menu will pop up you're going to tick the analysis tool pack and click okay and then once you've done that once you've done that you close your excel and reopen it and the next time you reopen it the data analysis panel will be there once you have that then you capture your data this let's assume that I've captured this data I'm going to open a news sheet you capture the data you type your data if there is your x and there is your y and then you answer the question by going to you data analysis you don't have to highlight it you will see why you don't have to highlight it you go to data analysis and this menu will pop up you scroll until you get to the regression you click on the regression and you click okay and here it will say which values are your y values you need to be very careful with this in terms of the data that you are given so our y value you're going to select only the y values you go to the x value you click inside the box where it clicks and then you select your x values you also need to state whether that you included the labels you see that I've includes the label every time because I want the label to be part of the report as well so you will tick labels you ignore the rest of the other things that are here everything that is on here you ignore you can leave it on new plan or you can say I want it on the same sheet as I am you click on the output range and you click inside the box where it's it's got the arrow there and then you just go one two three or you can you can just select where you want to put it the output you just click on one of the cell and then you press okay it will generate this output and you can just make it bigger a little bit so that everything is clear you are able to read every weight of it and there is the things on there so the only thing you need is that because remember that is your coefficient of correlation that is your coefficient of determination now let's go back and answer the question so already you can see from here if I go this way there is my my b1 is this way you must always remember that intercept is b0 slope it's your actual value that you will see they is your x y the only thing that you don't have with the output is your mean is this other thing right so you only are able to answer one and two but you cannot answer those other ones so to answer the rest of the other ones you go on the same data that you have you go back to data and you go data analysis and you go descriptive statistics and you click okay and you go into select the data both of it and you say it is grouped by because it looks like this it's by column so you say it's grouped by column and the labels are in the first row and you're also going to say you need the summary statistic because you want to know what is the mean and what is the standard deviation and the output you can also put it right here maybe you can go just a little bit away from this one now also it doesn't give you all the answers that you need because this one will just give you your mean those are the only thing that you will have your mean for x and y because that is what you would have calculated right so you are only able to answer those two that's it the slope you only going to look for the slope the whether it's positive or negative based on the answer that you got from the slope which is the answer yeah positive means the value is positive negative the value would have been negative this is a it's not a slope this is an intercept so you cannot use this you will use the value that is sitting in front of the value that is multiplying with an x the estimate you will calculate it based on the regression line so let's see yeah there is no other thing other than only calculating the mean that you will get from here it will not give you the other things as you can see to calculate the other measures that you might need like your sum of x squared and all that you will have to calculate them manually by multiplying this by that get the answer there so let's say you want to create an x and y which is x multiplied by one you will just say that value multiply by that value and you will get the answer and you just go to the end you will also need to do the totals remember it's very important to always have totals and that will give you the total by just going to autosump and adding all the totals and that is 201 which is the same as what you have yeah 201 and you can do the rest at the others you won't be able to calculate if you don't know the formulas so that is one of the exercises the other exercise looks like this also they have given you some of the sum of square measures you can use this the formulas you can see here they're asking you to calculate the query coefficient of correlation and the coefficient of determination so based on the this one you we have only seven rows yeah so on this question it has one two three four five six seven eight you have eight observations so it means on our excel sheet you're just going to minimize it on the side and make this smaller we just need to add one row so you just go to that and highlight and stay in set and we need it to shift now and you just enter the values what I like to do is to delete that now the other thing oh sorry I forgot to mention the other thing is you will need to adjust the calculations as well because when you add the row there are no calculation edit on those rows the empty rows that you have there so you go to the top one that you have here and you just drag it will do the same calculations as the previous one and then you just add the values so I'm going to start with x five three seven nine three four six and eight and you do the same with the y 20 23 15 11 27 21 17 and 14 and you will see once I'm done with that then I can just go to the left oh sorry to the right and they are my answers they are my answers okay so let's see if I have all the answers that they need and make it bigger so it's much better and reliable so the first one says the coefficient of correlation is zero comma it's minus zero comma nine nine way is my coefficient of correlation there is my answer you see how easy it will be otherwise you will need to go and do that on your calculator and enter the data and then go find the coefficient of correlation so with the template it can be easy but you need to know how to use the template then what does it mean the coefficient of correlation is negative zero point nine nine the coefficient of determination is positive always the coefficient of determination will be positive because it's a square that should be correct as well um because also when you look at the answer they it is positive the best fit line is given by this so let's see our best fit line so we know that two and nine nine must multiply so our slope must multiply with an x so if you look at this the slope is not multiplying with an x that is the other thing that you need to pay attention to when you answer the questions in your regression questions they might swipe around swap around the slope and the and the the intercept so this is incorrect it should be minus two point one one nine x plus eight as you can see there is the answer okay because b one is minus two one and b one always multiplies with an x there is a strong negative relationship looking at the coefficient of correlation you should be able to state that and the next one it says uh the estimate in the correction above the variable are reliable how do you state that they are reliable is it based on the coefficient of determination or the coefficient of ah you can see that it's almost close to one which is perfect so the estimates are reliable the open question that is incorrect is number three so and you will notice with the other activities i'm not going to do all of them for you you will have to go and practice and try and do formulas following what we have covered formulas excel two ways in excel because i've shown you the one the two ways that you can do so this is the other way using the template or using the data analysis you will should be able to get the same answers or using the your calculator now you've got four ways that you can answer correlation questions also using your calculators you should be able to know how to answer this question you should be able to calculate the slope the intercept and calculate the coefficient of correlation so that you can calculate the coefficient of determination there you can see that you should be able to estimate here they give you 10 years so 10 years is driving experience it's x you need to estimate your monthly car insurance money how much will it be you should also be able to answer content related questions where they ask you to interpret your art so yeah you need to know your your coefficient of correlation um how what is it how do you interpret the values and what does it refer to and this is one of those exercises here it's where you are not giving the data now right remember you can only use the template if you are given your x and y value you can use your calculator if you are given x and y value especially the functions otherwise if they give you things like this they expect you to use the formulas so you need to know you need to go and find what is the formula for s s x y and s x x so that you can calculate that you need to be able to find the coefficient the regression line based on the sum square measure equation we have done that the formula for b1 and the formula for b0 to calculate this because you have your sum square measures so you need to use the formulas you need to be able to take your regression line and estimate where x is 1.5 to get the value of your y you need to be able to use the sum square measures to calculate the correlation of coefficient so this is very important that you are not only going to be given the x and y values or two columns with data and ask you to just substitute them into the templates and you need to be able to know how to calculate using the formulas themselves okay you need to be able to know how to calculate sst the formulas are given to you if it makes it easier when they have given you x and y then you can use the templates and this is just that i'm not gonna repeat that and this is another one where they just ask you to estimate so on this one last one you don't have to go and do any other calculation except taking the score substituting into the formula and finding the answer so they've given you the formula they just want you to give them the answer what will be the predicted value um you need to know how to interpret r and r square in order for you to answer this question the same applies to the next one which is this question and the last one which is exercise 11 and exercise i think exercise 12 is repeating but that concludes today's session we took longer please make sure that you you just before you please make sure that you complete the register i'm gonna place the register in the chat uh the register is in the chat are there any questions or comments there we forgot to register as a sleazy remember i was i i'm not in the statistics group now can i get it through what's up okay or can you also add me there in the statistics group where my phone was lost okay i will check but i think you are on the group but i want to check okay so if there are no questions or comments happy let me see you next week saturday uh next week because we don't we no longer have anything uh content related but i know that some of you are busy with your assignments and i think it should be your last assignment if you have any question if you are still uncertain about anything relating to any because we've we now done with content we we are right at the end um it's now preparation for exams so bring those questions bring those uncertainties that you still unsure of like things that you are still struggling with and let's have that discussion for that two hours that we have otherwise enjoy the rest of your saturday bye thank you thank you thank you enjoy as well and everybody