 While we are getting ready, welcome to your exam preparation session. So I've scheduled two sessions today and next week, just to go through your past exam papers and also answer some of the questions that you might have regarding your module. Another disclaimer that I need to make, and I know that some of you might be new to the platform. You didn't attend my previous sessions that we had already. We had a couple of sessions before today. I am just a statistician, I'm not a psychological researcher. So some of the concepts I am not too familiar with. But we can have all conversation because you know your material, you have studied and you can help one another because we prepare for the exam. So don't expect me to tell you whether some of the things are right or wrong, except if it's an area that I am familiar with, which is statistics. So when you guys discuss about things like moderators and all that, I don't know any of those, but when it comes to probabilities and what are the differences between the different types of statistics that I should be able to help you and guide you properly. So since we're going to be looking at your past exam paper, someone shared the 20, just need to make sure that I'm looking at the right one. The May, June, 2020 exam, that is the exam paper that we're going to go through together. There will be, we can do question one up until question 70 for the two hours that we have, right? There will be sections in this question paper where I will not be able to guide you, but I expect you to have an answer to those questions because you've gone through your module as well. And anyway, it's another process of learning. The more we engage, the more you learn as well. Okay, so are there any questions, comments? I don't know anything about the structure of your exam, whether you're writing, using what platform and all that, that I don't have any idea or any clue about it. My job today is to help you get prepared with the content to go write the exam, that's it. Are there any questions? Anything you want clarity on or before we start? So for those who don't know, my name is Elizabeth. During the course of the session, you can call me Elizabeth, you can call me Lizzie or Liz, whichever name you find comfortable calling me with. The other thing, I am not a lecturer. I don't lecture at UNISA, and I just facilitate the classes for UNISA parallel. So we should also start on the right footing as well. And also, this session, especially the exam preparation, I feel like because I've spoken a lot for a couple of weeks, this is the chance that you guys start engaging and talking and answering the questions. So I will see how the session goes, but it means I expect you guys to participate fully during our conversation so that then it doesn't become myself alone asking questions and responding to the questions as well. So I need you to engage with the questions. I have a discussion, if you don't agree with something, say it, you don't agree, this should be this. Let's have that conversation. And if I see that we are stuck between the two, then I can come in as well to just make sure that we move on and clarify certain things so that then we are able to get everyone on the same path again. So I expect full participation. You can unmute your mics and ask or talk. Yeah, so feel free today. So I'm going to share the exam paper that we are going to go through. So this is the exam paper for the last time again. Are there any questions, comment, Piri, anything you want to clarify? You want me to clarify before we even start? This session is going to be a very long one because nobody responds to me. Nobody talks to me. So it's just going to be me alone in this for two hours. I need to rest my voice really. Miss Lizzie, I think people are waiting for you to start. Then if we can start, then the question will come because at the moment, you are busy talking and elaborating everything that we want to know. So we don't have questions. Let's start and then question will be asked. Okay, so I thought in terms of processes and anything, people want to know something before we even start. So anyway, so you just want to dig in. Okay, cool. Let's dig in. As you say, so question number one, that's how I'm going to start because there is no other way that I'm going to start. Question number one, that's how I'm going to start because there is no other way that I've prepared for this session other than just going through the question. So there is your question number one. The term inference is a psychological research refers to and there are three statements. One, the process of setting up a hypothesis as a relationship among variables. Two, making a prediction or generalization based on existing information. Three, the procedure for making a construct visible so that a measurement can be made. Which one of the three statements is inference? It's number two generalization. Yes, which is number two. So is that how we want to continue? I think I'm going to ask if anyone wants to take over and ask the questions because I am actually not feeling very well. My throat is sore and I'm coming up with flu and all that. Can I ask someone to read the questions for us so that I can raise my voice? Because 70 questions by the time we get to the end my voice will be gone. Question number two. No one. Okay, I will just read them because the challenge I have is because these are recordings that we share with other students. We know that at UNICEF we do have diverse students. There are those who are visually impaired. They hear by listening. So we need to read the entire question so that they are able to understand which one we're referring to. So I can't just say question two and then I stop right there and then you give me the answer. I need to read the whole question so that someone would understand what question was that and how do we answer that? Sorry, Lizzie. Question number two. Lizzie, it's Adele. The students need to help you out. I'm sure somebody can read the questions on your behalf now that you don't feel well. Melania. Melania, hello. Don't you want to read the questions? Melania, speak to us. So I know that there was another person who was speaking with me right now. I don't know who that was. Anyone? Please come. Now Adele, let me die for UNICEF students. By the time we get to 70 questions on this question paper, my voice will be down. They won't even hear me. It's their own fault because they don't want to help me. Okay, so question number two. Lizzie, can I just ask at this point that all the students tonight put their student numbers in the chat? This is the call in the register. Put your student numbers in the chat, please. And I just want to check if all the students who logged on as guests, if you do have access to the chat. Getting a response, Adele. I don't have access. Who's speaking? Tia. Did you log on with your work email address or Gmail address? I'm not sure it automatically logged me on. But when I asked you to fill in something, did you just fill in your name? Yes. You should have used your UNICEF credentials because it's important to have access to the chat if we post links. I want to post the link to the recordings for you. And now you won't be able to access that. You see, which is a pity. So remember with the next session to log on with your UNICEF credentials and not just your name because your new logged on as a guest. Okay, Lizzie, I won't bother you again. Thank you very, very much. No problem Adele, thank you. We can continue. Okay, let's start with question two. The mean range, the variance and the standard deviation are examples of number one, is it variables? Number two, descriptive statistics. Number three, inferential statistics. Which option? Number two. Number two, which the mean, the range and the variance and the standard deviation are examples of descriptive statistics. A variable is just a characteristic that defines an object or an individual. Inferential statistics are your hypothesis testing and your generalization statement. Okay, question three. Which of the options below provides the best description of the main purpose of quantitative research in psychology? Its purpose is to, number one, develop theories that explain the relationship among observed aspects of human behavior and mental processes. Number two, develop predictions about human behavior, which can be explained or which can be applied with absolute certainty. Number three, develop hypothesis about the relationship that may exist among the constructs. Which one of this best describes the main purpose of the quantitative research? That's number one. It will be number one. Psychological research, testing theory of human behavior, testing of theories against observation and mentalness. The main purpose of quantitative research is to develop theories and explain relationship among observed aspects of human behavior and mental processes. Do we all agree? This is where I'm going under the tree and not show my face. Question four. The process of selecting a subset of a population for a survey is known as a subset Is it one, triangulation? Two, sampling? Three, operationalization? It's two sampling. It's two sampling because you ask, we are taking out a sample out of your population into a subset. A small group. Question five. Empirical, sorry. Empirical knowledge is knowledge that is based on one, careful reasoning, two, appropriate theories, three, observation of events. Do you all agree? Yes. Yes. You guys, you didn't even need me to be here. I could go and have a nice rest if you are on track. I don't hear anyone disagreeing with anything right now. Question six. In the context of psychological research, to perform a measurement is true and this is one of those questions that I need to go hide under the tree. Number one. Find a way to observe a specific construct or phenomena which is hidden. Number two. Allocate the number on a scale to classify or indicate the magnitude of a phenomena or a construct. Number three. Calculate the summary value which describes an aspect of a specific construct or a phenomena. It's number two. Allocate on a scale to classify or indicate the magnitude of a phenomena or construct. That's number two. Others. We agree with heads number two. And there is another one. Question seven. Which of the following best described a latent? One. Observable. Two. Hidden. Three. Independent. Two. Hidden. This one I know it is a hidden. A latent is a hidden variable. Question eight. Abstract concepts such as anxiety, hyperactivity and intelligence, which are used in psychological explanations are referred to as number one. Parameters. Two. Measurements. Three. Constructs. Three. Three. Parameters. Parameters are measures that comes from a population. So it means the mean, the standard deviation that you calculate from the populations. Measurements are your data units that are those means and the other things that you use to calculate. Okay. Some of the psychological research concepts I know a little bit about. Not a lot. Question nine. A psychologist that has a psychologist has a theory. That visual perceptual ability influences the marks that learners will get in a mathematics test. In this example, learners marks in the mathematics test is referred to as their variable. Is it a dependent variable, independent variable or manifest variable? A dependent variable is an outcome variable. Independent variable is an input variable. A manifest variable is a visible variable. It's a visible variable. And it can either be dependent or independent, right? One. Because a latent variable is invisible, whereas the manifest is visible. But then what will be a learner's mark? Number one, dependent. Number one, it's dependent. It is a dependent. I don't know if I may ask, is it a variable that is influenced? It's an effect one. The effect one. Yeah. So remember X influences Y. The hands. X is your independent, which is an input variable. And Y is your dependent. Y is the variable that is getting influenced. And it is what we call an output or an outcome. Or an outcome. So that says that visual perceptual ability influences the marks. So the marks will be your outcome. Depending on if there are like pictures that are shown before, how would the student perform in their mathematics? Let's say if we show them a visualization of two apples and then taking away one apple and then will they understand and input and that visualization help to improve the marks, right? That is what this statement is saying. So your input variable will be that visual perception that you have of something and the outcome will be the improvement in terms of your test mark or it doesn't improve or you even score less. That will be the outcome. So the LENAS marks in terms of the mathematics test in this instance will become your outcome variable, which is then a dependent variable. It could have also like in a way became a manifest because your visual perception are your latent variable, right? Because you can, they are hidden. It's not something that is out there physically. You can see it. So question 10. Oh, before we move to question 10, is there any question relating to question nine? Are we good with question nine? Yes, ma'am. Yes, ma'am. Question 10. A psychologist is interested in studying the interaction between small groups of four to five people in each group. He suspects that the interaction between the between such groups can be described in a similar terms to the interaction between individual persons. In order to be able to do a scientific study of this question, or of this question, you would have to provide a or an definition of an interaction. Is it one? Is it this? A scientific study of this research question, he would have to provide an operational definition of their construct called interaction. Or will it be number two, which will state scientific study of this experimental question he would have to provide a research definition of the statistic called interaction. Or will it be scientific study of this, which is option three? This hypothetical question, he would have to provide an empirical definition of the parameter called interaction. Is it one, two, or three? Number one. It will be number one. It's number one. Yes. Research operational and construct. Because we must first start with the research and check the operational and then it will. Yeah. Oh, if you if you were if you were stuck between which one of the three, because it can also be an experimental research as well, right? You can look at the last bit because it says either that interaction would either be a construct. Or it will be a statistic or it will be a parameter and based on what I've explained what a parameter is previously remember, parameter is the mean of values that you calculate from a population. So it has to be the mean, the standard deviation or the model or something like that. Statistic is a measure as well that comes from the population. It has to be the mean, the standard deviation of that. But coming from a sample and interaction, it is a construct is not a measure, right? So you could have also use the process of eliminating those two. Question 11. The 10 population refers to number one, the entire group of which the data is to be collected. Number two. A subset of cases selected to represent the entire group. Number three. The entire set of variable which will be considered in the research. Number one. Number one. The entire group. Are we good? Are we all happy? Number one. Yes. Yes, it is the entire group. A population is the entire group. Number two, it talks about the subset of cases. That is a sample. And number three, it talks about the set of the entire variables. I don't know what that would be defining, but those will be the characteristics that will define whatever the population that the sample items that you have. Question 12. Question 12. A measurement that summarizes an aspect of a population is called the parameter while a measurement that describes the same aspect of a sample is called a statistic. That's how we do it. That's how we do it. And number three. It talks about a population. It's parameter. Samples. It's statistic. So you already gave the answer. So it's option three. Sorry, sorry. So we all agree with the statement that she just made. Yes. Question 12. The same aspect of a sample is called a statistic, which is what we have been defining so far. Now we're moving into calculations. Question 13. A jar contains five red, eight blue, three green and four yellow marbles. What is the probability that a blindfolded person would choose a green marble purely by chance? So yeah, you will have to go fight your calculator and calculate the probability of choosing a green marble. Five, nine, eight, three plus four equals 20. Okay. And three divided by 20 is zero comma one five. Number one. Now it will be your probability of choosing a green marble will be given by the number satisfying that of the, the green marble divided by the green total. How many green marbles? There were three out of 20 and that will give you zero point one five. Are we good before we move? So this means you are good. Question 14. A list of all outcomes that are possible in a specific statistical experiment is called the M of that an experiment. Is it one probability distribution? Two, is it frequency distribution? Three, is it the sample space? Number three. Because it talks about a list of all outcomes. That will be your sample space. A sample space is a collection of all outcomes in an experiment. Question 15. Then as the sample size increases, the distribution of the variable is more likely to resemble the theoretical distribution of the variable as you expected to find it in the population. And this is due to the number one, the law of large numbers. Number two. No. Central limit theory. Number three. Yes. The effect size. It's number two, the central limit theorem. What is the central limit theorem? Do we all agree on that? But it's number two. I think it's number one, law of large numbers. Because a central limit theorem talks about that you need to have a larger sample for your data to be normally distributed. And also when we talk about law of large numbers, it says even experiment is done repeatedly. Outcomes are independent of one another. So as the number of the times the experiment is related, the empirical probability will approach the theoretical probability. So here we are talking about the sample size, which is increasing. Yeah. But it can be law of large number. Okay, read that statement on laws of large numbers. Here on my notes that I made in, I just wrote that as the number of the times the experiment is related, the empirical probability will approach the theoretical probability. Yeah. And also... But then isn't it what you are reading? Isn't it what they are saying there? Because as the sample size increases. Okay. It will more likely resemble the theoretical distribution of the variable. Right? That's what it says. That's this definition. As the sample size increases, the distribution of that variable will be more likely to resemble the theoretical distribution of that variable. Hence I'm asking, read that law of large numbers. And what you are reading, it's the same unless it's me. Yeah, it's number one. Number one is correct. It's law of large numbers. Because as the sample size grows, the distribution will be similar to the normal distribution as well, probably. In your module, do you guys do the... So this is where it becomes a little bit tricky as well. Do you guys do also central limit theorem? Yes, we do, ma'am. Because with central limit theorem, it means you will have multiple samples, not just only one sample, but multiple samples. And when you look at the means of those samples, they will approximate to a normal distribution. And yeah, we're not even talking about the... The question is not asking about the normal distribution. It's talking about the theoretical probability. And what is the theoretical probability in this instance? First one, these are those kind of questions that are very tricky for me. Law of large numbers. Let's go find it. Law. And this is something that is in your experiment. Even experiment is done repeatedly. And if the outcome are independent of one another, the observed proportion of the favorable occurrences of an event will eventually approaches the theoretical probability. Let's go find central limit theorem. Central limit theorem. I have it here. I like it. And I think I remember even last year, we had a very room about this. Even random, or simple random sample size. And it's selected from a population with the mean and the standard deviation, the distribution of all means obtains from all possible. Samples is approximately normal. And the standard deviation will be given by the standard error. And nothing about the theoretical. Central limit theorem. Who gives a precise description of the distribution that you will obtain if you selected every possible sample calculated every sample mean and constructed the distribution of that mean. Nothing here talks about the distribution converging. The theorem gives the sample distribution of the sample mean for any population irrespective of the shape, the mean, the standard deviation or the original population. So nope, it's not central limit theorem. The distribution becomes normal, more normal. As the sample increases so that the larger the sample, nope, it is not. So the outside here, it's number one, which is laws of large numbers. And that is based on what is given to you even in the text law. Let's go back there. Of there, it's very clear. You can even make none of it on your study guide, it's on page 38. Because it could also even easily be in the effect size as well. Because law of large numbers and effect size, they are almost also related to one another. Because your effect size talks about the power of a statistical test that you will have, but it is not the effect size, the law of large numbers. Okay, question, just give me a second. Question 16, a test for sharp 10 memory capacity is normally distributed with the mean of 100 and the standard deviation sigma of 10. What is the probability that any person chosen at random will have a score of X equals 125 or more on this test? So, yeah, they're expecting you to do some calculations. What is the probability of scoring X being more or equal? 225. Therefore, we need to go and use our formula for the Z, which is your X minus the mean divided by the standard deviation because they gave you the mean, the standard deviation, just substitute and then calculate. Your X is what is given in the question because X is 125 minus your mean is 100 divided by your standard deviation of 10. You will say 125 minus 100 equals divide by 10 equal. What is the answer? The answer is? 2.5. 2.5, now. So, since we know that the probability of Z is greater than or equals to 2.5, we need to go to the table. Are we going to the bigger side or the smaller side? But it's the one thing that you always have to remember. The smaller side. I'd say it's greater, it's greater, not greater. Not greater, so not greater means you have to go to the bigger side because it says more on this test. It says 125 for more on this test and the... Yes. So it's the bigger side. I remember in this positive, right? So in just remember this. On your normal distribution, it's a belly shaped calf. In the middle, it's your zero. Where is your 2.5? Your 2.5 will be on this side and because the sign is greater, you're going to shade the greater side, the bigger side, which is the positive side. If it was less than, because it will be this side, we will be shading the less than, which is those ones that side. This are the less than or equal. This are the greater or equal. Greater means bigger, right? So every time when you answer questions like this, draw yourself this picture, it will guide you and help you not to get lost. Always calculate first your Z because this is, because I didn't make it explicit that this is a Z formula. So we're calculating a Z here using that formula. So you calculate your Z and this is your Z. On a normal, because we're standardizing this, we're standardizing 125 and this is the standardized. When you're standardized, your mean is zero because for any value which normally distributed is distributed with the mean of zero and the standard deviation of one. So our mean in the middle of our belly shape is zero. Then we look at where is our 2.5, it's positive. So it will be on the positive side. You can choose where you want to put it. I could also just draw it from, not also from here, but I could do it this way. I could say from here, it's my 125, something like that. Something like that. It's up to you where you want to put it. It doesn't have to be big or small or that you need to have a decision to say where are you drawing the line to show where your 2.5 is. And then look at the sign. The sign will help you which area to shade. So the sign is greater than that. So it means I'm going to shade everything which is bigger than zero, bigger than 2.5 will be covered. If it was less than, we would be shading the bigger area. So this area that we shaded, it's a small area. So we're going to go to the smaller side on the table. I'm gonna scroll on this table, on this question paper and hopefully maybe there is a table here. Yes, there is. Remember we're looking for 2.5 and if this table has two decimals, so it will be 2.5 or zero over shade. So let's go find 2.5, it's 0.0062, let's get there first. There is 2.5 and we're looking for a smaller portion which is the first color which is 0.0062. Go back to the question. I can't believe that your questions are like 70 and it's out of 70, so it's one mark for all this hard work. Option three, okay, moving on to question 17. Moving on to question 17. Do you guys understand that this, how I got to the answer? Is it clear? Does it make sense to you? Did I make it easier for you to understand? Yes, all good for me, thank you. Okay, question 17. Yes, but I don't know, Lizzie, I don't know, can you go back there? What I want to understand is that the shaded part is the one that we calculate. It's the one that we take, it's either we check the large portion or the smaller portion like on this one because we shaded the smaller part. Then it means that we're going to be looking at the smaller part, but if we shaded the large part on it, it means we were going to get the, we were going to look on the large portion side for the answer. Yes. Okay, now I'm sorted, thanks. So let me give you another thing based on the same information. Let's, if I swap the values around and I say, this was the mean is 100 and 25 and we say X is 100. If I swap these values around and I say this is 100. Now, instead of us getting positive 2.5 because it will be the X is 100, it will be 100 minus 125 divided by 10. We would get the answer of minus, what do we get? Minus 2.5, right? So how will you get the values? Still the same principle. You will draw yourself a plot like that and draw your zero in the middle and you will go and find where 2.5 is at. 2.5 is here. And then you're going to say, if 2.5 is on this side of zero because this side are negative and the side are positive. So if 2.5 is there, you go and look at the side. The sign hasn't changed because I didn't change this, what says more than, it's equal and more than that. So it's still going to be greater than. So because it's greater than and we know that when it's greater than it means they go, we need to shade the right-hand side. So it means I'm going to draw the line here and shade this side. Can you see now which, when I go to the table, which side am I going to find my probability? Can you see? The larger portion. I'm going to go to the larger portion. So drawing a diagram will help you visualize and make things clearer. By only looking at the value you got for that and the sign you are using will tell you whether you need to go to the larger side or the smaller side. So that's all what I was trying to explain with this. Okay, so let's move to question 17. Suppose that over the years, 10,000 students wrote the examination in psych 3704 and 6,000 of them passed, of which 300 obtained exactly 50%. This means that for randomly selected students, the probability of obtaining 50% is, while the probability of obtaining more than 50% is. So you also need to calculate the probability So there are 10,000 students who wrote the exam. Of those 10,000, 6,000 of them passed. What is a pass mark? Pass mark is 50%, right? When you write an exam to know that you have passed, you need to obtain a 50% pass mark. So if 6,000 of them passed, therefore it means we know of those 10,000, the 6,000 passed, of which 308 obtained exactly 60, sorry, 50%. Now from the 6,000 who passed, so there will be those who got 58, 68, 70, 80, 90 and those who got 100% in that. So these are those who got 50 and above, 300 passed with exactly. So if we need to find the probability that exactly 50% student passed, what will be that probability? What is the probability of exactly? This is just normal distribution, exactly 50%. 300 divided by 10,000. Divide by 10,000 and those, how much? What is the percentage? 0.3, that will be 0.3. And then we need to find the probability of those who got more than 60%, probability of more than? It says of the previous and four more. It's gonna be 6,000 divided by 6,000. Divide by 10,000. 10,000, which is 0.6. Which then makes it, question three. Zero question, yeah, question, the answer is number three. Question 18, two-sided tests, so two-sided dice are thrown together. What is the probability that both will fall showing a six? The options below are ended up to three places. If two, I think it will be number two, because they are saying two, two-sided tests, dice are thrown together, so it's two. Then it means that it's gonna be one over six times one over six, which is gonna give us one times one, which is one over six times six, which is gonna give us 36. I don't know if I'm mistaken. And then it means one divided by 36 is equals two. One divided by 36, which is gonna give us zero, 0.277, and when we round it off, it's gonna be 0.028. Yes, that is 100% correctly. Because the two dice are independent from one another. Yes. When they are independent from one another, therefore it means when the first die times, is the multiplication rule, which is the same as, one times one divided by 36, which is zero comma, zero to eight, question 19, oh, question 19. The standard normal distribution is also referred to as there. If you can remember, we just did that just now. Is it one, the X distribution two, the Z distribution three, the frequency distribution. Number two, the Z distribution. Yes, it is standard normal distribution. It's also known as the Z distribution. Yes. Question 20. A variable is normally distributed with the mean of 50 and the standard deviation of 10. If this variable is transformed to a standardized normal distribution, what would the value of the mean and the standard deviation, standard deviation of this distribution be? And I just said it when I was explaining something. Is it one, the mean of zero and the standard deviation of one? The mean of one and the standard deviation of zero? Oh, three. The mean of 50 and the standard deviation of 10. Number one. Yes. It will be number one, because for any value that is normally distributed, it will be distributed with the mean of zero and the standard deviation of one. Question 21, suppose the height of military recruits is distributed normally with the mean of 1,750 millimeter and the standard deviation of 50 millimeters. Drawing repeated sample size of 25 recruits each. We expect the standard deviation of the sample means to be about. Now, you need to go to the sampling distribution to understand this because it talks about the sample means. Therefore, it meant it's sampling distributions. Therefore, the standard error, which is also known as the standard distribution, the standard deviation of the sample means is given by the population standard deviation divided by the square root of n. So we just need to go and substitute. Our standard deviation, which is our sigma, our n is our recruit. So we just substitute into this formula. Standard deviation, it's 50 divided by the square root of 25. And those who don't know what the square root of 25 is, this is five. So you can also say 50 divided by five, or you can use your calculator. On your calculator, there is a square root function, which the answer here is two. It's number two, it's 10. It's 10, which is number two, because when you say two, I'm thinking, you're saying it's number one. It's not number one, it's got two. Okay, so, okay, let me also write it here. So next time, if they don't ask you, ask it like standard deviation, you know that this is the same as the standard, the standard error. Question 22, by convention, the total area under the standard normal kef is said to equal to, okay. What they are asking you, the total area is your probability. They are asking you, what is this area underneath the kef? If you know the area underneath the kef, if I divide this kef into two halves, this site, it's got zero comma five, and this site got zero comma five. So what will be the total area of this kef underneath the kef? It's number one. Yes, it will be one. Question 23, what is the principal advantage of transforming measurement to set scores? They enable one to, one, to determine whether the scores are normally distributed around the mean, two, to compare the person's scores test with the different means and standard deviation, three, to determine the frequency distribution of the test. Why are we transforming the scores? Number two. We transform measurements to see the difference between the two things, like the means and the standard deviation in this manner. We transform. We transform, maybe I'm checking the wrong thing. I must just go to set score. Do you have a set score? Do you have a set score as a topic in your module? That one is a very tricky one. Why is this thing on edit mode? How many sets score are there? There we go. There is our answer. A set score is the original measurement transform into a point on the standard normal distribution. And that doesn't answer the question why we transform. Z transformation. Any variable x that comes from a normal distribution can be transformed to its representation on the normal distribution. Provided that we know the mean and the standard deviation of those scores, then that is the formula. But it doesn't tell us why are we transforming them. In the major, transforming a score from normal distribution to its associated score has additional benefits. Transforming of a set of measurement, each with a different mean and different standard deviation into a set score can be used to compare an individual's scores across different distribution. After transformation, all the scores will fall in a common standard normal distribution with the mean of zero and the standard deviation, which makes it possible to compare them directly. So, number one, this enable us to determine whether the scores are normally distributed around the mean or to compare the person's scores on test with different means and standard deviation. Three, determine the distribution of the test. Number three, it's out. It's so that you can compare the scores. Transforming a set of measurement, each with a different mean and different standard deviation to compare an individual across different distribution. But you see that doesn't talk about comparing them across different distribution. A set score is an ordinary transform, a point on a standard normal distribution. All the characteristics of a normal distribution applies. The size of a set score is, can we also check the meaning of set score, maybe, maybe if we can check what this means. The formula to transform is that where X represents the variable, the mean, the population, the standard deviation of the population from where the X is obtained. But this whole paragraph talks to the transformation. Any X value that comes from normal distribution can be transformed to its representation on a standard normal distribution, provided that we know the mean and the standard deviation of those variables. They enable one to determine whether the scores are normally distributed, compare the scores with different means. You see, that's the other problem that we have here. The set score is the original measurement transformed into a point on a standard normal distribution. Therefore, all characteristics of a standard normal distribution applies. For example, the size of the score always reflects the number of the standard deviations that a particular score line above or below the mean. I'm leaning towards number one. I'm gonna explain now why I'm not leaning towards this number two, because tell me on number two, if you read the other way it says additional, transforming a set of measurement each with different mean and different standard deviation into a Z score can be used to compare an individual's score across the distribution, not test between different means and that. And I'm leaning towards number one in the correct one because it relates directly to the statement. It says to enable one to determine whether the score are normally distributed around the mean. And that is what it says. The size of the score always reflects the number of standard deviation that is particularly that the score will lie above or below the mean, so around the mean. And this talks to comparing in different person's scores. And also this statement doesn't talk about different scores, it talks about the same scores but across different distributions. So the answer is number one. Question 24. We can determine the extent to which any sample mean approximates the mean of the population from which it was drawn due to care. One, Z distribution, two, central limit theorem, three, statistical inference. Number two. It is number two. And we read about this. Question 25. Study the histogram below of the exam marks of a group of students in the same class. Note that the values on the horizontal axis are the class categories limits and they give you the histogram. If I'm looking at this, this is 10. This is 20. These are the frequencies that I'm reading the lines. I'm just writing one of them. So this is 20. This is 40. And this is 10. And probably this is also 20. So all in all, they are about 80, 90, 100. So there were 100 students in this class, okay. I assume we use the histogram as a base for making probability predictions. What is the probability that a student score will be between 40 and 60? 40 and 60. How many? We're gonna divide it with 30. Why am I saying that is that we're gonna check between 40 and 80, the percentage of 40, which is gonna be 10 plus 20. And it's gonna give us 30 and we're gonna divide it by 100. Because now we are checking between 40% and 60%. And then when you check our graph here, based on 40, I think 50, 50, it's 10. And when you check 60 also, it says 20. I don't know if I'm making sense. You are. I've already written it down. And then we will be 30 divided by 100, which is gonna give us 0.3, which is 0.30. Which is option three. And that's how you answer it. So when you get questions like this, you just look at the frequency and look at where the bar is at, because then that will corresponds to the number. So this means those who scored between 10 and 20, they were 10. 20 and 30, they were 20. 30 and 40, they were 40. 40 and 50, they were 10. 50 and 60, they were 20. And then, and so on. Okay. We have 25 minutes and we are on question 26. And there is a big paragraph coming up our way. Use the scenario below to answer question 26 to 31. A researcher suspects that the addition of certain food supplements to the diet of elderly people will reduce the decline in cognitive functioning that comes about because of aging. She decides to test this using the neuropsychological test that measures the speed with which object is identified. The test is standardized in such a way that a higher score implies a better rate of object recognition. It is known that the distribution of the scores on the test is approximately normal and that a mean of 80 and the standard deviation of 20 was found in the population of persons older than 65. To investigate her hypothesis, she obtains a random sample of N of 100 persons older than 65. Each member of this sample is given a daily dose of supplements over a period of six months. At the end of this time, each person is tested on the NPS test and the mean she's X bar for this sample is 86 is found. The researcher plans to test the hypothesis at alpha or level of significance of 0,05. So we need to answer question 26 to 31 using the same statement. So let's start with 26. The appropriate research suggested by this scenario above is as follows. One, is this a cognitive functioning decline? Oh, cognitive functioning declines with H. Two, the cognitive functioning of elderly persons is related to their perceptual speed. Three, the rate of object recognition will be better for elderly person who take the dietary supplement than those who do not. Which one will be the appropriate hypothesis testing? Number three, I also agree with number three. Yes, I also agree with number three because on number three, they talk about in this statement that they make, the test is standardized in such a way that a higher score will imply a better rate of object recognition, but also remember that there is such a statement at the beginning that elderly people will reduce the decline in cognitive functioning that comes about because of their aging. So number three will be appropriate for this test because then we can use a proper hypothesis on this one. So based on that information that you just selected to say that will be our research hypothesis how would we state the alternative in this instance? Number one. Number one, which, wait, let's go back there because the signs here are not right, right? So number two actually is wrong because it uses the sample size. So number two can be correct. So we'll end up with only two of them. And because we talk about beta means declining or then higher means greater than or equal therefore number one will be the correct one. Yes. Okay, the mean of the sampling distribution of this will be, let's go back. I'm going to give you an example. So with sampling distribution, the mean of the sampling distribution is the same as your population mean. So now, yeah, they're asking what is the mean of the sampling distribution? Go back to the question, what is the mean? 86. Nope. What is the mean of the sampling distribution? I just gave you a hint. I'll get my formula. The mean of the sampling distribution is the same as your population mean. What is the mean? It means it's 80. 80. It's 80. The answer is 80, which is option two. What is the standard error? Remember the standard error is your population standard deviation to the right square root of 10. 50 minutes. So let's go find sigma. Let's go find n and then go substitute into the formula. It will be 20 divided by the square root of 100 and then do the calculation. If you don't know how to use your calculator to calculate the square root, please let us know so that then I can show you on my laptop how to use a calculator. Don't be shy, because you will be losing one mark if you don't know how to answer this question. And what I've noticed with most of your site exam papers, they look exactly, almost exactly the same. Do we have an answer? No, we don't know how to find the square root of 100. 20 over 10, which equals two. The square root of 100 is 10. So this will be 20 divided by 10, which is equals to two, option two. Okay. Now, with the information as given in this scenario, what would be the most appropriate statistical test that should be used to evaluate the hypothesis? One, is it a one sample Z test? Two, sample T test or a one sample T test. Now, think about it, go back to what you said. If in your alternative you said it is option one, therefore it means anything that has two sample sites, it's out. So therefore it means we eliminated that because we said our hypothesis testing, we said it was greater than, right? That's what we said. Now, based on the statement that we have, what would be, is this a Z test or a T test? It's a Z test, cause sigma is no. Because our population standard deviation is no. Yes. Thank you. And that will be a one sample site, or sample Z test, based on that statement. Now, the appropriate T test or test statistic is calculated and based on this number, a computer program is used to determine that the one-sided P test is 0.022. What conclusion can be drawn on this? Now, remember to draw the conclusion, the decision, we can use different things. The decision using the P value and the level of significance, alpha, it says if your P value is less than the level of significance, we reject the null hypothesis. Remember that, that is the decision rule. If your P value is small, it must go. P is small, must go. If your P value is less than the level of significance, we reject the null hypothesis. Now, going back, what was our level of significance? 0.05. 0.05, okay. So if our level of significance is 0.05, our P value is 0.022. What is the side? It's less, right? Because our P value is less than our level of significance. Therefore, we're going to reject the null hypothesis. So if we're rejecting the null hypothesis, then what do we say? So it means number three is none of them because the statement is to reject the null hypothesis. So the null hypothesis can be rejected, which implies that the supplement, now you need to go back to your null hypothesis statement or your alternative statement. So let's go back to our statement. We said it is better, right? So better means more than improve, right? So if we're rejecting that better statement, then what do we say? Number one. Yes, it's number one. Because the null hypothesis can be rejected, which implies that the supplement improves cognitive functioning. Okay. Question 82. A type one error or cares when one, the null hypothesis is rejected when it should not be rejected. Two, the null hypothesis is rejected when it should be rejected. Three, alternative hypothesis is not accepted. One, when the null hypothesis is rejected when it should not be rejected, that causes a type one error. Yes, I agree. Number two is a type two error. I was trying to blow my nose. Yes. So a type one error cares when we reject the null hypothesis, that is true. And number one. Question 33. A statistical hypothesis is a formal statement about, and we did this. When we were stating the null hypothesis, right? I made comment about it, right? So pay close attention to things that you said before. A statistical hypothesis is a formal statement about, number one, population parameters. Some statistics. Two, sample statistics. Three, care statistics. One, two, or three. Number one. It's always about the population parameter. Population parameters are like your mean, your standard deviation, and your proportions. Things like that. Those are your population parameters. Sample statistics. Oh, population parameters, we use grid letters. Sample statistics are like your mean, your standard deviation, and your proportion. We use Romans letters. And sometimes they do ask you this kind of questions, I think. Let us, you must always remember that. Question 34. A sampling distribution of a statistic, for example, of the sample mean can be calculated if we assume that hypothesis is true. But not if we assume that hypothesis is true. The sampling distribution of a statistic can be calculated if we assume that the hypothesis testing is true, but not if we assume that the hypothesis testing is true. It's number two. If we assume that the null hypothesis is true, but not if we assume that the alternative hypothesis is true. The sampling distribution. Why are they liking to trick you guys like this? But I agree with what you just said, because that's what I will also think that. And let's go find out from your DS study guide. No, it's hoping please complete the register. I don't want to complete the register. I want to go find something weird happened right now. Is it in presentation mode? How do I get out of this mode? Because now I can't even find something. I don't even find the answer we're looking for. There we go. This talks about the type one error and type two errors. We're back. All right, sampling distribution of the statistics can be calculated if we assume that the null hypothesis is true, but not if we assume that the alternative is true. Go to 3035. Then a statistical test yield a large p value. Which of the following statements is most likely to be correct when a statistical test yields a large p value? Which of the following statement is most likely to be correct? Number one, the alternative hypothesis is probably true. Number two, the null hypothesis is probably false. Number three, the null hypothesis is probably true. Number three, I was going to go and explain, but you already gave the answer. It will be number three, because remember, if your p value is large, therefore you are not rejecting the null hypothesis. So therefore it means the null hypothesis will be true. If it was small, you would be rejecting the null hypothesis and probably you would choose number two because your null hypothesis will be false. Always never refer to your alternative hypothesis statement when you're making a decision. You always use your null hypothesis for making decisions. We use the alternative when we're doing the calculation and making a decision, but when we are concluding, we refer only back to the null hypothesis. Question 36, the hypothesis testing gave us the alternative hypothesis that states that the mean is less than 50 is a hypothesis test and requires a statistical test. When your alternative is like this, so this is a... Directional. A directional and the test is... One tail. One statistical test, which will be number three. Question 37, when applying a Z test to compare a sample mean to a known population, I can't believe that my voice also managed to stay like this. The p-value represents the probability of N. When applying a Z test to compare a sample mean to a known population mean, the p-value represents the probability of N. Rejecting the null hypothesis if it is false, obtaining the mean found in the sample of the data under the alternative hypothesis, obtaining the mean found in the sample of the data under the null hypothesis. Which one? I think number three is the right one. Obtaining the mean found in the sample of data under the null hypothesis. I don't know what they're asking actually in terms of this question. But number one, it's out. Number one is not right. The p-value, let's see. That one, I don't have an answer for you. If you are able to see some way, I said no, I can't even use this thing. So I broke the thing. I can't even search. I decided to find the p-value and I don't want p-value. That one, I don't have an answer for you. I can't even search now. Let's see. Obtaining the mean found in the sample data under the alternative. We use the alternative sign for decisions. But also on the null hypothesis, there is the mean. But is it under the null hypothesis or is it under the alternative? Are you able to find the right answer? Are we all happy with number three? Is the p-value then represents the probability of obtaining the results at least as an extreme observed result of a statistical hypothesis assuming that the null hypothesis is correct. So let's assume that it will be the same. Obtaining the mean found in the sample data under the null hypothesis, which is option three. Option 38, and we are five minutes away from knockoff time. When two population means are compared, the p-value is calculated to represent the probability of observing a specific difference between the sample means given that. And we just did this. H is not, which is the null hypothesis is true. Your alternative is true and your null hypothesis is false. When two population means are compared, the p-value is calculated to represent the probability of observing a specific difference between the sample means given that. Number one. Given that the null hypothesis is true or correct, which is number one. I just explained that now. Question 39. The lower we set the level of significance, the greater the probability of, one, rejecting the null hypothesis, two, type two error, three, type one error. Are you still there? The lower we set the level of significance, the greater probability of, 21. Here's the answer to that. It's number two, a type two error. Because they are saying here that on page 84 of our study guide, it talks about the level of significance being chosen. It's a long thing, but what I can say here, it says this kind of mistake rejecting the null hypothesis when it is in fact true is referred to as a type one error. But at the top there, while I'm busy coming with it, it talks about the p-value with a level of significance that we chose before we did the sub-linear mode and also it talks about the statistic is smaller than the level of significance. So a type two, it's a type two. It can be type one. Okay. Oh, again, on 85 it says what if the p-value is not smaller than the level of significance and we decide not to reject the null hypothesis. Now we run the risk of not rejecting H0 when inside H0 is false and H1 is true. This refers to as a type two error. Yeah. So lower your level of significance will suggest that you will have the risk of rejecting your null hypothesis. The higher your level of significance, you will run the risk of having a type two error. So looking at your question, when we said when the lower we said the level of significance, there will be the greater probability of one. Is it rejecting the null hypothesis? A type two error or a type one error? That's the three options that we have. I think it will be number one because the lower your level of significance will indicate the stronger chances that you will be rejecting your null hypothesis because think of it this way. The smaller your p-value, oh sorry, if your p-value will be smaller than your level of significance, you reject your null hypothesis, right? That is the other way that you can get it. But here you have a lower level of significance. So therefore it means you will have a no-no-no level of p-value in order for you to be able to reject the null hypothesis. Do you agree? Do you agree or is it am I talking Greek? I just don't know. The lower your significance level, then the lower the power of the test would be, right? If you reduce your significance level, then your region of acceptance gets bigger as that results. So you are less likely to reject your null hypothesis when your level of significance is lower. So then it can't be number one? Then it can't be number one. So probably it's between type one error and type two error. So the lower we set the level of significance, it's when we reject, I think it's when we reject the null hypothesis, which means if we're less likely to reject the null hypothesis when it is false, so you will be more likely to make a type two error. So then the answer will be type two error. Because if you are less likely to reject the null hypothesis even though it will be false, then you are committing what we call a type two error. So the answer here is type two. I agree. Okay, so that ends our two hour discussion, which means our journey for today ends right here until I see you again next week when we continue with from 40 to 70. If you have any questions, any comment or anything you want assistance with, my email address is eboyem. at unisa.ac.za That's my email address. You can send me an email and copy CT and chat at unisa.ac.za. Otherwise, there is a WhatsApp group that you can always use to get out of me and have a discussion with fellow students on there. Are there any questions, comments? Thank you very much for your time and your help. No problem. Thank you so very much. Thank you. Have a lovely evening and see you next week, same time, same place while we continue. Bye. Thank you, Lizzie.