 10 minutes in. I don't even know if you know me. So I'm going to do as wise and start by introducing myself to you. My name is Elizabeth Boy. I'm going to be your facilitator for all the sessions. Just to give you a little bit of history about me. I actually was for 13 years or more. I was working for UNISA as a e-tweeter or a facilitation of skills or quantitative literacy facilitator for Cape Town, but also being an e-tweeter for UNISA. So 2023 came and my contract was not renewed, which is set for me, but good for me in another way. I'm not sure, but it's still not a problem because what I do is to also help people succeed, especially students. I grant you the support that myself, when I was studying, I didn't have. So I am still going to continue assisting students who want to be supported in terms of understanding statistics, but sometimes these things might also be very challenging, but I'm still just going to continue. But anyway, I also do work full-time for the University of the Western Cape. I don't lecture there. I work in institutional planning. I am responsible for the business intelligence and analytics of the University. So what I'm going to do with the sessions going forward is not going to be something new or something that will change from how I use to offer the sessions when I was still working for UNISA. I will still follow the same pattern, do concept recap and also do some activities. At the end of the session, we can discuss how we're going to share the notes and how you're going to access the recording as well. But the purpose of today's session is just to start up with the introduction to statistics, especially for students who are doing psych 3704, which is psychological research. You don't need to be a statistician to understand your module. All you need is to make sure that you understand the basic things, which we will take care of today. And then, because everything that you will learn today will form part of a building block that you need to always remember in order for you to go through your entire module. So I'm going to also assist you with every section of the module that you have, which is more statistical related, not research related, but mostly in the statistical space. So that's what I'm going to be doing with you throughout the sessions that we have. We already have pre-booked some of these sessions, so you just need to register for them and attend the session. These sessions are not going to be repeated. So you just need to make sure that you are part of the online sessions, because these are only the free sessions that you will get from me. And then later on, I will tell you how you can access some of the things or some of the benefits. Okay, so the structure of today's lesson is just, like I said, I'm just here to explain the basic concepts of statistics as well as hypothesis testing. I'm not going to go into detailed explanation on some of the things. It's just to scratch the surface, just to make you aware of certain concepts and certain things that you need to always be aware of. Then we're going to cover those concepts. We're going to then do some activities after we've covered the concepts. We're going to do some activities and I hope we can engage with one another when we do those activities. So with research or with statistics, there is some way where we start. Everything that you do because you want to make decisions. Usually, those decisions need to be backed up by proof or scientific proofs or facts. All right? That's where everything starts. So with research as well, you need to follow a scientific process in order for you to reach that decision. And when you do that, when you want to reach that decision or you are a researcher and you want to provide whoever employed you to run the research for them or your executive or your line managers with some input, you need to have some assumptions made original, which are mostly we call them theories. Once you have those assumptions, they need to be a research question that precedes that because someone wants to find out why certain phenomena are happening. That is what they want to find out. So you need to make sure that you understand what exactly they want you to research about or to find out. Then you need to create some assumptions that you need to test. Sometimes those assumptions that you want to test, you just want to test the relationship or you want to test whether there are differences. So those assumptions, that's what we call the hypothesis. So you need to have a hypothesis statement that you need to prove. Otherwise, you either reject or you're going to accept that assumption that you are making. But before you can do or before you can prove those assumptions, you need to go and collect the data. So you need to also understand the kinds of data that you're going to be connecting. And that is part of what statistics is mostly around about because once you have collected those data points or the data that you're going to be using, or probably we can call them variables that you're going to be using from a population or from a sample, then you need to analyze that. You need to summarize that information. You need to put it in tables and charts. You also need to analyze it in terms of mathematical calculations, like adding up and subtracting or creating new variables, calculating the averages or the mean, the mode, which is the most frequent number that is appearing, the median and so on. That is part of the data analysis. So as you can see that this is a scientific process that you need to follow in order for you to make those decisions. But it doesn't end there. The data analysis part also has another component where you can analyze the data by finding the relationship, which is the correlation, or you can predict another thing, where you use the regression or the t test or the z score to predict other phenomena. If part of your data analysis, while you are analyzing this, does not agree with the assumptions. You always have to go back to your assumption and validate that. If it doesn't agree with that, you can always also change your assumption or go back and collect the information or the data that will support the assumption that you want to prove. Or you can come up with a new assumption because the first one that you came up with does not relate to the or the data does not support that assumption. So it's an iterative process as well. So why am I telling you all this? It's that you need to understand that whatever you do as part of your research process. Remember at the moment you are studying and you are panicking and you're thinking, this is difficult. You don't understand some of this concept. We do not want to take you into statistician. But if, for example, you don't feel comfortable doing the calculations yourself, as you are employed, you can get a statistician to help you do the analysis. However, sometimes you need yourself to be aware of certain things. And these are the basic things that you need to understand. So that when people give you the outputs and the reports, you are able to interrogate that report and say, hold on. I've studied this in my Psyc 3704 and this is what has been happening. And what you are telling me does not support what I know about this. So that you can question all this and not accept everything that you read from other people as they give you. So my role, just to give you another perspective into why I'm doing this, I want to make sure that I equip as many people in South Africa with the statistical skills to enable them to be able to use that information in their day-to-day environment or their wake environment. Hence, I'm telling you all this. Right. Let's come back. Bring it back to your module. In terms of your module, there are key concepts that you also need to always remember, like a construct. A construct is a concept that is an explanation for a situation or a phenomenon or a event or a behavior. For example, when I am stressed, that is a construct because nobody else can see that I am stressed unless you test that or unless you evaluate that. Right. So it is a construct. It's something that you cannot observe. We already touched on certain concepts like theories. Like I told you, in terms of the scientific process, there will be theories that develop. Right. So theories are just a frame of reference for facts that attempts to account why things are happening and the way they are happening. Mostly, it's a claim about how constructs are related to that phenomena, which has been validated by research previously. So theories come from when other people have done research, they may come up with a theory and say, when I was doing my research, this is what I've observed and this is the theory that we can base what is happening with that. And then you use those theories and then you create assumptions to test if those theories are correct based on the research that you are trying to do. When you have those assumptions and you want to, for example, like we spoke about the construct of stress. Stress on its own, you cannot just look at the person and say you are stressed. You have to create an instrument that measures that. When you do that, you are operationalizing those constructs. Right. And that is the process or when you are operationalizing, you are creating a process of hearing a research project where you convert or you are not convert, you are able to measure a construct. And this can happen in two ways. It can happen theoretically, which is by looking at the terms that were well-defined. Right. That are just the constructs that are already existing. Or you can look at it in terms of operational, which is an observable instance where you are able to observe. And usually, this is what the researcher, or this is what the researcher must do to measure the construct where you are able to create that instrument and create measures that will tell you that this person has a high stress level, low stress level, and moderate stress level that is operationalizing the information. Measurements are the process of how you allocate numbers to quantify constructs. Right. Like I said, in order for us to quantify your stress level, we will assign a numeric number, maybe zero to one, based on the scale that we are using, 10 to 20, or things like that. That way, we are creating what we call measurement. From the constructs, or not actually also just the construct, but from doing all this, you create also what we call a variable. And a variable is a characteristic or a characteristic that defines the item or a subject of study that you are looking at. For example, let's use another example. A variable can be something like gender. Gender is a variable. And with gender, it can be observed because I can look at a person and say this person is a male or a female. Sometimes you get it right. Or a variable can be something like height. Height, you cannot look at a person and say they are 1.4 meters tall. You have to take a measuring tape and measure them. So a variable can either be observed or it can be measured. I am going to just pause the idea and I'm going to stop my recording, my video, but I'm going to ask if you have any questions so far before I move to data. If there are no questions, so let's hear on and talk about a variable. We continue with the variable. So in terms of the variable, there are two types of a variable. It can be a manifest or it can be a latent variable. A manifest variable is that one that is visible. We can see it, the score of your test. When you write your exam, you get a score. We can see that it's visible. A latent variable will be stress because it is invisible. Nobody can see it unless if we measure the stress levels and we are able to see that you are stressed. So anything that is invisible is latent. Everything that is visible is manifest. What is data? Data are values associated with a variable. So what do I mean by that? Gender is a variable. The value of female is data. The value of male is data. So a variable and data usually we use them interchangeably because variable contains data values. And these data values are meaningless unless the variables have been operationalized. And the meaning of those, especially if you're going to create a new variable, unless the meaning of those are accepted, then you can use them as data values, especially for the values that you would have created. In analyzing the data, there are two branches that you need to be aware of. There is the descriptive analysis and there is also what we call an inferential analysis. With descriptive analysis, it's just the way we summarize the data in terms of tables and charts and calculating the mean, the medians and so on, which are your measures of central locations. When we use inferential analysis, that's where we are doing estimation and prediction and testing the hypothesis. Where we want to determine what is happening from the population or what is happening about the population from a sample. And I'm going to tell you a little bit about how we create a sample from a population lens. Those will be some of the definitions that we go through. So when you do hypothesis, you need to define what is your population of study, which is the set of elements or individuals that you are interested in studying. So for example, if for example, I want to study the behavior or the study levels or the study behavior of students studying Psych 3704. I will create a questionnaire that will ask them questions about their study levels, who they are in terms of demographics. Because my population of study is Psych 3704, usually the population is too big because it's all elements I'm interested in studying. Time consuming and cost. I need to take those into consideration when I'm doing a research. So because of the other factors that might prohibit me from collecting information from those, then I can create what we call a sample, which is just a subset of the population. So if I'm going to create some measures, like calculate the mean, the standard deviation. If I'm calculating the mean, the standard deviation, the population proportions, and I'm calculating them from the population, then those measures, we call them parameters. If I'm calculating them from a sample, those measures, we call them statistics. So how do we, what is the process of selecting a sample? So there are different methods that you also need to learn on how to create your sample from your population. In order for you to infer back, remember when we spoke about inferential statistics, we said we want to learn about the population from the sample. In order for us to infer back the results we received from the population, back to the, sorry, from the sample, back to the population, your sampling methods needs to be correct. So in your study guide, I think you, you are taught about probability sampling methods because the reason why you need to use those is to help you support your inferential analysis when you want to infer back your result. If you use like methods like convenience, you cannot infer back the results. They will just be, it will just be an analysis of those group of people that you have interviewed, or those group of people that you have analyzed. But if you use probability sampling methods, there are several of them that you can choose from. Then you can refer back your results, you can infer them back to your population and say you are 95% confident that the true height of everybody who was in my study is above the average or something like that. So what are those sampling methods? We're going to discuss them later on. So we spoke about the data and the variable. In order for you to also do the analysis, you need to understand the levels of measurements. There are four levels of measurements, two of them from categorical variables and two of them from numerical variables. So nominal levels of measurements are those measurements that you can put into categories. The ordinal are those that you also can put into categories, but there is an order. Like on a scale of one to five, when you are rating a service and say one being poor and five being excellent, there is an order because it's from one up until five from poor to better or excellent. So there is an order in terms of how you rate that service. With ordinal, there is no order. Male and female, there is no order or logical order in terms of how the values are. Ratio and interval are both numerical or continuous variables. Now with interval, there is no true meaning of zero. So always remember that when you see an interval value, it's a value that can also go into a negative. Ratio, there is true meaning of zero. Therefore, it means when you're working with ratio scales of measurement or levels of measurement, those will be data values that takes up a value of zero and above. It goes into only from zero to positive because zero has meaning. Zero means it doesn't exist. So you need to understand the levels of measurements. The types of sampling method that I spoke about previously, we've got a random sample or a simple random sampling, which is a method that tells you that everybody you are interested in studying has an equal chance of being included when you create the sample. Like for example, if you have a group of people's names and you put them in a good example of this will be a raffle. With a raffle, you buy a ticket, you put them in a fishbowl and someone put their hands in and they pick up one name out of the head or out of the fishbowl and that one name has been selected. However, if there were 100 names in that, all 100 of them had an equal chance of being included in that sample. That is simple random sampling. A stratified random sampling is when you put groups of your population into mutually exclusive characteristics. So what do we mean by mutually exclusive? Mutually exclusive, it means one person cannot be in both. So let's say you want to group them by gender. So you're going to put all females together, all males together. So they won't be anyone who is a female and male at the same time. So they need to be mutually exclusive. So you put them into those mutually exclusive groups and from those mutually exclusive groups, you apply your simple random sampling so that everybody in those mutually exclusive groups have an equal chance of being included in this sample. A cluster sampling, it's almost similar to the stratified, but with cluster sampling, you just put groups of people into groups and from there you do your random sampling as well, from each group to create your sample. So these are some of the probability sampling methods that you might want to use if you want to infer your sampling results back to the population. The last bit of thing that I want to talk to you about is that in your module, you need to understand the concepts of hypothesis testing. With hypothesis testing, there are four steps that you always need to remember. The first step is to always know how to state your null hypothesis and alternative hypothesis. And we state this because your null hypothesis will be what the researcher wants to prove, right? So that is what you do. But you need to be careful when you state your null hypothesis and alternative. You always going to use the population parameter. And whether you're testing, whether two variables are related to one another or they are correlated, or you want to test the difference between the two variables, always use the population parameter, which means you will use the mu or the p or the standard deviation, which is the sigma. Once you have stated your null hypothesis and your alternative hypothesis, you need to define the kind of method you're going to use for your decision. Now, this requires you to understand several things. We're going to get back to this in more detail when I show you the decision tree. So you need to decide which method you're going to be using. Are you going to be using the t-test or the z-test and critical value or are you going to use the p-value to make a decision? You also need to know how to do some calculation. You need to calculate either the t-test or which is the test statistic or the z-test statistic. And we're going to talk about those things just now. And once you have all these values or everything sorted, you need to make a decision. And when you make your decision, based on your decision method that you have created, you can either reject the null hypothesis or accept your null hypothesis. And always your decision, when you make your conclusion at the end, when you get to the decision point, you always have to refer it back to your hypothesis statement. Right? So how do you do this hypothesis? So in order for you not to get overwhelmed, you need to have something that guide you in terms of what is it that you need to be aware of. So for example, if your hypothesis is to test the difference between two groups, now when it's two groups, what does that mean? So two groups, males and females, those are two groups. You need to ask yourself, I've got two groups. Am I given the population standard deviation or is the population standard deviation unknown? If the population standard deviation is known, then you're going to use the z-test to calculate. So which means in your decision method, you're going to have already decided am I doing the z-test or am I doing a t-test so that you can calculate the test statistics, which is step number three. And step number three, the test statistic is your z is equals to your sample mean minus your population mean sample divide by the standard error, which is the sigma at the bottom. So that formula in simple manner, it is the y that is at the bottom. Z is equals to the sample mean minus the population mean divide by the standard deviation divide by the square root of your sample size. If the population standard deviation is unknown, then you're going to be using a t-test. Now remember, you're only doing it for one group. I said two groups. Oh, I made a mistake. Were you doing it for one group? One population group, right? Sorry, my bad. For one population group, so you're going to use the t-test for one sample t-test. And also you're going to calculate the test statistic. If your hypothesis is to test two groups, actually the first one, I made a mistake by using males and females. I should have used Psyche 3704 as one group. So let's assume that now I've got two groups. I've got Psyche 3704 and I've got statistics 1610, right? There are two groups and I want to test these two groups. You need to ask yourself, are these two groups independent or dependent? What do we mean by that? So let's talk about the differences between the two. Independent, it means they've got no bearing on one another. They do not affect. One does not depend on the other. It has nothing to do with the other. Psyche 3704 has nothing to do with states 1610. So those are two independent groups. Then you can calculate your decision method will be based on the t-test and the test statistic will be the differences between the two. Sample means divide by the pooled sample standard deviation. If you've got two dependent groups, what is two dependent groups? Two dependent groups, it means they two affect one another, the before and the after. So you do an assessment before and we get the results and we teach you something and then you do the assessment again. We're going to take the score of the first assessment and the score of the now and we compare them because these two are the same. So think about it. For two dependent group, it will be before and after. So when it's before and after, we do the difference. So your t-test will be t-test of a difference and your test statistic will be the difference mean. So you will take the difference of the two values and calculate the average and divide by the standard error of the difference. If you are doing the relationship between two variables now, you also need to ask yourself, are my variable ratio or interval or are my variable nominal? So if there are numerical variable, then you're going to do a Pearson test, which is test for correlation. You're going to take your independent and dependent variable, which is your x and your y and you do your correlation. We're going to do this in detail, but you need to ask yourself all those. If you are testing two categorical variables, which are nominal variables, let's assume that, then you're going to use a chi-square test, which tests two categorical variables and you're going to use that formula to calculate your test statistic. And that concludes my introduction to statistics. These are just the barely, we just touched, touched the surface. What are the types of questions that you're going to get in your assignment and in your exam? There will be something like this based on just the introduction of you understanding the concept. In psychological research, a construct may be considered as which is they're asking you to find out from all these four statements, which one will define what a construct is. Remember from the beginning, we said a construct. If we go back to the slight way we had construct, I just an act of an explanation of why something is happening or a phenomena. If you think about that and you need to answer this question, number one, number two, number three, which one is the correct answer? And that is for you to answer me. So number one says, measurement based on the careful observation of aspect of human or human behavior, observation of an aspect of humans or human behavior, which was operationalized in some way. Three, hypothetical aspects of human or human behavior, which we wish to investigate. And number four, an explanation of empirical observation based on the measurement of certain variable. What is a construct? Is it a measurement? Is it an observation? Or is it a hypothetical hypothetical aspect? Or is it an explanation of empirical observation? Think about it. What have we been talking about? And the answer would be it is a hypothetical aspect. Because remember, what is a construct? You cannot see, you cannot touch it, feel it, see it, smell it. It is invisible to the naked eye. It cannot be smelled. It cannot be touched. Right? So always think about it because it's something that you can hypothesize about it and make or investigate. So that will be number three. Question two, which one of the following definitions below is false? So yeah, they also want you to think about whatever you know about psychological research. But here, if you look at it, it talks more about measurement constructs and so on. So the first one says the term construct is used to refer to an aspect of a human behavior or experience, which is abstracted from an observation for study in psychological research. Number two, measurement is a process whereby numbers are allocated to construct according to a rule. Number three, when a psychological variable is measured, the result is referred to as statistic. When a construct is measured, the result, the resulting quantity is referred to a variable. Which one of the following is false? Now, I can tell you for sure that number one and number two are correct because those are the definitions we just went through. Remember, a construct is just an abstracted value or an abstracted from an observation. Right? It's something that cannot be seen, cannot be touched, but it's part of human behavior, which can be abstracted. Right? A measurement is a process whereby numbers are allocated. Remember, we spoke about measurement. We said with measurement, we allocate quantity so that we are able to do the construct so that we are able to measure them. Right? So that is correct and this is correct, which leaves us with two. We spoke about population and a sample. Remember that? And we said measurement that comes from a population are called parameters. Measurements that comes from a sample are called statistics. And a construct, when a construct is measured, the resulting quantity is called a variable, but we also said when we measure the quantity, we create what we call a variable. Right? Remember that. So number four is also correct. The only thing that is not correct is this because if they would have said from a sample when we measure a construct, it will become a variable. It will be a statistic because it will be from a sample, but here they don't even talk about a sample or a population or something like that. It just says a psychological variable when it's measured, it becomes a statistic. It can also become a parameter because if it comes from a population, it will be the data that we summarize and measure, they will become parameters. So number three is the false one. So what are the other types of questions? So this comes from one of the past exam papers. Oh, sorry. I didn't want to click there. Just want to check the time. This one comes from one of your past exam papers. It says inferential statistics refers to dot dot dot and you get four statements. One says calculating statistics which summarizes the data. Two says using probability theory to make conclusion based on the observation of data. Three says the process of computing general research question into formal hypothesis. Four says the process of finding a way to measure and abstract. We're talking about inferential statistic. So remember, we spoke about two branches. Descriptive is a way of summarizing the data. Inferential, we said it's more about explaining method of explaining about things that are happening or something about the population from a sample, which means we inferring or we creating either a hypothesis or an estimation to predict what the population is from the sample. So based on that, is it number one, number two, number three? Let me see if after I've just explained all this, is it one, two or three or four? So you decided that today you're not going to talk to me. I hope in future you will be able to talk to me. So the answer is number three because it is a process inferential statistics will be the process of converting your general questions into a formal hypothesis so that you can test that. When doing research, the term operationalization is used to refer to the process of one, calculating a test statistic to test a particular hypothesis, two, converting a general research question into formal statistical hypothesis, three, determining a way to get a numeric measurement of a construct which is being measured, four, converting a calculated statistic into a probability value called the p-value, number one, number two, number three. If all these questions came from the same exam paper, by now I would have noticed the pattern of this lecture because it seems as if this lecturer likes one number. I will take number two. Nope, remember number two, we answered that with the previous question. Operationalization, it has to do with measurement, right? So that will be option three. If we can go back to the definition of operationalization, remember operationalization, if you want to carry out the research project then you need to be able to measure your constructs. So operationalization has everything to do with measurement of your construct, so always remember that. We left with three minutes, so I think this might be the last question. In social science research, the total collection of measurements across a group of research participants is referred to as in social science research, the total collection, remember, everything, the total collection of measurements across group of research participants is referred to as one, descriptive statistic, two, sample parameters, three, sample statistics, four, data. This one is a little bit tricky. In social research, the total collection of, oh yes, yes, the total collection of all measurements will just be data, yes. Okay, last question. In social science, researcher is told by a grade one teacher that some children are terribly shy, while other children seem to be quite comfortable in their social groups. The researcher decides to investigate using a test of shyness, which was developed especially for young children. In this study, shyness will be an mm, while measurement of it is referred to as what. Like I said, if I was writing an exam, I would say this lecturer likes one number. What is that? What is the answer? Number three. It will be number three, because shyness is a construct you cannot see it, you cannot smell it, you cannot touch it. And when you're doing the measurement and creating from this instrument, then you are creating what we call a variable. So, number three. Okay, and that concludes our session right on time. But it doesn't stop you from communicating and conversating around how do we help each other with unpacking your module. So I will see you next week. But before you leave, remember to go to the YouTube channel to subscribe as well as so when you subscribe, you're subscribing and click on the bell to receive notification. This is just to let you know that the recordings are made available. But remember now here is the catch, catch thing. All the online sessions are free of charge. So you don't pay a cent to attend. All I want is for you to be here and engage with me from now on. We're going to work through all your content, make sure that you are ready to go write the exam and take all the difficult concepts that you have and find more activities and work them together online. Free of charge, my time, my effort of preparing and curating and finding all this and packaging it and delivering it to you. But I need to be compensated somehow for all this effort. If someone needs to appreciate the work that I do for you guys for free, right? So you need to join as a member because the recordings are not going to be free of charge. I am so sorry. I'm going to make them. So what lecturing or what course fees? Why do we pay course fees then if all of the services are free? Ah, that's the other thing. So now if you're joining this sessions, this I am doing them through Pambili analytics. There is no UNISA included in here, right? I think you joined the session late. You joined the session late. I don't do this as an employee of UNISA or contracted by UNISA. I used to be. Now I'm not. I don't have any contracts with UNISA at this point. I'm doing this free of my time. Previously I would do this free and publish the recordings free of charge, right? Because UNISA was compensating me right now. Okay, I will get to your answer right now. Let me let me also finish off the presentation so that it's part of the so you can get you can get you can get hold of me via this. I'm doing this through my company Pambili analytics. When I did the introduction, I think you joined late. I did already introduce and gave some little bit of perspective in terms of this. So you are more than welcome not to be part of this. You are more than welcome not to join the YouTube channel as well. Like I said, it's not even a lot of money. What I'm asking there, right? I'm also going to give you some. Let me stop there. And thank you for coming and stop the recording and stop the sharing. Okay, so one time