 So, welcome to today's presentation on Chi-square statistics. That's a very simple test to measure the goodness of fit and association between nominal variables, basically. So, we'll have a presentation on Chi-square test and also demonstrate it on Microsoft Excel and one part of it on SPSS even. So, let's start with the presentation. Yeah, so what we will do is we'll just, you know, have this presentation. Can someone confirm whether you can see the screen? Yes, sir. Very good. Thank you very much. So, before I begin the Chi-square statistics, I will just talk a little bit about parametric and non-parametric test and, you know, the same old thing about the p-values, the alpha values. And let's just take it as something that has to be understood conceptually before we get into the calculations. First of all, there are distinctions between parametric test and non-parametric test. Parametric test assumes certain distributions in the data, especially it assumes a normal distribution of the data. And there are other conditions that has to be specified. By parameters, we mean terms like mean and standard deviation, etc. When we have these kind of things in a distribution, we are doing parametric test. And basically, parametric tests are about numerical variables, variables which are at least interval or ratio variables. And non-parametric test, they do not rely on any distribution. They can be applied even if the parametric conditions of validity are not met. So, that is why non-parametric tests are much simpler and they are more elegant compared to parametric tests. But parametric tests are more powerful and they can find out an effect even if it is very, very small. So, that is why the first choice is parametric. And if we don't get parametric tests, we go for non-parametric tests. These are the four assumptions for parametric tests. Number one, they have to be independent. That means every item that is there, it has to be independent. It is not dependent on any other item for its selection. The distribution is normal. The variance is homogeneously distributed across the sample. And the scale is at least interval. It's at least an interval scale and it's a numerical scale. So, these are four assumptions we make for parametric tests. If these assumptions are not satisfied, if for example, it's a nominal scale and an ordinal scale, I'm sure we remember that nominal scales are the ones which are categorical or which are just categories. Ordinal scales are where we use rankings. So, when we do, we have either categorical variables or we have ordinal variables and where these conditions are not satisfied, we go for the non-parametric tests. The most important thing even in a non-parametric test is about the p-value. So, let's very, very quickly find out what is a p-value. So, if I have to just repeat what we know about p-value, it's that it is the probability of the null hypothesis being true. So, p-value just confirms whether the null hypothesis is true or not. So, if the null hypothesis is true, then the probability of the p-value, whichever we take that is known as the alpha value. So, for example, as I told you and as we know that we start off with a null hypothesis, we begin with the conjecture that there is no effect between the variables, that is the null hypothesis. If the probability of the null hypothesis is very small, usually we take that probability as 5% and this is a very important value to remember. So, we have to find out whether the probability of the null hypothesis is being true. If it is less than 5%, then we reject the null hypothesis. So, we keep the p-value or the probability value as kind of a benchmark. If my probability of the null hypothesis being true is less than 5%, we will reject the null hypothesis and we will accept or we will have more reason to accept the alternate hypothesis. Alternate hypothesis is what we are interested in. That is roughly what our research hypothesis is most of the time. So, this is a very important indicator even for non-parametric tests. As I go along, we will have a discussion on this as I carry on. So, if the probability of the null hypothesis being true is less than 5%, then we will reject the null hypothesis. Now, there are two kinds of errors possible there that I am just repeating it. It is not required that this is more in parametric tests, but if we reject the null hypothesis where actually the null hypothesis is false, then we are doing a correct thing. But if we reject the null hypothesis when the null hypothesis is true, then that's an error. It's known as a type 1 error. It's also known as a false positive error. Another error that we have is known as the type 2 error, which is the false negative error. Anyway, this is not required for chi-square statistics, but just as a background because we will be dealing with probability values here. That's why we wanted to discuss about type 1 and type 2 errors. So, what is chi-square? Now, it's written as this Greek letter C and we square that. So, it's pronounced as chi with a K. So, that pronunciation is important. So, what is chi-square? It's a simple non-parametric test of significance suitable for nominal data where observations can be classified into frequencies. So, this is done to find out whether the observed frequencies are any different from hypothesized frequencies. When we come to the examples, you will understand that this is a much simpler test. It's a very simple non-parametric test of significance. That if we are expecting certain values and my observed values are very different from what I'm expecting, then we will have to do a test of significance. And there we will be using this probability value. I'll discuss that in a moment's time. Basically, it's about whether the observed frequencies and the one which we expect, they are different or similar. So, we'll find that out in a moment's time. So, if the observed frequency, it could be frequency of anything. It could be frequency of phone calls in a month. It could be the frequency of people with a certain blood group. It could be a frequency of 4 Z by Rohit Sharma in matches across IPL. It could be anything. So, we are concerned with two different frequencies there. O is the observed frequency, frequency of any of these categorical values, categorical variables. And E is the expected frequency. What is the frequency that is expected? Or what is the hypothesized frequency? The term we use in chi-square is expected. Basically, we are dealing with hypothesized frequencies there. So, it's used as a goodness of fit test. It tells us how well an observation distribution, it will fit a theoretical distribution. So, for example, if we have to find out whether students choose to answer short answers more than long answers or things like that. So, when we have cases like that, or we could also be dealing with cross-stabilization across categories. So, these are two different things. The first one we are trying to find out about whether the frequency is what we expect the frequency to be. And the second one where we are talking about cross-stabilization between two categories. Say, for example, we discuss about Chennai Super Kings and Kolkata Knight Riders and Royal Challenges Bangalore and the Delhi Capitals, for example. And what is the preference for these teams according to gender? So, whether a male student prefers CSK or KKR or RCBRDC and we will put that in a table which is known as a contingency table and we will try and find out about that. As I said, we will have an explanation for all this. We will have a demonstration of all this. And this will be clearer to you. One of the contingency tables, which is a special contingency table is a 2 into 2. We will explain what all this means. So, say, for example, this is the information that I have about the number of phone calls one friend made to another in these months. In January, the person he or she made 44 phone calls. In February, the number of phone calls was 56. In March, the number of phone calls was 59. In April, 89. In May, 22. June, 24. July, 12. August, 42. September, 99. October, 28. November, 49. And December, 35. So, this is the information that I have that in these 12 months these were the number of phone calls made over a month. So, these are frequencies, as you can understand. If there are no variations across the month, then we expect the number of phone calls to be similar in each month. How do we get this figure? This is the total number of phone calls made over a year. If I divide that by 12, this is the expected frequency I get. So, this is the observed frequency. This is the expected frequency. Now, chi-square will be able to tell me whether this is different from this or whether we can safely say that the number of phone calls across different months, they were different. Now, as we see, it looks very different. So, we'll have to statistically prove that as well. We must have an objective criterion of finding out whether we can see whether the chi-square values are different. So, for that, what I'll do is I'll straightaway go to the Excel sheet and I will try and find out and demonstrate that to you. When I return back, we'll have a discussion on this again. Can you see the Excel screen? Can someone confirm? Yes, sir. Thank you. So, let's see this. This is the same thing that we just saw, the number of phone calls and the expected frequency. How do we get the expected frequency? We did the summation and once you do the summation, that will be 599 and if I divide it by 12, I'm getting it 46.5 for every month. Excel has a very simple way of finding out whether this is statistically different. So, if you follow along very quickly, you'll see that. I'll just try and find out a formula. To find out a formula in an Excel sheet, we have to use the sign is equal to. And after that, I will have to type in chi. So, here I don't have to go to the data analysis. Here it can be seen only from the function itself. So, I will just zoom it again to show it to you. So, this is the command that I'm giving you. It's equal to chi square dot test. I have to provide it the actual range and I have to provide it the expected range. The actual range is my observed value. So, I'm putting in the value here from here to here. This is my actual range. Then I have to provide the expected range after a comma. From here, shift till here. The moment I enter that, I'm getting a significance value. This value is known as the p-value. This is telling me the probability of the null hypothesis being true. What is the null hypothesis in this particular situation? The null hypothesis is that there is no variation in phone calls across the month. So, the probability of the null hypothesis being true is minus, sorry, 2.6502 e to the power minus 29. What it means is that 28 zeros and then 26. As you can imagine, the probability is almost nil. So, we can safely suggest that the null hypothesis is being rejected. What is the null hypothesis? That there are no variation in phone calls over the month. Statistically, using chi square statistics, we are suggesting that no, the null hypothesis is not true. We will have to accept the alternate hypothesis, which means that there is a strong variation in the number of phone calls made across the different months. So, this is a very, very simple way of finding out whether the null hypothesis is true or not. So, this again gives me the chance to go back to our discussion on the p-value. So, we have seen that if we have these phone numbers and we get an expected frequency by just dividing the total by the number of months, then I can safely say that the variation in phone calls across different months is statistically significant. It is very, very high. How do we find out that? Because the probability of the null hypothesis being true is very, very, very small. In our situation, we found out it was 0.28 zeros and then 2 something. So, it's very small. We just require it to be less than 0.05. So, again, to just remind what we did just now, this chi-square begins with the null hypothesis that the observed frequencies differ from the expected frequencies through chance alone. So, this is the null hypothesis. What is the null hypothesis that the observed frequencies are only by chance? They are not because of some inherent reason there. So, we are able to discard the null hypothesis in this case where because the p-value is less than 0.05. So, we have to remember this p-value of 0.05 and if my p-value is lower than the alpha level, we reject the null hypothesis as we just did in this particular case. So, this is a very, very simple illustration of trying to see whether the goodness of it is established or not. So, it's not just limited to p-value. There are other characteristics there which I will explain in another moment's time. So, chi-square statistic is a single number. So, we will have to find out what that single number is at times. It depends on the degrees of freedom and it depends on the alpha value. What is the alpha value? Alpha value is the probability of the null hypothesis being true and that probability, if it is very small, then the null hypothesis is rejected. The degree of freedoms is very simply defined as the number of observations minus 1. So, if I go to the previous condition, here the degree of freedom would be 11 because we are dealing with 12 number of variables. For 12 months, the degree of freedom will be n minus 1. So, it is 11. In many other cases, we'll see that the degree of freedom is simply just we subtract the number of observations which are available to you. So, since we have these 12 observations, if we subtract 1 from it, we get a degree of freedom. So, these are two terms we are supposed to remember when we talk about the chi-square value. In the example I just showed, there we just used the p-value to suggest that it is statistically significant. But we can actually find out the value of that chi-square statistic. What is that value? That value is found out using this particular formula that we will subtract the expected frequency from the observed frequency. We will square it. We'll divide it by the expected frequency and we will sum it all. So, in the next example, I will tell you how to find out this chi-square measure. So, even here if you wanted to, we could have easily found out what is the chi-square measure. But I will find out for the next example that I'm going to show you just now. So, chi-square statistic is a single number that tells you how much difference is between the observed and the expected value. Or if there is no relation at all. If there is no relation, then the chi-square statistic will be very small. And we have to keep in mind degrees of freedom and the p-value because we will be using that in our examples again and again. Now, this is the sample of a student population in a university in America. They have been shown to have this percentage of the blood types. O type, that is 38%. A type, that is 38%. B type, that is 20%. And AB is 4%. This is what we are observing. The expected frequency in the entire population is what is there statistically shown that in the entire population, the O type is 44%. The A type is 41%. The B type is 10%. And the AB type is 5%. These are the four blood types. As you can understand, these are all nominal values and these are frequencies. So, we are trying to find out whether the blood group of this particular student population is significantly different from the national average or is significantly different from what the blood group is for the national population. So, for that again, we can use the chi-square statistics because we have the observed value and why do we get the expected? Here, the expected is not equal. It is what the entire population is. In the earlier example we showed you, it was equal because we were expecting that it would be same across all the months. But here, we expect the values to be what the population value is. We expect the blood type O to be 44%, blood type A to be 41%, blood type B to be 10% and AB to be 5%. But this is what we are getting. What is the degree of freedom here? Since we have four different kinds of observation, the degrees of freedom will be 4 minus 1, 3. So, here the degree of freedom is 3 and these are observed and these are expected ranges. We will have to find out whether there is any significant difference in the blood group of the student population from the entire population. So, what do we do? We will have to find out the chi-square value. We will have to find out the critical value of chi-square at that particular degree of freedom and at that particular alpha value or that particular p-value. So, this is like a table. That table exists. If the chi-square value that I am finding out, if it is more than that critical value, so there are two ways of finding out the same thing. Even in T-statistic and many other statistics, it is the same thing. We go either by the alpha value and we find out whether the probability is less than 0.05 or we find out whether the value that we are getting is more than the critical value, more than the threshold value, if I can use that particular term. So, if the value of chi-square, which I am calculating by observed minus expected, squaring it and then dividing it by expected and by summing it, if that value is more than the critical value, then I will suggest that the null hypothesis is rejected. So, let me demonstrate it once again. This is the table that I was talking about. So, let me first show you the table to talk about the critical values and then it will be easier if we do it again. And I will show you a number of ways of doing it. So, here on the left side, if you see this particular side, this is the degrees of freedom and on the top, this is the level of significance, which is alpha. So, my alpha value is the probability value of the null hypothesis being true. So, 5% means 0.05. If I divide 5 by 100, I get 0.05. So, at 0.05 significance and at a degree of freedom of 3, the critical value is 7.815. So, if the value of chi-square that I calculate, if it is more than this critical value, then I will reject the null hypothesis. So, I will have to calculate my chi-square value at a particular level of significance and at a particular degree of freedom. These two information has to be provided and if my value of chi-square is more than this, then I will reject the null hypothesis and I will accept the alternate hypothesis that the difference in blood group from the original population is statistically significant. These tables exist, but you don't always have to go and see tables because all these things can be done or are done on Excel or by SPSS itself. This is just to show how it is done or just to show you the concept behind finding out the critical value and the value of the chi-square that we deal with. So, this is the table that exists and from the table at a particular degree of freedom, we find out what is the critical value. If my significance or if the alpha value changes, you can see this critical value will also keep changing. So, these are my critical values. So, at a 5% level of significance, let's find out whether it works or not. So, I will again go back to the... This is the observed. This is the expected. Expected means this is the blood group seen in the original population and this is what we are observing. What is the formula for chi-square? See, I have already done it here, so I will just show you here on the thing. You subtract the observed frequency from the expected. So, here the observed frequency is at B2. The expected frequency is at C2. We square it and then we divide it by the expected value and then we do the summation of all these four values. So, Excel is doing it for me. I am finding out the formula... I am using the formula to find out what is the chi-square value I get for my particular observation. The chi-square value I get here is 11.24. As you can understand, this is much greater than 7.8. In the example I just showed you, what was the critical value? The critical value we saw there was 7.8. So, if I subtract... If I see it, I will see that... Okay, just once again. This was 7.81. This was the critical value. The value of chi-square I am getting after calculating it is more than the critical value. So, since it is more than the critical value, I can safely suggest that the difference in the observed frequency from the expected frequency is significant. That means the blood group of the student population is significantly different from the original population. So, this is another very important test that we can very simply prove without any major formulas or having to do some sophisticated calculations or whatever. So, this is one way of doing it. I will discuss at least two other ways of doing it with the same example so that this particular part is clearer to whoever is watching it. So, we have been able to see that the critical value was 7.8 and the value that I got was 11.25. So, since my chi-square value is more than the critical value, I am able to reject the null hypothesis. This is basically the chi-square distribution. So, the alpha value, this is, you know, the left tail and the right tail, this is suggest. I will just talk about in a moment's time. So, when we talk of that 5%, the 5% is, you know, the right tail test, as we say that if the value of alpha is less than 5%, then we reject the null hypothesis. So, this is the C distribution we get. It is, as you can see, it is different for different degrees of freedom and it is different from a normal curve that we know which is exactly symmetrical. If the degree of freedom increases, it resembles a normal curve. So, this is the chi-square distribution. So, basically we are testing this chi-square distribution. Now, I will use the same thing to find out all these values without using the, you know, table. So, I will straight away have calculated all these things using these functions, the chi-square invert RT, the chi-square invert RT and the other functions to find out the value of chi-square. So, let me just show it to you that, you know, in the same diagram, see, we calculated this using a kind of a complex, not a complex, but we had to, you know, subtract and square and divide. As you can see, if I just show you there on the formula bar, you can see this. This is what we had to do. If I do not want to do that, I want to put it in a more simpler manner. So, what do I do? I will, again, you know, go to the same test that I did last time, which was about the chi-square test. The chi-square test. What do I do in a chi-square test? I provide the actual range. I provide the expected range. This is the value I'm getting. What is this? This is the p-value. This is the probability value that I'm getting from this particular thing. Now, since this p-value is less than 0.05, we are rejecting the null hypothesis. That is what we showed you in the earlier thing on the phone calls. Now, this is the p-value I'm getting for this particular thing. Now, if I have to find out the chi-square value straight away, we can use this p-value from a function on Excel itself. As you can see, this is the value I got for calculating using the formula. Now, I want to get it straight away. So, for that, the function is chi-square inverse RT. What is the probability? I'll have the probability that I'm getting. That is the probability. What is the degree of freedom? We have just calculated. If you see this, this is exactly what we have got here. This is in four decimal places. If I decrease the decimal points, I will get it exactly. This is the chi-square value I get using the built-in XN formula. So, what I did here was I used the probability value that I found out using the chi-square test. And then using the chi-square inverse RT, RT means right here, using the probability that I got from the first one, I can get the chi-square value here. If I want to get the critical value, the value that I use the table for, that also I can get it using the... This is there in the presentation itself. So, if you go back and look at the presentation, you won't have to remember all these terms. I will just repeat it in a moment's time again. So, what this inverse does is that it uses the left tail probability. So, for that, we'll have to subtract this from one. So, if I subtract this... Sorry, that is not what I'm doing. Sorry. Let me first find out that probability and then come to that. 1 minus alpha. Alpha means this is the value I've got here. That is the probability that I'm getting for this particular thing. So, that is what I'll have to get. Now, just a minute's time. I'm trying to find out what is the critical value. So, for critical value, what I'll have to use is that chi-square inverse. Here, the probability that I'm using is 1 minus alpha. What is alpha? Alpha was 0.05. So, 1 minus 0.05 will be 0.95. I don't want to confuse it with left tail and right tail, but if I want to get the critical value, this is how I go about. And what is the degree of freedom here? The degree of freedom is 3. If I find that out, this is 7.81478. So, what is this? This is the critical chi-square value for this particular division. If you remember, this is the critical value we got in the table also. So, instead of using the table, we are using the Microsoft Excel function to find out what is the critical value for my distribution. So, the critical value here is 7.814. Or if I reduce the decimal, it will be 7.81. If I round it off to 2 decimal, it will be 7.81. The chi-square value I'm getting is 4, which is obviously greater than this value. So, instead of finding this value from the table, we have a very simple function which is known as chi-inverse. If I use the chi-inverse function, that is a very simple way of finding out what is the critical value. So, if I know the probability, and here for inverse, I will have to use 1 minus alpha. What is alpha? Alpha is the probability of the null hypothesis being true. If I find out what is the critical value, if I know what is the critical value, I have to then find out what is the chi-square value of my distribution. If I know this value, and if this is more than the critical value, then I will reject the null hypothesis. So, if you are able to reject the null hypothesis, you have to feel happy. So, the null hypothesis is rejected means my statistical hypothesis is correct. We are able to suggest that the blood group of the student population here is significantly different from that of the entire population. So, these are three different ways of doing chi-square statistics using the Microsoft Excel program. I will also show you very quickly about one SPSS program as I go ahead, and then we will complete this. So, if I go back the chi-square inverse RT, what it does, it calculates the chi-square value. So, instead of using the formula, which formula? This formula that we spoke of. So, we are not using this formula. Instead of using this formula, if you want to straight away get the value of chi-square, you will have to use this function in Excel, the chi-square inverse RT, where probability is the probability you get after doing the chi-square test and degree of freedom you get by subtracting one from the number of observations. So, this uses the two arguments. The probability you have to tell, and this is the degree of freedom. The same thing about the critical value. I just showed you how to find out the critical value by using probability, what is probability here? 1 minus alpha. Whatever alpha, you subtract one from that. You subtract alpha from 1, you get the probability and you use the degree of freedom to get the critical value of chi-square. So, to find out the critical value of chi-square, I do not have to always use the table. I can straight away use it in the Microsoft Excel. So, this is what my result is. The result is that since my critical value of chi-square was 7.81 and since the value of chi-square I got was 11.24, that is why I am rejecting H0. H0 means null hypothesis. So, what do I get? What do I infer from that? I infer that the distribution of blood types in a student population differs from that claim for the national population. So, this is a very important test that I am able to do there. Another test that we do for chi-square is known as the test of independence. So, basically, you know, they are on two variables as I just told you. It could be male or female and then you are also trying to summarize, you know, their views or their association on some other variables. For example, it could be IPL teams as I just showed you. It could be about which kind of sports they like or which kind of food they like or whatever. So, when we have this kind of a frequency table, these frequency tables are known as contingency tables. So, chi-square, the first test that I just showed you was about the goodness of it and when we have data in this form, this is known as the test of independence. So, here we try to find out whether there is a relationship, say for example, between the socio-economic background of a child and his preference for extra-curricular activities. So, you know, we have the socio-economic background of a child, say for example, rich, poor, middle class or whatever and his or her preference for extra extracurricular activities and we have those frequencies and we try and find out there. Or we can try to find out whether there is a difference between social media used among people of different political backgrounds. So, you know, people maybe leftist or rightist or centrist or whatever and you know, what is the amount of time they are spending on different social media on Twitter, on Facebook or Instagram or whatever. So, this is another test that we can do using Chi-square or is there is a preference for sport influenced by gender to male and female, you know, compared to, you know, they have different preferences for long tenets and cricket and football and rugby, etc. So, these are the kind of things that we can find out in Chi-square test. We have, you know, these kind of things there. So, these tables, they are known as contingency tables. So, female here, you know, the choice for archery is 35, for boxing is 15, for cycling is 50. For male, the archery is 10 and the boxing is 30 and the cycling is 60. So, we try to find out whether, you know, across these sports, the difference among gender, this is significant or not. So, we use Chi-square for these kind of things as well. So, as you can understand, there are two different nominal variables and we are, you know, doing it on a contingency table like this. So, this kind of a table is known as a two into three because there are two rows and three columns. If we have just two rows and two columns, it will be a two into two contingency table. What is the degree of freedom for something like that? We subtract one from the row and round from the column we multiply it, we get the degree of freedom. So, here if there are two rows, I subtract one from it, it becomes one. If there are three columns, I subtract one from it, it becomes two. One into two is two. So, the degree of freedom here would be two. These are very simple things. Many of you might be hearing it for the first time. That's why it might appear a little bit difficult but it is not. At times we have to use the Yates correction when the expected frequency in each of these cells is smaller than five. So, we expect the expected frequency to be at least five or more. If it is less than five, then we have to use a correction for Chi-squared as known as the Yates correction. I will come back to this after I show it to you in a moment's time on an Excel sheet. That's the last thing that we are showing to you. What we do in Excel is on an SPSS sheet. I am showing you the data view on an SPSS. There are lots of variables about age and gender and family income and mediums and so forth. We are trying to find out how to use Chi-squared test for something like this. The same thing again here in statistics, in SPSS, the most important bar here is the analyze thing. You go to analyze, go to descriptive statistics and then you go to cross tabs. This is how you do Chi-squared test on SPSS. First, analyze the drop down menu. Whatever we get, we use descriptive statistics and from descriptive statistics we go for cross tabs. If you click on the cross tabs, this is what you get. Say for example, I want to find out whether the gender of the respondent has any effect on any kind of purpose for which they use internet. I will again zoom it. What we do, we are trying to find out a cross tab or use a contingency table. As you can understand, this is also Chi-squared test but slightly different from what we did. This is what we have to do. From here, we have to click cells and we will have to click on to expected. That's all we have to do. Chi-squared One thing that we have to do for this SPSS test is to see the impact of the Chi-squared. Here, we are just suggesting whether it is different or not. If you remember the correlation, we also have to find out the effect or how much is the effect size. To find out the effect size, we have to use the, sorry, we have to use the phi and Kramer's v and at times we have to use this lambda. If I use this and if I continue I will get an output. I just want to show you the output very quickly. If you can see there here, they are saying 5 cells have expected count less than 5. There are cells which have expected count less than 5. If I see the Chi-squared here, if you see the value, you don't have to this is a cross tabulation table, you don't have to go into the details of it. I am just trying to show you how to find out the result on a SPSS thing. For the 9 degrees of freedom, the p value is 0.278. Since the p value is more than 0.05, I can safely suggest that there is no difference between the gender of the respondent and the reason why they use internet. There is no difference there. We use this cross tabulation and as you can see, we are getting the expected count and we are getting the observed count. From there, the Chi-squared test is done by SPSS itself. I don't have to do anything. I just have to find out the value here. If this value was 0.05, I would have said that this is significant and then I would have gone to see the values here and that would have told me whether how strong that relation is. This is again a very simple way of working with SPSS to find out what is Chi-squared statistic. Simple as I told you, you just have to go to analyze and use the descriptive statistics and cross tabs. From cross tabs, you have to use whatever values you want to do. The reason for being on Facebook, for example, or you can say whatever, topic of the post, I do it here. Here we are getting I showed you one where it was not significant here. I am getting the p-value as you can see. This is the value which is important. Here it is we are trying to find out whether the male and female and the reason and the topic of their Facebook post whether there is any difference there. We can see actually there is a difference there because the p-value that we are getting is 0.05, which is exactly what we want. There are a few cells which are less than 5, so for that they will have to make that Yitz correction. There are no cells and it is better. That is what is required. Here if we see the nominal by nominal, this is the value. If I see the value, it is very small. The effect size is very small. Although chi-square test tells me that there is an effect, but the size of the effect is found out through this and the size of the effect is very small as I told you. If you see this 5 value, it is as you can see, this is the medium kind of a level. If you remember the R value in correlation where we find out the R value, whether these things are correlating or not, it is very analogous to those R values. The phi or the Kramers v, this is the effect size very similar to the R values. This suggests me whether there is an effect or not. If I have to find out chi-square in SPSS, it is much simpler. I just have to use the steps and then I have to just put out my whatever nominal variables are in either in row or column. So either of them can be in rows and columns that are no there is nothing hard and fast rule about that. The chi-square reflects the size of the discrepancy as we have told you. We might have to find out the critical value of chi-square. We can either find it through the table or we can use the excel formula, as I told you. Then I find out the chi-square value of the observations that I have and I find out whether this is greater than that critical value or not. If it is greater than the critical value, then I will say that there is association or the null hypothesis is false. This is the chi-square distribution according to the degrees of freedom. Here the degree of freedom is 1. Here the degree of freedom is 2 or 3 in 1. The more the degree of freedom resembles a normal curve. So these are chi-square distribution. You don't have chi-square test can also be used for parametric test to answer whether the variance is equal to less than or greater than some predetermined value. So chi-square test is also a parametric test. We use it generally for a non-parametric test but it can be used for a parametric test as well. I have just showed you cross-stab by SPSS. So that's all for today's presentation and once we start working on it we will have many things to realize about chi-square distributions. So that's all for today.