 We are doing descriptive statistics and we have discussed that the functions of descriptive statistics primarily are organizing the data, presenting the data, and summarizing the data. In the last lecture, we organized the data through frequency distributions and presenting the data through graphical representation using histogram, polygon, and other graphs as well. Today, we will be moving to the third function of descriptive statistics, which is summarizing the data, which we can achieve through mayors of central tendency, meaning, we will be explaining the data in just one single value. Just before we manage the example, if I have 300 students appearing for an entry test and I have to report what is the average performance of students on that entry test, I can summarize the whole 300 data points into one single value by describing the average of the data, maybe describing the mean of the data. Mayors of central tendency allow us to find the central value or central location in our data set. So central tendency basically summarizes the data, identifies the average or single score to serve as a representative value for the data and also defines a center for the distribution. We have so many examples around us where we need to, you know, talk in averages, where we have to communicate in averages. For example, my professor asked me what is the average temperature there in Pakistan during summer and winter. We can take a lot more examples, for example, if we conduct SAT exams in different private schools like Pakistan, Lahore and other cities, what is the average score on SAT exam for female male, similarly, what is the average rainfall during the four seasons in different areas of Pakistan. So we take a lot of examples in everyday life where we communicate in averages, in central value or summarizing the data into one single value. Summarizing the data into one single value or finding the central position in the data set is not always an easy job because there are a lot of data sets where we are struggling with how to communicate in such a way that we can appropriately communicate without losing the real meaning. As you can see, there are three data sets and distributions in front of us, one of them is normally distributed. We will talk about it shortly. In which we can explain the average value, which is 5, which is our center, its maximum frequency. In our data set B or C, you can see that it is not really easy to define what is the center in the distribution. So that's why we have three mayors of central tendency, which are mean, median and mod. There are three of us who have our own ability, three of us who have utility and three of us who have different places, we have our own benefits in how we can communicate better and better results. We will talk about three and then we will compare three, how, when and where we have to use which mayors of central tendency. For instance, mean is the center of the gravity of the distribution and mostly our parametric statistics, when we go ahead and read the testing hypothesis, usually when we compare groups or draw inference, our main unit is the mean, on which we go to calculate the mean, in a simple word, it's an arithmetic average of the distribution. It is a summation x over n as you can see the formula here. Summation x means that all your data points and distribution, by number of the data points we divide it, which is n. For example, my sample is 4, whose scores are 3, 7, 4, 6, to calculate a mean, I will add all these four points and then divide it by 4 because total number of n is 4. So 3 plus 7 plus 4 plus 6, which is equal to 20, 20 divided by 4. So my distribution of mean or average arithmetic average would be 5. Similarly, we don't draw weighted mean, weighted mean means that when I have more than one distribution and I want to communicate results together by combining the means of two or more groups. For instance, I have a male sample, a female sample, or I have a different distribution sample here. For example, if I have assigned 10 students from my class to go and find out the motivation level of the students to find out, each student will draw a sample of 50 students and then will come to me and report one single value to me. So when I have to draw a weighted mean, I will collect all those 10 students' samples and then I can draw a weighted mean from those 10 reported mean and the way to draw a weighted mean and the formula is on the screen in front of you, i.e. we will take the summation x of sample 1 and take the summation x of sample 2 and divide it by n1 plus n2. For example, in this example, if we add 4, 5, 6, 7 and 8, it will add up to 30. And similarly, if we collect the other 4, 6, 8, 10, 12, we will get a total of 40. So it is 30 plus 40, now our n1 is 5, 1, 2, 3, 4, 5 and n2 is also our 5. So 5 plus 5 is 10 which is equal to 70 divided by 10 which is equal to 7, our weighted mean will come. So when we give different projects to students or when we collect data by drawing different samples on one topic, then weighted mean is useful because you can communicate all samples into one single value, their average of all the samples. So this is how we calculate weighted mean. The mean from the group data, like we made a frequency distribution for a group or a group data, when we have a lot of data points, for example, 100, 200, 300 data points, we definitely group them by making a class interval. So you can see on the screen that we have data and we have 80 observations in data and those scores are basically on the test of a start, for instance, 1 to 10 scores are possible and we have made its classes and we have written their frequencies in the last lecture because we have learnt to make a frequency distribution. So to get the mean, we will be needing an extra column which is x column which is a midpoint column and we remember that to get the midpoints, we will add lower limit and upper limit and then divide it by 2 to find the center of this interval. So 3 plus 1 divided by 2 which will be equal to 2. So we have taken out the midpoints for everyone and to take out the mean from the group data, formula summation fx over n, fx means that you will multiply f with x and then take the summation i.e. total. So 12 multiplied by 2 is 24. Similarly, I will multiply each x value with the x value and then get a total of summation fx which is 426. I will plug the values into the formula which is summation fx divided by summation n because it is one and same thing. So 426 divided by 80 which is equal to 5.325. So if someone will ask me what was the average of those 80 students on that test, I can report that 5 was the average score or the mean value in my data is 5. So this is how we calculate mean for the group data. Now the purpose is that we use mean in most cases, but in all the conditions, maybe mean may not be your appropriate measure. For instance, if I have a data with extreme values, first of all I have 2, 3, 4, 5, 4 and then 50. So if I want to take out the mean of this data, although I know that the majority data is scoring from 5 to below, but one student has taken the maximum record beating number which is 50. So when I have such distribution where I have few extreme scores in the data, then mean might not be a very good choice to report central tendency in the data. Why? Because I know that the majority is 5 to below and the mean should be somewhere below. But if I will add these 5 values and divide by 5, 54, 56, 7, 8, 59, 60, 61, 61 divided by 1, 2, 3, 4, 5, which is almost equal to 12. Something. Now if I have to report that the average score on the test is 12. Something, then maybe it may not be so accurate for me to report, it might be somehow misleading, especially when we have, for example, I remember that if you have to calculate the annual income and you have to calculate the annual income, then if there are bill gates and other normal people come in, then where will your average worth go? The average income per annum will go to a very high value just because adding one extreme value in the data. So in this type of data, then we have to use other measures of central tendency and we will talk about the next measure in the next try.