 Let's talk about some interesting questions. Like how is the poverty line decided in a country? Or how are the exam cut offs are decided? We often wonder why somebody just over a particular line is considered say not poor and just below that line is considered poor or for that matter when you apply to some college Just by one mark. You may lose the seat or you might get the seat Just consider the days data set that I've drawn here I've shown it using different dots and that's not the point basically But the idea is let's say this is say some college and we have got so many applications and these dots are showing marks Let's just say these are marks and the college wants to segregate and just divide the data into two so that say a line like this a vertical line line like this and Everybody on the left of that line doesn't get the seat Doesn't get the seat everybody above the line gets the seat or probably is considered for the next round gets the seat so which data do you choose to draw the line in between and that's where we want to study median and median is also major of central tendency and It is usually used to divide the data into two major groups mostly Where mean or mode might not be the best cases just consider if these are marks and if I decided That I'll consider the mean of the marks But let's say this student here at the top just go way more than everybody else So the mean would move upwards the mean probably will be somewhere here And there is a chance that if I decide the mean everybody above mean gets to the next round and everybody below that doesn't get Then there is a risk of losing out on many good candidates And if the same thing happens at the lower level that this student has scored way lesser Then mean might move to the left and this could be the mean and we might lose on to Like we might just get through a lot of students and we will have to consider a lot of students in the next round So my objective is to select just half of the students about some median mark so that I can limit the applications Let's see how we can do this So to formally define median we can say median is the observation in the data set where half of the Observations are above it and rest half Of the observations are below it and this is basically The definition formal definition for the median So quite clearly now we will just have a data set and our objective will be to decide the observation Which just separates the data into so let's just have a data set So you see a data set like this we have different numbers and now the question is can I just draw a line somewhere and say Okay, these are the half observations and these are the half observations and I'll just choose value in the middle No, that's not a correct way to do this And we'll have to modify this definition of median that we just wrote Basically what we want is half of the observations Below the median should be lesser than the median and half of the observations which are on the other side of the median Say at the upwards of the median should be greater than the median And so half of the observations are greater than the median and the other half is lesser So now what we'll have to do is to arrange the data in the increasing or a decreasing order So let's just arrange the data in increasing order So let's write Two and then we will have the rest of the twos. So we have one two three four twos So that's how we'll write twos Which is the least which is the least value in the data set and then we'll go for three So there is one two threes. I guess so yes two threes Then we'll go for five because we don't have a four five and then another five and Another five. So there are three fives then we'll go for six One two and there are two sixes. So two sixes then sevens So I think there are two sevens Perfect, then I don't think there is a there is a eight but there is a nine one nine and The second one. So there are two nines and two tens So this is how we have arranged the data in the increasing order and now We can easily select a data observation which separates the data in two parts If you notice we are working with odd number of observations because it's easier to spot the middle value So let's count how many data observations we have We have one two three four five six seven eight nine ten eleven twelve thirteen fifteen sixteen seventeen So because we have seventeen observations If you choose some middle value, there should be eight values to the left of it and eight values to the right of it So let's just go for the ninth value from the left And that will be this five here. So this is the median and easy to see that these are eight observations and these are eight observations here and the eight observations on the left are less than five and eight observations on the right are greater than five and this is the median that we have found and median was the ninth number from the left And because we have n observations where n is odd Then the position of the median value would be n plus one divided by two I'll encourage you to find out What is the position of the median when we have even number of observations? Just a hint the median is a value that is available in the data set when the number of observations are odd But when the number of observations are even The median is not available in the data set and then you will have to find out the Median from two values in the data set two mid values So I'll encourage you to find out how we can do that