 सबते दिलेंगाउ ही ठ lime ? मरना सनबी तु दिबू। थुड़े हमी वाब म fon ज़याले है किर ठ सेंball ज़ा वी नीविख तो सकतीसां मैं क्यीना लागना नादी हुज ट्या eyelashesा पकटिटिपFНАЯ Indust good कुएन च्करि�らいारत है च्पर्ष किर कते। अपनी हो स कर computer  sekali, what are the two approaches that we have? there are two approaches here, numerical and graphical, and then we would like to know certain relationship between known descriptive statistics such as mean, median and more. The descriptive statistics primary objective is to describe your data to a totally unknown person who has a noning greed आद़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़़� खुड़ी सब क्यशक खुड़ा हाँ खुड़ा घर वर्जाँ साभी करतिए भी प्शिःा करतिए च्वाला। ही पाशक खुड़ाउ मेतरिस। और वर्चा खुड़ काता है लिए ज़ों कुड़ा रहे Bewगे है ु उसाया़ कर रेण्जेंप दानी वी वीग्डीचका कोई नहीं हो येगा नहीं हो आप प्वाच्छात एक देगीवी ब्रुक्ञान प्रूक्ध वो और तो नहीं ने वो, तो जगीवी धिमलिट्तुर रेखवान्गे तो और प्षद्छात भी फूक,OULD you see this very often in the general literature but the vox and whisker plot is very common among the scientific literature to represent the data우. So, let us move on first we would like to discuss different numerical methods to describe the data気 Carter or to explore the data. उतिटि करने लग of the data, अप आप बच्टा होगने में लग कोत लग ठुमन तुमनो, अर आप च्टंचा। अप रव लतिटाए ईदि, तुमनो वरचटाए तुमनो ओगने यजिए। भी जिरटा होगने में लग छोंगे लग झागगा और च्टंचा। � Nep is the sum of all the values of xi arsa  infectious immuneic Off-ườiट faculty folg ॐ  लग ॐ ॏ sparkle ज़そして ॒ क़ ृ ॐ ौ か ॄ॑ आई मेंप़्ािया वाल्गा कन्चा बाठा की उक लिए वो अगर को तुर्फाँ घेकाल of the place, जिन्चान जब जक्या लिए नस केirrelे प्तिताक करी। जिन्चान जेंचान जेंचान जेंचान केले और जो जृ फ़ोगा फ़ोगा यू nutrient-deher, उक लिए वव मेंप़्ार की। लग्सल से टब हैं हैं करो ही किछागी, जो पना क़ा होगे के विएव्टाएझा करम से चलते हैं बत बतव पर aus वेश्व धिर रें करम भी वेश्व पर करव टब हैण विएव डब वेविएव लिझा कर KN if you look at it carefully, it actually draws an equivalence from the physics of what they call mean physics which is called a gravity of matter center of gravity of the matter just as in any matter there is a center of gravity similarly for any data it is centered around its mean value and thus it has a certain similarity between the physics where we say center of gravity and mean value of the data in statistics there is another such value and that is called median now median is another central tendency value in which it actually divides the data into two halves it actually divides the data into two halves so if you have x1, x2, x3, xn to be your n data points these are your n data points we ordered this value remember when you get the x1, x2, xn value they need not be in ascending or descending order so suppose you order them down and call them y1 less than or equal to y2 less than or equal to yn basically x1, x2, x3 and y1 where to yn are same but y1, y2, yn are ordered once you ordered them if you have odd number of data points that is if your n is odd if your n is odd you pick exactly the middle value you pick the n plus 1 by second value that value is called the median of the data in other words it will divide the data this side 50% of the data will be there and this side also 50% of the data will be there so this is called a median value now suppose if we have the points are even in number then you have to take an average value because it does not have any middle value there are two middle values so when there are two middle values you have to take an average of the value which is sitting at n by second point and the value that sits with n plus 4 points for example if we have x1, x2, x3, x4 and x5 these are all ordered because this is a real line and therefore these are all ordered then this is your median value but if you have one another point x6 then the median value is average of x3 and x4 this is what it shows now the question is that mean or median what to choose a very common question because both of them in a way provide a middle value of the data one divides the data in 50-50 while the other divides the data in such a way that it becomes a center of the data we find that when you compare from their value it makes a difference that median is called robust against extreme values in the data while mean is affected by the extreme values of the data here we have shown it through a very simple example let us take 5 data point in our data which is 8, 9, 10, 11 and 12 what is the mean value average is going to be 10 and the median value because there are 5 data points the middle value we have to pick up so the median is also 10 now just replace this 12 by 18.0 extreme value it says so extreme value it means that either it is very large or it is very small in this example we are changing the larger side value to even further large value now you see the mean value will become 11.2 while the median is always a middle value and therefore it always remains 10 so this is what it says that median is robust median is robust against the extreme value while the mean value tends to get affected by the extreme value so this is the difference these kind of difference one needs to use very selectively or very thoughtfully when you apply the central tendency value for example if you are considering a case in which you are grading a person among 10 people or among 12 people and you would like to know that 12 people has graded a single individual and has given different grades now if someone wants to favor the individual may suddenly give a very high value or if he does not want to favor the person she may give him a very small value in that case if you take average as your central tendency the mean value that person's perception which is a sort of a biased perception will affect but if you take in such cases a median value it will not affect so in such situation where more it is examining a person a student or a candidate for a post or a candidate for a promotion it is very common to use median as a measure against mean there is one more method which is called mode mode is very simple you must have seen the frequency plots we are going to talk about it at some later time there is a you can see that there are you come with a bar chart in which the frequency plots are given like this in that case it says that mode is the value where the highest frequency is measured mode is the value where the highest frequency is measured it is the most probable value of the data because it has the highest probability of occurrence and therefore it has the most probable value it is called the most probable value of the data it is possible to have more than one modal value and such in a data and such data is called multi-model in future in this course now itself when we come to frequency plots we will show you one plot from materials data where we do see the bi-model and this two model distribution actually tells us that how to look into the analysis further how to you know segregate the data in future there are other statistics involved in it such as there is this percentiles it means that once again you order the statistics in an ascending order so as I said suppose you order them as y1 less than or equal to y2 less than or equal to y3 like that up to less than or equal to yn then the first percentile p1 p1 is first percentile where 1% values lie below p1 so for example if n had been 10 then first percentile would have been probably only one data point should lie below that and therefore y2 would have been the first percentile so likewise you can have a kth percentile where you pick up some value here I call it pk where the number of data here are k% of n the values below that value is about k% of n there are special names to these percentiles p25 is called the first quartile q1 because it divides the 25th percent data p75 is third quartile and of course p50 is what we call median so if you look at the data if data is on this straight line 50% of the data is divided by this data point then it is median or we call it p50 or it is called q2 here when you say this is q1 it means that on this side there are 75% of the data and here if you call it q3 it means that this side it is 25% this side is 75% of the data so these are some of the methods measures of central tendency there is a relationship between the 3 measures that we have learned that is mean, median before we go into it let us talk about some graphical methods here we are and before going to graphical method sorry we have to cover measure of dispersion see in the central tendency we have covered the measures which actually decides the center point where the data is located where the data is centered so we have mean which is the average value we have a median which divides the data in two parts 50-50 and we have a more which gives you the maximum data maximum frequency of a data now we would also like to know how the data is spreaded for example why is it important because if you know what is the mean value it is insufficient as we saw that you know what is the maximum and what is the minimum within which area it is spreaded it is like this or is it like this you will see that sometimes you have the same mean value but the data could be spreaded on a larger scale so we must know what is called measure of dispersion there are 3 measures of dispersion we are going to discuss here they are called range variance or standard deviation and the interquartile range let us go to each one of them one after the other range as it says if m is the maximum of your data value and small m is the minimum of the data value then maximum minus minimum gives you the total range of data this is one measure of dispersion so if you have a data on this straight line which is x1 x2 now please note that this is I am writing it as an ordered data and this is xn then this xn minus x1 this is the range of the data please note that I have assumed that maximum is xn and minimum is x1 which need not be true all the time but this is for the simplicity I have shown it in this way another measure which is most commonly used measure is variance variance is defined as a mean squared distance mean squared distance of data this mean is x bar so you can see that it is squared distance xi minus x bar to the power square 2 so it is a squared distance and then we have taken a mean value by dividing by n minus 1 now this n minus 1 why are we taking n minus 1 you can roughly understand that out of n data points freedom that we had one parameter of x bar has already been calculated and therefore we are dividing it by n minus 1 but for time being we take it as a formula and this is the formula for s square and the standard deviation is the square which is s the next measure is interquartile range in the previous slide we defined what is the third and the first quartile the distance between 2 is called an interquartile range roughly speaking variance gives you dispersion around the mean value and interquartile range gives you a dispersion around median value this standard deviation as I said is most commonly used measure of dispersion and in this course we would like to emphasize it because the data representation with an error bar actually represents the standard deviation that you found in the data so under the assumption normality that is the data is distributed very nicely as a bell shape curve where the mean value lies right in the center in that case it says that if you take x bar plus s value and you take x bar minus s value in this area about 67% of your data will lie so when you measure any value of your experimental result then that value if you give if you find its standard deviation and in plotting suppose you want to show your data points like this you can show it with an error bar in this manner which actually shows that your data lies right in the center and at the most it can make an error of plus or minus 1 sigma and that way 67% of your data is covered in this therefore it is very common to show an error bar against every data point and this error bar is generally one standard deviation away now we move on to graphical methods we will cover mainly 4 kinds of graphical methods which are most commonly used histogram or bar chart which is also known as frequency plot pie chart you must have seen it number of times and cumulative frequency plot and finally box and whisker plot as I said the histogram pie chart these are very commonly seen in any general literature such as newspaper or any general scientific journal or any business journal which are very commonly used graphical methods cumulative frequency plot you get to see more frequently in many of the financial matters box and whisker is a plot which is very commonly used among scientific data representation so let us move throughout this graphical representation we are going to use this data this shows a very partial data which covers certain aspects of a data there is a code 1 code 2 some codes are given there is a temperature given these are the materials yield strength ultimate tensile strength percentage elongation and percentage reduction of area so this is a typical materials data and we will cover the graphical methods using this particular set of data before going further I would like to summarize what we have done is we are discussing to use descriptive statistics to describe the data to a common person there are two methods of doing it so far we have covered the method which is the method of numerical in other words numerically you can calculate certain values to describe the data the first was central tendency in which we covered mean or arithmetic average median and mode then we also covered how the data is spreaded and that we covered through measure of dispersion by covering the range the standard deviation and the interquartile range in the next session we will cover the graphical methods which will be histogram frequency plots pie chart and box and whiskers plot thank you