 Okay, so let us restart a discussion of the basics of statistics. In this session I am going to continue with a discussion of random variables and given that we have some variation in whatever it is we want to measure, we are going to have to figure out how to deal with this given the few attempts we have at doing experiments. So in going forward what we need to actually understand is if we have a random variable and it is taking on a bunch of values of course we know remember what the random variable was. It takes on a set of predefined values, there is a whole range of values that it can take and you know what that range is expected to be but you do not know which outcome you specifically will see in the next trial. So given that there is uncertainty with what you are about to see the question comes up what is the expected value of this particular variable. So we could use an example of heights of people again, let us say that an alien comes down from Mars and is trying to figure out the average height of a human being. So the alien is going to have to do a few experiments, he is going to catch a few humans, measure the heights and then using this collected set of measurements the alien will have to figure out what is the average human height, so what is the expected height. So now we are going to have to get into a discussion of what is it that ultimately the alien wants to measure and report back let us say to Mars and what is it that is actually being seen as individual measurements and therefore how much information does each individual measurement provide. So the point being that if an individual measurement is associated with a random experiment then there is error in it and if there is error in it it becomes important to ask what is the expected value of x given that we saw a particular outcome in a given trial at that experiment. So what is the expected value of x which leads us to this concept of an expectation and we need a notation for this and so this e of x is an expected value of x but then that can also be represented and in fact most books would have it as a mu and that is also often referred to as a population mean and what we mean by a population mean is what is the average height of the entire human population that after all is what the alien is required to report back to Mars and of course what the alien is doing is looking at individual measurements of a few people and the hope now is looking at the few people can you comment on the average height of the entire human race so there is a population and there is going to be a sample of people whose heights are being measured but remember what we want to report is not what is going on with the sample what we want to report is what is going on with the entire population so what is the population mean and of course if you are going to talk of the population mean of heights given that heights are expected to vary in the entire population what is the expected deviation of an individual's height from this mean so what is the in other words what is the variance what is the population variance so if you had the ability to measure the heights of everybody in the whole population what is the average height and how much variation do you expect to see in the whole population and the standard deviation that I had mentioned briefly before the break is the square root of the variance so an expectation refers to the expected value of a measurement but when you moment you talk of an expected value for measurement what you are actually implying is if you had had the ability to look at this measurement in terms of an entire population if you had looked at the complete set of measurements what would that average then or what would that average deviation then have turned out to be so what is a population measurement so now it turns out if you want to compute an expectation we have to go back to asking for each observation that we expect to see for each possible value of a variable that we expect to see how often is that value likely to arise so if my variable is x i how often is x i likely to arise and our notation has been that the probability of seeing x i is p of x i and then therefore the average value of x i the expected value of x therefore is the summation of a particular measurement x i and how often that measurement is likely to occur so p x i so summation x i p x i and it is easier to immediately see how to use the concept of an expectation if you again go back to that coin toss experiment so in most of statistics we are actually tossing coins or rolling a die but it is easier to visualize those as examples and then map them back on to the variable or the process that you are looking at so if you are looking at a single fair coin being tossed and remember that heads and tails are not random variables you need to define a mapping from heads and tails to 1 and 0 so let us say heads implies that x is 1 and tails implies x is 0 so if I give you that kind of a mapping then since it is a fair coin the probability that you get x equals 1 which means heads is the same as the probability that you get a tails x equals 0 and that is half in which case what is the expected value of x if you go back to the formula on top you basically saying that you will see the value 1 which is heads half the time and the value 0 tails the other half of the time and so the expected value of x is 0.5 so the expectation or the expected value of x is 0.5 while the actual outcomes of an individual experiment will be 1 and 0 so you actually toss a coin you will see a heads or a tail in which case x is 1 or a 0 but the average value you see is 0.5 and in a sense you should be slightly amused with the fact that the average value can never be an individual outcome by itself you will never see in an individual experiment x take on the value 0.5 x is only 1 or 0 but its average value takes on a fractional value and the same thing actually applies then if you also roll a die instead of a coin so take an unbiased die and you roll it and what you expect to see you expect to see the value is 1, 2, 3, 4, 5, 6 with equal probability because I have just said its unbiased and if I then ask you to figure out what is the expected value of x where x is the value that comes up when you roll a die what is the expected value will turn out to be you look at each outcome so go back to a formula summation x i p x i look at each observation and then ask how often is that observation going to occur in this case each observation comes up with probability 1 over 6 and so if you go through the formula it will turn out that the expected value of x is 3.5 and once again 3.5 is not an outcome you will see when you roll a die once so why do expectations matter to you so I have put down two scenarios on my slide I am talking of a casino or a lottery so expectations matter if you talk of probabilities and probabilities matter if you talk of entire populations which basically in turn matters if you think of doing an experiment and infinite number of times. So when you are at a casino for example and you are gambling at a game what happens to you you go in with a limited amount of money and you start betting at a particular game so your ability to play this game is limited to how much money you have to how much you can tolerate losing your money but then why does the casino play this game why do the owners of the casino play this game because they also might run the risk of losing money okay so what ends up happening at a casino is that the odds of a win at a game are obviously rigged and they are rigged against you winning such that for every 90 rupees let us say that you expect to give out to a customer as his profit on playing a game make sure that there are enough losers such that there is 100 rupees coming in okay so let me repeat that if the casino owners rig up a game such that on average if they have 100 rupees coming in and on average they are expected to give out 90 rupees to the people playing a game then the factors they have made a net profit of 10 rupees again in an expectation sense. So the casino owners do not necessarily worry about whether they are about to give out 1 lakh rupees to an individual so they may end up giving 1 lakh to an individual okay but then there are so many other people who are losing their money playing the game that effectively that the casino ends up taking in more money on average even though it has to give out large sums of money to individuals and that is exactly the same thing that happens to the lottery. So with lotteries you might have a limited ability to play the lottery because you do not wish to spend too much of your money so you might play it now and then but the owners of a lottery okay a state government for example is not too concerned about whether they are giving out lakhs or even crores to an individual as long as enough people have played this game and lost such that the expected net profit is positive for the state government as long as they make a net profit okay they will continue with the lottery scheme. So why does the expectation matter here so who cares for an expectation as an individual playing this game you do not care to worry about expectations because you are not going to gamble infinite number of times you have a limited amount of money and as your sum of money in your pocket starts dropping down you are going to lose your nerve and you are going to walk away from the game but the casino owners or the lottery owners they are in it for the expected profit they are not in it for an individual event they are not in it for an individual trial they are in it for the long term so it is an expectation which matters to the casino okay and it is an expectation of a net profit which matters okay so this is typical in our research problems in the sense that we wish to make comments of what would happen if we had the ability to repeat an experiment a large number of times we wish to make comments about model parameters taking on particular values but the fact is we are unable to necessarily do these experiments a large number of times so we are unable to calculate that population average ourselves but instead we do this experiment a few number of times and then we ask the question what is the expected value that seems to emerge on doing this experiment a few number of times so we wish to make a comment always in our research about population values okay population values under the assumption of infinite experimentation but our limitation is that we can do it only a few number of times and therefore we resort to talking about expected values given the few samples that we have seen and of course the immediate headache is our samples might be just sheer occurrences of luck for example if you are expecting to say that a coin is fair the expected probability of heads with the fair coin is 0.5 but if I ask you what is the proportion of heads you will see when it toss a coin 100 times you may come up with a sampled experiment where you see 40 heads out of 100 and therefore you will come back and say I saw a proportion of 0.4 as the proportion of heads right so what's important is not necessarily the 0.4 what's important is what is the expected value that the 0.4 is telling us about is that 0.4 ultimately reflecting that the true value is 0.5 in which case it's okay that we saw a measurement slightly away from 0.5 and so it's the same for this alien who's come down from Mars who is supposed to report a hideback to get his planet so he is going to catch a few human beings as we said he's going to measure their heights and he's going to assume that the average that he gets out of the sample somehow reflects the entire population of human beings because the alien doesn't have the ability to capture the entire human race so the alien wants to talk about the expected height of the entire human race but he does that by looking at individual measurements and then trying to come up with some logical okay measure of an average looking at the individual measurements so what's really important is the expected value not the individual measurement and that's a theme that I'll come back to again and again when you talk about experimentation so in all of this discussion I have been forced to talk about a population and a sample so the moment we said expectation we said this is what happens if you looked at the entire population that would have been the expected value of something if I had had the ability to do an experiment an infinite number of times that would have been the population value of a particular model parameter that I found out so invariably we want to make comments about population parameters and to do that we seem to be sampling so that alien sampled a few human beings and found their heights but we sample all the time when we carry an exit poll after an election and you want to find out which part is likely to win the election is not as if every single person voted was asked who did you vote for so by sampling a few people as they leave the polling booth the TV station is going to try and find out who likely has won the election so the exit poll so a clinical trial also now is a similar situation so for example somebody has come up with a new drug and wants to prove that it's a good drug which let's say cures cancer a certain type of cancer now there is no way you can do this experiment okay systematically and test it on every single patient who has that form of cancer you can't do that so instead all you need to do okay and common sense tells us that is find a set of people who have that particular type of cancer and just experiment with this drug on them and their health improves and you probably have yourself a good drug so in other words carry out a small clinical trial so the point is that clinical trial hopefully is letting you know about how this drug will behave for the entire population of people who have that particular disease so you sample with the intention of learning something about how your process might behave across the entire population and the moment we talk about a population we need to know how often each member in that population is likely to be seen which of course is a probability distribution and the probability distribution as I said before is a model based on infinite sample so the samples that we are talking about are measurements coming out of this entire population and therefore following this distribution and therefore we will have to start asking questions like what's this distribution looking like where are our samples relative to the shape of this distribution and are we seeing something normal are we seeing something unusual with our measurements by the way it's important to recap something that we said about that random experiment the samples must be randomly chosen so that alien comes down to earth and happens to sit at one of the local centers for a workshop then that's probably a bad estimate of the average human height because you as you can see okay around you you don't have children in your auditorium so you're probably not getting a true estimate of the average human height so you really want a random sampling of people so the alien is expected to randomly roam around there and capture human beings young old short tall random without bias and then sample and then estimate an average and then report that average back to Mars so random is the word already coming up several times now the random experiment the random variable and the need to sample randomly so without a possibility of bias so which basically means you need to be careful when you keep using this word random so remember random doesn't mean that we don't have control over an outcome random simply means okay that we are without bias sampling from a population of measurements so if I have a population of people I will randomly sample a group of people and measure their heights and report an average okay if my experiment can give me a set of measurements a range of measurements and I randomly want to try a set of conditions which will allow me to span a set of measurements and I will then report an average measurement so kind of heading forward and in particular heading forward into hypothesis testing where you have yourself some phenomenon and some model with some parameters in it where we are going is we want to start making comments about what is the expected value of the parameter of a model that model by definition is a population model meaning whether you are testing acceleration due to gravity in one center or another center all of you should be for example getting the value 9.8 meters per second square so it's a population value but given the experimental infrastructure in each of your centers it's possible that you will end up with slightly different values that's okay what we really want however is not the individual measurement what we want is what's the expected measurement what's the expected measurement of that parameter so typically the parameters that we are going to look at okay and once we have looked at so far for example when you're talked about heights of people we have said what's the average height what's the variance in heights so mean and variance and the other kinds of parameters that you have talked about are what's the proportion of heads you might see when you toss a coin especially if it's a fair coin is that proportion 0.5 and so on but these could also be model parameters okay for example to do with linear models so for example when you are talking about f equals m times g acceleration due to gravity then it's actually a linear model that you're fitting between f and m where you are saying the proportionality constant is a g so remember you want to make a comment about a population parameter it's always a population parameter you want to make a comment about but you because you're the one doing the experiment are going to see the individual measurements so that alien will see individuals whose heights that alien will measure so the individual measurements now are random variables so on this slide itself now you can see the use of the word parameter and the use of the phrase random variable so what's the difference so parameter is like the standard value of gravity 9.8 meters per second square regardless of who does this experiment where this value should not change okay it is this true universal constant it should not change so if I look at people doing this experiment across several places and several times at each place and I do this experiment as often as I can that average value is 9.8 and then at that point it's a parameter which does not change on you but when we do an experiment once in a while okay that individual measurement okay is a random variable because that individual measurement can change on you so it's very easy to see this not with gravity but with the coin toss again which is why we go back once more to the coin toss so go back to that 100 coin toss experiment it was supposed to be a fair coin so what is the expected number of heads that you should have seen so what is that parameter in that model there for a coin toss the parameter is the proportion of heads you see okay in a set in a number of tosses so that proportion we expect for heads is 0.5 so you expect to see a proportion of 0.5 but when you carry out an individual experiment you happen to see 34 heads in 100 tosses which suggests a proportion of 0.34 so what's the difference 0.34 is a measurement an individual measurement reflecting a random variable why because the next time you do this experiment instead of seeing 34 out of 100 you might see for yourself 55 out of 100 so each time you are doing this experiment this value might change so it's a random variable but on the other hand okay if it's a fair coin once and for all you should be seeing a value of 0.5 as the proportion of heads should not change so the parameter cannot change what you are seeing is changing and if you bother to do the experiment many times you will quickly realize that you are probably near about 0.5 sometimes less sometimes more but basically you are somewhere near the parameter value so the challenge now going ahead is if individual measurements are random variables which can change on you if you decide to do the experiment again and again how do we know that we are getting close to the true underlying value of this parameter for example how does that alien know that he has found the true height the human race even that he has sampled a few people and now everything may be up to how he has sampled so now there is an interesting insight in statistics in terms of theory which is do not pay attention to single measurements now this of course assumes that you have the ability to repeat an experiment so no more I have already talked about that climate change example you cannot repeat climate change experiments but the coin toss experiment we could repeat we could keep tossing away so if you can keep tossing away and we keep getting values like 34 out of 100 or 55 out of 100 or 70 out of 100 okay do you want to look at each of these experiments in isolation or do you want to collect all your experimental data together and then look at them at one shot and it turns out intuitively you want to look at all your data together which is where in fact the word statistic again comes into the picture so a statistic is any measurement that you derive when you look at all your sample data so it is a value it is a quantity whose value depends on the collection of samples that you have and not on the entire population so again now you have got suddenly the number of people in this drama increases so first we had a population so there was a population for example when you are talking about heights of people there was an average human height in the entire population and there was a variance reflecting the variation of heights in the entire population so there is a population mean and there is a population variance once and for all okay but now the moment I work with a group of people as a set of samples the set of samples can change on me which means that anytime I work with a set of samples I will end up with a sample mean and a sample variance to describe the average value that I see and the spread in those values that I see and the whole headache now is that both the sample mean and the sample variance they are not constants they are a strong function of the members in my sample set and this can change on me so if I do it from center to center or from day to day in terms of heights of people this will change on me so where the population mean and the population variance were constants or parameters once and for all because they reflect the whole population and they cannot change where those were constants the sample mean and the sample variance they are not constants they are derived from individual measurements which are not necessarily fixed which will change on you if you try the experiments again so anything derived from measurements with change in turn will change which means that the sample mean and the sample variance can change it turns out curiously enough that there are different ways then to combine the measurements that I have to come up with a measure of for example an average height so look around at the people in the hall where you are at and ask how are different in how many different ways can you come up with an estimate of the average height using the sampled heights that you have access to so I can think of many so the most intuitive thing we will all do when we want to find an average height is to simply take the arithmetic mean of all the measurements in front of us take all the measurements take find average that is we will claim is the average height but were there other ways to do it yes for example in school you are taught of a median and a mode what is the median the median is that value if you take all your heights and you sort them out the median is the middle most height so it is a reflection of an average an average in a sorted list what was the mode was the most frequent height you would have been likely to see because we expect the distribution of heights to be symmetric around the average the most expected height we would have seen probably was also the same as the arithmetic mean in other words the mode likely is also the arithmetic mean but there are other funny ways in which you can define an average height and so let me just put this out there so look around you find the shortest person in your room find the tallest person in your room take an average of the shortest person and the tallest person and that is the measure of an average height now intuitively you will all not like to do that why is that because you feel that if one additional short person even shorter person walks into your room your estimate of the average height will change a lot on you right so whatever average you are coming up with is a strong function of the measurements you have and if you happen to see an extreme measurement either a very short person or a very tall person any calculation which depends only on the shortest and the tallest is now going to be full of error so instead of that you can actually do something else you decide you look around you if it is dangerous to look at the shortest and tallest only in fact you can do the exact opposite you can throw out from your data set you can throw out the shortest you throw the tallest and then you average the rest that way in case you have got an extremely short person or for that matter an extremely tall person who you are afraid might bias your calculation you are throwing him out of the picture okay so point of it is if I give you a set of measurements there are different ways to estimate the parameter of interest the parameter in this case being the population mean you want to measure the population mean and you can do it using the mean the median the mode by using the min value plus the max value and then averaging them okay or by leaving out the min and the max values and averaging everything else just different ways to estimate the same population parameter and if you think about this if you have got a symmetric distribution of heights of people in your room it does not matter which of these approaches you use you will always more or less end up at the same average so it matters to you only if you have got some extremely short person and the rest of them are all reasonably average heights or you have got an extremely tall person and the rest are all average heights in other words it matters to you only when there is a skew so remember also then that a statistic because it has been derived from individual measurements is now also a random variable so remember our individual measurements are not fixed they change we allow them to change they are all coming from a random experiment they can change on us say the random variables say the random variables anything calculated using a collection of random variables must also itself be random so it itself is a random variable and if it is a random variable we are back to square one what is this distribution what is now the average of this random variable what is the deviation in this random variable so any random variable we go back to asking what is its distribution what is its mean what is its variance so at this point we can start with an interactive question for you guys okay so what I have here is a question where I want you to tell me in a simple study of gravity where you measure the force as you change M can you list the random variables involved in the study and why do you think they are random variables so we will shift to accessing one of the centers now this is a SVP engineering college Vishakhapatnam what is the functional difference between binomial distribution and Gaussian distribution can you give an practical example so binomial distribution first of all relates to a variable which takes on discrete values it takes takes on fixed integral values which is why we keep talking of this coin toss experiment what happens to you if you toss a coin 100 times you cannot take on fractional values with a coin toss so you talk of values 0 1 all the way up to 100 when you toss a coin 100 times right so any phenomena where you have got a discrete set of numbers okay now typically involves counting some events in a process if you are counting for example okay the number of accidents happening on the roadside that has got to be a discrete number if you are counting the number of people accessing an ATM that has got to be a discrete number okay so typically such phenomena okay required to be modeled using a discrete variable which in turn is a possibility with a binomial distribution the binomial also has this upper cap on the number of events which can happen for example when you talk about coin toss you cannot get more than 100 heads because you toss the coin only 100 times so where you have a discrete event happening and you are interested in the count of how many events are occurring and there is an upper bound on the number of counts that is a good possibility for a binomial random variable the Gaussian on the other hand is a continuous distribution where we are talking about a continuous variable which takes on any okay real numbered range of values of course when you have talked about distribution of heights we are looking at between 0 and infinity as the range of values for the heights okay so the binomial therefore cannot be used to describe heights because it is restricted to just integral values we need a continuous distribution and a bell shaped distribution intuitively makes sense to us with heights because we expect the distribution of heights to be symmetrical about an average height okay and if you collect enough measurements and set up a histogram you will quickly realize that it is more or less a symmetric bell shaped distribution and therefore it is a good candidate to be modeled with a Gaussian or a normal distribution so I have this question that if you are measuring okay if you are carrying out the study where you are trying to look at how force changes as a function of mass being applied where the proportionality term according to your theory is g what do you think are the variables involved in the study this is Sirna Engineering College Nehru according to me small m has to be a random variable because as we are changing f that means we are changing f according to us that means we are performing an experiment on f and m we get it or so according to me small m is a random variable there are three terms in that formula f m and g so are you therefore saying that there is only one random variable no force is also random variable Sir because it is a function in one random variable so force will depend upon acceleration also acceleration is constant so if mass varies force will also vary so force is a function in one random variable that is mass okay so what I identifying is basically the independent variable and the dependent variable in an experiment where you are trying to manipulate one thing observe something else and from all of this calculate yet another quantity right so you are trying to manipulate m observe f and then from there compute a value for g so you are right in saying that m and f for you are variables which you will control and manipulate in experiments and observe okay but the correct answer is all three of them are variables and the reason is if I go back to this business about a population parameter so this g that you are about to do see in your experiment that g that you are about to calculate and how are you going to calculate this we will calculate this as f over m this value of g unfortunately is not going to be a constant so since there are several of you in this room if each of you were to individually do this experiment I am sure you will all get slightly different answers so I go back to what I said a little earlier I would be very very concerned and very very worried if all of you got absolutely the same answer and you repeated an experiment again and again because actually highly improbable to see the same answer so therefore while you might be focusing your attention on manipulating m and then observing f the factor of the matter is the value of g that you are reporting back to me is not necessarily a true unshakable fixed constant that g also happens to vary and in that sense the g also now is a random variable because we are working not with the true population value of g but with an estimate of it so we have an estimate of g which therefore means that we have a random variable okay because remember what we said any statistic itself coming from a small sample of experiments of measurements must in turn also be a random variable so thank you Klerna so f and m were both variable of course the experiment was designed with manipulating m and observing f but then g which is this missing term or the proportionality term in there it should have been a population parameter but what we are measuring is not g directly what we are measuring is an estimate of g which I am calling g hat and I am making again this important distinction between a population value and a sampled value a population value can be claimed to be a parameter which should not change but its estimate is a variable and that will change now this actually matters to us because almost every time you do an undergraduate lab experiment and you are fitting a straight line you have got to be asking the question were you working with constants were you being asked to estimate some fundamental constant in your straight line fit okay and do you really think you have an estimate of a population parameter so very fundamentally it is this when you do a regression experiment or straight line experiment you are trying to fit y and x in other words come up with some model which says y is alpha times x plus beta so this is alpha and the beta are model parameters if you are looking at f equals mg then alpha is the equivalent of g and beta is 0 now this is true who does this experiment and whichever centre does this experiment it cannot change so alpha and beta cannot change inherently there should be one true value for acceleration due to gravity at the surface so the problem arises because we are working with a few attempts at an experiment and so where we were trying to get values of alpha and beta we end up getting measurements a and b as substitute or estimates of alpha and beta so a is an estimate of alpha b is an estimate of beta so you have got to remember that every time you are doing a straight line fit so we are quick to take whatever few data points we have fit a line claim that we have found a slope and an intercept and then we try to claim that these are the true underlying parameter values but the factor of the matter is what we have seen has error in it and in that sense the a and the b are random variables so just to summarize again you are looking to evaluate a model f equals mg but when you carry out the experiment what you actually end up with is g hat instead of g which is your estimate of g so the distribution of human height we have said is a Gaussian with parameters mu and sigma square so it is reasonable now to ask for the entire set of humans in the whole population if the average height of a human being is mu how do I come up with an estimate of mu and we have already kind of talked about an experiment where you will look at the people in your room collect the heights and come up with an estimate of mu and similarly you will want to come up with an estimate of the variation in heights you will estimate the true variation for the entire population by looking at what is going on in your room with your collection of heights so you want the population value mu but the factor of the matter is what you have is the mean of a sample not of the whole population so if I now give you a kind of formal definition side by side and I try to connect all this terminology which I have introduced so far so this mu which is this population average height of the entire human population you have already talked about an expectation so when we look at an individual who is x i the expected value of x i is mu so anytime I look at an individual that individuals height tells me something about mu and on average if I look at all individuals across the entire population I will be able to back calculate mu and in fact that is what I am writing as a summation on the right hand side so what I am saying there is if n is the size of my entire population and if I measure the heights of everybody in that population so then the arithmetic mean of heights in the whole population is mu okay so that is what we should have been trying to do of course we cannot work with the entire population so instead what do we end up with instead of capital N we end up with a small sample lower case n so we cannot work with the whole population we work with a smaller group the moment we work with a smaller group okay we have now a sample mean and because I have not sampled the whole population the sample mean itself can change okay depending on which n small n okay people you decide to choose in your calculation of the sample mean so mu cannot change x bar can change and similarly if I want to talk about the variation of heights in the whole population sigma square cannot change because it is after all talking about the entire population once and for all but anytime I work with the collection of measurements I will compute in its place an estimate of the variance which I am calling s square here and if you look at the two formula side by side I have on the left a formula for sigma square which is the variance associated with the entire population of heights you can notice that I am looking at deviation of an individual's height xi from mu which is average height of a person in the entire population that is on the left hand side whereas on the right hand side first of all I do not have capital N I have a lower case N because I have a subset of people whose sites I am working with and similarly I no longer have an estimate of mu because I do not know the whole population I do not know the average height of the whole population and so the best I can do is come up with a deviation from the average I see in my current sample in my collect set of measurements so my current set of measurements gives me an average x bar so I end up with xi minus x bar so what that forces you to recognizes while you are trying to study a phenomenon of which involves population parameters mu and sigma square you have ended up with x bar and s square and these are unfortunately random variables just to kind of recap because you are getting to some fairly important concepts now just to recap whenever we carry out some analysis of a model that model will have parameters which are important to us so if you are talking about a distribution of heights the parameters which are important to us are mu and sigma square if my model is about a study of gravity the parameter of interest will be g so depending on the phenomenon you are studying you have a parameter that you wish to learn about what you have access to because you can run a set of experiments is you can build a collection of measurements which will call xi this is a subset of measurement you do not have the ability to do the experiment infinite times you have a small number of measurements and your headache is how do I how do you use the small number of measurements to comment about all the model parameters that you really want to study now each measurement if I look at it in isolation does give me a feel for what is going on with the population means so if I look at an individual in a room that individual after all reflects the human race and therefore I am learning in some sense about what the true height of an individual ought to be in the entire population as an average now it is not enough to ask is every member of that population reflecting the average in some way or the other so now that actually is a slightly tricky concept let me spend a minute more on that what that equation basically suggests is measurements may vary so xi can change from measurement to measurement it can change but we are all individuals coming out of a particular population the population of humans so our heights whether they are same or not our heights are ultimately giving us some information about the average human height so looking at any human being should give you an estimate of what an average human height should be as opposed to looking at let us say a cow or a dog okay so xi here we are looking at members of a particular population and each member of a population gives you a feel what an average height might be but that is not what we do by intuition we combine our measurements we say instead of looking at individuals we combine our measurements and then try to get an estimate of that average human height okay and that seems to be a slightly better thing to do than to look at our samples or our individuals one at a time so you have got to be asking the question now why should we look at an average of a number of attempts at an experiment why should I try to get an x bar okay if I want to learn about mu so we want to learn about mu x bar is an estimate of mu but why can I not learn about mu just by looking at an individual xi and why do I instead have to go to an arithmetic mean of a sample to estimate mu so I can summarize this as a set of equations and then with a plot so if I am looking at an individual measurement think of what an individual measurement tells you so if I am again talking about heights at best what an individual will tell me is that that individual comes from the human race so which in turn tells me that the expected value of height for that individual is mu right and we have defined that the human race as a variance sigma square so the variation that I can expect with the measurements of individual heights sigma square so this is what happens if I look at one measurement at a time so fundamentally if I look at one individual I learn about the average height of a human being that is an estimate alright but then if I forget about the first individual another then go look at another individual that gives me a different estimate of the average human height and the problem is my two different estimates of the average human height do not collectively help me improve my understanding of the average so hopefully that was clear what we are saying is looking at one measurement gives you some idea alright as to what's going on and if you look at another measurement separately without using the knowledge you already have then you will continue to have practically the same amount of information and idea about the average human height and it is not improving your understanding of what the average is and only way you will improve your understanding is by working with an arithmetic mean and collecting all the information that you have and then averaging out all the information that you have because it turns out that if I look at x bar so remember x i is a random variable x i is a measurement with changes on you so it is a random variable and because x i was random x bar also turned out to be random because it depends on which measurements I have in front of me at a time so x bar is a random variable it is important to ask what is the average value that x bar itself might take so what does what does expected value of x bar mean it means that if I go to each center for each center I can now calculate an average height so I am no longer looking at individuals I am looking now only at averages per center so for each center I have an average human height and I can start asking each center to report back its average human height okay and then I will average these averages so we are now going one level further we are not looking at individual measurements and averaging them that we do inside one center but now we are talking what will happen if I come up with an estimate of an average per center so that is my x bar so each center gives me an x bar and if each center reports back an x bar I have a collection of these x bar measurements and the question will then come up what could be an average of all these x bars and that is the expectation of x bar okay because each center is not expected to give me the same value for x bar so I will have to ask what is the average of x bar and that will turn out to be mu anyway so now that leaves us in a position of either looking at individual measurements because individual measurements tell us about mu or we can look at average measurements because average measurements are also telling us about mu the difference comes about in what is going on with the variance of the average so where the variance of an individual measurement of sigma square it turns out the variance of an average now depends on how many samples I had in my average and so intuitively again you will agree with me if I sample more and more and I work with an arithmetic mean you will feel that you are converging on to the true underlying value of the population height okay so that is in fact something that most individuals intuitively will gain an impression of so if I were to try to show you this according to a chart so what does this dependency on n that x bar has mean so if I come up with one measurement okay what is my estimate of average human height that follows a distribution centered around the true average height which is mu so mu is at the center of my curve there is a large variance sigma square which reflects why this curve is relatively fat around the center line okay but as I go from one measurement to now averaging my measurements in other words as I start doing replicates and then start averaging my replicates what should happen the mean doesn't change my expected value of the mean doesn't change whether I am looking at expectation of xi or expectation of x bar I continue to learn about mu but it the fact is if I look at measurements one at a time the variance of xi is just sigma square whereas the variance of x square as this beautiful property of depending on the number of samples you have and that's that n in the denominator of variance of x bar which means as I increase the number of measurements the variance of x bar should drop so then is in the denominator so as n goes to 2 okay this curve changes so basically sigma square becomes sigma square over 2 which means the variance of my distribution has decreased so I have got a thinner distribution the distribution is still centered around mu and as I increase the number of samples my curves get thinner and thinner so what is the interpretation of the curves getting thinner and thinner okay what you learn is that your estimate x bar is converging to mu and that's particularly important to us because we want to end up in a situation where we say we think our measurements are given as an estimate of mu especially within some error bar so we wish to say I think I have learned about mu plus or minus let's say 0.1 or if you are talking about heights plus or minus 1 inch I think I have found the average human height plus or minus 1 inch so we want an error bar and what we want is we want as much of the area under the curve within this error bar why is that because the area under the curve is a measure of probability okay of having found the true underlying measurement so we want to therefore make these distribution curves as thin as possible with the moment we make them thin we end up saying most of the measurements that we are seeing are clustered around mu and the arithmetic mean that we are seeing of x bar is very close to the true underlying value of mu and what does it take for us to get a thin distribution you have to increase n keep increasing n which basically tells you what you want to learn about something you want to learn about something keep sampling keep testing that particular measurement again and again and again average it and in doing so you will gain accuracy in terms of what you think you are coming up with as an estimate of the underlying parameter value so this is of course something we are very guilty of in our experimentation we tend not to do replicates so here is one very simple reason why you must be doing as many replicates as you can because the more replicates you work with the more you average them more accuracy you are gaining in the estimate of that parameter okay of course that depends on how expensive it is to do an experiment okay how time consuming it is and therefore you are in a position to do replicates or not but it turns out there is another reason why we must sample so we are sampling okay so the previous point is we sample and we keep sampling because we think we will gain more and more accurate estimates or whatever it is we are trying to find so that is one reason to sample and that is one reason to repeat an experiment but it turns out there is a second reason and the second reason I will give you the formal definition but really the graphical interpretation I will give you next is easier to understand so the formal definition is this if you want to look at any random variable a random variable has a distribution whatever the random variables distribution is the moment you start working with a arithmetic mean of measurements okay so for example let's start with a random variable which is a binomial the binomial I showed you long back was not necessarily a symmetric distribution depending on whether the proportion of heads was 0.1 or 0.9 you would end up getting more heads or tails it was not symmetric around 50 so binomial need not be a symmetric distribution it is also a discrete distribution but let's say that I repeat my coin toss experiments many many many times so each experiment is 100 process let's say I repeat this if I repeat it okay and I then average all all the outcomes that I see across all these items that averaged outcome of course it continues to be a random variable because it depends on what we are seeing in individual experiments but that averaged outcome actually starts following normal distribution okay so that's what's called the central limit theorem statistics so it's one of the more important theorems and fundamentally it tells you that okay if you work with arithmetic means you will probably end up with a normal distribution regardless of what the individual measurement was following so the individual measurement may follow its own distribution could be any finite distribution it doesn't matter arithmetic mean as long as you are sampling enough times and repeating your measurements enough times will start following a normal so here's a very simple illustration of that so what I am doing here is I am rolling a die it's a single die so what are the values you expect to see when you roll a die you expect to see the values 1 2 3 4 5 6 okay in fact you expect to see the values 1 2 3 4 5 6 1 6th of the time each so these are rolls of a fair die so I am showing you a probability of 0.167 associated with each value 1 2 3 4 5 6 so what I have here are the values that I see on rolling a die and on the y-axis I have myself the probability of that particular outcome so this starts from 1 and goes all the way to 6 and each outcome is seen with probability 1 over 6 this is with one die but now let me ask you what will happen if instead of rolling one die we roll a pair of dies the moment we roll a pair of dies and of course we are interested in the total of what we see across the two dies you want to see the value on each dies add them up what should happen first of all the range of values that we can see now increases from 1 to 6 to what does it go to it goes to 2 to 12 so it will go from 2 to 12 but do you now intuitively again before I show you the plot would you agree that if I am looking at a pair of dies it is much more probable that I will get an outcome of 7 as a total sum then it is for me to get a value of 2 whether many different ways I can get a sum of 7 but there is only one way I can get a value of 2 when I roll a pair of dies and that is why getting a 1 on each dies so if you look at this distribution as a plot what do you get you get this and this curve tells you that 7 is the most probable outcome and that the other outcome are less probable and they keep decreasing as you go to the extreme outcomes the extreme outcomes being 2 and 12 is a triangular distribution we went from a flat distribution with 1 die to now a triangular distribution as we roll a pair of dies so what happens if I go to 3 dies now the range of outcome should go now from 3 to 18 I have got 3 dies the minimum value I will see is a 3 the maximum I can see is an 18 so if I roll 3 dies I will get this so notice how the shape of the distribution is changing as I increase the number of dies it went from a flat distribution to triangular to now what is reasonably symmetric not quite bell shaped yet but as I increase it is heading towards a bell shaped distribution and of course what is it that we call this distribution this is the Gaussian distribution. So what this simple example about rolling a die tells us is that while an individual attempt at an experiment may follow that is the blue dots here while an individual experiment may follow it is own distribution which you may or may not know do not worry about it try to do the experiment as many times as you can and then average out your measurements because the averaged value of your measurements is more likely than not following a Gaussian distribution regardless of what the original distribution was and so that is another really serious reason why you should make every effort in your experiments to repeat your samples and then average out in terms of generating an arithmetic. So where does that basically leave us there so we wanted to for instance find out the average human height mu you have said that we have got a few measurements and that will give for us a sample mean x bar. The sample mean x bar is a random variable and that immediately raises a concern do we really think we have mu or do we have potentially some error because of sampling so that fundamentally means that we are talking about x bar being equal to mu plus or minus something which I am simply reporting as some deviation delta from mu what we really want is for delta to be very small you want delta to be very small because that implies that we have got from our sampling a very good estimate of mu. So going ahead what we are required to do is to figure out what does it take to push delta down to as small a value as possible because you want the best possible estimate we can come up with of mu what is delta what does delta depend upon inherently depends on sigma square it depends on how much variation you see in measurements in the first place possibly depends on the least count in any instrument you are using to come up with the measurements but as you have just seen it also depends on the number of samples because we had that sigma square over n for the variance and what we saw is as we increase the number of samples sigma square over n decreases which basically means x bar gets closer and closer to mu within a particular tolerance and therefore in addition to x bar being the preferred point estimate of mu it turns out where we had it is we want to work out an interval estimate for any parameter in a model where you ask the question do we think that we have found an estimate of the true model parameter within a particular interval it is okay if you do not get the precise value once and for all but can you get there a claim to be there within a particular interval okay and that interval now is a strong function of what the physics of your processes it depends on your domain how much error are you willing to tolerate in the claim of a parameter example if you talk about acceleration due to gravity are we interested in the third decimal place or the second decimal place and so on.