 Now, that we have most of the distributions we know, we have written them compactly and put them as an exponential family which has a very compact CPMF or PDF representation. Now let us start looking into the data itself. How to infer the parameters that we are associated in these distributions ok. So, for that if you have to get this underlying parameters ok, data is fine, but how this how it does that data should satisfy certain properties or the way that data is generated has to be done in some particular way that is what we are going to now study in sampling. So, when I say sampling, it is about the data points. If you let us say there is a black box which is generating data, you do not know what is the underlying distribution according to which it is generating data, you just ask it to give one data point it give that is like a sampling for you got one sample. Like that you can ask more samples and that whole process is called sampling ok. Now, little bit motivation on this sampling, sampling you can just roughly think as data collection mechanism. Now, for time being forget about this underlying distribution and all data is collected in various form for various purposes right and nowadays especially when there is a large population and you want to understand something about this population, but you cannot go and talk everybody you may only talk to small fraction and you could just imagine like ok you go and talking to them and collecting information that itself can be thought as sampling process ok. Now, let us see one way to look into all this is like a I am going to let us say let us say x is a random variable which denotes the score obtained by a particular a student in this IE 605, x is that random variable and now all of your realizations of that random variable ok. I can see what is the score obtained by you what is the score obtained by you like that that is that I can think of realization of that random variable. If I got like about 60 students are in this class, if I got the score of the 60 students that I can think of 60 samples of this random variable x ok. So, similarly let us think we as a country we are so many people. Let us look into one particular issue like health status of V Indians what is the health status of Indians that is like one value let us say some health has some number which is going to be denoted by x. Now, all of us are the realization of that all of us kind of like based on like our health value like let us say we assuming that like we have some inherent properties like we as Indians which is going to govern some medical condition maybe like all of us are manifestation of that medical condition right which shows up in like like maybe we can go at each individual ones and ask that. Extending this argument suppose let us say this is the classic examples of polls we get all the states polls, national level polls everywhere right and depending on something maybe let us say all Indians have some tendencies to make I mean we have our own way of making opinions about what to do which party to consider and all and during this polls when people I do all this opinion polls you have seen right like they cannot go and ask everybody that is a massive process that election commission will do. They will only go and talk to certain people that is like a sampling process instead of going and asking everybody they are only going to possibly ask a certain fraction of the people feeling that they are like a representative sample of the whole population ok. Now, here we can see that x is we can think of polling x is like a random variable which is going to denote what which party you are going to vote then I cannot then all of us are like a realizations of that. Now, you can just think of these arguments like suppose there is a new car in the market and all of us have something about like to buy it or not. Now, what the company who is launching that car want to know that whether that brand new car will be how much it will be accepted in the market. So, they cannot go and ask everybody they will only go and ask maybe certain fraction of the population only collect certain samples. So, the point here is like we can just imagine that all of us are like samples when we are looking into the big population and that population is like a realization. What we want to do is we cannot go and ask everything we will be only sampling certain fraction. Now, the question here now comes is how to do better sampling ok. So, let us take this voting issue itself when the election commission conducts voting everybody is going and waiting like assume that mostly all the populations are voting. But now you are doing an opinion poll you do not have that much of resources going and asking everybody that is a massive work you are only going to ask certain fraction. Now, how to choose that fraction itself that is a question right. So, that whatever that fraction tells that is actually going to represent the actual outcome. Now, now with this motivation the question is how to do better sampling and now on that front we are going to refer to one important term called random sampling. So, that it result in an unbiased result. So, what is unbiased we will make it precise what is the mathematical meaning of unbiased I mean we use the word biased unbiased very frequently right. When something is not favor we will say ok that guy is biased like he is not thinking the way I want, but we will make it precise what we mean by unbiased. Now, we are going to say that the random variables x1, x2 up to xn these are called random samples of an of a population f of x. Now, you see that I am using the word population for f of x, f of x is actually a pdf, but now I am calling it as a population if they are IID with common distribution f of x. So, simply random samples of size n from population f of x that means, they are IID samples drawn from distribution f of x and here distribution f of x when I say it could be probability mass function or probability density function depending on you are dealing with a discrete case or continuous case is that fine and now if you have one particular sample let us say you have obtained n samples x1, x2, xn and if they are indeed random sample coming from population f of x then what we know the joint distribution of them should be written as the product of their marginal pdf right. So, this is the definition we have said already this is just I am using independence here and identically distribution is coming because they are all following the same distribution. So, the same distribution the common distribution sometimes I am also now henceforth calling it as a population ok. Now, when I am doing the sampling I have to be using words with and without replacements. So, what is with and without replacement in the with replacement after you do a sampling the sample outcome whatever that you have seen this will be put back before the next sample is drawn randomly and each sample comes from a new fresh experiment like if you put it back all the possibilities are now there when you want to do the experiment again all the possible outcomes are there. So, that is why when I put it back it looks like it is coming from a fresh experiment and when I do sampling with replacement they will give IID samples. On the contrast if you are going to do without replacement. So, what I will do is after sampling the sample is not put back. So, your sampling outcomes get restricted when you do experiment again y equals to yeah. So, let us say you have this samples x 1, x 2 up to x n and from this you got y 1, y 2, y n using this relation. This is your new sample yeah because you have this let us say this is coming from some underlying distribution. Now, when you do y equals to f y this will induce another distribution on y right and all this new samples they are also coming from this distribution only right, but yes now you have to say that these samples are coming from this population ok. So, the population is always what is the PDMF of PMF you are looking into ok. Now, when you are going to do sampling with replacement it can give identical samples, but need not be independent ok. So, let us quickly look into some example. Let us say x is some random variable which is taking 6 value sorry from that is this experiment is throw of a die which is going to give you 6 outcomes right. Let us say I am going to repeat this experiment 10 times when I did first time x 1 let us say I got a value phi ok what is the probability of getting phi 1 by 6 because all the things are there. Now, phi I got I took that phi. So, that dies that the same dies I again used to get the second value and let us say I got a value 4 what is this probability this is also 1 by 6. So, I am going to use the same dies and again and again. Now, let us now say that x equal equals to phi I got now I remove that phase phi. Now, my dies is consisting only of phi faces and that phi is removed. Now, when I got x equals to 4 what is the probability of getting 4? 1 by 5 right. Now, this is the difference. Now, when I get this is let us say this is like a with replacement and now the when I am doing with replacement PMF is always 1 by 6 that is uniform, but now I did sorry this is with replacement and now when I do without replacement for this need not this is this is let us say T this is this and this is equals to I this need not be 1 by 6. If you have removed something which have that faces whatever you observed you have not considered again then this need not be all the time 1 by 6. So, that is what we said sampling with replacement this results in IID identical distributions all the time, but if you do sampling without replacement that need not that is no more going to be identically distributed ok. One thing you need to notice is when we are going to do sampling with replacement also it is not necessary that it is always going to be IID. In some cases you can you may happen that it may be not independent because if you are not going to perform the experiment in an independent fashion that could be dependency there ok. So, you have to be that is what we careful whenever you are doing the sampling you are doing sampling with or without replacement and what going forward we will always assume that we are going to do sampling with replacement all the time. So, that we always get IID samples is the distinction with and without sampling clear yeah maybe we should say without. So, when we do ok let us read this sampling without replacement can give identical samples. See this statement is little ambiguous let me see what I wanted to say here. See sampling with replacement is definitely going to be give identical samples sorry identically distribution because the distribution remains the same. But now whether it is going to be independent or not depends on the way you construct your experiment. If you are ensuring independence that is true then it is not only identically distributed this is also going to be independent samples. Now I am just trying to like this is I was just trying to highlight the fact here that even if you do with replacement it may do is to identical the underlying distribution can remain identical, but it may not be remain independent if you are going to bring in some dependency there ok. Yeah. So, if I do without that statement breaks, but let us say let us read it in this fashion only, but it is just like the experiments are not done in identically fashion ok. Even if you do replacement, but if you do not do identically sorry if you do not do experiments in an independent fashion IID things may not remain. So, that is what what now we are going to follow henceforth is when I am whenever I am saying I am doing a random sampling I mean is it is with replacement and also it is with done independently. So, that I get IID samples ok. Now, now ok you have samples you have random samples next question is what you are going to do with that? Now with this sample you are actually trying to extract some information right. Now you want to get some summary of this and whenever we are trying to do summary basically we are looking into some function of this T, some function of the samples which we are going to denote it as T. Now let us say I am up put some function on the samples and I got a value y usually this we got by processing this data sample that is called statistic. Notice that here your x 1 to x n now you are summarizing them by getting one random variable y which were through some function T that is called statistic and whatever the distribution of resultant y that is called sampling distribution of y. Now what are the possible operations I can do you give to any operations you want if you have n samples this T function can be anything. But we will you focus on some often used statistics one is simple take the average of all the samples you have and this is called sample mean. Another thing you do something more you first compute sample mean subtract it from the samples and then square them and divided by anything you want like maybe let us say n minus 1. This we are going to call it as sample variance and denoted by s square and the square root of this sample variance we are going to call it as some standard deviation. Highlighting this because these are the most common statistics we are going to use sample mean sample variance and sample standard deviation. The realized values of this sample mean sample variance x we are going to denote it by small letters x bar s square and s. So, notice that here all I am just representing the samples as random variables not the particular realizations. But when you plug in the particular realization whatever the value you got you are going to ok fine let us see is this sample mean here x bar is a random quantity yes right because that is gotten by averaging random variables. So, this x bar can take some particular value x bar small x bar and this s square is also random variable whatever the realization it can take we are going to denote and similarly s r can also take some particular small value realization we are going to denote it like s. Now let us look into some of the properties of this sample mean statistics and sample variance we have. Let us say this x 1, x 2, x 1 they are random samples coming from a population with mean mu and variance sigma square. The expected value of this sample mean is always going to be the underlying mean value of that population. So, now let us say why is that. So, x bar is nothing, but this quantity you should take the expectation of this this is nothing, but average of this expectation is this step correct from this to this why linearity operation and now expectation of xi is all mu and if I add mu n times and divide by n this is going to be simply mu. Now similarly you can also compute the variance of this sample mean actually it is always easier to compute variance thinking this as a covariance like we already discussed that variance of x is nothing, but covariance of that random variable with itself if you just use this and expand you will and after doing some simplification and using the fact that these are all IID samples you will see that this is sigma square by n ok. Now let us look into the next one sample variance. If you look into the sample variance and again compute its expected value this needs to be done certain manipulation because x bar here is nothing, but average of. So, x bar here is nothing, but xi by n right. So, if you plug in that appropriately you will see that and whatever the definition we will get we are going to see that this value is exactly equals to sigma square. So, notice that if I if you people noticed when I took sample mean I divided by n, but when I took sample variance I did not divided by n even though I added n terms here I divided it by n minus 1 only when I did that n minus 1 you will see that expected value is going to be sigma square had I not divided it by n this would have not been sigma square this will be some extra factor would have come here. So, now the definition is whenever expected value of this sample mean is mu that is the underlying mean value this statistics x bar is called unbiased estimator of mu. Notice that now I am calling x bar as an estimator and whenever this expectation of x bar is mu we are calling it as unbiased estimator. And similarly expectation of sample variance is also sigma square and in this case the statistics x square is called unbiased estimator of your variance. So, sample mean and sample variance they are the estimators for mean and variance further they are unbiased ok. So, let us stop here.