 So, students as we saw in the last lectures different phenomena exhibit different distributions. We represent them through random experiments constructed in a defined manner to arrive at underlying distributions. So, we arrived at binomial distribution, Poisson distribution, Bernoulli distribution, interval distributions etcetera. Some of them were discrete, some of them were continuous. Apart from these there are several distributions which natural systems obey which we have not mentioned in the previous lectures. For example, there is distribution called a log normal distribution. Log normal distribution is relevant when the occurrence of the random variables happens over several decades of scale. For example, the size of dust particles suspended in the atmosphere. It can vary from about say 1 nanometer to almost a millimeter a scale of about 10 to the power 6. So, normally the kind of distributions that we discussed such as gamma distribution or Gaussian distribution often are useful for describing fluctuations around a certain standard deviation. So, a factor of say 3 or so, but decay very rapidly as the scale increases. On the other hand the distribution such as the log normal can describe the probabilities of occurrence of very large values with significant probability that is why it is supposed to have a long tail. Actually log normal distribution is obtained by using the log of the variable in Gaussian distribution. For example, it has a structure of the form if x is the variable then the log normal ln is described by a normalization quantity 2 pi and then some so called geometric standard deviation 1 by x e to the power minus log x by x geometric divided by 2 ln sigma g whole square. Here we should replace sigma g with ln sigma g. So, this is just an example and as you can see it will have significant probabilities for large x because it will go as a log x to the power log x type. So, if you compare it with the kind of distributions we have discussed so far the other distributions will taper off very quickly. There are in nature several distributions which go even slower than the log normal. An important aspect which is of interest in understanding underlying distributions is the estimation of parameters. For example, in the problem that we discussed in connection with the Poisson distribution the underlying average expected number in a certain given volume is a parameter of interest. So, mean value or what we call as mu is one important parameter of the distribution. The other quantity of interest is the standard deviation. So, whether the distribution is narrow or distribution is a long tail we assume that it at least has a mean a mean value a representative value. So, in the absence of detailed information of the distribution the knowledge of mean is useful in several occasions. A challenge of statistics is how to estimate this mean value and this is accomplished by a methods of sampling. There are many problems in sampling it should be unbiased meaning there should be no sampling bias one should not preferentially select a few values which come from the data a priori there must be equal probability of occurrence of the values governed by the distribution in question. And second important thing is sampling is always limited in number. So, one has a limitation of not being able to perform experiments over the entire population in question, but only take a limited samples say n samples from the underlying population. So, then the question arises how close will be the value that one has generated for the mean from this finite sample to the mean of the universal population which I want to attribute to. There are several cases where this theory has a crucial value especially when the parameters are of certain significance and sampling for example, involves costs. So, one cannot use very large samples because cost will go significantly. So, in this process of optimization to find out how close we are with respect to the mean assume the significant importance. In studies for example, connected with the diseases called epidemiology it is often required to provide an estimation of the risk of a certain agent. Let us say the risk of some pollutant which is in the atmosphere on the health of populations. This risk could be small at individual levels, but given as a whole to the whole society it has some important implications. But how does one estimate this risk? How much how many people have to be sampled? How long one has to pursue this sampling experiments? So, all these imply a lot of cost. Hence to be able to theoretically determine how close one is to the intended mean of the population and how many samples one would require to arrive at this closeness. That is the theory that we will discuss today and that brings us to what is called as the central limit theorem. So, central limit theorem is a kind of a statement that may that gives us a certain universal behavior of the deviation of a sample mean from the universal mean. Regardless of the underlying distributions from which these sample mean has come, it is of a very deep value in as I mentioned in many statistical programs. Let us formally state what this central limit theorem means and then explain it and then try to arrive at a certain proof. So, first we must have a concept of identically identically and independently distributed random variables. As the words suggest the random variables that one is selecting should be distributed by an underlying distribution identically. All of them basically come from the same population, their random and the selection of one variable is not influenced by the selection of another variable. So, identically refers to the distribution from which they arise and independent arise relates to the sampling method in which there no mutually influenced. So, it is about unbiased sampling. For example, if you go back to an example of a Poisson distribution, I had taken a sample of volume V from reservoir of water containing some bacteria. Then the question here is what is the mean expected number of bacteria in this volume V. This has come from an underlying distribution average concentration of some rho. So, the expected number let us say mu equal to rho into the volume V. So, in an actual experiment when one does, we assume that the occurrence of a bacteria they happen independent of each other, they are not influenced by others. And that the probability distribution for the fluctuations around this mean that is satisfied identically by all all particles which enter this volume. So, this is a very important understanding that we must have about the meaning of identically and independence of the random variables. Then concept that we must have is that the distribution in question must have at least a mean must that is distribution should have at least moment up to 2, if not more. That is it must have a mean and a standard deviation. One might wonder why one such a restriction has to be made, because once there is a distribution there would always be a mean and a standard deviation, but that it is not really true. In some contexts there are distributions which are normalized the area under the curve could be one, but their mean may not exist. For example, if I have a distribution function f x which has a property of 1 plus x square to be specific and if you plot it would have a behavior like this, its peak will be at x equal to 0. It will deceptively look like Gaussian for small values of x. However, as you go to larger and larger values you can see that this function tends to 1 by x square as x tends to infinity that is there is a significant probability of occurrence of large particles or large variables. If you integrate this system say minus infinity to infinity f x dx one can easily show that this is actually a tan inverse function. So, it will be tan inverse x at minus infinity to infinity which is pi by 2 and pi by 2. So, you get pi. So, this is an integrable function. I can therefore, rename f x with the 1 by pi as the pre factor. So, that the area under the curve then becomes uniting. However, if you try to establish the mean of course, in this case the mean x f x dx this will be 0. So, mean exists although the function may not be integrable, but by symmetry considerations we can say that the mean exists, but if you try to establish the standard deviation by integrating first the second moment x square f x dx this will have the form x square dx by 1 plus x square. And as x goes to infinity the integrand will be just dx. So, it will go as x that means, when I set the value at the upper limit here it will diverge. And such distributions are not purely hypothetical there are natural phenomena where such very slowly falling distributions occur. The distribution in question that we showed it actually corresponds to the intensity of line width in spectroscopy the intensity of photons for a specified line width as a function of frequency. It is called Lorenzian. There are many such distributions called power law distributions obeyed by extreme values where the function f x goes as 1 by x to the power alpha for x much greater than let us say 1. So, if it goes as 1 by x to the power alpha it may still be integrable if alpha is more than 1, but its moments may not exist. In fact, moments will diverge if alpha is less than 2 if alpha lies between 1 and 2 then we can see that its mean itself will diverge. Hence to encompass the entire family of distributions a restriction is made that the distribution must have at least a mean and the second moment or a variance in order that we apply central limit theorem. With this broad assumptions Vina state statement of central limit theorem for short we write it as CLT. There have been historically lot of work done in the 18th century on central limit theorem. The formulation is a what we are what we are going to state is essentially called Lindenburg Levy formulation. Levy was a very well known statistician who did lot of work on distributions with long tails and the statement goes like this for n independently and identically distributed i i d independently and identically distributed random variables r v's drawn from a distribution. The universal or underlying distribution having finite values of at least first second moments the distribution of the sample mean approaches a normal distribution normal which is also called Gaussian distribution with mean mu of the underlying distribution and standard deviation sigma by root n mu the universal population mean and sigma population standard deviation and n sample size. Sometimes the normal distribution with a mean mu and standard deviation sigma by root n or any standard deviation s is briefly written with the notation mu sigma by root n in this case. So, this is n for normal distribution with the mean mu and standard deviation sigma by root n this notation is often used. Let us understand this statement. So, it says that if I have let us say a population let us say I am doing sampling from an interval distribution that means, I am let us say finding out the mean I want actually find the mean time between the arrival of 2 vehicles when I am let us say doing a random experiment on a road. I am just noting the arrival times of each vehicles and then I want to estimate from this the mean time taken for the arrival times between successive vehicles. So, that is a characteristic or a parameter of the distribution and we know that arrival times of an exponential distribution. So, the distribution f t for a it is also called a Poisson process is e to the power minus lambda t. So, where lambda is reciprocal of the mean time this could be an experiment. So, I have I cannot wait for infinitely long time. So, I will do an experiment for a certain period I have a data for about a few vehicles say n vehicles and from that I will obtain a certain mean. Say if for example, if this is lambda is the quantity of interest what I would have estimated the corresponding mean time tau would be 1 by lambda which is the population mean of the arrival times. But from the experiment I would have obtained let us say a time tau sample on an average of various times say t i between i equal to 1 to n 1 by n. So, now the central limit theorem says that the difference between the true value and the sample estimate satisfies eventually a Gaussian distribution. Although my underlying distribution was an exponential distribution this is one example. Similarly, I could have done the distribution for the Poisson probability of number of particles contained in a droplet. So, I take a droplet first experiment see how many particles are there say first time I would have got 0. I take another sample I would have got 5 particles third time I would have got 3 like that there is a limit. So, I would have taken let us say 100 such samples I would have got various numbers. So, up to 100. So, total let us say 100 samples. So, I would have then estimated an average out of all these numbers. So, each number if it is x i then I would have summed all of them and calculated the average and estimated that as the mean number r bar of occurrence of particles in a specific volume v which is supposed to represent the underlying contamination that that is in the reservoir from which this droplet is taken. So, this is now we know that is actually a Poisson distributed process. So, one can conceive of various types of underlying distributions. However, the central limit theorem says that regardless of the original distribution from which these random variables were sampled the difference between the sample mean and the population mean eventually goes over to a Gaussian distribution with the specific standard deviation for that sample and that standard deviation decreases as n increases. Thereby implying that eventually if you had taken very large samples you would have hit the mean with an accuracy as close as you please. This is a far reaching this is a theorem of far reaching significance and importance and it is very important to understand the basis and proof of this and that we carry out in the next lecture. Thank you.