 us continue with the proof of central limit theorem. In the last class, we had a variable x distributed quantity with the sample with the population mean mu and a population standard deviation sigma through a random experiment. We obtained n random realizations of this variable. We constructed the mean of this sample y was the mean of is n values. Then we standardize this way mean via a standardization procedure through a quantity z which had the property of mean 0 and unit variance. And this z was of course, expressed as a sum of individual deviations of values from the mean called z i. Then we went about constructing the distribution for this quantity z from the original distribution of the sample of the population f z. And under the assumption of IIDs identically and independently distributed nature of each of this measurements, we constructed a characteristic function via Fourier transforms for the variable z. And then we expressed it in terms of the characteristic function of the original distribution and arrived at the result that p p n n subscript denoting the number of elements in the sample with respect to the conjugate variable k is f carat k by root n to the power n. If we are using capital N, we can subscript it capital N. Interpreted verbally this relationship simply means that the characteristic function of the joint distribution function z is actually the nth power of the characteristic function of the individual distributions at an argument k by root n. This is the implication of that. Let us investigate this relationship in the limit of n going to infinity. So, as the n goes to infinity k by root n tends to 0. So, obviously, f carat 0 to the power infinity. So, it is not easy to visualize this limit. So, we have to carefully construct this limit for which we proceed as follows. So, our next step is examine the behaviour in the RHS n tends to infinity for which let us construct a Taylor expansion of the characteristic function f. So, the Taylor expansion for example, f of say if it is anyway any variable u it will be f 0 plus u f prime 0 plus u square by 2 f double prime 0 plus of the order u k. So, in the spirit we can write f tilde of k by root n as f tilde of 0 its value at the origin plus k by root n f tilde prime 0 plus k square by x square by 2. So, k square by 2 n f double prime 0 and higher order terms. Let us evaluate each of these coefficients. So, if by definition we know that f carat of say anywhere if it is just k it would be integrated over the space on which this function is defined f z e to the power i k z by definition we know if you put k equal to 0 f carat 0 will be just the integral e to the power 0 is 1. So, a dz and since we are using normalized individual distributions it will be 1. Let us see what f carat prime at k equal to 0 is by its definition f carat prime at k equal to 0 is the derivative with respect to k evaluated at k equal to 0 that is f carat prime 0 equal to d by d k of integral e to the power i k z f z dz whole integral evaluated at k equal to 0. It easily turns out to be i integral z f z dz which is by definition of the average it is i into z bar, but z is a one of those z i variables which is a standard standardized variable with the mean 0 and unit variance. And therefore, z bar is 0 hence f carat prime at k equal to 0 which is i into z bar which is 0. Similarly, if we now take the second derivative at 0 it will be i square z square will come. So, it is basically it is going to be i square and z square bar i square is minus 1. So, it will be minus of z square bar and we know that z square bar is always can be written as variance z plus z bar square because the definition of variance z is z square bar minus z bar square. Now, z bar is 0 and variance is unit variance for the quantity f. So, this is going to be minus 1. So, we will evaluate first two term expansion. So, we need only the two derivatives and hence the Taylor expansion takes the form f of k by root n will be the first the value itself is 1. The second term is 0 and the third term has a negative 1 here and it is basically k square by 2 n and then of course, it will have a higher order term which will be of the order of n to the power minus 3 by 2 for any k because we know like cube of root n basically. Now therefore, we have an approximation up to second term for the characteristic function of the value parameter z p subscript n k as we have shown it is just to repeat it is k by root n to the power n and this therefore, turns out to be 1 minus k square by 2 n into if you wish one can retain this as of the order of k n to the power minus half whole to the power n. Now, for a minute let us ignore the term of the order of k n to the power minus half as compared to 1. So, as n goes to limit limit n tends to infinity k n to the power minus half will tend to 0 and hence the term in the curly bracket will will be only unity. So, we can write in the next step limit n tends to infinity p n k will be 1 minus k square by 2 n to the power n. We easily see that this has a structure of the form of an standard definition of an exponential function. To recapitulate this is 1 plus x by n to the power n as n tends to infinity this tends to e to the power x. So, using this property we can see that the quantity of interest p n karat evaluated at k will tend as 1 minus k square by 2 n to the power n and which will tend as n tends to infinity to e to the power minus k square by 2. So, we have now achieved a kind of a universality in the Fourier transform of the distribution function. Universality with respect to n that is for sufficiently large n the Fourier transform in the variable the k variable is sort of independent of n. Now, of course, there was always an n in built in our definition of the standard normal variate z because z itself was standardized to have unit variance because it is so its variance sigma square by n was used for dividing the deviation from the mean. So, now from this the next step is the actual distribution function function of the variate z p n z that will be Fourier inverse Fourier inverse of p n k which by definition is 1 by 2 pi k now k is going to be integrated minus infinity to infinity p n cap k e to the power minus ikzdk and we know the Fourier inverse of a Gaussian is also a Gaussian and in our particular case the Gaussian is e to the power minus k square by 2 and its Fourier inverse also will be e to the power minus z square 2 or z square by 2 and that is going to be therefore, 1 by root 2 pi will remain e to the power minus z square by 2 where 1 root 2 pi goes when you are normalizing the Gaussian variable when you transform from k space to z space. So, we arrive at a universal distribution for the joint probability density of the z variable as 1 by root 2 pi e to the power minus z square by 2. So, this is in a way a compact proof or compliance with respect to CLT prove the central limit theorem. Now, we can make it more explicit we can go back to our y variable we can recapitulate that our y was the sample mean xi by n where we had n values of xi and our z was a deviation function suitably scaled y minus mu divided by sigma by root n. So, we can go backwards from the z to y and obtain the distribution of the sample mean which turns out to be P n of now y the sample mean will be 1 by root 2 pi we I will just use it as a standard error e to the power minus y minus mu whole square by 2 se square where standard error was defined as sigma by root n. Let us try to interpret this result what it says is so, if I have a sample mean y evaluated by a random experiment of a population whose original distribution has 2 moments 2 moments because as we saw we needed the second derivative. So, for which we needed the second moment basically variance related and we did not matter whether its higher moments existed or not then we use the property asymptotic property of the function leading to exponential. So, if we have that average which for sufficiently large n its separation or difference from the mean value is a Gaussian distributed regardless of the original distribution from which it came and its Gaussian distributed with the variance given by a c square represented graphically it means the distribution of the quantity y its theoretical mean is mu y is a sample mean I have taken a sample some n n measurements or n experiments. If I had repeated this set of n experiments several times the mean I would have got each time would be y 1 y 2 y 3 etcetera and that would have been distributed in a Gaussian manner with respect to the true population mean and this would have been standard deviation of that would have been sigma by root n thereby in saying that if I increase my n say this is n equal to say 1000 as opposed to n equal to 100 then the distribution will be narrower and narrower almost coinciding with the mean value regardless of the original distribution and that is where the power lies as we mentioned most of the distributions that we discussed in our study of probability distributions all of them had mean as well as a standard deviation. Hence we expect that whether we are sampling from a binomial distribution, Poisson distribution, gamma distribution, an exponential distribution or a simple interval between 0 and 1 uniformly distributed in that interval. If you had made measurements out of sample then the mean of the each of these samples would have shown a Gaussian distribution around the population mean and with the standard deviation given by sigma by root n. If of course, the population is entirely having no dispersity that is mono dispersed totally then of course, even one measurement should give you the mean if you do not have measurement errors because sigma is 0. So, there cannot be any other value coming out of it that is what it says. So, so long as your population is dispersed the true mean measured in a random experiment is going to deviate from the population mean, but this assures you central limit theorem assures you that you by increasing the sample size eventually you are going to approach the sample mean with indefinitely close. The impression that we get that the central limit theorem is universal is not fully correct at least mathematically there are some exceptions to central limit theorem of course, these exceptions come from exceptional nature of the distributions somewhat mathematical somewhat idealized. One such distribution is the so called Cauchy distribution supposing there is a quantity which is distributed by a function known as Cauchy distribution function. So, it has a form mathematically like this supposing x is the quantity the distribution function for that is given by b by pi 1 by b square plus x square and it is also called Lorentzian at times in physics literature because this function represents the width of the spectral lines. Theoretically lines should have just single photon frequency, but due to certain quantum mechanical effects develops a broadening and that broadening is described via a Lorentzian. So, there is some physical basis for the choice of Cauchy distribution. There could be other distributions which also have certain pathological character like this I will tell you what are the pathological character, but which could be exceptions to central limit theorem also coming as solution of some mathematical models. For example, in problem of particle evolution under certain scenarios of particle interaction dynamics one can get a size distribution which also will will be similar to Cauchy. So, let us discuss what is that aspect which distinguishes from other distributions. So, one of them is of course, we know that the quantity f x is normalized this will be one it is very easy to show this because integral of 1 by b square plus x square is tan inverse x by b and that b cancels tan inverse at the end points will be pi by 2 pi by 2 1 by pi should be 1. But interestingly if you see it does not have of course, it is mean well by symmetry arguments you can say 0. So, we will not discuss that, but if it was slightly deviated from the x equal to 0 point then this quantity would not have been 0 it would have been infinity in general. Supposing my Cauchy and Lorenzian was right now this is the way it is f x this is my x variable and this is f x, but if the same function was slightly shifted to some peak value mu then it would have been x minus mu square one could have again use this formula for calculating mean and would have arrived at divergent. The reason why it is divergent is because when you multiply by x this integral minus infinity to infinity takes the form dx x by b square plus x square x square. If you look at the behavior of the function for large x it has the integrand as a form dx by x. So, when we are integrating a function like dx by x it will have ln x value and when you put the limits either infinity or minus infinity this will be at least at the point of plus infinity it will tend to infinity and that is the reason why we assign that this is a distribution with 0 mean. Similarly, it is a second moment x square dx by b square plus x square dx also will go to infinity even more strongly because as x goes to infinity x square x square cancels it is just an integral of dx which will be unbounded. Hence for the Cauchy distribution we do not have second moment that is a sigma square does not exist. In fact, mu also does not exist strictly. So, that is why these functions are do not have standard deviation people often refer to the extension or deviation from the mean so called full width half maximum. The concept of full width half maximum is used in place of standard deviation which simply is the width at which the value becomes half the maximum. So, that is the physical reason why spectral lines are discussed often via this width b is half the width corresponding to half the peak value. So, this is F w h m. Now, supposing one samples from such a distribution would it satisfy central limit theorem actually what happens if we follow through the mathematical steps that we followed. So, we first we have to find out the Fourier transform or the characteristic function of this distribution which by definition is minus infinity to infinity e to the power i k x dx divided by b square plus x square b by pi. It is it is a bit daunting exercise to evaluate this integral because of the symmetry around 0 e to the power i k x the cosine part of it only operates because e to the power i k x will be cos k x plus i sin k x the cosine part operates, but still it can be evaluated through only contour integral methods. We will defer that detailed exercise to some future occasions. Right now please take it from me that when you carry out that contour integration one obtains the Fourier transform of this function as e to the power minus b mod k. It is very important to know that it is a function of mod k. Let me physically argue why it is so? First of all when you see k equal to 0 this integral is 1 you can see if you put k equal to 0 mod 0 is 0 e to the power 0 is 1 it satisfies. What we cannot visualize is that why it should be mod k, but we can visualize that it should be an even function of k that is because of the symmetry. But we would have said that ok it is a function of k square, but somehow it does not turn out to be. So, it is not a function of k square the lowest order function which is sort of even with respect to k turns out to be mod k of the order. Hence the Fourier transform of the Lorentzian or Cauchy distribution is associated with the function mod k and mod k itself has a structure like this. The function is continuous at k equal to 0, but not differentiable from the right if you approach you will get a slope of plus 1 from the left if you approach you will get a slope of minus 1. So, there is a discontinuity in the derivative and hence it is not a differentiable function. So, which basically means that the method we carried out in deriving the central limit theorem that is a Taylor expansion around k equal to 0 which presupposes the existence of the derivatives is not valid. So, we cannot expand it in terms of in terms of Taylor series. Now, if we however we do not need to do that we can exactly evaluate the Fourier transform of the joint distribution function because by definition we had proved this much that it is f k by root n to the power n up to this point we had made no assumption. So, we can now substitute the result we obtained and it is going to be e to the power minus b mod k to the power k by root n of course, because argument is k by root n to the power n. So, this leads to e to the power minus b mod k mod of k root n you can always include because n is all root n is the positive. So, it does not matter I can always put it inside. So, now the Fourier inverse of this is very easy we do not need any approximation because we know the Fourier transform is root k. So, I can write it as e to the power minus b root n of mod k. So, it is now going to be a Cauchy distribution with the width parameter instead of b b root n and hence the P nz the joint distribution function which is the Fourier inverse which is going to be b root n by pi 1 by b square n plus z square. So, the in to conclude this what we see is that if you start sampling from a Cauchy distribution it does not lead to a Gaussian distribution, but leads to a Cauchy distribution again with the f w h n full width half maximum value now becoming instead of b it is going to be b root n. So, it is going to be broader and broader the width. So, this exception has a certain bearing certain caution to be exercised whenever we speak of CLT. So, this should not be missed that not every conceivable probability distribution would lead to a central limit theorem with respect to the mean. In the subsequent talks we will see how this central limit theorem tendency of sampling of from any arbitrary distribution leading to a universal Gaussian distribution with the well defined universal mean is connected to stochastic processes. Thank you.