 Alright, so we reached a stage where we were talking about the moment distribution, the moment generating function for a probability distribution and then I mentioned something about the cumulant. So let us carry on with that. The idea is the following. If you have a probability distribution for some random variable and from now on let me use a symbol capital X for a random variable in general, it could be continuous, it could be discrete. We have looked at discrete random variables, integer valued ones but we are going to extend it to continuous random variables as well. Then you define a moment generating function m of u as equal to the expectation value of e to the u x, where this x is the random variable and of course that is immediately equal to a summation from k equal to 0 to infinity, u to the k, expectation x to the k divided by k factorial and these are the moments of the random variable, could be discrete, could be continuous whatever with respect to, the average is taken with respect to the normalized probability distribution. Now immediately it follows that m of 0 must of course be equal to 1 because we put u equal to 0 here, all terms vanish except the k equal to 0 contribution which is 1, the expectation of 1 is just 1, okay. So that is the normalization of the probability and then one can ask can this quantity, can m of u be written as e to the power k of u, some k of u, okay. Can we do that? What does that mean? It says you are taking this quantity here which is in general going to be some power series in u as you can see and you are writing it as the exponential of another function out here and we will see the advantage of doing this very shortly. Of course this immediately implies that k of u, k of u is the log of, natural log of m of u here, okay and k of 0 must of course be 0, so that m of 0 is unity as we have seen. So in general summation from r equal to 1 to infinity, some constants, some kappa of r, u to the power r over r factorial and these constants, u independent quantities, this quantity here is the so called rth cumulant of the random variable. Now it is not hard to see that these cumulants are going to have interesting properties, kappa 1 equal to, it is just the mean value of x, that is trivial to do, all you have to do is to expand this quantity and pick out the coefficient of u in the power series in u and then you discover immediately it is just the first moment, that is really true, okay. The second moment kappa 2 equal to turns out to be x minus the expectation value of x whole squared equal to the variance, it is just the variance of the random variable. As you know the second moment itself we found was a very inconvenient quantity to use, you needed to subtract the square of the mean from that and then you got a quantity which had physical significance as a scatter of the variable about the mean and that is kappa 2, the second cumulant, okay. Turns out that the third cumulant kappa 3 turns out also to be x minus x average cubed, namely the third central moment turns out to be identically equal to that, okay. Now when you write this in a power series, put that in here and write this as a product of terms and then write the expansion of each of these and collect, compare with what happens here with the various moments, you discover immediately it follows from here that the highest term, the leading term in the arth moment is in fact the arth moment, in the arth cumulant is the arth moment but then things get subtracted after that. Now for instance this quantity here is just x squared minus x average squared, this quantity here would be an x cubed minus 3 times x squared x and then there would be a term which would be plus 3 times x times x average squared and then a minus x average cubed but when you take the average value of that you get a cube here, x average cubed with a 3 and there is a minus 1, so the next termlessness of course be twice x cubed, okay. The fourth cumulant turns out to be x minus x average whole squared to the power 4 but there is a correction to it which is thrice x minus x average squared, this follows the variance squared and of course if you expand this, this is equal to x4 and then plus etc, there is a set of corrections lower moments than the fourth are going to appear rest of it. So there is a systematic way by which you can write the arth cumulant in terms of the arth moment and lower moments and vice versa, right. And the coefficients are standard coefficients, in a sense what is happening? You are kind of subtracting out the lower moments in appropriate combination of lower moments when you do this and you have seen this happen in many many places for instance when you subtract out in the quadrupole moment of charge distribution you subtract out the lower moments in exactly the same so that the whole thing is rotationally it has got interesting definite transformation properties in exactly the same way by doing this subtraction a very important property emerges among other things namely if you just shift the random variable by constant value then the moments of course would change but the cumulants do not change. So if x is replaced by y which is x plus a constant a some constant value a sure value a then the first moment of course will change because the average of y is now the average of x plus a this constant a but all the cumulants remain exactly the same kr for x equal to kappa r for y for r greater than or equal to 2 that is trivial to see it is immediate here for example you immediately see whenever you can write it in terms of central moments this thing the mean value just cancels out the shift in the variable cancels out and then you have this invariance here. So the cumulants of a random variable are invariant under translations of this random variable you shifted by a constant then it does not change at all that is one very crucial property. Let us write down the cumulants for various distributions that you already know about so we have seen a whole lot of distributions for which we know closed form answers so let us write it down if you recall just to recall to you what happened if you recall I defined a generating function f of z this was just a generating function and then I had a moment generating function m of u and we found that this was just f of e to the power u so wherever z appears I just replace it by e to the power u and k of u the cumulant generating function is just the log of m of u now let us see what this turns out to be for various distributions so that you could write the cumulants down for instance the binomial distribution the probability distribution itself we wrote down it is just the binomial the generating function for that was of the form f of z was equal to p z plus q to the power n when you had n Bernoulli trials and p was the probability of success in any given trial then it says m of u is this fellow. So remember that the average value mu the mean was equal to n p so instead of n let us write it as mu over p and if I take logs I put instead of z I put e to the power u and then I take logs I end up with k of u equal to mu over p so that is the cumulant generating function for and remember that the kth cumulant rth cumulant k up of r is d r over d u r k of u evaluated at mu. So once you have this expression it is a very simple matter to write down what the cumulant generating function is and therefore what the cumulants are completely for this binomial distribution what happens in the case of the Poisson distribution for the Poisson f of z was just e to the power mu times z minus 1 so I replace z by e to the u and then take the log to get k of u so this immediately says k of u equal to mu e to the u minus that is it. What can we now say about all the moment all the cumulants all you got to do is this differentiate it set u equal to 0 if you differentiate this fellow you can just e to the u once again you put u equal to 0 you get 1 right so it immediately says that for this Poisson k r equal to mu for all r greater than equal to 1 so the mean the variance the higher cumulants they are all the same it is just one number it is a very special property of the Poisson distribution not shared by others there are other distributions which might display this property you can create them but the fact is that therefore the Poisson the variance so the statement that the variance is equal to the mean for a Poisson distribution is a special case of a more general statement that all the higher cumulants are equal to the mean value in this case. What other distribution did we look at we looked at the geometric distribution right so for the geometric distribution with mean mu the distribution itself was 1 over 1 plus mu so geometric times mu over 1 plus mu to the power n that was the probability distribution of the random variable n which took value 0 1 2 3 etc and now if I find f of z for it it is just a summation of this guy so it is 1 over 1 plus mu and then this geometric series summed from 0 to infinity which is 1 over 1 minus this guy multiplied by a z. So the 1 plus mu will go away and you get 1 over 1 plus mu minus mu z but I got to put e to the power u here that is m of u so it says k of u equal to minus log 1 plus mu minus mu and that is it now we can write down all the cumulants from this directly a sum of Poisson random variables again Poisson so nothing new happens what happens if you have a difference of the two if you had a skellum distribution for instance where you have the difference of two Poisson random variables whose means are mu and mu say what happens then well the generating function was this and then in the other case for the new because of the minus sign when we generated it you had an e to the 1 over z instead of z so it would be just this plus nu times e to the minus nu e to the minus u minus 1 that is k of u okay and of course the first cumulant is the first derivative of this at u equal to 0 and that will immediately give you mu minus nu which we know is the mean the second cumulant you differentiate this twice you are going to get a plus sign again and so on so it immediately it says kappa r equal to mu plus minus 1 to the power r mu so every other moment every other cumulant is mu plus mu and every other the even ones are all mu plus nu and the odd ones are all mu minus nu as you as you would expect in this case we will write down the cumulants of some continuous distributions as we go so the first important property of a cumulant is that it is translation variant the other property that is obvious by looking at it is that the cumulant is a homogeneous the cumulant the rth cumulant is a homogeneous function of this random variable in a strange way that is if you multiply the random variable by a constant then the rth cumulant gets multiplied by that constant to the power r okay and that is fairly straightforward to see so if x goes to c times x k r is multiplied by c to the r so the scaling property is also very immediately obvious here the cumulant has another very crucial property we saw that the variance of two independent random variables and is simply the sum of the individual variances this is going to happen for all the cumulants the additivity of cumulants is a very very crucial property and we can see that in many many ways but one way of seeing it is to say that well if I take the log here this moment generating function just multiplies for various random variables and if I take the log it simply adds up so it is clear that if you have several random variables independent random variables then and they are independent statistically independent then the cumulant rth cumulant of the sum is equal to the sum of the rth cumulants so the crucial property is there for independent this is absolutely crucial very very important property we make use of it as we when we talk about limit theorems we want to make use of this at least part of this property now I have mentioned off and on continuous random variables so let us just say a few words about it and then so you have a continuous let us call X the random variable takes values in some continuous interval of the real axis for instance or takes values of overall real values of overall real numbers then instead of talking about a probability of this X for any particular value which must be defined with infinite precision now it is a set of measure 0 a point in a continuum all you can talk about is the probability density function of this variable that this random variable has a value probability that X has a value in some X X plus DX equal to P of X DX P of X is the probability density function I will call that denoted by PDF that is the probability density function and it cannot be negative this number cannot be negative could become unbounded all you need is normalization so all you need is integral DX P of X equal to 1 over whatever is the range of this variable in general minus infinity to infinity say and you also need P of X to be greater than equal to 0 now we are going to be rather lose in our mathematics if for example you have a situation where there is one particular point in the continuum where there is a finite probability that this variable has a value then we will include it in here by putting a delta function at that point we put a delta function spike in the PDF with an appropriate weight factor so that it gives you the probability of taking on that particular value okay. So we will be a little casual about this I will continue to write integrals but then in here could be delta functions alright now once you have a continuous random variable of this kind the convenience to define and once you have a probability distribution function of this kind which is integrable it is convenient to define a Fourier transform for this variable and the Fourier transform of this quantity P tilde of K equal to integral minus infinity to infinity DX e to the minus I K X P of X this quantity this is called the characteristic function of this random variable but it is nothing new we have already introduced this quantity in a different way this quantity is just the expectation value of e to the minus I K X with respect to this weight factor here but we know what e to the U X is that is M of U so this is nothing but M of minus I K so the characteristic function is just another way of writing of saying that you have a moment generating function okay there are analytic properties of these variables which I am not emphasizing at the moment but you see if you give an arbitrary function P tilde of K I cannot claim immediately that it is the characteristic function of a random variable till a certain set of conditions is satisfied for instance if I want a normalization to be valid for this if I put K equal to 0 here and I want the integral to be equal to normalize quantity 1 total probability then this immediately implies that P tilde of 0 must be equal to 1 but even more strongly given a function P tilde of K it can be a characteristic function only if it is Fourier transform only if it is inverse Fourier transform gives you a non-negative function P of X okay so that is a very strong constraint a real function should be real and it should be non-negative that is a very very strong condition so all functions of K are not going to be even if they are integrable are not going to be characteristic functions okay and that is an important test what follows all right now the additivity of cumulants follows trivially once I introduce the characteristic function as you can see from here because this immediately says that so if I have a random variable X 1 and another random variable X 2 with moment generating functions m 1 n 2 and cumulant generating functions k 1 k 2 etc then it immediately says as you can see the probability density function of the variable let us call this equal to X say then P of X is equal to an integral from minus infinity to infinity dx 1 integral minus infinity to infinity dx 2 P 1 of X 1 P 2 of X 2 where X 1 X 2 are the points in the sample space of these two variable random variables and P 1 and P 2 are the corresponding probability density functions okay multiplied by the constraint that X 1 plus X 2 must be equal to X okay and where does that get us that says this quantity is minus infinity to infinity dx 1 P 1 of X 1 P 2 of X minus X 1 if I use the delta function constraint then this is all it is now it in what form is this quantity here it is a convolution so it immediately follows that P tilde of K equal to P 1 tilde of K P 2 tilde of K by the convolution theorem for Fourier transforms but that is the same as M of M 1 of minus I K and M 2 of minus I K and if I take logs this immediately implies that K of U K of U or minus I K does not matter these are all power series so which implies that the cumulants add up because the cumulant the Rth cumulant of this quantity is the coefficient of minus I K to the power R divided by R factorial whatever right so it says immediately the additivity of cumulants is a trivial consequence of this fact here okay if they are discrete value random variables over some finite range or something like that you got to work a little harder to do this but it is pretty much the same yeah sorry the X there should be additive of both X 1 and X 2 what should I am sorry say that again the X 1 plus X 2 should be equal to X in that case when we are finding the probability density function of X yes what exactly are we doing in this we have written X 1 comma X 2 equals X on top X 1 plus X 2 is equal to X I am sorry I am very sorry yeah thank you yeah X 1 plus X 2 yes now generalization to some constant times X 1 plus some other constant times X 2 any linear transformation similar things would happen but you can check that out directly thank you now we could go back and look at various continuous distributions or even the discrete distributions and ask for sums of random variables and that is going to become very important to understand how the sums of random variables behaves when you have a large number of components we are going to spend some time on it but let us do this in the context of a very famous example the simplest of these examples the most ubiquitous of continuous distributions is the Gaussian distribution so let us write down the answer for the Gaussian distribution what happens to the cumulants and a very important property will emerge so the Gaussian it is also called the normal distribution and it is parameterized by two quantities the mean and the variance the PDF for the Gaussian so first of all you got a random variable X which is an element of minus infinity infinity and the corresponding PDF P of X the normalized PDF is this quantity I should use curly X here the value then the mean value of X this quantity is mu and the variance X equal to sigma square so it is parameterized by the mean and the variance two parameter distribution now this distribution is going to appear ubiquitously everywhere we will see when we look at the central limit theorem how this distribution emerges in a very very general context but right now let us look at some of its properties first of all the shape of this distribution it is quite straightforward it is something which is bell shaped here where this is the mean value it is unimodal the peak is at the mean out here there is a function of little X this is P of X and this width here this half width is proportional to sigma so the full width that half maximum the value at this point and this mu at X equal to mu is 1 over root 2 pi sigma squared and when you go to half that value this quantity here the width is proportional to sigma and sigma is the standard deviation out here okay you could ask what the cumulative distribution function is so the cumulative distribution function let us call it f of well there are various notations used for it let us call it P of X this is equal to the probability that the random variable X is less than equal to some specified number X which is equal to the integral from minus infinity up to X of DX prime P of X prime so the probability that the variable has a value less than some specified value is the area under this curve that is the cumulative distribution function is clear it cannot be a decreasing function it has got to be a non decreasing function in this case when you have a distribution like this as you move to the right the area keeps getting added to so it is an increasing function right and P of minus infinity is 0 P of infinity is of course 1 let us use another symbol for this I do not like this P f out here infinity is 1 and by the symmetry of this distribution it is quite clear that f of mu equal to a half at this point you have exactly half the area is exactly a half can we write down this f of X in terms of some known functions well there is this famous integral it is called the error function error function of X this is defined as an integral from 0 to X dt e to the minus t square and you want to normalize it so 0 to infinity the answer is equal to square root of pi over 2 so 2 over root pi then earth infinity is equal to 1 of 0 is 0 and earth minus infinity it is an odd function so it is equal to minus 1 goes from minus 1 to plus to infinity to 1 at X equal to infinity so this quantity this f of X will turn out to be half 1 plus the error function of X minus mu because I have shifted everything to mu the origin to mu and it should be scaled down I used a dimensional as variable here so what I need is in the probability density I had e to the minus X minus mu whole squared divided by 2 sigma squared so this length scale there was fixed by 2 sigma squared right so I got it kill that and that is all it is okay and this earth x itself is a function which looks like this the function of X this is 1 this is minus 1 that 0 the function goes like this and that is what the cumulative density distribution function of this random variable is okay statisticians like to use the cumulative distribution function rather than the probability distribution function itself for various reasons first of all when you have these atomic probabilities namely you have a given point where there is a finite measure for instance then we need to introduce things like delta functions and so on which are rather singular objects but when you integrate it out things get smooth so people like to use this rather than using you do not mind using step functions but you delta functions are little singular you got to define them more precisely etc so it is convenient in many cases do this physicists generally work with densities probability density functions etc so once we have this we could ask what is what are the various quantities associated with this Gaussian distribution 2 parameters mu and sigma squared the variance we could ask what all its cumulants are for instance we could ask what its moments are and what would be the moments of this distribution we already can predict what is going to happen for a Gaussian we can ask what is X minus the average value the central moment this quantity this is of course 0 by definition because mu is just the expectation of X but we can ask what this is for the kth moment 2k plus 1 say the odd moment what should this be well the pdf is a symmetric function of X minus mu and now I am asking what is the average value of X minus mu to the power an odd number an odd integer should be 0 by symmetry this is 0 the integral exists for all positive k as you can see because there is a e to the minus x square to take care of convergence definitely all the moments exist so this is identically equal to 0 0 2 maybe does not matter and what are the even moments like what would this be what is it when k is 1 it is just the variance it is just the variance what is it when k is 0 you want to be 1 it has got to be 1 average value of 1 now it is clear that the answer depends only on sigma because once you shift to mu the only parameter left is sigma and the only quantity of dimensions sigma of dimensions length is sigma in the problem so this got to be proportional to sigma to the power 2k just pure dimensions multiplied by some factor and that factor is not hard to find you can write down a Gaussian integral multiplied by any even power here and turns out this thing is times 2k minus 1 double factorial this stands for one times this this fellow here stands for one time into 3 into 5 2k minus 1 so this symbol double factorial is very often used for this or you could write it in terms of 2k factorial divided by k factorial times 2 to the k and so on. Now there is a nice interpretation of this combinatorial interpretation which were which is useful in places like field theory when you do what is called wicks theorem it is very useful if you took this thing here x minus mu to the power 2k and wrote it out as factors you have 2k factors each of which is x minus mu then you can ask in how many distinct ways can I pair these fellows can I write them pair wise how many pairs you can independent pairs can I find and the answer is precisely this so this is really a combinatorial factor that arises from the number of ways in which you can pair 2k objects two at a time okay so that is what this is now we could ask what is so the very important property emerges that all the central moments are all dependent just powers of this guy here nothing more sigma squared to various powers now you can ask what is the cumulant generating function of this distribution and a very interesting fact emerges k of u in this case is equal to well first you want the moment generating function but remember this is just a k of minus ik is just log of p tilde of k and p tilde of k is the Fourier transform is integral minus infinity to infinity 1 over root 2 pi sigma squared e to the minus x squared over 2 sigma squared minus ikx it is the Fourier transform of a Gaussian okay and what is that going to be it is also going to be a Gaussian this guy is also Gaussian and all you can all you need to do is to complete squares out here pull out a 1 over 2 sigma squared complete squares here it is also Gaussian but what is the width of that Gaussian going to depend on it is one it is it is sigma squared itself whereas here the width was sigma squared there it is 1 over sigma squared there is this very interesting property of Fourier transforms that the more compact a Fourier transform is the more spread out its function is the more spread out its Fourier transform is and vice versa so the width here if it is sigma squared the width there is 1 over sigma squared very profound implications so this will lead to the fact that k of u equal to mu times u plus half sigma squared square k of minus ik which is p tilde of k will turn out to be minus mu ik minus half sigma squared k squared so what does that tell us about the Gaussian it says of course it is a immediately true that kappa 1 equal to mu and kappa 2 equal to sigma squared that is that is very clear but what does it say about this greater than equal to 3 what does it say about it it is 0 identically because remember it is the coefficient of u to the r over r factorial in a power series expansion about the origin of k of u and this is a polynomial that is it just the first two terms and everything else goes away so an incredible property of the Gaussian is that all cumulants higher than the second one vanish identically there is a first moment there is a second moment and there is a there is a variance a mean and a variance and all the higher cumulants are identically 0 okay so very very basic property of the Gaussian is that the higher cumulants are all identically 0 which again means that if you look at the fourth cumulant for example or ask for the third cumulant that is identically 0 there is a physical meaning to these cumulants the mean is of course going to tell you something about the average value it is exactly the average value the variance tells you the scatter about the mean the third moment gives you what is called skewness it is related to how asymmetric it is this distribution is and when you have a symmetric distribution like the Gaussian the fourth or any other symmetric distribution then the fourth cumulant gives you information about how much it departs from Gaussianity because this relation immediately tells you this relation here it immediately implies that for the Gaussian kappa 4 is equal to this quantity which is delta x that is the x minus average x which I called mu mu to the power 4 minus 3 times the average value of x minus mu whole squared squared is identically 0 for the Gaussian and this quantity in general let me call x minus mu equal to the deviation from the mean then this quantity delta x 4 minus 3 times delta x squared the whole squared and just to make it dimensionless we divide by this quantity delta x squared squared this quantity this is called the excess of kurtosis and for a Gaussian it is identically 0 this quantity could be positive negative or 0 if it is 0 for a Gaussian it is identically 0 but if it is positive what does it sort of imply it says that this guy is dominating over that in some sense and this is the fourth moment so it means that the higher larger values of x about the mean are actually more significant than the smaller values it says something about the shape of this distribution. Similarly if it is negative it says the large values do not dominate the smaller values dominate right so in one case you got a thing which is fatter than the Gaussian the other case you got something that is leaner than a Gaussian and these are important indicators of the deviations from Gaussianity and the reason the deviation becomes important is because Gaussianity is what I expect if you had as you will see a lot of random variables added up in an incoherent sort of a fashion and the limit for suitable rescaling of this sum linear combination it turns out the distribution will be Gaussian and a very very robust conditions. So this implies that whenever you have a deviation of this kind it says something very important about the underlying physics in the problem okay so keep that in mind that you have this is identically 0 for a Gaussian but then there are distributions for which this is not so now pretty much you can ask does this go on forever after all to define the distribution completely I need information about all the moments so is it that I need an infinite set of numbers only then can I reconstruct the distribution for instance suppose I give you all the moments of a distribution can you uniquely reconstruct the probability distribution function or the density function this important problem is a problem in mathematical statistics it is called the problem of moments and there are certain answers known to it under suitable conditions it is a very important problem we will not go into that but let me explain say simply say that for practical purposes very often when you actually analyze data etc the first four cumulants serve to pretty much describe the random process that the random variable more or less completely so the mean gives you some crucial information about what this variable is typically likely to be if it is a simple kind of distribution the variance gives you a scatter the third one tells you about skewness or asymmetry and the fourth one tells you departures from Gaussian so pretty much this numerical purposes they should suffices in most cases but of course from a theoretical point of view you need to know all the moments before you can make statements here you could ask the following question which is an interesting one I am not going to prove it here which is the following are there continuous random variables with well defined probability density functions such that just as a Gaussian had a quadratic cumulant generating function and all the higher cumulants were identically 0 after k3 and onwards is it possible to have a cumulant generating function which is a polynomial of some finite degree greater than 2 and everything else all the higher powers are 0 so the distribution would have cumulants up to some n and then every other cumulant is identically 0 is it possible to have such a distribution the answer is under fairly general conditions no either in principle all cumulants exist barring accidents in certain cases or the Gaussian says it stops at quadratic and that is it nothing more there are other such properties which will also emerge as we will see when we talk about stable distributions we will see there are a certain other interesting properties of this kind which will emerge that either it stops at the second order it goes on forever then only other two possible possibilities we will see where this comes about okay so the next step now is to ask I have some information about the Gaussian other other such distributions there are several but we will talk about it when we come to stable distributions but first we would like to do the following I would like to take a set of some ordinary very simple distributions for random variables add them all up and see where it goes what the distribution of a sum looks like in particular we will undertake a simple exercise we are going to take a whole n random variables all uniformly distributed between 0 and 1 so this random variable each of the random variables takes values between 0 and 1 with a constant probability distribution function 1 and I add up all these fellows and ask what is the distribution of the sum of this thing so we will work that out explicitly here meanwhile one final point if you have functions of random variables their probability density functions can look very different from the distribution density functions for the original random variable if you look at the Gaussian example for instance let us take a Gaussian with 0 mean so you have p of x this guy is 1 over root 2 pi sigma squared e to the minus x squared over 2 sigma squared the Gaussian with 0 mean for simplicity let us set the mean to be equal to 0 right and then I ask what is the density function probability density function of a variable let us call it psi which is equal to the square of this variable and let us call its pdf row of psi now it immediately it is obvious that this psi is an element of 0 infinity unlike the original random variable which ran minus infinity to infinity now you got 0 to infinity what is the distribution pdf of psi going to be like several ways of doing this one of them is to say alright I do it by brute force I do it by saying that row of psi must be equal to an integral minus infinity to infinity p of x and then a delta function which says psi minus x squared x squared is but to do this integral I got to convert this delta function over psi to 1 over x right and what is the first property of the delta function it is a symmetric function so I can write this as x squared minus psi in this fashion and I got a delta of x squared minus as constant squared psi squared root of psi whole squared so I can write this as integral minus infinity to infinity dx p of x delta of x minus root psi plus delta of x plus root psi divided by 2 roots that is the Jacobian which derivative and I do this integral I can now do this integral because I use the delta function and plug it in but you can also write the answer down you see if I say that when x takes a value between x and x plus dx capital X takes a value between little x and x plus dx suppose I take a value between psi and psi plus d psi right then row of psi d psi must be equal to p of x dx in this case we are fortunate because as x increases I also increases otherwise when you are talking about probabilities you have to make sure that they are both positive on both sides so I can write this as these x over d psi but I must be careful to write that modulus sign okay and I must express this thing in terms of psi because that is what a function of so I must write this as p of x is square root of psi and d psi over dx is 1 over twice square root of psi there is no need for modulus because I is not negative so would that be a right answer no because in minus x contributes exactly the same amount to this right so there is a factor 2 that is equal to e to the minus psi over 2 sigma squared over root 2 pi sigma square so this 2 cancels that 2 and this of course you got to check normalization so you got to verify that row of psi d psi equal to 1 integrated from where to where 0 to infinity so it is an exponential but it is also got this factor so this distribution looks rather different from what the original variable was for instance if you wrote down in one dimensional motion if you wrote down the Maxwellian distribution of velocities for velocity component it is going to be e to the minus m b squared over 2 k t or something like that Gaussian on the other hand if you asked what is it for the energy which is half m b squared then it is going to be proportional to e to the minus the energy with some k t factor divided by square root of the energy in the denominator so this factor which came from the derivative the Jacobian sitting there to this factor here is crucial what do you call this in that example which I just talked about in the energy density of states it is precisely the density of states in one dimension for one dimensional motion okay so we will see where that is when we talk a little bit about the canonical ensemble you will see that this density of states plays a crucial role we will talk about that when I mention the characteristic function for this distribution okay alright so the next exercise is to take a set of identical random variables and add them up and see how the Gaussian emerges we will do that next time.