 We start today by looking at sums of random variables so as I said we are heading towards deducing or at least understanding some limit laws such as how the Gaussian emerges in many cases etc. So I start by saying asking a very simple question and that is the following. Suppose I have a set of random variables Gaussian random variables and let us denote them by x1, x2 up to xn and each of them is a Gaussian random variable all of which are identically distributed okay. So they are independent statistically independent identically distributed random variables and there is an abbreviation for this so independent identically distributed random variables i, i, d, r, v's each of which is a Gaussian distribution has a Gaussian distribution and it is a normal distribution with a mean mu and a variance sigma squared and it is generally denoted in this fashion a normal distribution with mean mu and sigma squared. So that is just notation to say that the random variable is a Gaussian with the specified mean and variance okay and each of them has exactly the same distribution. Then the question is what is the distribution of a variable which I will define as Zn this is equal to x1 plus etc up to xn minus n times the mean divided by and I am going to scale it out to make it dimensionless here divided by n sigma squared square root remember sigma is the standard deviation for each one of them and this is just the sum of the variances out here n sigma squared because the variances of independent random variables add and I take a square root. So what is the distribution of this Zn that is the question okay and now of course you can do this the hard way you can say I write its probability distribution by writing a product of the distributions of all these variables out here integrating over all the sample spaces of each of these variables and then imposing a delta function constraint that this combination should be equal to that that is a painful way of doing it this case. On the other hand I argue that each of these variables if I consider a variable like xi minus mu I subtract the mean out in this case these are all independently distributed and each of these fellows these variables has a variance sigma squared therefore the sum of all these fellows is variance n times this and I immediately write down what the characteristic function is or what the moment generating function is or what better still the cumulant generating function for this whole thing. So the cumulant generating function of Zn this thing here is equal to there is no mean because the mean is 0 I have subtracted it out and then it is equal to one half sigma squared u squared was a formula for a single Gaussian variable for n of them it is n sigma squared u squared but remember I have scaled out each variable by this thing here. So xi minus mu divided by n sigma squared I have multiplied this random variable by a constant 1 over root n sigma squared and therefore in the variance it is just this constant squared. So this is equal to this fellow which is equal to one half mu squared. So we have a statement that the cumulant generating function of this sum rescaled by shifting the constant out by removing the mean and dividing by this square root of n sigma squared is just one half u squared which implies immediately that Zn has distribution normal distribution of 0 mean and unit variance which is called the standard Gaussian distribution n of 0, 1. So it is our first interesting result that if you have n of these Gaussian random variables and you subtract out the mean value and divide by suitably rescale it by this factor here which is square root of n times variance of each of these then this combination is normally distributed for every n this is true. You can generalize this result immediately consider for example this quantity. So let us suppose that you have n Gaussian random variables and we have a i x sub i sum from 1 to n. So the x's are all Gaussian random variables and let us suppose that the mean value of any of the x's is some mu i and the variance is some sigma i but they are all Gaussian random variables then what is the distribution of this combination here. That combination is not a very good one we need to do something better we need to scale out the mean value etc. So let us call this equal to let us call this Zn just for convenience and let me define the random variables i n equal to Zn minus summation ai mu i I subtract that out and I divide by the square root of what should I divide by the square root of assuming that this fellow x i has mean mu i and variance sigma i square what should I divide by here yeah I should sum from 1 to n ai square sigma i square what would be the distribution of Z sub n the standard normal distribution right. So has distribution of 0 and that is true for every n so we see that if you do a rescaling and the translation suitably subtracting a constant a sum of Gaussian random variables or a linear combination more generally a linear combination of Gaussian random variables has a Gaussian distribution. So in this sense the Gaussian is extremely robust you take two independent Gaussian random variables do a linear combination then some these scaled version of this sum has a Gaussian distribution once again it reproduces itself as you can see the theorem is actually more general than that and the statement is the following and this is one statement of a famous theorem called the central limit theorem. I am saying this very loosely there is a rigorous way of stating this theorem in its all its generality but what I am going to do here is simply state it in very very simple terms and that is that if you give me a set n random variables x 1 to x n they need not be identically distributed x 1 x 2 x n and let us to start with look at i i d r this does not matter each of them x i has a mean mu i and a variance sigma i sigma i square which is which are finite. So we want the first two moments of everyone the variance and the mean of every one of these variables to be some finite value then one can ask what happens to the sum sigma x i i equal to 1 to n minus the mean value summation mu i 1 to n rescaled by this summation sigma i square and now if you say these x's are distributed by any arbitrary distribution which has a finite mean and a finite variance independent of what that distribution is the statement is that the limit as n tends to infinity of this combination here has a normal distribution. So that is the statement of the central limit theorem it says no matter what distribution you start with you will end up with the Gaussian distribution and what we would like to do is to look at a couple of toy models and see if it is really true and how it comes about rather than how to prove this statement is a rigorous proof of this theorem and it is called generalizations what we will do is to look at the simplest instance simplest possible example toy model and see exactly where it comes from where does it become how does it become Gaussian and so on and that is an instructive lesson. So let us do the following let us look at a set of n variables let us first do the following let us look at a variable x which is in the sample space 0 to 1 and p of x equal to 1. So this is uniformly distributed in the unit interval between 0 and 1. So if I plot this p of x it is just this so the average value is a half and so on and so forth what is the variance so it is clear that x equal to a half because it is just uniformly distributed and the mean square is the integral of x squared from 0 to 1 which is one third one third and you subtract half whole squared so it says the variance of x equal to one third and one fourth equal to 1 over 12. Now I ask what happens if I have two of these variables if I add two of these variables what does the distribution look like? So let us call let us have another such variable y which is also let us say we got a y which is also element of 0 1 distributed in exactly the same way then the question is what is the distribution of a variable z which is defined as x plus y. Now what is the sample space of z? 0 to 2 in this case so we immediately write down p of z the probability density function of this variable z must be equal to integral 0 to 1 dx integral 0 to 1 dy p of x well it is one in this case p of y that is also one in this case I do not want to confuse with this symbol here or let me call this rho you have p of x p of y delta function of x plus y minus z and this is one this is one so we got to do this integral it is as simple as that but we got to be little cautious doing this delta function integral and that is done as follows I draw a picture and here is x here is y here is the origin and the integral is over the unit square 0 to 1 those are the boundaries of that is the interval in which each of these variables takes values and now I try to impose the condition that x plus y is some number z between 0 and 2 okay if it is 0 the graph looks like this so this is the curve x plus y equal to 0 or y is minus x 45 degree line 135 degree line this way and as long as z is less than 1 it is clear that the line goes something like this so this is x plus y equal to z for z less than 1 and what is the value of this integral well look at it this way I do the y integral first and then I do the x integral when I do the y integral I must see if the delta function fires or not so as I move up in y for every value of x there exists a legitimate value of y such that x plus y is equal to z provided x is less than this value which is z and then I can just finish off the integral completely but as soon as x exceeds z the curve looks like this as soon as it exceeds 1 it starts looking in cutting in at these two points so till z hits 1 you are fine it is just an integral in x running from 0 to 1 so it says that you can get rid of the delta function constraint and write rho of z equal to integral 0 to z dx because the y integration is gone the delta function is taken care of it but the region of integration over x has been curtailed not from 0 to 1 but 0 to z okay so this is equal to z itself right up to 1 what happens beyond that is that the curve looks like this so this is x plus y equal to z and z is 1 less than equal to z less than equal to 2 what is the range of integration now in x if I finish this delta function constraint as I move up in y I have a contribution as long as x is bigger than this value whatever it is only if x is less bigger than that value right up to 1 right up to this point 1 and what is this value this straight line here this point here y is 1 and you are on the curve x plus y is this guy here so what is this value here z minus 1 so this is equal to z minus 1 up to 1 dx equal to 2 minus z so you see how the shape changes as soon as you have this z crossing 1 so therefore there is a point where this thing is sharply peak this value and if you write what the distribution looks like so here is z here is rho of z is actually something like this at z equal to 1 it goes up to 1 so it is like this it looks like that so what was flat as soon as you add 2 of them has become a little conical thing like that triangular distribution and it is normalized because the area under this curve is half the basin times the height the height is unity in the basis 2 so it is 1 so this is what is going to happen successively I add more and more of these variables but you see if I had one more variable I had x y z and then you call w the sum for example then you are going to have an integral which is in the first instance going to constrain this to 1 minus something and then 1 minus x minus whatever so it is going to be more and more complicated as I go down this is a very foolish way of doing this when you have more than 2 variables in fact what will happen at the next stage and you should do this numerically to see what the shape changes like at the next stage it is going to go something like and so on it will go right up to 3 etc. I can get rid of that spread all the way by subtracting the means each time but you see the shape changing all the time it is getting the range is getting wider and wider and the shape is changing so we would like to see what happens when you put a very large number of these and then take the limit when the number goes to infinity the way to do this is as follows yeah so let us define z n to be equal to x 1 plus up to x n minus the mean the mean for each of these is a half so let us define this to be n over 2 divided by the variance what is the variance of each of these it is 1 12th as we saw for n of them it is just n over 12 and I want to divide by variance so let me define z n in this fashion and I asked now what is the distribution of this z n probability density function okay let us call that as variable value z so this is equal to integral 0 to 1 dx 1 0 to 1 dx n in this fashion and then the p's and all of them are unity this is all uniformly distributed over the unit interval and then a delta function of z minus z n well little z n is summation x sub n 1 to n minus n over 2 divided by root n over 2 that is the integral I have to do but it is not a foolish to try to do it as it is because as you can see the constraints are going to keep on and multiplying here so the way to do it is to try to factor this into a product of integrals we do that by writing a representation for this as the Fourier representation a minus infinity to infinity dk e to the ikx e to the ik whatever so I replace the delta function by that and what happens to row of z this becomes equal to 1 over 2 pi into the minus infinity to infinity dk e to the ikz that comes out and then I have an integral 0 to 1 dx 1 e to the z n is going to be e to the ik I am just concerned about that n over 2 what happened to that oh yeah it is sitting there it is very much there so we got to take care of that so let us do that this is e to the ik there is a minus z n so there is a plus n over 2 there e to the ik n over 2 this minus and that minus goes away and then root 12 over n and this is 2 root 3 and then there is an n so let us write this as square root of 3 n that factor is there we cannot avoid it and then an integral from 0 to 1 say dx 1 for instance let us write that first e to the minus ikx 1 and then there is this guy root 12 over n x 1 yes this fashion and that is it but then the same integral gets repeated so let us just write this as dx and take this to the power n perfectly all right so this becomes and that is a trivial integral to do so row of z is let us let us first simplify this integral let us put i equal to integral 0 to 1 dx e to the minus ik and what do you get here this is 2 root 3 right root 12 over n x which is equal to e to the minus ik minus 2 ik root 3 over n minus 1 divided by minus ik 2 ik root 3 over n doing this definite integral right and let us pull out this factor so this is equal to e to the minus ik root 3 over n and then this is going to appear with a plus exponent there there is a minus here that is minus 2i sin of this guy right so minus 2i sin of k root 3 over n divided by minus 2i k root 3 over n watch all the minus signs and so on because I do not swear to this thing it is okay if it is wrong I blame you so this cancels okay and that is i but what is appearing is i to the power n so this whole thing raise to the power n let us put that in if I raise this to the power n this becomes square root of 3n so that neatly become e to the minus ik square root of 3n that is going to cancel against this and then there is a sin k root 3 over n over k root which is equal to so that is rho of z it is obviously the Fourier transform of Fourier inverse Fourier transform of this guy that is what rho of z is now we are in good shape because this factor cancels against that and all you got to do is to look at the behavior of this as n becomes larger and larger what happens to this guy as n becomes larger the argument goes to 0 as you can see so it is like sin x over x and as n and as x goes to 0 the limit is 1 right the next term will be a correction of order 1 over n because sin x is x minus x cubed over 6 the next term so what we need to do is to find this ratio here becomes 1 minus the square of this guy so k squared 3 k squared over n and then there is a 6 out here that is the next term raise to the power n and we need limit n tends to infinity which is equal to e to the minus k squared over 2 so it is a home because it immediately says that rho of z in the limit as n tends to infinity tends to 1 over 2 pi integral minus infinity to infinity tk e to the ikz e to the minus k squared over 2 but that is a Gaussian with variance equal to 1 so the inverse transform is also Gaussian which implies this is equal to 1 over square root of 2 pi e to the minus which implies that z n has distribution n 0 mean unit variance because that is precisely what the inverse Fourier transform of this guy is by the way you can do this integral by completing squares etc etc and you will end up with this result here this integral will give you a square root of pi that cancels and gives you a square root of pi in the denominator so you see we started with the distribution which looked nothing like a Gaussian flat distribution in the variable was 0 to 1 but yet as you kept adding stuff to it we are ending up with a Gaussian distribution the random variable can take on values all the way from minus infinity to infinity because we made sure that negative values could be reached because we subtracted out that mean part so it got centered at the origin and now the variable can take on all values minus infinity to infinity very important to note that this is the only limit in which there is a distribution possible you see what happens is that we subtracted out something which is linear in n because means were all subtracted but divided by something which was a square root of n and that gave the right scaling so that you end up with a Gaussian distribution actually the theorem is a little more general than that we could have variables which had different distributions and still you would have similar properties but that gets a little more intricate generally you are used you encounter cases where they are all IID RVs some kind so although I have done this for the case of this uniform distribution the same thing is actually true if you had any distribution which has a finite variance in me with suitable rescaling so the statement is that if you have IIDs of this kind then if you have IID RVs of this kind then there exist constants A and B such that the variable summation X I 1 to n minus some constant B rescaled by some factor A sub n independent positive quantity and that some real number B sub n there exist A n and B n such that the limit of this as n tends to infinity goes to a Gaussian distribution so that is the statement of the central limit here. We did this in a very simple scalar way just you know one dimensional variables and so on but it does not matter what the dimensionality is does not matter at all and this is the famous random walk problem about which we are going to say a great deal so let me do that step by step and then we will see how the central limit theorem emerges. We are going to talk about the random walk on discrete lattices but right now in connection with the central limit theorem let us just look at the problem of a random flight so let us call this problem of random flights and the rule is the following I start in three dimensional space actually it is independent of the dimensionality of the space as well I start in three dimensional space at some origin and I take a step of fixed length L in some arbitrary direction in three dimensional space so I fly from this point to that point and this vector let us call it R1 having reached this at the next time step and I do this in discrete time at the end of every time step I take a discrete step whose length is L magnitude is L but the direction is anywhere in space so the next time around I go here and then I go there I can cross my path earlier path it does not matter no constraints at all whatever so this is vector R2 this is vector R3 and so on and I do this for some n steps okay and let us suppose that the at the end of the n step I am here this is Rn and the end to end distance from here to here let us call it R I should put a subscript n here actually so let us do that and I ask what is the probability density function of this vector Rn R sub n okay it is a three dimensional vector so I need to know its distribution in space as well as in direction as well as in magnitude so I ask what is the PDF of R sub n which is equal to R1 plus R plus R will be some quantity let me denote it by the following let me denote it as P of Rn because it is dependent on the number of time steps that I have and I would like to look at it and the limit little n tends to infinity okay and I should really put a time step also so that I have time included in a dimensional way so some time step tau for example we can put that later but just to remind myself of this let us put a tau here and call this a time step I am doing this in anticipation of the fact that later we are going to look at things in continuous time so I need to have a quantity of dimensions time right now for this problem it is only the n that is relevant it is only the number of steps that is relevant I want to know what this quantity is it should clearly be normalized so it is clear that this integrated over all space d3 R should be equal to 1 at every tau for at every n this should be the case so the whole thing should be normalized and the random variable is a sum of n random variables all of which are identically distributed except each variable is a vector in three dimensions only the length is fixed but the direction is arbitrary completely so let us go about it in the following way what would be the average value of this R sub n intuitively what would be the average value it should be 0 because it is a vector so it is as likely to point in one direction as an inter opposite direction with equal weight so I expect this should be 0 when we of course find the distribution exactly we should verify this is so but it is to be really true that this is equal to 0 what would be the mean square value that means the dot product of this with itself what would that be so what we now asking for is the expectation value of Rn dotted with itself so summation i equal to 1 to n summation j equal to 1 to n Ri dot rj this is what we are asking for but this summation this commutes with the operation of taking averages so it is this what is this angular bracket over average over what over the collection or ensemble of what all possible realizations all possible random walks random flights of n steps all possible steps that is what I have to take the arithmetic average over but what we are trying to do is to find the probability distribution itself so that these ensemble averages can be replaced by averages over the weighted averages over the probability densities but now this guy here is equal to let us suppose the length of each of these is l length then this dot product has n diagonal terms where i is equal to j then of course Ri dot Ri is just the square of the magnitude of this vector which is l squared and there are n of these guys so this is n l squared plus summation ij but i not equal to j in this case Ri dot rj now if Ri is a vector in some direction like this and rj is a vector in some direction like this bringing them down to the same common tail here and theta ij is the angle between these two vectors then since the magnitudes of these vectors is fixed all we have to do is to multiply this by l squared and take the average value of cos theta ij that is what the dot product means but what is this average equal to it is got to be 0 because for every such configuration there is an equal equally probable configuration like that it means the other fellow points in the other direction and cos theta is minus cos pi minus theta so these two contributions cancel each other so this goes away immediately this is identically 0 and you end up with rn squared is n times l squared and since the average is 0 average value of the vector rn is 0 this is the variance so if you take a random flight and ask for the end to end vector displacement vector the variance of that vector is proportional to n not n squared but n so the standard deviation is turning out to be square root of n root mean square displacement is turning out to be proportional to the square root of n now if n is proportional to time you can see that the distance covered the root mean square distance covered is going like the square root of time which is the famous random walk because if I walk purposefully in this direction at a constant speed the distance I cover is proportional to the time but if I walk meaninglessly meandering in random directions then my mean displacement is 0 but the end to end distance that is not 0 the means root mean square distance is going to be proportional to the square root of the time elapsed when I do a random walk this is typical behavior of typically diffusive behavior as we will see there are exceptions to it when there are certain physical conditions met but in general this is going to happen with the root mean square displacement is going to become proportional to the square root of the time right or the way mean square distance is going to go like the time itself first power of time so this is crucial that is just a dimensional constant but this is crucial so we already have some valuable information what we need however is this distribution the full probability distribution function so let us write it out and see what happens we need now to integrate over all these guys so we put in a delta function constraint and we say p of r n tau is equal to a delta function of r minus this vector here and the standard trick of course is to write that delta function in a Fourier representation immediately so I am going to have d3r1 d3r d3rn times the distribution of each one of these guys that is needed for each of these steps and since it is a one step quantity let me call it p1 of r1 p1 we still got to write down the probability density function of a single step that is going to be my input and then a delta function a 3 dimensional delta function of r minus r1 minus minus rn that is what this guy is and I write a Fourier representation for this and bring that integral to the left hand side so p of r n tau equal to 1 over 2 pi whole cube it is in 3 dimensions integral d3k times e to the power minus ik dot whatever it is r so e to the minus ik dot r and then I have e to the minus ik dot r1 e to the ik dot r1 e to the ik dot r2 etc times this fellow here and it is the same integral repeated for all of them so it is some integral to the power n and what is that integral it is equal to integral d3r forget the index now not needed e to the power ik dot r p1 raise to the power n this is what we need now what is p1 first of all what is going to be the physical dimensions of p1 because I want p1 of r integrated over all components of r to be equal to 1 so it is got to be 1 over length cubed I need a third power but what we do know about this is that you are asking for the distribution of a vector whose length is l and which is moving about it is called all directions possible it can be it is tip can be anywhere on a sphere of the surface of a sphere of radius r so this has got to be proportional to a delta function of r minus l it is got to be proportional to that right and what is the constant of proportionality how do you discover what that is you normalize you normalize this but over all directions and what is the total solid angle 4 pi so you divide by 4 pi and the volume element also has an r squared d r but r is got to be equal to l otherwise it is 0 right and that is got to be divided out so it is 4 pi l square and that is all it is so that is p1 of each of these vectors it is got dimensions of 1 over length cube because this is 1 over length squared and this delta function of course has physical dimensions of 1 over the argument which is 1 over length okay so it is a sharply defined sharp you know at one point there is a spike in r and that is it so it is trivial to verify that integral d3 r p1 of r so let us put that in this integral and see what happens so you have integral d3 r e to the i k dot r and then a delta function of r minus l and a 1 over 4 pi square that is what this square bracket is and I need to do this in fact okay obviously I should do this in spherical polar coordinates it is simplest because this constraint here involves this magnitude r but I have to take care of the angular variables integrations right what should I do I should choose my polar coordinates in such a way that the polar axis is along the vector that is sticking out there is a vector k sitting here is not being integrated over and since this whole thing is spherical is symmetric this region of integration is all space spherical is symmetric this thing is spherical is symmetric and that is a scalar so it is invariant under rotations of the coordinate axis which implies that I can choose the orientation of my axis as I please and the integral does not change and the convenient thing to do is to choose it along the direction of whatever vector is sticking out because then that i k dot r just becomes magnitude of k magnitude of r times the polar angle and that is theta in this case in spherical polar coordinate otherwise you are in trouble because if you did not do that the answer does not change but it is needless complication just to tell you what the complication can be if I choose the axis in some random direction and this vector is k say and that is my vector r and that is the polar axis the z axis then if this r has got spherical polar coordinates theta and phi and this fellow has spherical polar coordinates theta prime phi prime then the angle between these two the dihedral angle if you call it some gamma the addition the law of cosines which you learn in high school or wherever cos gamma right in spherical trigonometry we learn this formula for cosine it is the first it is the simplest instance of the addition theorem for the general polynomials which I presume everybody knows so what is this equal to what is cos gamma in terms of these guys it is called the law of cosines you learn this in spherical trigonometry right in high school or wherever first year college or something yes no you do not learn it you do not learn it in college no I am I am outdated so now tell me yes I mean Euclid probably knew it and I am sure he knew it various other people knew it but the internet generation does not this is the addition theorem for P1 of what it is equal to cos theta cos theta prime plus it is called the law of cosines so if you got two vectors sticking out randomly in space and you know the polar coordinates of each of these the polar and azimuthal angles you can tell what the cosine of the dihedral angle between them is so what you have to do for doing this integral if you choose the polar coordinates some polar axis in some arbitrary direction is to say that for this vector this vector and then k dot r is kr cos gamma and for cos gamma you have to substitute this in the exponent and then try to do the integrals it is obviously a horrible mess much simpler to exploit the fact that you have spherical symmetry in this case and there is a vector sticking out the direction of case so you can without loss of generality choose the polar axis along k which means that the theta prime the theta for k this this quantity theta prime is 0 so this goes away and cos gamma becomes cos theta which immediately implies that you can write this as equal to 1 over 4 pi l square integral 0 to infinity d r r square is sitting out there there is a delta of r minus l of course multiplied by e to the ikr cos theta now there is also an integral over theta what is the range of theta the polar angle 0 to pi so there is sin theta d theta but I like to write it as minus 1 to 1 d of cos theta and then there is an integral over phi 0 to 2 pi d phi e to the ikr now you can do the phi integral trivially you get a 2 pi factor so this gives you 2 pi and the phi goes away this gives you an e to the ikr cos theta but cos theta is the integration variable so it gives you e to the ikr minus e to the minus ikr over ikr so this integral goes away and you get 2 i sin kr over ikr the 2 pi goes away with this 4 pi so it is 1 over l square the 2 goes away here the i goes away here and you got a delta of r minus l so if you said r equal to l this factor cancels against that and you are left with sin k l over k l so this whole thing finally becomes something extremely simple it is so this is equal to sin k l and this whole thing here becomes sin k l that is it that is it so it says p of r is an inverse Fourier transform of this quantity sin k l over k l to the power n that is well behaved as k goes to 0 this fellow goes to 1 so there is no singularity otherwise you will be in trouble you got to do an integration over k and if there is something in the denominator that blows up faster than 1 over k you are in trouble but that does not happen even though it is raised to the power n this guy here has got a well defined limit okay and this sinc function as you know what it behaves like and you raise it to the nth power now you can see that as you increase n if you increase n here this number is always less than 1 so it is essentially going to go to 0 unless except for the contribution from the region near k equal to 0 where this thing goes to 1 so the dominant portion is near k equal to 0 and in what manner should you take a limit you should take a limit when l goes to 0 and n goes to infinity in such a way that this guy goes to a finite limit then there is a respectable limit here now what does it do for small values of k this goes like 1 minus k squared l squared over 6 so if you take a limit n goes to infinity l goes to 0 such that n l squared tends to a finite quantity then I can replace this l squared by that finite quantity over n and raising it to the power n is going to give me e to the minus k squared and that is a Gaussian and its inverse Fourier transform is another Gaussian immediately so let us call this let us call this alpha c so this fellow is this guy which is essentially 1 minus k squared so I put this equal to this so it is alpha over 6 6 n to the power n and that tends in the limit n tends to infinity e to the minus alpha k and the inverse transform of that Gaussian is another Gaussian that is an interesting exercise to do how should you do that integral how do you actually establish that that it is a Gaussian so let us suppose we have done this and we will finally want to find what is the inverse transform of e to the minus alpha k squared over 6 what should I do now I should not of course I know that I know for sure that the inverse transform of a Gaussian is a Gaussian even in high dimensions but what coordinate system should I choose for this no no because there is no need to there is no that makes it very hard that makes it very hard if you try to choose polar coordinates now you should choose Cartesian coordinates because this here will become k 1 x plus k 2 y plus k 3 z and it will factor and that guy is already in factored form so it will be the product of three Gaussians which you can write down easily so at this stage you should use Cartesian go back and choose Cartesian coordinates evaluate in Cartesian you could do it in all the coordinates but and you will end up with something which says p of e to the minus something times r square crucial point where will alpha sit will it sit up here or down here downstairs yeah so there is some constant divided because the width here is 1 over alpha the width here is we will see what that is when we do the diffusion equation this will turn out to be proportional to time because we are taking the limit n goes to infinity tau goes to 0 such that n tau goes to the time t and then this constant here will become d t the diffusion constant times t we will see this explicitly so this is how random flights in the limit of 0 time step and 0 length step tends to the solution of the diffusion equation in continuous space and time but it is actually a consequence of the central limit here really arose because you add all these steps and you end up with remember that each step was not distributed in a Gaussian or anything like that but this is true but now I also mentioned that the individual random variables could have other distributions we took them to have fixed lengths turns out it is of incredible generality the individual steps need not have the same distribution at all they need not have the same fixed length they could in fact be distributed themselves the steps could themselves have different lengths distributed such that that distribution of step length has a finite variance that is all you need the result is also independent of the space dimensionality does not matter how many dimensions you are in in space it is still true apart from some numerical factors you would still get a Gaussian variable here you could have distributions in time you could have distributions in space as long as they are respectable distributions with finite variances and so on this Gaussian this emergence of this Gaussian form is completely robust only when very fundamental quantities like if I did for example I took a problem of random flights in which the different steps are of different lengths chosen from some drawn from some distribution themselves and that distribution does not have a finite variance this means that there do occur steps of arbitrarily large length such that the mean squared length is not finite then you have a very different behavior all together Gaussian gets destroyed and this is what happens when you have these so-called livy flights so on we will talk a little bit about that later on this is like the strategy used by bacteria when they are in the nutrient solution they wander around taking random steps but then after while they take one big hop a long distance and then start doing it again now if you have these clusters of clusters of clusters in a self similar way it is possible to have random walks flights which have more non Gaussian behavior in the limit and they are also of practical importance to come back but otherwise the lesson here is that the Gaussian is very robust huge number of generalizations I will stop here and resume.