 We will begin this course on stochastic processes by a little quick introduction to discrete probability distributions. I will initially talk about probability distributions both for variables which take on a discrete set of values as well as continuous random variables and eventually work our way to random processes or stochastic processes where you have a random variable which varies in time in continuous time for instance. So, let us start with the discrete probability distributions and I am assuming that you have I assume that you have some elementary some familiarity with the elementary notions of probability and statistics, but let us do this very quickly recapitulate some salient features the ones that are going to be of interest to us in our future work. Let us start with the simplest of these distributions is take a pair of dice and toss these dice and ask what kind of distribution you get for the score from the pair of dice. So, each of these we assume these are fair dice and each of them has a number on the top phase running from 1 to 6 and you have 2 of these dice you toss it once and you ask what is the probability of receiving various scores and so on. So, let us call this random variable the score s which is the sum of the numbers that you see on this pair of dice each of them can have values each dice can give you a value from 1 to 6 with equal probability. Then what is the sample space of this sum s the score s clearly the least value you get is 2 and the largest value you get is 12. So, this is the set of integers 2 3 up to 12 that is the sample space and then you can ask what is the probability of getting a particular score and this probability if I denoted by p sub s it is clear that if you sum from s equal to 2 to 12 this must be equal to 1 that is the conservation of total probability. So, each p sub s is a number running a positive number running from starting from between 0 and 1 and s runs from 2 to 12. Now, what is this equal to by the way what is the this is a very simple experiment what is the most probable score it is 7 the most probable score is 7 and what is the probability of getting a 7 well this is figured out by saying what is the probability of getting any score at all to start with what is the probability that the first guy gives you say 3 and the second die gives you 4 what is the probability of it is 1 6 for the first one 1 6 for the second one. So, it is 1 over 36 right how many such possibilities distinct possibilities are there here is the die on the left here is the die on the right and how many distinct possibilities are there this can take 6 possibilities and the other one can take 6 possibilities. So, there are 36 possibilities in all right every time you toss these die there are 36 possible sequences that you can get assuming that this die and that die are distinguishable completely. So, the set of microstates I am going to use the language of physics here of statistical physics there are 36 microstates in all 6 for each of these and their independent objects. So, you really have 36 microstates that is the detailed configuration of the system as a whole what is happening to die one what is happening to die two completely specified 36 such microstates and the probability of any one of them if they are all equally probable because the dies completely the dies or fair dies is 1 over 36. How many macro states are there by macro state I do not ask the detailed information about the system, but I only ask for what is the score what is the total score. So, S is going to label the macro states label by S how many macro states are there there are 11 macro states here right most probable what is the most probable score 7 is the most probable score and why is that happening the most number of accessible microstates each microstate is equally probable this sort of reminds you of the micro canonical ensemble in statistical mechanics where you have a whole set of microstates accessible microstates and you postulate that they are all equally probable. Then the most probable macro state is going to be the one that gets contributions from the largest number of microstates and this is because what is the score by the way what is p 7 equal to it is 1 6th because the number of what are the contributing microstates to this well certainly 1 6 2 5 3 4 4 3 5 2 6. There are 6 possible micro states which contribute to this. So, 6 times 1 30 1 over 36 gives you 1 6th what is the formula for p sub S what does it look like well as you can see you can see that if S takes on it is extreme value 2 there is only one contributing microstate 1 and 1 and similarly for 12 it is 6 and 6 and in between it increases and decreases once again once the peak is reached. So, it is a very typical behavior of this guy and this is 1 over 36 into S minus 1 as long as S is less than equal to 7 it increases and what is it once you exceed the number 7 what is going to happen it is going to be a downward thing. So, it is going to be something minus S 1 over 36 13 minus S you just have to figure out that at 7 it is going to be 1 6th this is S not 5. So, we have a complete distribution of this whole thing we can compute all sorts of things the variance the mean the higher moments of this distribution etcetera. So, this is an utterly trivial example completely trivial example very elementary example the only assumption is that all the microstates are equally probable every one of them is equally probable and the uniform distribution in the microstates still leads to a non uniform distribution this is not a constant goes up and comes down in the macro variable here macroscopic variable here. This is at the root of what is happening around us in statistical mechanics when you discover that system sit in states most probable mac macrostate even though the microstates are equally accessible or equally probable the system will sit in the most probable macrostate and there is such a steady equilibrium state in many cases simply because the number of contributing microstates is the largest exponentially. So, in many cases as we will see in a simple coin tossing example now that we have this let us go to a slightly less trivial example or just one question an incidental question we assume these two dice were distinguishable objects they were distinguishable objects. Now, I would like to constantly come back to the physical applications the physics of it and make a make connection with what you know from statistical mechanics. So, suppose these two dice were made so perfectly that they are indistinguishable from each other what incidentally to the in this problem here what is the probability of getting three a score three it is 1 over 18 because you could have done it with 1 2 or 2 1. So, it is 1 over 18 now I ask I make these two dice absolutely indistinguishable from each other what is the probability of getting three is it still 1 over 18 these two dice look exactly the same absolutely alike is it still 1 over 18 would you say it is still 1 over 18 after all when you toss a pair of dice they are supposed to be fair dice they are supposed to be indistinguishable in any case. So, if you take these two classical dice and toss them even if they look exactly alike you can certainly see which one is which and the probability is still 1 over 18 now suppose these two dice were quantum particles identical particles both of them were spin one particles nice identical particles bosons would the probability still be 1 over 18 if they were indistinguishable in the quantum sense would it still be 1 over 18 what do you think they completely indistinguishable right and then these obey both statistics and in the both statistics way of counting this is certainly not 1 over 18 because you cannot distinguish a state in which one of them is in a level one and the other is in level two and the situation is reversed these two really form they are completely part of one state you cannot distinguish in this fashion. So, what is the new feature in quantum physics that has happened at what stage would you say that these two are indistinguishable in the quantum sense and therefore, the counting of the number of independent states is different from what it is in classical physics after all we know from quantum statistics that that is true you do not count in the same way as you do in classical objects right even if these classical objects are completely indistinguishable perfectly you still cannot put them in what is called an entangled quantum state on the other hand potentially if the two particles are absolutely identical you can put that system in the quantum mechanically entangled state and once you do that once that possibility exists you have to change the rules of counting it is no longer what it was earlier. So, the short answer to this question of when you use quantum statistics has in part to do with the fact that you can put these indistinguishable objects in what is called an entangled state which you cannot do with classical objects and that is the reason why even if the dice are indistinguishable from each other absolutely to whatever degree of perfection you still have exactly classical counting in the case of dice, but in quantum particles you have to change the rules very profound reason here we will come back to this when we talk a little bit about quantum statistics. Let us go on now to the first non trivial distribution which again let us do with the help of an example let us talk take a fair coin and toss this coin and look at the statistics of the coin tosses. In fact let us make it a little more general and say the coin is such that the probability of a heads is p and tails is q and p is some number between 0 and 1. I am going to do this toss this coin n times and that is my experiment. So, I take this coin and toss it n times lay them all out or take n coins which identical and toss them and lay them out and ask for the total number of heads and that is my random variable the total number of heads and I must talk about the probability space the sample space of this random variable. This is my random variable and what is it is sample space yeah it is 0 1 2 all the way up to n there are n plus 1 possibilities and they label my macro state any given value of little n labels my macro state. In a given toss the probability of success if I say heads is success is little p and that of failure is little q which is 1 minus p. I ask now for the probability that this random variable takes on one value from its sample space. Now I should be a little careful here in mathematics you would distinguish between the random variable per se the symbol for the random variable and the symbol for one of its values from its sample space. I am going to unless we need it I am going to avoid this distinction I am going to use the same symbol for both. You have to be a little careful here, but in the context it is very clear what I mean. So, what is the probability that the random variable takes on one of these values in this here what is the p of n these tosses are independent of each other and the coins are all exactly the same they are all identical. So, what is the probability of getting n heads well first of all I must get exactly n heads and they are all independent tosses. So, this must be proportional to p to the power n and the rest of the tosses must give me tails because I want exactly n heads. So, the rest of it is q to the power n minus n and that is the probability of getting any one sequence of letters capital n letters such that little n of them are h's and the rest of them are tails t's, but I do not care about the order in which this these heads are achieved at all. So, I must add now all the number of microstates for each of these microstates multiply this by the number of microstates which is just n n this is this stands for n c the binomial the combinatorial coefficient. So, this is my probability distribution and for obvious reasons it is called the binomial distribution because this thing here is just the coefficient of little p to the power n in the binomial expansion of p plus q to the power capital n. The questions of interest are things like what is the average number what is the average value of little n what is the variance in this number what is the scatter what is the actual what are various other properties of this distribution various factorial moments and so on and so forth all of which can be deduced from this fellow here it is by the way a trivial matter to verify that summation n equal to 0 to capital n p of n equal to 1 that follows immediately because it is just equal to p plus q to the power capital n, but p plus q is unity. So, it is normalized this probability distribution is normalized what is the average number incidentally this is called a Bernoulli trial a trial in which the probability of success is some little p and failure is 1 minus p and we are doing n Bernoulli trials. And the result for this capital little n this random variable the number of successes is a binomial distribution of this kind what we need are quantities like the average value which I denote with angular brackets this is the mean value or average value mean value of n we are going to look for things like the variance of n what is the definition of the variance of this variable how is the how is the variance defined it is a square value of the mean square of the square of n minus the square of the mean value squared right mean squared minus the square of the mean that is the value that is the definition of the variance can this be negative can the variance be negative no how do we know this is there a way of rewriting this variance. So, that we know for sure that it is cannot be negative yeah. So, you can also rewrite this as equal to you start with n minus the mean value. So, the deviation from the mean is this quantity here square it. So, it becomes positive and then take its average as a trivial matter to verify that this is equal to that. So, can the variance be negative can it be 0 yes there is only one condition under which it can be 0 only one possibility and that is if the random variable always has the value equal to its mean which means that it is a short variable it is no longer a random variable. So, the moment you have a random variable which can take on more than one possible value you have a variance which cannot be 0 it is got to be positive the standard deviation is defined as a square root the positive square root of the variance and it is sometimes denoted by delta n. So, the standard deviation sort of measures in some specific sense the scatter of this variable about its mean value and then of course, you would naturally like to know how big it is relative to the mean. So, delta n divided by n is called the relative fluctuation it is got other names, but I like this name relative fluctuation and incidentally if n has physical dimensions then this quantity this relative fluctuation ensures that it is dimensionless it is an absolute number. We would like to find out what all these quantities are for this binomial distribution like to find out what these variable various quantities are. So, what is the easiest way to do that first we got to define them right we have to define what I mean by the mean etcetera in terms of the distribution. So, n equal to n p of n because it is a normalized probability distribution. So, these numbers are all less than one positive numbers and this is the weighted sum out here and n squared of course, n to any power k equal to summation n to the k over the sample space. What is the mean value in the binomial distribution capital N times p. So, it is not hard to show that this for the binomial distribution thus becomes equal to n p. Now, the quickest way to do this is to use what is called a generating function at the moment you have a probability distribution whether it is continuous variable or discrete variable does not matter we define a suitable generating function something which will generate all the moments for you. So, let us define an f of z and this is the generating function this thing here equal to summation over the sample space p of n z to the power n. So, I define a variable z the continuous variable which could be complex in general it will it will be complex it does not matter because then it is got nice analytic properties the function of z multiply p of n by z to the power n and then you have an f of z. So, you can identify p of n as the coefficient of z to the power n in the function f of z in the expansion of f of z in a power series in z. And this is utterly trivial for the problem we have at hand because this is equal to summation n equal to 0 to n p of n is of course, n n p to the power n q to the power n minus n z to the power n. And what is this equal to yeah it is a binomial expansion. So, this is trivial it is p z what should f of 1 b what should f of n 1 b should be unity because f of 1 if you said z equal to 1 it is just the sum over all the probability. So, it should be equal to 1 is it equal to 1 here yes indeed because p plus q is 1. So, this is satisfied the condition is normalization condition is satisfied what is f prime of 1 equal to go back to this formula. If I differentiate this with respect to z then I get n p of n z to the power n minus 1 and then I said z equal to 1. So, I get n times p of n which is just the average value right. So, it is clear this is equal to n and what is that equal to if I put it in here this closed form n times p what happens if I differentiate the second time. I find the second derivative of f with respect to z and then said z equal to 1. Well the first time I am going to pull down an n the second time I pull down an n minus 1 and then I said z equal to 1 it is the average value of n times n minus 1. So, it is immediately clear that f double prime of 1 equal to n times and what is that equal to. Well I go back here this stage differentiate once I get n and then a p outside n p times this fellow to the power n and n minus 1 and that goes to 1. I differentiate a second time and then what do I get. So, we have our first interesting result which says that the average value of n square minus the average value of n that is what this is that is equal to n square p square minus n p square. So, the mean square value n square is this plus n p which is the mean value and if I subtract the mean value square it is this minus n square p square which is equal to this and that which is n p times 1 minus p which is n p q and that is the variance it is just n p q and what is the standard deviation square root of n p q. So, what is the relative fluctuation. So, the important thing is that this is proportional to 1 over the square root of n. So, the larger the number less the scatter about the mean relatively relative to the mean this is a crucial fact. So, very very crucial fact let us look at a simple physical example just to see where this is going to get us. So, is this clear the way it comes about by the way you can generalize this a little bit what would this be n times n minus 1 n minus k plus 1. This would be the derivative this is equal to d k over d z to the k of f of z at z equal to 1. It is called the k th factorial moment and once you have the k th factorial moment you can with simple algebra write down the k th moment itself. The higher moments themselves are not that useful just as in the case of the square the second moment what was much more physically relevant was the square the average here minus this guy here like the variance. So, we need to find similar subtractions for the higher moments and what are those things called when you subtract out all this quote unquote irrelevant parts of the lower moments in the k th moment what is the what is the higher than higher analog of the variance what are these are called the cumulants. These are called the cumulants of the distribution and we will come to it they were very important notion a very important reason why you have cumulants are more useful than the moments themselves. We will come to that very very important significant property of the cumulants which is not shared by the moments we will see what that is. So, let us give a physical example of this I think here let us assume that we have classical ideal gas in a big volume v and there are n particles in a big volume like in this room let us assume it is closed and it is in thermal equilibrium classical particles then there is an average number density rho which is n over v that is equal to the average number density. So, you have these particles moving about and then there is an average number density in this. Now, I ask the following question suppose I take a small volume v smaller than this capital v little v and focus on just this sub volume. Of course, the number in this sub volume is changing it is changing extremely rapidly there are rapid fluctuations, but suppose I take an instantaneous snapshot and ask at any given instant of time what is the probability that there are exactly n particles inside here. So, I ask for the probability little n p of little n that there are exactly little n particles inside this sub volume little v at any given instant of time. We are going to assume the whole thing is in equilibrium. So, the statistical properties do not change with time what is this going to be and all the particles are assumed to move independent of each other completely. So, it is like a whole lot of Bernoulli trials the probability that any given molecule is in this sub volume is just the ratio of this volume to the total volume completely unbiased uniform distribution in space. So, it is equal to little v over capital v and you want n of them. So, that is the probability that n of them are in there multiplied by the rest of them have to be outside which is 1 minus v over capital v to the power n minus n and you do not care which of the molecules is inside and which of them is outside. So, and little n runs from 0 to capital n exactly as in the binomial distribution this is therefore, a binomial distribution parameterized by capital n and of course, by this ratio. So, it is a Bernoulli trial set of Bernoulli trials in which the probability of success is this and the probability of failure is this out here. And what is this going to be average value it is n times p which is n times little v over capital v, but that is precisely rho v and that is exactly what you would expect because if this is the average number density then the average number of particles in this little volume v is just that number density multiplied by the volume it is precisely this. So, now you can ask what is the probability of fluctuation of a given magnitude from the mean and we already saw that the standard deviation to the relative to the mean relative fluctuation is proportional to 1 over the square root of capital N. And if you have 10 to the 24 particles in this room then of course, it is 1 part in 10 to the 12 which is exceedingly small. So, this is the reason why we do not all suffocate because there is a finite probability that all the molecules will spontaneously congregate in one small sub volume at some instant of time causing us to suffocate, but that is extremely unlikely extremely the probability is vanishingly small. The probability of a scatter about the mean significant scatter becomes smaller as the number N increases capital N increases when it goes to Avogadro's number it is just out of this world completely and that is why thermodynamics works because thermodynamics is the science of averages. And we are interested very often in fluctuations about this average and generally in a thermodynamically stable state the fluctuation about the mean the relative fluctuation is of the order of 1 over the square root of the number of degrees of freedom which is already of the order of Avogadro's number. And that is the reason why it works why you can get away with just the averages without looking at fluctuations except when the system goes near phase transitions or when critical phenomena start playing a role when the fluctuations get bigger and bigger till they become as big as the system itself in some sense. Then the variance the relative fluctuation the standard deviation of physical quantities can become as big as the mean itself and then you are in trouble when you have a failure of this very robust 1 over square root of N formula and then of course you have to look at it a fresh. So that was our simplest example of a distribution which is binomial distribution now let us look at another distribution which is related to this. We are going to look at number fluctuations and other kinds of gases also this is just one kind of classical gas we will look at it in other gases as well but before that let us look at another problem which is a related distribution and this is the geometric distribution. I go back to my coin toss I take a single coin the probability of a head is little p and tail is 1 minus little p and I keep tossing it and I ask what is the probability that I get a head for the first time on the N plus 1 toss. So probability I made it N plus 1 so that this N could start running from 0 1 2 3 etc. I have a first toss second toss and so on the first toss corresponds to N equal to 0 etc. So what would this be well clearly for N of these tosses I must have failure and the probability of that is Q each time. So this is equal to Q to the power N and the last one must be a success so this and it is just that is this probability normalized we need to check that what is the sample space of N 0 to infinity because you could have a real bad luck you could keep getting going there so you must have summation N equal to 0 to infinity p of N equal to summation p comes out N equal to 0 to infinity Q to the power N which is of course 1 that is the geometric series 1 over 1 minus Q which is p and the p cancels and you get 1 what is the mean value what is the mean value of N oh by the way you know why this is called a geometric distribution this is a geometric sequence just a geometric sequence for various sense now what is the mean value what is this well as before the obvious thing to do is to find a generating function then it becomes completely simple right. So let us do that the generating function summation N equal to 0 to infinity p of N z to the power N and what is this equal to in this problem its simplicity itself just multiply this by z to the N and sum over N and then it is p over 1 minus Q z it is just a binomial expansion so it is p over 1 minus Q z first thing you do is to check if f of 1 is 1 is f of 1 1 set z equal to 1 and you get Q over 1 minus p over 1 minus Q which is p so therefore it is normalized no problem and what is the derivative at the origin all you got to do is to take the derivative at z equal to 1 what do you get differentiate with respect to z which is 1 minus Q z whole square set z equal to 1 so it is Q over p the average number let us be careful that is not a great thing was my derivation right this got normalized so this is okay if I differentiate I get this 1 minus Q z whole square and then a pQ up there Q whole square yes okay now what happens where you can see that is reasonable answer because of the probability of a head becomes smaller and smaller little p becomes smaller and smaller the average number when you you wait for is going to getting is going to increase till p goes to 0 it will go to infinity and in the converse limit when p is equal to 1 and Q is 0 then the first toss because n equal to 0 corresponds to first toss you are going to get a head straight away so that is perfectly alright as it stands and what about the variance and so on and so forth you can write this down explicitly right so one way to do this to parameterize it slightly differently instead of this n let me call it mu the mean that is the usual symbol for the mean of random variable let us call it mu and that is equal to 1 minus p over p or p is equal to 1 over 1 plus mu so another way of writing this geometric distribution is to write it as 1 over 1 plus mu and then Q Q is 1 minus p so it is mu over 1 plus mu so we can write this as mu over 1 plus mu so that is another way of parameterizing the geometric distribution so once you identify this form you can read off what mu is what the average value is it is a trivial exercise by differentiating a second time and so on finding out what the mean is what the what the variance is and so on and so on leave that you do that now again let us ask for a physical problem where you have a geometric distribution coming out the binomial distribution we saw a very simple example of a gas and a classical ideal gas number fluctuations in a classical ideal gas and thermal equilibrium typically was a binomial distribution is there a similar kind of distribution example which you can think of for geometric distribution you have all seen this but in a slightly different language so you do not recognize it as such instead of a gas of particles think of a gas of photons inside a cavity it is a black body cavity so think of black body radiation what happens there it is a collection of photons sitting inside a cavity whose walls are held at a fixed temperature this gas equilibrates unlike a massive gas of massive particles where systems will be in thermal equilibrium because there are all these elastic collisions going on so if any one particle has too much energy the other fellows knock it down soon to come near the average and so on in the case of a photon gas there is no direct interaction between photons the direct photon photon interaction is extremely small there is a small correction due to quantum electrodynamics but the cross section is negligible for these purposes so here is a gas of non interacting particles they do not interact with each other they do not even undergo elastic collisions with each other they are photons massless particles but they equilibrate because the photons are absorbed by the walls the atoms in the walls of the cavity and re-emitted so this process of interaction with the walls of the cavity keeps the system in thermal equilibrium and one of the properties of a photon is that it has zero rest mass so in principle in principle you can create and destroy any number of photons you can absorb and emit any number of photons so the number of photons is not conserved there is no energy there is no number conservation for a gas of photons in fact the number is fluctuating quite widely and we would like to know what is the probability that at any given instant of time in a black body cavity with a given temperature you have a certain number of photons it is a fluctuating random variable now of course a photon is also described by specifying its direction of its momentum its wave number which is equivalent also to specifying its frequency because you know that the photon frequency and wavelength or wave number would satisfy a relation like omega equal to c times k where this fellow here is h cross k equal to p the momentum of the photon and this is 2 pi nu the frequency here remember that the energy of a photon epsilon is h cross omega or h nu depends on what you like to use this is like to use the angle of frequency as the frequency itself and h cross is h over 2 pi h is Planck's constant so for a photon you have to specify what its wave number is actually or its wavelength or frequency they are all equivalent in this language in addition you have to say what its state of polarization is now every free photon has two degrees possible degrees of two kinds of polarization it could be either left circularly polarized or right circularly polarized correspondingly its spin quantum number in the direction of its motion is either plus 1 or minus 1 these are the two hill city states of the photon or the two states of polarization of photons well let us simplify matters and say we not worried about any of these things we are simply asking for a given frequency nu and a given state of polarization what is the what is the number of probability that you have n photons in this cavity how would we go about it well this system is in thermal equilibrium with a cavity so I know that the probability that you have n photons number of photons of a given frequency nu and state of polarization that is what this n stands for and I would like to know and n has no limit on it it can go from 0 to infinity because as I said the number of photons is not conserved what is this proportional to what is this going to be here it is clearly going to be proportional to if it is going to be in equilibrium with a heat bath at temperature t some temperature t then this thing here has to be proportional to proportional to e to the power minus the energy of these n photons which is n h nu divided by Boltzmann's constant multiplied by the absolute temperature that is what equilibrium statistical mechanics prescribes for us in the canonical ensemble that it is got to be proportional to e to the power the energy of that state divided by k t there is a standard symbol for this one over k t which is e to the minus beta h nu times n beta stands for one over k t k Boltzmann times t k Boltzmann is Boltzmann's constant it is value in standard international units is 10 to the minus 1.38 10 to the minus 23 whatever it is joules per degree or something like that so this guy here is just a conversion factor to convert from temperature to energy and you have got this formula here but already you see this is a geometric distribution because e to the minus beta h nu is a number less than 1 and that is raised to some power n and this exactly like the q to the power n that we had so this is basically a geometric distribution all we need to do is to multiply it by the p to normalize it so if I want to write an equality I must multiply it by the p which is 1 minus q so 1 minus e to the beta h nu and that is it that is a geometric distribution what is the average number we have a formula and that was q over p right so this was e to the minus beta h nu over p which is 1 minus e to the minus beta h nu which is also equal to 1 over e to the beta h nu minus 1 do you recognize this factor it is the Bose factor from Bose statistics it is there it is right there I mean you know we are very simple argument I push certain things under the rug but this is the Bose factor exactly the Bose factor you begin to see where it comes from now just from the canonical ensemble normalized and that is it of course there were assumptions made here which are correct in this case the photons do not interact with each other it is an ideal gas and there was one more assumption made which is that it does not cost any energy to add one more photon to a statistical collection so the chemical potential here is 0 otherwise you must have beta times energy minus the chemical potential out here but the chemical potential of a photon gas is 0 because the photon is 0 rest mass for bosons the chemical potential cannot be positive it has got to be negative or 0 and in the extreme case of photons it is 0 so we have the Bose factor here this is in fact the average number for a given frequency nu for a certain state of polarization of course you want to find the actual total photon number of all frequencies you must integrate this over with respect to frequency and then you must multiply by 2 for the 2 polarization states but we already have the Planck formula in some sense and we have now an expression for what the mean number is what the variance is at etc etc what is the variance going to be well all you have to do is to take the generating function differentiated plug it in etc etc and find out what the variance is and you discover for this problem you leave it as a simple exercise for you to show that delta n over n this guy is greater than 1 in this problem so it is a huge scatter whereas normally we saw in the classical ideal gas that there was a scatter the relative fluctuation was not that large it was going like 1 over square root of n here the guy is actually larger than unity so this fellow is bigger than that shows that this is tremendously fluctuating system in which the photon number is really fluctuating enormously the relative fluctuation is huge compared to the classical ideal gas so the effect of statistics are shown up here in this very indirect fashion there are other examples of the geometric distribution that we will come across as we go along but I thought that it is a good idea to have one physical problem here black body radiation we are going to look at radiation in other states like inside a laser cavity and then the statistics are going to change completely as we will see the next question is to ask what happens in a Bernoulli trial what happens to the binomial distribution if the probability p goes to 0 and the number of trials goes to infinity in such a way that the product of these goes to some finite number in the limit such that n p goes to some number some finite number mu so the mean does not change but the probability of success in every given trial is vanishing the small the number of trials is increased so that the mean remains the same as before what happens to the binomial distribution so if you start with n n p to the power n 1 minus p minus n what does this do in this limit and it is not a very hard question to answer what we need to recognize is that now the sample space of this little n will go all the way to infinity because earlier it went from 0 to capital N for the binomial distribution but capital N is going to go to infinity so little n will now run all the way from 0 to infinity all the non negative integers and what would you do well what you would use is a very very interesting formula because this fellow here is equal to n factorial divided by n factorial n minus n factorial p to the power n 1 minus p n minus n and this will tend in this limit tends what should I do when you have the factorial very large number what do you do you sterling so called approximation I would like to call it sterling's formula because it is a very good formula it is not it is not much of an approximation it is an extremely good approximation extremely good approximation already because what does it say it says sterling it says that if you have a number n n factorial then n factorial is of the order of n to the power n in the leading approximation and you replace all the factors here by n then that is of course an over estimation so you got to kill that over estimation and what you get is a correction it goes like e to the minus n and then if you do it a little more carefully and normalize it you get 2 pi n that is the leading term in here and what you actually have is 1 plus corrections which go down as n increases which die to 0 and the first correction is of order 1 over 12 n plus order 1 over n square so you get a whole infinite series an as n tautic series but the first term already is extremely good as n becomes bigger and bigger it gets better and better because you can see that the first correction goes like 1 over 10 times n already so already when n is 10 that is a 1 percent correction 1 12 when n is 1 what happens it is a very bad thing to do when n is 1 what happens well if this formula is true it says 1 is approximately equal to e to the power minus 1 square root of 2 pi so the question you are asking is this so is square root of 2 pi approximately equal to e and so on but we know exactly how good it is going to be it is going to be correct to 1 part and 12 92 percent accuracy already so the error is just 8 percent or something like that when it is 10 it is already of the order 1 percent when n is 1000 it gets much better when n is Avogadro's number forget about it it is almost exact so it is an extremely good formula the reason of course is you write this as an integral you write this as integral 0 to infinity dx x to the power n e to the minus x write it in this form and then use this this guy is an increasing function of x as n becomes larger and larger increases more and more steeply that fellow comes down more and more steeply exponentially so so you have an integrand which has a function of x this one factor starts like this and goes up and the other fellow starts at 1 and comes down very rapidly the result is the product of the 2 will be 0 almost everywhere except in a small region here and now do a Gaussian approximation right this is a Gaussian integral and you get sterling's formula the leading term including the square root of 2 pi n so I leave that as an exercise to you and the corrections are going to be as good as this Gaussian integral is so they are sort of extremely small that is why you have these so this is a sort of Gaussian approximation to an integral it is very powerful so use sterling's formula here in this and a little bit of algebra and the result that emerges is that this becomes e to the minus mu you can see where the exponential is going to come from this little p can be written as mu divided by n so you have 1 minus mu divided by n to the power n when n goes to infinity that is e to the minus mu so that is where this comes from and then the remaining portions are mu to the n what do you call this distribution it is a Poisson distribution so this is the Poisson distribution it is a one parameter distribution specified by the value of mu and mu is the mean value that is where it started and that is where it remains what is the generating function of the Poisson distribution so once you have that f of z equal to summation p of n z to the power n n is 0 to infinity and that is equal to e to the minus mu plus we will write this as e to the mu I have a z to the n so it is mu z to the power n over n factorial some disease to the mu z multiplied by this that is it exponential this exponential form of the generating function for the Poisson distribution has all sorts of miraculous properties as we will see the product of two exponentials is again an exponential the sum that is going to play a big role is f of 1 equal to 1 yes indeed what is f prime of 1 that is the mean value so we stop here and I am going to take it up from this point next time.