 Welcome back. So, last class we were discussing the probability generating function which is the transform which is typically used for integer valued random variables. So, I there was the small clarification I had to make. So, the region of convergence of the PGF I said. So, it is a they are discs for non negative integer valued random variables. And if the random variable potentially takes negative values they are generally annular regions right. And, but the region of convergence always contains the unit circle right. So, this is one clarification I want to make. Today, we will study the moment generating function which is actually defined for defined in general for any random variables not necessarily discrete or continuous or otherwise. And the reason we will see the reason it is called moment generating function because it generates the moments the nth powers of the expected nth powers of the random variable MGF is a function m x from r to 0 infinity by n by m x of s equal to expected value of e power s x. The domain or region of convergence of m x is the set d x equal to the set of all s for which m x of s is finite. So, the moment generating function of some random variable x is like an exponential moment the expectation of e power s x where s is the parameter. And it obviously this may not exist everywhere and the domain in which or the region of convergence where this guy is finite is called the domain or the region of convergence of this moment generating function. Now, notice that so there is so I have defined it as a function that maps r to 0 infinity right. So, what I am essentially saying is that this s is something real some real number right. So, again so strictly speaking moment generating function ought to be considered a function of a complex variable. So, strictly speaking that it is analogous to the Laplace transform in a so we will see that very soon right. So, strictly speaking we should allow s to be complex right, but throughout there will be little bit sloppy about this part of the reason is that we have not really defined the expectation for complex value random variables right. We only spoken of expectation of x where x is some real value random variable not really spoken of. So, e power s if s is complex e power s x will be complex and we not really described complex random variables in any detail right. So, ideally we should have s is complex, but I am kind of pushing it under the corporate I am not going to be very serious about this. And for this reason even for the many theorems and results on the moment generating function which really require a fair amount of complex analysis to prove and understand properly right. So, many of in that sense I will only be quoting results and using some properties from complex analysis without being very serious about deriving them or you know proving them. So, this is expectation of e power s x. So, if x were to be discrete this is you would just sum up the p m f right. So, for example, if x is discrete with p m f p x of x then m x of s will be sum over x e power s little x p x of x right. And if x is continuous with density f x then m x of s equal to integral e power s x f x of x d lambda d lambda d x you can call it whatever you want you are integrating x basically maybe. So, just to be clear let me just write d x all right. So, it is in this sense that. So, for continuous random variables this moment generating function is exactly analogous to Laplace transform except for that little minus sign you use for Laplace transform right. Usually you put a minus here for Laplace transform, but otherwise it is the same thing and just like in Laplace transform you in Laplace transform you consider s as a complex number right ideally that is what you should do here also to be precise. But we will just we will not do that. So, just to keep things simple and to avoid contour integration and so on. So, any questions on this? So, let me give you some examples we have an exponential random variable with f x of x is equal to mu e power minus mu x for x because now is equal to 0. Then you have m x of s is equal to integral mu e power minus mu x e power s x d x right this is from 0 to infinity this is equal to. So, this if you just do this integral you will get mu over mu minus s or s mu minus s and this is valid for s less than mu. So, if you can just for now s we are treating s as a real number right. So, we just do this integral and you can see that if s is less than mu e power minus mu x. So, if you just take this up you will get this just simplify this. But if s were to be bigger than mu this e power s x is growing faster than e power minus mu x right. So, you will get infinity right in that case the moment generating function is plus infinity. So, again so there is one remark I want to make here if you had. So, again right. So, if you if you were doing this as a Laplace transform right. So, this will become a integral of a complex function right. So, technically in order to be rigorous you have to do this as a contour integral some of you might have studied complex analysis. So, you would have to do this as a contour integral the thing is you will get the same answer and the answer will be valid for real part of s less than mu. So, I am ignoring that bit here I am just treating s as a real number and blindly integrating it this clear. Second example let us do a Gaussian. So, let us do a standard Gaussian 1 by square root of 2 pi e power minus x square over 2 for x in r. So, I have to write out now m x of s is equal to integral minus infinity infinity 1 by square root of 2 pi e power minus x square over 2 e power s x d x right fine. How do you do this integral is a very standard trick to do this integral you have to you have to square I mean you have to complete squares in the top right you remember that it is an old trick. So, you have to multiply you have to I think do that s square over 2 and then you do. So, then multiply by e square e power minus square minus s square over 2 and then you will have a perfect square in the top right you will have e power minus x minus s square over 2 right correct with me. So, this is just algebra right this is standard trick and then this integral is your is a standard Gaussian integral right. So, this integral become 1. So, the answer will be e power s square over 2 and this is valid for all s I will say that for all s in r actually this is in fact this is the moment generating function for all complex values of s also it is valid in the whole complex plane. So, this is not value in the whole complex plane it is valid for real part of s less than mu if s were to be complex. So, I am so far I am just doing all this as real integrals which is not technically speaking the full means fully correct actually when we do Fourier transform this issue will bite as even more when we do characteristic functions right which is like a Fourier transform will this will bite as even more. So, that is exponential this is Gaussian and finally, let me do Cauchy. So, if you have f x of x is equal to 1 over pi 1 over 1 plus x square x in r. Now, you try to do this integral m x of s equal to integral minus infinity e power s x over pi times 1 plus x square d x right. Now, what happens in this case if s is equal to 0 you get if s is equal to 0 you get 1 right you are just integrating the p d f itself right. So, by the way I mean if s is equal to 0 anyway you get expectation of whole thing will become 1 right. So, it is always 1 that is one of the properties I am going to state right. So, this will become 1 for s is equal to 0, but if s is again pretending s is real if s is greater than 0 you have a growing exponential competing with the decaying 1 by x square type of a term right. So, that will be infinity right, but if s is negative what happens you have the opposite problem because on the negative side it is growing right and 1 by x square is the decay on the negative side as well. So, what you can conclude if s is not is equal to 0 this is not 0 this is always infinity right infinity otherwise right with me. So, as if you are restricting s to real values only for s is equal to 0 is it defined otherwise it is infinity it is not. So, the region of convergence in this case will be just s is equal to 0 on the real lines only for s is equal to 0 is the m g f finite. But s is equal to 0 is always a part of the region of convergence because as you can see the moment generate m x of 0 is always equal to 1 right by definition. So, for this particular so there are 3 different cases right. So, here there is a certain range where it is finite certain range where it is not in the region of convergence here for all values of s then m x of x exist. So, the region of convergence is the entire real line in this case and here there is only 1 point in the region of convergence only the origin right. But origin is always in the region of convergence and that is all there is in this case no further there is no further points in the region of convergence is that clear. So, that brings us to the question is it enough to specify the moment generating function in order to specify. If I just tell you this is the moment generating function and this is the region of convergence is it enough to specify the random variables distribution of the random variables density in this case completely right. So, if I were to give you this is my moment generating function with this region of convergence is it necessarily true that. So, for the exponential this is the moment generating function, but is this true that if I only give you this can you conclude that the can it can it uniquely say that it is this p d f or for the further matter for the Gaussian. If I give you this is my moment generating function e power s square over 2 for all s can you tell me that the random variable is the standard Gaussian right. So, more generally if I give you the moment generating function and region of convergence can I tell you can you tell me the distribution uniquely right. So, that is the what do you think yes or no no who says no yes. So, suppose I give you this suppose I tell you that the it is equal to 1 for s is equal to 0 and everywhere else it is undefined or infinity right infinity actually. So, then what you say does it have to be Cauchy. So, in fact I can have 1 over 1 plus x cube right you get the same answer correct. So, if this were to be not 1 over 1 plus x square, but 1 over 1 plus x cube with some other constant in front by the same argument you can conclude that or 1 over x to the 4 for that matter right. So, you will you can easily conclude that it will be the same moment generating function will be defined only for s is equal to 0 and undefined or plus infinity for s not equal to s not equal to 0 correct right. So, what do you think. So, generally the answer is no. So, I just prove that the general answer is no if I give you the moment generating function just like this for example, there is no way you can figure out the distribution, but it seems like I mean if you tell me something like that or something like this it seems like you can uniquely identify what the distribution is right. So, the complete answer to this turns out to be very interesting, but in order to appreciate it you have to you have to know some well it requires some tools from analytic theory of analytic functions. So, the answer is the following if you specify the if you give me the moment generating function not at one point, but in some tiny neighborhood of the origin of s is equal to 0. If you specify in any tiny neighborhood around the origin then that is enough to specify the density uniquely the distribution is uniquely specified by specifying the moment generating function in any tiny interval right. The interval can be very small or very big does not matter. So, you do not even have to tell me for example, that e power s square over 2 is the moment generating function in all of real line. If you just tell me this is the form for a tiny interval around the origin then it has to be Gaussian in this case in that case it has to be exponential and so on, but if you just give it to me at one point it is clearly not enough right. So, I will just state this as a theorem without proof this again goes back to the this goes back to the theory of analytic functions analytic functions are these you can think of them as the very nice functions which get completely specified if you specify them in a small interval m x of s is finite for all s minus epsilon epsilon. So, this epsilon is greater than 0 epsilon is any positive number it can be very small also as small as you like it cannot be 0 can it has to be strictly positive. Suppose you tell me that m x of x is finite in that little interval around the origin then m x of s uniquely determines the c d f of x if x and y are two random variables such that m x of s equals m y of s for all s in some tiny interval minus epsilon to epsilon again epsilon is positive then x and y have the same c d f. So, this is like a uniqueness of moment generating function. So, it is if you it is not enough to specify it at a point or a few points, but if you specify the moment generating function in a tiny interval right no matter how tiny it is if you specify it in an interval the c d f is uniquely specified. Furthermore, if you have two random variables let us say you know nothing else about them, but you do know that they have the same moment generating function not necessarily everywhere they have the same moment generating function in some very tiny interval around the origin. See all moment generating functions are I agree at the origin right because m x of 0 is equal to 1, but I am saying that m x of s and m y of s do not just agree at the origin with they have two they agree in a tiny interval around the origin. Then it is necessarily the case that the moment generating functions have to agree in their entire regions of convergence and the random variables have the same c d f. So, this is the. So, this is this kind of a result is possible because these so called analytic functions this moment generating functions whenever they exist in a neighborhood they are analytic functions of s of the complex variable s actually. And these analytic functions have some very they have some very tight properties. So, you cannot have two analytic functions agreeing on an interval and then differing elsewhere because you cannot make analytic functions by stitching together right because they have some very tight structure. They cannot stitch together functions and make them analytic it never works that way. So, that is why this theorem holds this proof is beyond the scope as I said because requires machinery from complex analysis which one there is no further conditions right. I mean I have not said anything else here right. If they agree in an interval they agree everywhere actually that has nothing specific to do with moment generating functions is true for all analytic functions. It just so happens that the moment generating functions whenever they exist in an interval they are analytic. So, if you take that let us say s is equal to 0 m x is always all moment generating functions have the value 1. So, what I what I require is that the function b function agree on some interval that interval can be anything right. If it agrees on minus epsilon 1 to epsilon 2 it necessarily agrees on some minus epsilon to epsilon right. So, I can take a there is no loss of generality in taking a interval like that. Generally if an analytic function agrees on any interval whatsoever it will that is enough it is always defined right. Zero it is always defined, but it is not clear that it will be defined in an interval around 0 as in this case right. But if it is defined it is uniquely specified no matter how tiny is epsilon is yeah you can have minus epsilon 1 epsilon 2 if you like there is no there is nothing very sacred about keeping the same epsilon right. So, is that clear. So, to go back to this if you give me a function like that and specify an interval in which it co holds then I am unique I am sure that this is an exponential and that is a Gaussian without even specifying it for let us say the whole of real line. If it is if you tell me that the m g f is e power s square over 2 in a tiny interval around the origin then it has to be a Gaussian the random variable. But here you cannot specify it for an interval right it is 2 bar it is only defined for 1 point in that case it is not unique. So, that means so at some level I must be able to if you are giving this kind of some functional form to me in an interval I must be able to invert the moment generating function and get back my c d f or p d f 4 as the case may be right. There are actually explicit formulae in terms of the moment generating function to get back here. So, to get back here in this case density let us say right these formulae are basically similar to the inverse Laplace transforms. But again we will not go into it because it requires. So, what is the inverse Laplace transform like it is a contour integral right it is an integral of a complex function over a certain contour right. So, because that is a little bit beyond our scope we will not actually perform contour integration to inverse invert any transforms. After all if I tell you that the m g f is 1 over 1 minus s you can just do some pattern matching and figure out what the random variable is right. So, what the approach we will take is somewhat more what I what I might call practical rather than very mathematical just that if you just pattern match if you see a transform like that you know it is Gaussian right. So, we will just figure out some transforms for some popular random variables put that in your keep that in your mind or put that in your formula sheet for your exams. And you just do pattern matching we will not worry about contour integrals. So, most of you must be experts on Laplace transforms right some of you will just know probably have like memorized a bunch of tables of transforms and such. So, in the case you can pattern match if you have not that type you put it in your formula sheet no problem right. So, properties properties of m g f m x of 0 is equal to 1 right that is the most trivial property that is by definition second property moment generating property which is why the function is called the moment generating function in the first place right. So, this I will put as a theorem suppose m x of s is finite for s in minus epsilon epsilon is positive then the derivative with respect to s evaluated at s is equal to 0 equal to expectation of x. And generally the m th derivative d s m at s is equal to 0 is equal to expectation of x power m. So, what this is saying is that if you take the derivative of the moment generating function with respect to s and set s is equal to 0 you get the expectation which is the first moment. And generally the n th moment the m th moment is obtained by evaluating the m th order derivative and taking s is equal to 0. So, this is again. So, if you want to prove this theorem. So, there is a I mean there is I will just at a very hand wavy level this is what is happening right you are taking d d s of expectation of e power s x right. So, if you tolerate me taking the derivative inside. So, if you do expectation of d d s e power s x then this will be expectation of s times e power s x correct I am differentiating with respect to s right. So, x times e power s x correct. So, this is equal. So, and then I set s is equal to 0 then I get expectation of x right. So, I mean I guess if you are if you are an undergraduate student this is the proof for you if you are a graduate student you will have to question this step right. I am essentially taking the derivative inside what is this expectation after all it is integral right it is an integral x d e power s x d p right. So, this requires a somewhat serious proof not a difficult proof, but this will be in your home work it is a I will put this in your next tutorial. This requires a proof it I believe it is you have to invoke dominated convergence theorem somewhere in order to yeah it requires invocation of the dominated convergence theorem that is your hint. And similarly for the m th derivative again if you are willing to buy this step you can differentiate this m times and you will get x to the m th here right. So, you will get the m th moment. So, the only real mathematical thing is in the taking in the you are essentially differentiating inside the integral that requires the justification it is not always true it requires justification after all the derivative is. So, let me give you one more conceptual hint. So, the derivative is some kind of a limit correct limit h tending to 0 of something right. So, you have an expectation is an integral and what is outside is a derivative which is like a limit. So, you have limit of integral and if you have to integrate interchange it you have to have integral of limit right. So, you must invoke either m c t or d c t you know only two ways to do it in this case it turns out that d c t will do the job. So, I will so I will not say anything more about the proof because you will supply this step and then you are fine. So, that is property number 2 property number 3 if y equal to a x plus b a and b are real numbers. Then m y of s is equal to e power b s m x of a s is actually a fairly simple change of variable fine. So, in particular so as an example if you have x is your standard Gaussian and we have y is equal to well sigma x plus mu y is equal to sigma x plus mu. In this case then y will be you can easily show that y will have mean mu and standard deviation sigma we varying sigma square right. So, then y will be n mu sigma square and in that case m y will be e power mu s times e power sigma squared s squared over 2. So, you know the m x of s for the standard Gaussian is e power s squared over 2 right. I am just using this result to compute the moment generating function for non standard Gaussian as well n mu sigma squared right. I can compute the moment generating function yes I am saying that this is valid. But, it is not always valid it is valid in this case you cannot arbitrarily interchange derivatives and integrals. Now, you need an you need to in work dominates convergence theorem this is not a this requires may be 4 steps it is not 1 step it is a you will do it in your home book it is a guided proof that you will do. Now, finally property number 4 you have z equal to x plus y x and y are independent then you can show that m z of s is equal to m x of s times m y of s. And this is because expectation of e power s z. So, e power s z is equal to expectation of e power s times x plus y that is equal to expectation of e power s x times e power s y. Now, you know that see you know that x and y are independent therefore, any functions of x and y are independent. And since then therefore, these 2 are independent e power s x and e power s y are independent and independent random variables are uncorrelated right. So, this will become e power s x times expectation of e power s y. So, here is where you need and you are using independence here. And similarly, if you have n random independent random variables you will just multiply all the moment generating functions in order to get the moment generating function of the sum. So, this is valid for this great continuous any kind of random variable right you have not assumed anything in particular. So, this is exactly like in signal in systems. If you are instead of convolving functions you multiply Laplace transform when invert back right that is exactly what you doing here. So, if x and y had let us say they had densities then you would convolve the densities, but that is convolution is an awkward operation. So, it is better to take transforms multiply and transform back right. So, if you want to example, if x 1 is exponential of mu 1 and x 2 is exponential mu 2 and x 1 and x 2 are independent. And then if z is equal to x 1 plus x 2 then you have m z of s will be the product of the 2 moment generating functions right which will be mu 1 mu 2 over mu 2 minus s times mu 1 minus s right I am just multiplying the 2 moment generating functions. And this will be valid for s less than this more than of the 2 is not it minimum of mu 1 comma mu 2. So, this is the moment generating function of the sum correct. Now, how will you find the if you want to find what the distribution of z is you will have to invert this moment generating function right. So, now you can use some of your Laplace transform tricks what you are used to in Laplace transform right normally if you have something like this you write a partial what is it called partial. So, you will write it as a over mu 2 minus s plus b over mu 1 minus s find a and b and then you know what to do right. So, that is fairly standard I think you can proceed from here. If mu 1 were equal to mu 2 you will get mu minus s whole square. And that will be the transform of your second order Erlang because you are adding 2 exponentials. So, good then finally, let me just state the final property before finishing. If you had a sum of a random number of random variables let say z is equal to sum over i equals 1 through capital N x i where x i's are i i d with m g f m x and n is independent of x i's with p g f g n and m g f m n then. So, this is a situation you considered before I am just trying to write it as write its moment generating function m x of s is given by g n of m x of s are equal to m n of log m x of s. So, g n is your p g f from the previous class the proof of this is using the law of iterated expectations is very similar to the proof we derived in the previous class is fine. So, the most the this formulas most useful in an example of this following kind where you have x i's are i i d exponential with parameter mu let us say and n is geometric with parameter p I think we consider this example earlier did not we where you had a geometric sum of exponentials right. And I think what we did that is just brute force did wrote out the whole thing and figure out that there is some answer right here you will get a very explicit answer. Because here you have m x of s is what mu over mu minus s right where s is less than mu and you have g n I know right. So, g n of s g n of if you like g n of z is what I used right what is the p g f of this from previous class I had p z over something like that is not it 1 minus p z is that right correct. And this is valued whenever the argument is less than 1 over 1 minus p. So, if I were to blindly substitute here. So, my m z z is of course, my geometric sum of exponentials m z of s will be equal to p. So, I have to compose so I have to take g n of m x of s. So, whenever I have g n of z in place of z I have to write this whole thing right. So, I have to write p mu over mu minus s over 1 minus 1 minus p mu over mu minus s right. So, I have to take care of the regions of convergence of course, just do that. And I think you just simplify this you should get mu p over mu p over mu p minus s actually I did not even compute that because I know the answer the answer has to be an exponential with parameter mu p right. Remember the taxi example or the word is a taxi or alpha particles either way either example will do right. You are waiting for taxis which arrive like with exponential durations and with probability p they are occupied may be probability 1 minus p they are occupied. So, you have to wait for a geometric number of taxis each of which takes an exponential amount of time to come. So, the total time you wait is exponential with parameter mu p. So, this is very easy right this is like three steps rather than doing the big summation in the previous case I will stop here.