 Welcome back. We were discussing characteristic function and its properties. So, yesterday I was in a bit of a hurry to prove this result on uniform continuity. So, I will just redo the I will just recap what we said and finish the proof. So, C x of t is the characteristic function of x of some random variable x and C x of t is a non negative kernel. So, uniform continuity. So, these two I think I spelt out yesterday what these two really mean. So, uniform continuity. So, if you want to deal with right. So, uniform continuity means means for all t there exists a function phi of h which goes to 0 as h goes to 0. It is can go to 0 like that as h goes to 0 that absolute value of C x of t plus h minus C x of t this difference is upper bounded by. So, it is bounded by phi of h that is what this uniform continuity means. It is a stronger notion of it is stronger than continuity uniform continuity is stronger than continuity, but weaker than differentiability somewhere in between. So, this really means. So, continuity means that this difference goes to 0 as h goes to 0, but equi continuity means that regardless of what t is I can uniformly find an upper bound for this difference that is why it is called uniform continuity. So, this difference is does not really depend on t it is bounded irrespective of t see what I mean. So, that is why it is called uniform continuity. So, we were trying to prove this. So, we said that the difference. So, we took this difference absolute value of e to the i p plus h x minus e to the i t x we took that and we said this is equal to. So, we said this is less than or equal to expectation of the absolute value of e power i h x minus 1 we got till here yesterday. So, this was just simplifying this left hand side. Now, I am going to claim that this is in fact my phi of h and I have to show that this phi of h goes to 0 as h goes to 0. So, now, the claim is that. So, I want to send. So, I want to put limit outside here first I am going to establish that the random variable inside let me call this random variable some y of h for lack of better notation let me call this y of h. So, this random variable. So, y of h is a non negative random variable clearly right. So, this y of h we said is yes just absolute value of e power i h x minus 1 if you write this is cos h x plus i times sin h s h x you get this is equal to square root of 1 minus cos h x square plus sin square h x right just simplify this out. So, I will have cos square plus sin square which gives me 1 there is already a 1 that will give me 2. So, that gives me square root of 2 minus 2 cos h x this cos of h s not hyperbolic cosine cos of h x. So, this on further simplification. So, 1 minus cos is 2 sin square h x by 2 right. So, that if you do that identity you get 2 times absolute value of sin h x over 2. So, that is my random variable y of h is nothing but twice sin h x over 2. Now, so that is my random variable now I have to show that the expectation of this random variable goes to 0 as h goes to 0 correct. But, it is clear that this random variable goes to 0 right as h goes to 0 this random variable goes to 0 correct. But, so absolute value of y h is also less than or equal to 2 right. So, this whole thing is less than or equal to 2. So, dominated convergence theorem implies that you can take the limit inside correct. Because, it is bounded about uniformly by 2 right. Now, the sequence of functions the one thing is the sequence of functions is index by h which is not really a it is not a sequence in n it is not a sequence of it is not a discrete sequence. But, it does not really matter right h is going along any going to 0 anyway. But, that is dominated convergence theorem applies even in that case. So, here you will have by d c t limit h tending to 0 expectation of y h is equal to 0 right which means we have what we want. So, that proves uniform continuity. So, by dominated convergence theorem dominated convergence theorem this follows. So, if you are given a character if you are given a function which is continuous which is discontinuous there is no way the function is going to be a characteristic function of any random variable right. Because, all characteristic functions are uniformly continuous c means. So, property number c x of t is a non negative kernel means that for. So, this is what you have to show to show for any n and t 1 t 2 dot t n in r and z 1 z 2 dot z n in complex z 1 z 2 z n complex we have you must have that sum over j k z j c x of t j minus t k z k bar is greater than or equal to 0. So, all complex quadratic forms are non negative that is what it means to say that c x of t is a non negative kernel. So, you fix whatever t j is you want whatever z j is you want complex numbers and then you are writing down the matrix of. So, you basically what you are doing here is in other words. So, in other words. So, you are writing down. So, z 1 dot dot dot z n and then you are writing down the matrix of. So, the j k th term here is c x of t j minus t k and all the diagonal entries will be 1 as we discussed yesterday and then these are z 1 bar through z n bar. So, you are looking at a complex matrix and a basically a complex quadratic form is not it and all these quadratic forms have to be non negative that is what this is for any characteristic function. The way you prove that. So, this guy. So, the LHS. So, I am going to say this left hand side all the stuff is equal to sum over j comma k integral z j e power i t j x times z k e z k bar e power minus i t k x d p x right you agree with that. So, I am what am I doing right. So, I am just writing out. So, I am just writing out this as an integral and. So, actually I have not even pulled it out. So, I have just write written out this integral this as an integral this is integral of e power i t j minus t k x which is what I have written down here d p x right I have not assumed the density or anything right it is a very general characteristic function. And I have simply. So, I have skip one step I guess because I have taken the t j with the z j and t k with the z k bar I think so far you will have no problem right. So, now you have to realize that this is a squared term. So, this is actually if I write it down you will realize that this is in fact a squared term. So, this is going to be expectation of absolute value of z j well sum over j z j sum over j z j e power i t j x absolute value squared. And therefore, this is non negative. So, I am saying that ok. So, it is easier to see that this is equal to that rather than the other way around right. So, if you take this expectation of I have assumed complex expression here whose absolute value squared I am taking expectation of you know that for a complex number absolute value of z squared is equal to z z bar right. So, you apply that result. So, you multiply this guy with its complex conjugate and you multiply term by term you will actually get this. And this of course, some meter change of integral and summation that is needed. So, you just expand this out as z z bar and you will get this expression and this is clearly non negative. So, characteristic functions are non negative kernels non negative definite kernels. So, these 3 properties. So, there are few other properties I listed of the characteristic function right. These 3 are particularly core properties in some sense. So, I call these defining properties in some sense right defining properties of C f. So, these 3 are called defining properties of characteristic functions. So, far we have only proved that any characteristic function has to have these property. What is remarkable is that any function that satisfies these 3 properties or also characteristic functions of some random variable. These 3 properties are enough there is no more further properties that are needed. The converse result is a more involved theorem. It is called Bokkner's theorem might have heard of it B O C H N E R right. So, may be I will make a remark according Bokkner's theorem B O C H N E R. The 3 properties above are sufficient for a function to be the C f of a random variable. So, we have proved the forward result we have proved that every characteristic function must have these 3 properties. This is what I am saying here is the opposite right. There are no further properties that are needed. So, if you manage to verify these 3 properties if I give you some function of t f of t right and you manage to verify these 3 properties that f of t is the necessarily a characteristic function of some random variable. So, in these 3 properties are some sense defining properties of a characteristic function in the sense actually in the sense of Bokkner's theorem which is why I mean this is a little hard to digest you may be wondering why I put this down right this for example this non negative kernel business it is actually a very core property although it is a little awkward to state. So, if you manage so every Bokkner's theorem says that every uniformly continuous non negative kernel is the characteristic function of a random variable yes. This modulus square actually let me before saying let me just take it out I think I made a mistake summation. So, the absolute value yes this is correct and then you this absolute value square you expand as z z bar that is it you will get it. So, this remark is just for your information we will not prove Bokkner's theorem. So, there is also the inversion theorem I did not state in full generality and I believe I made a small mistake. So, I will actually state the inversion theorem again inversion theorem for characteristic functions basically says that if you know the characteristic function you know the distribution of the random variable. The present statement is as follows. So, these 3 I am just stating as 3 results I am not going to prove them if x is continuous random variable with density f x and C f C x then f x of x is equal to limit t tending to infinity 1 over 2 pi minus t e to t C x of t e power minus i t x d t at every point at which f x is differentiable I think I said continuous yesterday this is what the grimoire tensor has f x is differentiable and this may not I mean this relation may not hold at points where f x is discontinuous or non differentiable. Also by looking at a characteristic function there is an easy way to identify whether it corresponds to a continuous random variable or not. So, a characteristic function is defined for all random variables. So, there is a let me state number 2 a sufficient, but not necessary condition for the characteristic function C x to be the characteristic function of a continuous random variable is absolute integrability. So, basically if you have integral minus infinity absolute value of C x of t d t is then infinity. So, if your characteristic function is absolutely integrable then it is necessarily true that the random variable is a continuous random variable there will be a density, but the converse is not be true it may be true that there is a density, but the characteristic function may not be integrable absolutely integrable. So, it is a little bit like. So, if you are looking at this you can kind of believe the statement. So, if you are looking this at this inverse transform relationship you will have the absolute value of this is less than or equal to the absolute value of that which is less than or equal to the integral of the absolute value. Therefore, if you have the integral of the absolute value is finite you can expect the uniform absolute convergence and hence the uniform convergence of the inverse transform. Therefore, the density will exist as a equicontinuous function as a uniformly continuous function actually. So, this is believable in light of this I will not prove it again, but it is not. So, you may you can find examples of characteristic functions which are not absolutely integrable, but still x as a density think about if you can see if you can think of such a example. The characteristic function is not absolutely integrable, but x as a density 3. So, the most general inversion theorem is this right. So, general inversion theorem let f x denote the c d f of x and c x its characteristic function define f hat from r to 0 1 as f hat of x equal to half f of x plus limit y going to x from below f of y f x of y maybe I should call this f x hat. Then equals limit t tending to infinity integral minus t to t 1 over 2 pi e power minus i a t minus e power minus i b t over i t c x of t d t. So, this is again this is a hard theorem to prove this just says that you can f what I mean is f hat this is the f hat function. So, what this result says is that if you have any c d f need not necessarily be a continuous random variable or any structure no further structure is needed. If you are given its characteristic function you can effectively obtain the c d f that is what this is saying. Now, let me help you interpret this f x hat this f x hat is a function that is almost looks like your c d f f x except that at points of discontinuity it takes the middle value of the discontinuity see whenever see you know that c d f are not necessarily continuous, but they are right continuous. So, you are looking at f x plus the left limit of the c d f which need not always be equal to the functional value. So, in some sense you are averaging the right limit and the left limit which means you are looking at the mid point of a discontinuity of your c d f, but of course, if f is continuous these two will be equal and you will get back your f x correct. So, this f x hat is equal to your c d f wherever the c d f is continuous if f x is not continuous at some point you will this f x hat will take the middle point as the value. See this f x hat is not quite a c d f because it is not right continuous it is only taking all these middle values. So, it is like a proxy c d f, but almost approximates your c d f. Now, what this is saying is that if you perform this inverse transformation right if you perform this inverse transformation you do not you do not quite recover your c d f, but you recover this c d f which converges which except at points of discontinuity converges to the middle point everywhere else you recover your c d f. So, for example, in particular if you have if you put a if you send a to minus infinity right you will recover your f hat right, but from f hat going to f x is very easy right why. So, keep the points of continuity they are equal at points of discontinuity you will converge to the middle value you simply make it right continuous then you will get back your c d f. So, performing this almost gives you the c d f, but you know the c d f has to be right continuous. So, wherever it converges to the middle point you take it up to the right continuity the right limit point right and then you will recover your c d f that is what this is saying. So, these theorems require a lot of harmonic analysis and complex variables machinery to prove a fairly non trivial results it is generally I just put it down to show that you know you can completely recover the c d f from the characteristic function. Any questions what I am saying is suppose you have a situation where the c d f. So, the c d f may have some jump like that right. So, usual so the functional value of the c d f will always be here, because c d f are necessarily right continuous, but if you blindly perform this inversion this is like a Fourier transform inversion right. If you perform that the function you recover f hat will be fine at all points of continuity, but whenever there is a discontinuity this is a feature of all these Fourier theorems you converge to the middle point. So, this is a standard feature that occurs in even in Fourier series you get a situation like this right. So, if you write down the Fourier series of a square wave or something you get convergence here you get convergence here, but at jumps you get convergence to the middle point right. So, I think you may know that. So, all these Fourier type results have this this quirky property right you always converge to the middle point, but you know that the function the function is a c d f after all. So, you cannot be here you have to be here right you just redefine it and you are you have recovered your c d f which one ultimately have to be which one. S hat need not be continuous, if c x is absolutely integrable then it will have some very nice properties as I said here. If c x is absolutely integrable you will in fact have a density which is absolutely continuous . So, now I will talk about moments from characteristic functions. So, exactly like you can recover your moments from the moment generating function you can also recover it from the characteristic function. Except the with an added advantage that the characteristic function will always exist the moment generating function need not always exist. So, let me write c x of t the k th derivative of c x of t this means k th derivative of c x of t exists at t equal to 0. Then expected value of absolute value of x power k less than infinity for k even and absolute expected value of absolute value of k minus 1 is finite for k odd. . B if expectation of if. So, if some k th moment exists then. So, the expected absolute value of x power k is less than infinity. Then c x the k th derivative of c x evaluated at 0 equals I power k times expected x power k. So, the k th moment is k th moment multiplied by I power k is the k th derivative at 0. So, you get this I power k I mean this moment generating function you do not get this I power k there is no you have to we have to put this I power k for this characteristic function. Further and this is going to be important later you can actually tailor expand the characteristic function in terms of it is moments. So, you will notice that I am not improving not proving a great majority of results here in all this it is there are fairly involved. So, further c x of t admits this tailor expansion j equals 0 to k expectation of x power j over factorial j I t to the j plus little o of t power k. So, this is the moment generating property of the characteristic function. So, you can look at the characteristic function and say whether certain moments of finite or not. So, if the k th what this part a is saying is if the k th derivative is finite at t equal to 0 then the k th moment exists for k even and k minus 1 exists moment is actually finite for k odd and this part. So, you by looking at the characteristic function you can say a certain moments of finite and this part says if your k th moment the absolute value of x power k has finite moment. Then you can actually recover expectation of x power k by differentiating the characteristic function k times and setting t equal to 0 and of course, is my power k. Now, this part of the result says that you can tailor expand you can basically tailor expand c x of t in terms of the moments. This should not be altogether surprising for you after all because this is nothing but expectation of e power I t x. So, you can write this as 1 plus I t x plus you know I t x squared over 2 factorial you can tailor expand that I mean the tricky between bringing the expectation inside and that you can do when this condition is satisfied that is what this theorem is saying. Otherwise, it is trivial the trick is in bringing the expectation in. So, you can sum to the k th moment where you know it exists and the correction term is little over t to the k you know what little is essentially terms that go to 0 quicker than t to the k as t goes to 0. So, this will be like t to the k plus 1 or some bunch of terms that go to 0 faster than t to the k. So, this is going to be useful later in fact, this is the property that will help us prove some very important results such as central limit theorem and weak law of large numbers. Any questions see first of all little if I say that some function is little o h. So, little o h means that if you take little o h by h this goes to 0 as this case h goes to 0 that is what little o means. So, little o t to the k means that as t goes to 0 this term goes to 0 faster than t to the k fine. So, if you are essentially it gives you an expansion of c x of t up to the t to the k th term plus they may be little bit of a correction which goes to 0 faster than t to the k that may not see that depends on the characteristic function. So, we are saying that as long as the k th moment exists in the sense as long as the k th moment exists the structure of the characteristic function up to the k th term is very well determined by this by the moment. Beyond that there is a little bit of a correction term sorry used a lot in see I mean computer scientist use this o notation a lot right computer scientist use big o for big o and small o right big o is used for upper bound. So, when you say that some algorithm is like order n square or something the number of the number of the time complexity of it is at most like n square right. So, the time complexity divided by n square will go to some some constant, but in this case it has to go to 0 which means little o means that it is quicker than what you what is inside here it goes to 0 quicker than what is inside. So, big o and small o are different and this is a small o let me just say small o it is not big o right then that would be a very different statement this is small o. So, I will stop here.