 Welcome back. Today, we will discuss a very important concept, the concept of independence of random variables. You will recall that we have already studied independence of events right and we have also studied independence of sigma algebras. Now, independence of random variables, so in elementary treatment of probabilities, so you might have studied in your under grad, you would have studied independence of events and independence of random variables and you get two different definitions right. What we will do is we will adopt an approach, which brings independence of random variables and independence of events independent of sigma algebras all within the same intellectual framework. So, I will define independence of random variables in terms of the independence of the sigma algebras generated by the random variables and then later we will verify that the definition I am going to give is consistent with the more elementary definition of independence that you are familiar with. So, recall that, so as usual this is the picture, so you have some omega f p and let us say you have two random variables x and y that map elementary outcomes to R 2. This is the picture we have and we said that for any Borel set B in R 2 we can speak of its probability law, its probability that the pair maps into that Borel set right. Also recall that, we defined sigma algebras generated by random variables as follows in particular we said sigma algebras generated by x as the set of all A's subsets of omega of the form A equal to x inverse B for some B in B of R right. And similarly for sigma y, sigma y will be you call it something set of all call it C in omega that is that C is equal to y inverse B for B in B of R. So, these sigma algebras, so the sigma algebras sigma x is consist of all those events here whose outcome is completely determined by the realization of x. In particular every event in sigma x is of the form set of omega for which x of omega belongs to B and similarly every event in sigma y is of the form set of omega for which y of omega belongs to some Borel set right. It means, so far now definition, x and y are random variables on omega f p random variables x and y are said to be independent if sigma x and sigma y are independent sigma algebras. So, this is the definition we will take. So, as you can already see I have defined independence of random variables to random variables x and y in terms of independence of sigma algebras which is something we understand already. After all independence of sigma algebras just corresponds to independence of events right. If you pick one from here and pick one from here they should be independent events correct. So, this definition brings independence of random variables sigma algebras events everything in the same framework. So, in other words for any Borel sets B 1 B 2 in B over we have probability of omega. So, let me write this we have that the events omega such that x of omega belongs to B 1 and omega such that y of omega belongs to B 2 are independent is that clear. So, I defined. So, I said x and y are independent random variables if the sigma algebras generated by them are independent sigma algebras right. What do you mean by independence of let us call this f 1 and let us call this f 2 right. So, they are all sub sigma algebras of f right and the probability measure is defined on these guys also right. So, under this measure p we are saying that these 2 sigma algebras are independent sigma algebras which means you pick any whichever set you want from this sigma algebras and any whichever event you want from this one they must be independent events is that clear. However, so events in sigma x are precisely those which are pre measures of Borel sets under x and similarly events here are pre measures of Borel sets under y right which is what I am considering right. This is the pre measure of some Borel set B 1 under x this is the pre measure of Borel set B 2 under y and for any 2 such Borel sets the events x of omega belong to B 1 and y of omega belong to B 2 must be independent events right under this measure p thus. So, I can say i e actually i e p x belongs to B 1 I am going to abuse notation from the 1 y belongs to B 2 is equal to p x belongs to B 1 times p y belongs to B 2 for all B 1 B 2 in R right this is for any Borel sets B 1 B 2 in R B B R is this clear. So, you can I mean this is we can take this also as the definition I mean this is this is the same as saying this right. So, where any 2 sets any 2 Borel sets B 1 and B 2 probability that x belongs to Borel set B 1 y belongs to Borel set B 2 this these 2 events are independent. So, the it products out like this this should be true for every Borel set B 1 and B 2 are there any questions. So, now what is the next step what do I often do. So, once I make a statement for Borel set I will make that statement for generating class right. In particular if x and y are independent random variables we have probability that. So, I am going to apply. So, I am going to take B 1 as minus infinity x and B 2 as minus infinity y the generating class of the Borel sigma algebra. So, I can I will simply write this as probability that x less than or equal to little x y less than or equal to little y in all this I mean omega sets that. So, that is I mean I am not writing that again and again. So, that is equal to probability of x less than or equal to little x times probability y less than or equal to little y for all x y in R. Now, what is this this is the joint CDF correct and this is the product of the marginal CDFs right I E f x y of x comma y is equal to f x of x f y of y for all x y in R. So, if x and y are random variables which are independent then the joint CDF is given by the product of the marginal CDFs correct. Now, this is the definition that you are familiar with is it not how do you define independent random variables in more elementary treatments you say the x and y are independent if the joint CDF factorizes into product of the marginals. However, that is not quite what I said right I define independence in this way independence of sigma algebra from that I wrote down this Borel set condition and then I said. So, this should be true for all Borel sets right then I said in particular if Borel sets are semi infinite intervals this must hold right. So, according to my definition here this holds, but now you have to prove the I mean from if you if you think that well the elementary definition that saying that x and y are independent if this holds will not be a complete definition if the converse were not true right I mean if this were to hold is it true that this definition holds is it true in particular that for all Borel sets this holds understand the see if you if this definition and the elementary definition are indeed equivalent one must imply the other correct. So, what I am saying is the definition I have given here implies the elementary definition that you are familiar with. However, is it true that the elementary definition implies our current definition I only know this for see this only means that the probability on the probability law over the pi system products out correct it may not hold for all Borel sets right you have to prove it. So, that something you have to prove. So, theorem x and y are independent if and only if f x y of x comma y is equal to f x of x f y of y for all x comma y in R. So, one direction we have already proved in this if and only if right. So, if I write if and only if right which part have a proven if x and y are independent then this holds I have proved right it means I have proved that bit only if has been proved right yes I have to prove the if part which means that if this is given to me I need to prove x and y are independent in the sense that sigma x and sigma y are independent sigma algebra which is the same as saying this holds right for all Borel sets right. So, this requires a proof. So, the proof this proof again I will not do in proper detail I will only say that. So, I already stated this theorem on pi systems right that if you specify a probability measure on a pi system it gets uniquely specified on the sigma algebra generated by it. Then you can also show that if on a pi system the probability measure products out on the pi system then it has to product out on the sigma algebra that is also something you can show. And that is how this comes out right all I am saying is on the pi system of minus infinity x minus infinity y the probability measure is independent right. And then you have by the lemma that I just stated you can prove that it translates to the entire Borel sigma algebra or you can also approach it using caratheodory there is also another approach to this to the if part using caratheodory. So, this follows from. So, if part follows from the following property of pi systems if g and h are independent pi systems on omega sigma g and sigma well this is not what I really want to say. So, let me just let us see sigma g and sigma h are. So, in this case pi system will be the set of all omega for which x is less than or equal to little x. And here h will be set of all omega for which y is less than or equal to little y. And sigma g will be sigma x sigma algebra generated by g. So, and you have for what this condition implies is that g and h are independent pi systems right. I mean this is exactly the same as saying if g is the pi system of x of omega less than or equal to x and h is the pi system of y of omega less than or equal to y these 2 are independent pi systems big by this. Then from this you can prove I mean this is a property you have to prove from which this will follow. It requires a proof, but this you can see the proof is there in Williams if you want. So, is this clear. So, it appears that I am complicating things right you may you may just be very very happy with this definition this says the definition right. But actually you can take this as the definition it is equivalent once you prove this result you can take this as the concept of independence. The reason I do this as I said I mean all in more advance treatment you always look at independence of random variables as independence of the sigma algebra generate. And it is actually a more elegant treatment it may be a little bit hard to digest in the beginning, but the nice thing is independent of events and independence of random variables come into the same framework. They are not like disparate concepts right. In fact, the independence of events also can be define in terms of independence of random variables right. Now, what we did is we define independence of events define independence of sigma algebras then we define independence of random variables in terms of independence of sigma algebras. You can also do the opposite you can just say that x and y are independent is same as saying sigma x and sigma y are independent. Then say events a and b are independent if the corresponding indicator random variables are independent if a and b are independent if i a and i b are independent random variables. So, it is all within the same framework now you can actually prove that they are equivalent i a and i b are independent random variables if and only if a and b are independent events that you can show as an exercise. So, even if you never define independence of events you can go the other way as well that is what I am saying right. So, it is all in the same framework is there any questions on this is this clear to everyone. Now, how about if you have more than two random variables the same story right the same you can just give the same definition now right. So, shall I just give for n random variables and then state for arbitrary random variables. So, let so definition again. So, all these random variables x 1 through x n live on omega f p they all live on the same probability space random variables x 1 x 2 dot x n are said to be independent if sigma x 1 sigma x 2 dot dot dot sigma x n are independent sigma algebras. So, this is to be understood the same way. So, you pick any event here any event here. So, one you will end up with n events n boral sets and you should have independence among them right and actually there is no sanctity in taking only n of them you can take any arbitrary collection of random variables and define it in terms of I can define it in terms of the independence of the corresponding collection of sigma algebras right. So, so far you will be. So, you would have seen this in elementary treatments you will say that x 1 x 2 x n are independent random variables if the joint c d f of all of these guys products out into the individual marginals that you would have seen right. Now, I am going to say something a little more I am going to say definition a family of random variables family x i such that i belongs to some indexed i of random variables is said to be independent if the family of sigma algebras sigma x i i well sigma x i i belongs to i is independent. So, you have a collection of is a family of random variables this family of random variables may be in infinite family actually this can be even an uncountable family of random variables this i can be index by any set you want any arbitrary index set and in that case you say x i are independent random variables if the sigma algebras generated by each of those random variables. So, this is what kind of a thing this is a family of sigma algebras and we already know what it means to talk about independence of a family of sigma algebras right we I think we gave the definition right. So, what does that mean? So, you call this f i I think we use f i for. So, a family f i of sigma algebras is independent if you pick any event you want from each of these guys that must that family of events must be independent. So, that is something we already understand right now if you want to write down a c d of definition for this is a bit of a headache right. So, here you can do it right. So, this is a very elegant way of treating the whole matter and you give me any collection of random variables I define it to be independent collection of random variables if the corresponding collection of sigma algebras that they generate is independent sigma algebras right they are independent sigma algebras. So, it gives a certain aesthetic I mean it is kind of in the same framework right you do not have to give a different definitions for finite family infinite family this that right you can just finish it in once you know what independence of sigma algebras is you can just finish it in one frame work. Is there any questions. So, in effect what this means intuitively right I mean this is a definition that you understand in terms of the earlier definition, but. So, what this means is that. So, recall that sigma x i is collection of all events here whose occurrence or non occurrence is completely decided by the realization of x i right. So, what I am saying here is by saying this sigma algebras are independent I am saying that. So, you cannot decide by. So, what you can decide by looking at x i the set of all events you can decide by looking at just x i and the set of all events you can decide by looking at the x j s realization those events are independent right that is what this means correct just take it for x y if you are confused if you take x. So, I am saying that the set of all events whose occurrence or non occurrence is decided by looking at x and the set of all events whose occurrence or non occurrence whose is decided by looking at y those events are independent that is what it is ultimately saying right. So, if you want to say it very colloquially you can say that oh looking at x I cannot really figure out much about y right, but that it is a little bit I do not want to make such a lose statement, but that is roughly what it means right the set of all events you can decide by looking at x and the set of all events you can decide by looking at y are independent events correct. So, we can. So, if you adopt this line of definition you can completely avoid the CDF definition right any questions no more questions I will move on to the next topic. So, you take some time to digest this it is a very powerful way of looking at independence you may resist it in the beginning a little bit because this the CDF definition seems over so much more familiar, but this is actually a more powerful and more general way of looking at it and it is mathematically equivalent as we know right. So, I am going to switch gears completely now I am going to. So, we study these types of random variables right. So, for single random variables we said x. So, if you have a single random variable x it is categorized differently based on the nature of the measure p x induced on R right and we said it has to be one of three fundamental types or mixtures thereof discrete continuous and singular right. Now, we are going to look at let us say for just for two random variables what the possibilities are right actually there are lot of crazy possibilities right along one axis you can have one the other axis you can have another or some crazy mixtures thereof right, but again the interesting cases. So, we will not worry about singular as I said right. So, the interesting cases only correspond to so x and y being discrete x and y being continuous so on right. So, some nice cases what we will study in those nice cases for example, when the x and y are discrete we will talk about a joint PMF and then we will talk about conditional PMF. Similarly, we will say that if x and y are both continuous random variables it may not be the case that x y is jointly continuous. So, these are concepts that we will study. So, in the sense that so in the discrete case it is easy right if x and y are both discrete random variables it means that with probability one they take countable values which means the pair x y is taking countable values clearly with probability one that is the simplest case of the case when they are both discrete. But the case when they are both continuous if x and y are both continuous it may not be the case that the measure induced on R 2 is necessarily a continuous measure right some that is not a very easy case. So, that so it that requires a definition of joint continuous random variables those concepts you will get to know. The easiest of course as I said this discrete random variables. So, this is the case of so if discrete random variables you take two discrete random variables x and y living on the same probability space. Now, so this you are looking at a measure on. So, you are saying that x takes only countable values with probability one and y takes countable values with probability one. Now, if you look at the map on R 2 right what happens. So, you will you have so on. So, x so this let us say this guy is x this guy is y and we know that x being discrete it will only take countable set of values and y also being discrete will take some other discrete set of values. Which means as a pair x comma y the values they take will be the simply the Cartesian product correct. So, if I mark the points with x. So, I will get all these points right. So, I will simply get all these points right and so on. So, you see that and you also know that if you take two countable sets and take Cartesian product you will have a countable set again right. That is a I think that is something we did not prove, but it is true. So, if you have a countable set like this right. So, you have one countable set like that and another countable set like that the Cartesian product can be can be it is like you can count it like that. You can arrange a if you if you go like this you may never finish to come here. So, but if you count like that you will get a arrangement. So, you can show that Cartesian products of two countable sets are it is a countable set. So, the range on R 2 will also be countable with probability 1. So, on R 2 the measure all the measure is sitting on some countable set on R 2 all right. So, essentially if x and what I have said is that if x and y are discrete the joint measure is also sitting on a countable set. So, x and y are automatically jointly discrete this we may not appreciate so much now, but I am going to tell you that if x and y are separately continuous x y may not be jointly continuous. So, I can speak of a joint P M F which is defined as P x comma y of x comma y equal to the probability that x is equal to little x comma y equal to little y. This is called joint P M F and this is defined for all values of little x and little y. However, it will be non-zero only on a countable set in particular this will be non-zero on the Cartesian product of e 1 and e 2 where the e 1 and e 2 are the countable values that x and y take. And again just like in the one dimensional case if you give me the joint P M F of these two discrete random variables I can figure out the probability law for any Borel set on R 2 how will I do that. So, you give me whatever Borel set you want on R 2 let us say it is that that is your Borel set what I will do I will simply add the P M F of those points in B. So, what I can do is for the so if you ask me for the joint law of some Borel set B. So, let us say B is in B over 2 I will simply write this as the summation over little x little y in B P of x is equal to x y is equal to y. And so I mean this will only be non-zero on those points right where I mean it will only be 0 on the Cartesian product of the two ranges of x and y. So, why is this true by the way yeah it is countable relativity right because this B. So, the measure of B is simply it is measure is 0 outside these points of these points the measure is 0. So, it is measure is given by the addition of the measures of these single terms right everywhere else the measure is 0. So, this way so let me just say this is non-zero only on a countable subset of R 2. So, if just like on one dimension or two dimensions if you give me the joint P M F I can figure out everything there is to figure out about random variables x and y discrete random variables x and y. Any questions yes no I will get to independence soon. So, I am just talking about. So, I have switched here completely. So, I am just talking about I finished independence and defining everything about independence I am talking about now x and y both being discrete right. And I am saying that joint P M F is enough to characterize the joint law and if I know joint law I know joint CDF right. So, next definition let x and y be discrete random variables on omega F P. The conditional P M F of x given y equal to y is defined as probability of x is equal to x given y is equal to little y which is equal to little p x comma y. So, the notation for this is p little p little p x given y of little x given y is equal to given little y. So, this is defined by divided by p y of y where p y of y greater than 0. So, this defines the concept of a conditional probability mass function conditional P M F. So, what you do here is you are given two discrete random variables and you are fixing the value of y. Let us say this little y is one of the values that capital Y takes with positive probability otherwise you cannot condition on it right. So, you condition on y equal to y and look at the probability mass function. So, how is x distributed given y is equal to y that is given by probability obviously this is given by the intersection of this and that which is your joint P M F over probability of this guy probability of b which is. So, it is a very obvious definition. So, intuitively what it does is maybe I can just go back to this figure just clean it up a bit. So, if you just look at I am conditioning on y equal to y right. So, let me say this is the y I am conditioning on. So, I am fixing the y. So, if I fix a y I look at all the values of x it takes and I simply. So, for each x I define the conditional P M F as the joint P M F scaled down by the probability of y equal to y. So, I am fixing y equal to y and I am in some sense scaling these masses to make it a you make it a valid probability right. So, you mean after all if you sum this over various x's you will get 1 right. That is what is intuitively happening if you sum it over all x you will get 1 right. So, in a sense you are fixing a y and you looking at what x is distributed as this is a picture you are scaling all these masses to make it a valid P M F and it is a P M F over x and depends on which y you fix all right. This is definition clear any questions of this definition. It is actually I mean fairly elementary right this we. So, in the case of in the continuous case singular case and all this gets messy because you can know it is not you cannot really condition on a singleton right it has 0 probability. But, in discrete case there is no problem this is very elementary. So, we will. So, we have defined this conditional P M F now we will move on to characterizing discrete random variables and looking at independence of discrete random variables. You can get some characterization in terms of the P M F's and conditional P M F's for independent discrete random variables which we will do next class.