 We were discussing the sigma algebra generated by a random variable x. We define this sigma algebra as follows. We took this space omega f p that is my random variable and we said that you take all the Borel sets and take the pre measures under x. Take all those subsets of omega which are pre measures of Borel sets under x. So, in particular we defined this as the set of all subsets of omega such that A is equal to x inverse B for some Borel. We can show that this is a sigma algebra that is a very elementary proof. You have already proved it in your homework just definition chasing. Now, because x is a random variable every one of these A's is also f measurable it is an event because pre measures of Borel sets are events. Therefore, sigma x is not only a sigma algebra it is also a sub sigma algebra of f. So, we said this is a sigma algebra and it is contained in f. So, the sigma of x consists of all those events here whose occurrence or non occurrence is determined by the realization of x of omega. So, it contains all sets of the form. So, any typical A contains is of the form x inverse B which x inverse B is what. So, if you remember so recall that x inverse B and simply the set of all omega in omega such that x of omega is in B correct that is the definition of x inverse B. So, it essentially contains all these sets. So, the moment I give you the realization x of omega. So, when the randomness omega realizes I can determine x of omega and I can determine for each Borel set I can determine whether that x of omega lies in that Borel set or not. So, for each of these sets A which are of the form x inverse B I can determine whether they occurred or not correct. So, which is why this sigma algebra represents all those events here whose occurrence or non occurrence is completely determined by x of omega x. Is that clear any questions on this? Sometimes the sigma of x may be as big as f sometimes it may be much smaller than f. So, for example, just let us just take that this is your omega B lambda. See this is your uniform probability measure on 0 1 and you take and consider the random variable indicator of whatever set you like you tell me some set you like canter set or some interval or you tell me whatever set you want. Let us say interval 0 1 third whatever is right. This is the indicator that my omega falls in 0 1 third. So, in this case let me just call this I a this is my event a. So, this I a takes only 2 values right 0 and 1 correct. So, this will be equal to 0 if omega is not in 0 1 third it will be 1 if omega is in 0 1 third. This set could be anything any Borel set. Now, this is a random variable indicator random variable. Now, what is the sigma algebra generated by this. So, it is the. So, you can show that can verify sigma of I a. So, I am just calling this I a is equal to omega 5 which should always be there and a and a complement right. It only consists of these 4 subsets omega 5 which will always be there and a and a complement. This is a sigma algebra and this is in fact the sigma algebra generated by the indicator random variable. In fact, this is true for any set a any Borel set a correct. And this is a much smaller sigma algebra than my f f is b here right. But on the other hand if I consider the random variable x of omega equal to omega let us say. So, which is a trivial random it just gives me the value of omega itself right. In that case sigma of x will be the Borel sigma algebra game right, because this is the identity map right. So, is this clear. So, I just a very trivial example. In fact, for the constant random variable what happens the random variable which takes constant value of probability 1 right. So, the constant is a random variable right. So, x is equal to some 1 with probability 1 if I say right. That is a all these omegas map to all these omegas are with probability 1 the sample space the omegas map to just some constant c right. So, in that case your generated sigma algebra will be the trivial sigma algebra omega and phi right. So, that is the random you cannot you cannot look at the constant random variable and decide anything about the occurrence of any other subset. You know that omega occurred and phi did not occur that is all you can say right. Looking at the realization the constant random variable is this clear any questions on this. So, this is a concept that we will use when defining independence and so on. So, we will drop on this definition later just keep this in mind. So, we will move on to the next topic which is several random variables. So, whenever I say several random variables when I am considering more than 1 random variable. So, all this random variables will live on the same probability space. So, they are all on omega f p. So, they are not different they are not random variables living on different probability spaces. They are all defined on the same omega f p and each of them will be a measurable function from omega to r. So, why so in terms of motivation right. So, as I said a random variables capture some numerical function of your elementary outcome. So, if you are looking at some complicated phenomenon like whether or something you may not even be able to capture what this omega is some very very complicated process. You do not know what the probability space is, but you may be satisfied by knowing the temperature which is a numerical function of what realize that day. So, but you may want to know more than one such numerical function right. So, it may be that the weather is whatever some complicated space we may not even know what omega f r right. And your whether on particular day realizes then this guy may be temperature one random variable may be temperature. Then there may be another random variable somebody else may be interested in measuring let us say humidity or something else some other numerical function. But the point is that both these random variables. So, you are measuring different things, but it is the same underlying randomness that is feeding these random variables right. The probability space is the same and the elementary outcome little omega is the same. And once little omega realizes this guy may capture the temperature this guy may capture the humidity or to make a much simpler example if your tossing 10 coins someone may. So, this x may measure the number of heads and y may measure the number of tosses until I saw the first head or something like that right. They may be measuring two different things or simply the number of tails right. But again the underlying randomness feeding these two random variables is the same right. They are all living in the same probability space is that clear. So, whatever it is that we said about. So, for this random variable you can completely characterize it with its probability log p x. Similarly, this will have a p y and you can specify or you can specify just the c d f capital F x and capital F y right. And all the properties that we said hold for both these random variables. So, if we just want to capture the statistical properties of this separately and this separately we know how to do it right. However, so in several random variables the main issue is how is in capturing the interdependence between these random variables. So, it is not so important how. So, I mean so single random variables if you just look at the probability log x you have a complete statistical characterization of x and similarly for y right. But remember that these random variables are coming from the same probability space and so if I tell you that the temperature on a certain days is in a certain way. It may tell you that the humidity has to be in a certain way perhaps right. It may tell you some information because the underlying randomness is the same right. So, if this was measuring the number of heads and this was measuring the number of tails right. You can clearly see that there is a dependence between the two right. Because in this trivial case what is a not a head has to be a tail right. But in general the structure of interdependence could be quite complicated. So, this is not the picture we will study. If you have multiple random variables what we will study is this picture. You have the same omega f p, but instead we will look at the map x y as mapping into R 2. So, what I will do is this is x comma y. So, every time omega realizes I am going to say that I have a point in R 2 right rather than just look at the two separate real lines. And this will capture my interdependent structure is that clear intuitively. So, that is what we will study. Similarly, if you have n random variables you will look at it as a x 1, x 2, x n are the random variables all define on the same probability space. You will look at it as a map from omega f to R n everybody with me now. So, now these are random variables right. So, you have to look at. So, in the single line we said over the this sigma algebra on the real line is the Borel sigma algebra that is the sigma algebra I care about. So, the pre measures of that guys must be f measurable right. So, we know that x and y are like separately random variables in the sense that if you give me a Borel set on R pre measure will be f measurable here also the same thing holds true, but here I am living in R 2 right. Now, this random variable x comma y lives in R 2 right. So, you would imagine that this probability measure on omega f is being some is being pushed on to R 2 by the pair x y just like x is pushing a measure on to R you would imagine that x y induces a measure on R 2 on the 2 dimensional plane correct. Now, except that you want to have a sigma algebra on this right on R we know that this is a Borel sigma algebra right. If you want to push this measure on to R 2 we have to say what sigma algebra you are pushing it on to right. The answer is that I will not say this very I mean in great detail, but you can define a Borel sigma algebra on R 2 all right which is essentially the sigma algebra generated by well it can be generated in multiple ways again just like on real line we can we generate it with open intervals or semi infinite intervals and so on. On R 2 you can generate it by what is the equivalent of an open interval an open ball an open ball in R 2 right. So, you can generate it with an open ball the set of all collection of all open balls in R 2 equivalently you can show that you can generate it with open rectangles you can show that it gives you a same sigma algebra, but the generating class we will consider is in fact semi infinite rectangles which means sets of the form minus infinity less than or equal to x less than or equal to minus infinity cross minus infinity less than or equal to y. So, we will look at. So, the Borel sigma can be looked at as the sigma algebra generated by the pi system on R 2 which is again this pi system on R 2 is the set of all semi infinite rectangles right. So, you take sets of that form right and you take intersection find a countable intersections countable unions and complements and whatever it is the collection of sets you get is the Borel sigma algebra on R 2. So, it is the logic is exactly the same just like you have intervals on the real line you use open rectangles or open balls you can show that if you use this semi infinite rectangles or if you use open balls you get the same sigma algebra. So, I mean the way I mean it is a little bit of a non trivial proof because this have edge these are like rectangles, but if you have to generate balls from them I guess if you want to generate something like that you have to approximated by a sequence of some sets right you keep you construct a countable sequence of sets in R 2 such that the ball can be the countable union is the ball right. And then to get the other way around to show that these kind of sets can be generated from the open balls you have to make up some other argument right, but you can do it it is standard proof. So, far you can believe it at least all right that is good. So, this is the Borel sigma algebra on R 2 and also you can also define Lebesgue measure on R 2 Lebesgue measure on R 2 corresponds to area. So, what you will do is I have these generating classes which are essentially rectangles right. So, minus infinity. So, what you will do is you take some rectangle of side a and b you will say that I want the measure of that set to be a times b right the area right. And then you take f naught which is a countable union of these rectangles and you are saying whatever measure you want to it right the measure which is the area. And then you invoke Karatheodori verify countable relativity of p naught and you will get a measure on this right same procedure I am not repeating it because it is the same thing right. In fact, you can do all of this in R n nothing changes right. So, you can get a Lebesgue measure on R 2 which is defined on this Borel sigma algebra which corresponds to area. So, that is well and good. So, you have Borel sigma algebra on R 2 now. So, now you want to talk about. So, let us say you are given some complicated Borel set here right I want to talk about the probability that. So, let us say this is my Borel set b it need not be something nice it can be something quite crazy. In fact, on R 2 there are a lot more crazy sets then there are on R right. Now, you can count it some canter sets and so on here you can do all sorts of things right there is canter dust there are all sorts of factors on R 2 right can create some totally crazy sets. And some of the many of those are actually Borel and you want to talk about the probability that x comma y lies in some Borel set right. Now, the question is an event like this. So, if you take ideally we want to talk about this right omega belongs to omega such that x comma y which means by which I really mean x of omega comma y of omega lies in some Borel set. So, this is some Borel set in R 2 whenever I write x comma y in B this is some Borel set in R 2. Now, is this an event I ideally want to talk about something like this correct agreed. So, is this an event meaning that is this what is inside this P see I can talk of P of something only if it is f measurable right is what is inside this is this game f measurable. See if you give me Borel set on R I know that probability of x lying in that B or y lying in some other B prime right they are all separately f measurable. But, if you give me some Borel set on R 2 what do I do do I know that it is f measurable it is not obvious you have to prove it this is true this is in fact f measurable. But, you have to prove it. So, the way to prove it is that. So, you know this right. So, you know let us say that. So, let me. So, I will do one thing. So, instead of rather than prove this in class I will give this to you as a homework as a guided proof. This will be a homework you have to show that these kind of. So, for any Borel set B in R 2 it is a little bit of an involved proof it will take 5 to 8 minutes to do it properly. So, I thought might as well just do it in homework. So, this set you know that the pre-images of Borel sets on R are f measurable for both x and y. But, if you are given some Borel set on R 2 you want to prove that the pre-image is f measurable right only then can you talk about something like this correct. This you will do it is actually an I mean you just have to do it properly it is not sophisticated or anything. So, you have to show this is homework there will be a guided proof you will I will not just say I prove this I will tell you how to do it in the problem all right good. So, you want to do this once you prove on that this is f measurable I have a probability law for Borel sets on R 2 correct. So, this is what this is what is this is the probability law after all. So, this is you can denote this as P x comma y of B right this guy this guy is defined to be equal to P x y P x y of B. If you did not prove this you cannot speak of this event right you cannot speak of this probability if you did not prove that that set were to be f measurable right. You do not need any further assumptions you can just prove it if x and y are separately random variables in the sense that pre-images of Borel sets are f measurable you can prove that this is an event with me good. So, this is my probability law on R 2 right probability law. So, this is called joint probability law this is the joint probability law of x and y. So, this specifies everything there is to capture about the interdependence of x and y right. So, you give me any Borel set you like on R 2 I can tell you what the probability of x and y mapping into that Borel set is right. So, I can tell you the probability of x and y mapping into that Borel set that is what this joint probability law tells me and that is a complete statistical characterization of not just x and y separately, but jointly right. From the joint probability law of x and y can you generate the probability law of let us say x how will you do that. So, you will just the B you feed into this will simply be. So, you have some Borel set on R you say that B cross R right that will be a Borel set you feed that in you will get the marginal probability law of x rate of x alone. Similarly, for y if you want the probability of y lying in some Borel set you feed the set you feed in here will be R cross whatever set you want right those are all Borel sets you can show. So, from here you can get the individual marginal probability laws of x and y separately. However, if you are given the probability laws of x and y separately you may not be able to get the joint probability law. So, separately specifying p x and specifying p y for all Borel sets is not enough to obtain p x y of on R 2. Because, you are losing out on the interdependent structure and you only say what the statistical probability of x separately and y separately is not capturing the interdependent structure right yes. No, it is not true, no I did not say anything like that. See, they have to be defined on the same probability space. So, you have some underlying space of coin tosses or dice throws or weather or whatever x and y are both defined on the same random on the same probability space. You cannot have x being drawn from some experiment involving coin tosses and y being involved from something else picking a random student or something right that is not what we are looking about we are talking about all random variables are living on the same probability space. This is sigma x and sigma y. Well, yes. So, what I am saying is. So, sigma x will be some sub sigma algebra of f sigma y will be some other sub sigma algebra of f they may or may not be the same there is no constraint at all. No, they are not right. Yeah, it is not true that if sigma x and sigma y happen to be the same sigma algebra it is not it is not at all true that x and y are equal. That is true, but however the interdependent structure may be quite complicated. So, if you give me a borel set here the set of all pre images of borel sets may be the same as set of all pre image of borel sets here. So, if you take to give you a very simple example let us say this is 0 1 interval with then you take x of omega equal to omega y of omega equal to 1 minus omega right to make a very trivial example. Then both sigma y and sigma x are in fact the borel sigma algebra, but I mean. So, you cannot say anything about the random variables right there ok good. So, now, so where was I so that is the joint probability law if you know the joint probability law you know everything about the statistical joint statistical probabilities of x and y, but knowing separate probability loss p x and p y is not enough to conclude what p x y is the joint probability laws. So, in particular I know that. So, I know that the borel sigma algebra on R 2 is generated by semi infinite rectangles right. So, since sets of the form minus infinity x or minus infinity x cross minus infinity y are borel borel on R 2 p x y of that guy is well defined correct. So, it is defined for all borel sets. So, it must be defined for the generating class right I mean I am repeating the same argument I did for single dimensional one dimensional case right and this is called the joint CDF joint cumulative distribution function. This is we will denote this by f x comma y of little x comma little y which is probability law assigned to the semi infinite rectangles. So, this is to write it out fully this is the probability of omega for which x of omega is less than or equal to little x and. So, I should really write x omega less than or equal to little x intersection omega for which. So, I should really write this right. So, I should I will never write this again just for once I will write this I will abuse notation from now on. So, I should write correct. So, but this is what I mean this is abbreviated as probability of x less than or equal to x comma y less than or equal to y again abuse of notation a serious abuse of notation. But this is what I will write from now on but whenever I write this I actually mean this this is called joint CDF with me. You can get individual probability law. Is it possible individual probability law? From individual probability law we can get joint probability law. The probability machine is defined as sigma of 6 no you cannot do that I mentioned this earlier. If you are given this p x y the joint probability law you can clearly get individual probability law by putting one of them to be r right. The other you keep whatever borals that you want. But if you are given the separate probability loss p x separately and p y separately you may not be able to figure out what the joint probability laws cannot be done in general. In fact, in the example I gave the omega and 1 minus omega example I gave. If you give me the separate probability loss they will both be uniform. But the joint law is not a uniform measure on the unit square you just think about what I just said. The example I gave on you take sample spaces the 0 1 interval you take x of omega equal to omega and y of omega equal to 1 minus omega. And the separate probability loss will be uniform on 0 1 for both cases. But giving p x and p y separately will not give me p x y. No, but p x may not be Lebesgue measure. I mean so p x y may be some arbitrary measure. For you know even if p x and p y are separately Lebesgue measures the p x y may not be the Lebesgue measure on r 2 that is exactly the example I am giving you. So, now again so this is a complete intellectually it is a complete repeat of what I said in r actually people do not I mean you can do all this on r n and just forget about it once for all. So, now what I am going to say. So, if I give you the probability law I can specify the CDF like I have just done the question is the opposite true. If I give you the joint CDF is it true that I can generate the joint probability law because this is what completely specifies my statistical probabilities of x and y together. What do you think why it is a pi system. So, this is this defines my measure on the pi system on r 2 which is the semi infinite rectangles. And you know that from the theorem we stated, but did not prove that if we specify a measure uniquely on a pi system it gets uniquely specified by specified on the sigma algebra also. So, the joint CDF uniquely specifies the probabilities of all boral sets on r 2. It is because of the pi system theorem we derived you can make a note of that as I said everything is a repeat. So, I do not this class is a total repeat except for the first few things I said. So, I have started stating some properties of the joint CDF most of these are along the same lines I mean along expected lines if you know the probability of one dimensional CDF. So, you are looking at two random variables. So, we will always use this notation capital F and index by the random variables capital X comma capital Y if these were x 1 x 2 x 3 or whatever you will write x 1 comma x 2 comma x 3. And the arguments are these guys x less than or equal to whatever. So, as you would imagine. So, this is a function of two variables. So, you should look at the plot in 3 D. So, when x and y both go to infinity the joint CDF goes to 1 and when x and y both go to minus infinity thus joint CDF goes to 0. You will again prove this is same way you can this you can show using continuity of probabilities just like we showed it in the one dimensional place. So, one remark here is that when I write something like this limit extending to infinity y tending to infinity I mean that x and y can go to infinity along any trajectory in any order. So, normally I should write limit extending to infinity then limit y or in some other way. So, when I do when I write limit extended to infinity then a limit y tending to infinity I mean that the inner limit goes first. But, if I write something like this I mean that it does not matter how these x and y go to infinity. You can go along any trajectory. So, no matter how these what the trajectories are this is limit is equal to 1. Similarly, no matter what trajectory along x and y go to minus infinity this limit is always 0. Then you have monotonicity f x y of little x 1 little y 1 less than or equal to f x y of little x 2 little y 2. This is again obvious because. So, the set of omegas that map to less than or equal to x 1 less than or equal to y 1 is smaller than the set of this are contained in the omegas that map to this semi infinite rectangle. So, this will follow f x y is continuous from above i e f x y of x plus u y plus v where limit u down going to 0 and v going down to 0 is equal to f x y of x y for all x. So, here. So, here. So, this here this is a 3 dimensional function. So, you have to be a little bit careful. So, let me explain what this means. So, you are looking at. So, this is your let say this is your x plane and this is your y plane and you are looking at what the c d f is doing coming out of the board. And you are saying that if you take any point here and if you approach this point from above meaning that. So, your u is your x variable is not it. So, your u is coming like that and v is coming like this. So, if you just imagine a axis here. So, your u and v can approach 0 in any order or any rate, but they have to come down to 0. So, you can either have this go to 0 first and then that go to 0 or you can go on whatever trajectory you want. You can do that. You can do that. u comma v can go to 0 any whichever way you want except you cannot do that or do that. You should not approach like that or approach like that or approach like that. So, as long as u v approach to 0 like that you have the limit will be equal to the functional value which is what continuity from above means. This is the equivalent of right continuity in two dimensions. So, both u and v have to approach from the positive coordinate never going to any of the other coordinates. Then you are then this limit will hold. Again this follows from what result how will you prove this continuity of probabilities. You consider any sequence u n going to 0 v n going to 0 going down to 0 and you just write it out and you will apply continuity of probabilities anybody with me. Finally, I put one property which this is all basically repeat of what we said for one dimension. So, I will just one I will just put one property which has no corresponding equivalent in one dimension because it talks about the marginal limit. So, if you have f x y of x y you send y to infinity you should get the marginal of x y to infinity you should get the marginal of x. And similarly, if you send x to infinity this is 2 for all x this is 2 for all y. So, what this is saying is that if you send one of the arguments in the joint c d f to infinity the function you get will be a function of the other variable. And that function will happen to be your marginal c d f of x see the marginal c d f is nothing but the c d f of x. Except when you have two variables you will call the c d f of one of them as a marginal and call this as the joint. This is your similar old c d f of x which is the probability that x is not equal to x. Similarly, this is the probability that y is not equal to little y. Again the proving this will involve sorry you should use some continuity of probability result you will fix an x and send y to infinity along some sequence. And you will end up with probability that x is not equal to x. So, given the joint c d f I can determine the marginal c d f the separate c d f of x and y, but the opposite is not true. Given this and given that separately there is no way to figure out the joint c d f. In fact given you can that you can show by counter example. You can take two c d f marginal c d f corresponding to two different joints. If you can construct even one such example you have proved that given these two you cannot reconstruct this, but of course given this you can reconstruct the marginal. Essentially the marginal only captures the statistical behavior of that random variable. Whereas, the joint the c d f will capture as the name says the joint as the joint distribution which is the it captures the interdependent structure between x and y. Any questions? Can we define the conditional c d f later? We will define conditional c d f later. Let us stop here.