 So, we are going to discuss today two important properties called independence of a set of events. Then we are also going to talk about conditional probabilities and then we will define what is a random variable today. So if you recall in the last class, we just began with what is a probability space, we defined what we mean by a sigma algebra, what we mean by a probability space. So that is the basic building block for us and we keep on building on these aspects. This notion of independence of two events is important in our study of probability and this notion of independence you have to go purely by definition, the definition that I am going to give you. We say that, so now onwards whenever I am going to define all these things I am going to assume that these things are defined under some underlying probability space which has its own sample space, f event space and an associated probability function. Well let us say this is given, now I am going to say let us take two events coming from my event space, I am going to say that they are going to be independent if the probability is such that the probability of the joint event split as the product of the probability of individual events. So this is just definition. So independence has nothing to do here that these a1, a2 are disjoint or anything of that sort. It is just the property of my probability function. For example, let us take our dice example. In our dice example, let us construct one event space which will have these events. Let us say I have such an event space, right now I am not worried about, it is a sigma algebra, maybe it is already sigma algebra but I am not worried about it. And on this let us define my probabilities like this. My probability I have defined the probability of the event 1, 3 to be let us say I have defined this to happen to be 0.3 and I have defined 2.2 and I have defined probability of 1 to be 0.6 and let me also include this event 1 here. So according to this for definition of my probability are my events 1, 3 and 1, 2 independent. So let us take this is to be a1 and let us take this to be event a2. I have defined these probabilities to be, what is the event a1, a2, when I say a1, a2 it means intersection of a1 and a2. What is a1, a2 and what is the probability of that event 0.6. Now is this 0.6 is going to be the product of these two, no, right? So in this due then say are you going to say a1 and a2 are independent, suppose if I redefine this to be 0.06 then they are independent under this probability. So as you see this, it has nothing to do with about any commonality of the terms that are occurring in these two events, if we are according to this definition we simply go that as long as the probability of this joint event splits as the probability of the individual events we are just going to call that as independent. Now this is the notion for independent of two events. Now can we talk about independent of three events, how about, so when you are going to say a1, a2 and a3 are independent, they are going to say they are going to be independent. Now if I take p1, p2 this should satisfy a1, if all these conditions holds then we are going to say that the events a1, a2 and a3 are independent. So suppose now I say that if I now want to extend it to a1, a2, a3, all the way up to there are ak, these many sets are there, what do you think the independence of these many events means? Here I took pairwise and also took, so what I did basically here, I took all possible subset, right? So then when you have a1, a2, a3, you are going to look for all possible subsets. But in the all possible subset, you do not need to check for individual elements, right? That p of a1, p of a2, you do not need to check because all subsets also includes individual elements. Now if how many such conditions overall, how many conditions will be there? If I have to check the independence of these many events, how many conditions you think you need to check? 2 to the power k minus k, why 2 to the power k first? So this means basically so it is total number of subset and why k you have to remove? Single individuals I have to remove and these 2 to the power k also contains what? Null set. I do not need to check anything for null set, so I have. So if you want to check independent of k events, you need to be checking these many conditions. So k are the individual single term sets? Yes, there are k single term sets and I am asking if you have to check independence of these many events, then you are going to do such thing for all possible subsets and the actual number of subsets you need to check this many, so you will be ending up checking these many conditions. So as you see, checking for independence of k sets already requires large number of conditions to be checked, that is like it is growing exponentially like 2 to the power k. So often it is of looking at independence of, it is kind of joint independent, instead of looking at such kind of independence, we will be interested in looking at pair wise independence, which we call a weaker condition. So when I want to ask, check only for the pair wise independence of this k events, we are only going to take 2 at a time and check whether they are going to independent. So to check pair wise independence, how many conditions you need to check, kc2, so now we are going to define the notion of conditional independence, before that we will define the conditional probabilities. So when you are going to model many things, you will, often will be faced with a situation that what happens if this event has happened? Suppose let us say there are many things that are going to affect, that are random things, but you observed that some event already happened. So what happens if you are going to analyze the system, for example, let us say you have two dice, you are going to throw them, you are going to throw first one and let us say after throwing the first, you are going to throw the second one. This is one experiment for me, throwing the dice one after another. So what is the sample space for this, how many points are there, there are 36 sample points are there in the sample space. Now let us say as soon as I threw the first dice, I already observed what is the outcome of this dice. Now based on this one, there is still randomness in this because I have not yet rolled my second dice. Now but my randomness has significantly reduced now because I have made some observation already about the first one. Suppose if you know the first one, what will your sample space reduces to? It is just going to reduce to 6, right? Now you have to just look at the randomness that is going to appear out of that can possibly come from this 6 outcome. So you see that in this case, after observing something, you want to condition on things and look at a new induced sample space after making this observation and now that will lead us to something called conditional probabilities. So we are going to say that probability of A given B, what is the meaning of this? Probability that event A happens given that B has already happened, okay and now we are going to define this as probability of A B. So now in the case of suppose if I tell you, what is the probability that the sum of the outcome of a dice is 10 given that the outcome of the first dice is 4. How you are going to compute this? So I already told you the outcome of the first dice is 4. Now you are only going to look at the possible outcomes which are 4, 1, 4, 2, 4, 3, 4, 4, 4, 4, 5, 4, 6, right? Now you know that my sample space has reduced, conditioned on this. Now in this, now you will be looking at event of interest that is the outcome of the dice is 10. In what way this can be possible? This is possible only if 4 after I observed 4, I observed 6. That means out of the restricted set, I will be now looking at the occurrence of this pair 4, 6, okay fine. So this is exactly, we are going to say now that the event B has happened related to this event B what is the frequency of event of my interest. So let us say I am interested in some event A after event B has happened. If this A has to be of any interest, this event A has to be a part of event B, right? Because if that event B is not a part of B, this is never going to be any event. For example when I say outcome the sum of the 2 dice is 10 that is the pair 4, 6, this is one of the possible candidates in 6 possible outcomes I had. And that what is that event of interest that is common to both A and B and what is its relative frequency in the new reduced sample space I have. So this is how we are going to redefine our conditional probability. Given that my probability has given that some event has been told to happen that has happened that is going to happen with probability P B that means when you observe that particular event that has its own probability and that would have happened with probability P B. Within this probability what is the relative frequency of that I am going to observe event of my interest and that is how we are going to define this conditional probability in this fashion. Now naturally as you see this conditional probability you want to define for something the event on which you are conditioning that happens with some positive probability, right? For example in the dice case we are not going to condition on the case where the outcome of the first dice is 0, right? That is not a possibility for you, right? So I am going to assume that is for that reason we are going to define conditional probabilities condition on the events for which the probability of the events is non-zero, okay? Now as you see this once I fix my B now I can change my event A whatever event I can ask. For example in the coin dice problem let us say first the B event I said the outcome of the first event is 4. After this I could ask the question the sum of the outcome of the dice is 6, sum of the outcome of the dice is 7, sum of the outcome of the dice is may be 12. I can ask all these questions, right? And all these different corresponding different events A. Now each of these events will have their own associated probability, right? Condition on this B. Now you can ask the question, yes P was a probability space which has started with which has which need to satisfy certain set of axioms, right? We said P1, P2, P3 are the condition that need to satisfy. Now if I am going to treat this guy as a new probability that is you fix this B underlying event B and then ask for the probability of events A. Now let us call that, now this is another probability, right? Defend on different different events. All these events are also coming from my script F. Is this again a probability function? If I need to check this is a probability function what I need to verify? So I need to basically verify by three properties P1, P2, what was the P1 property itself basically non-negativity, right? So is my P of A is going to be greater than or equals to 0, yes, right? This is by definition. So if you have defined it like this P is always going to give positive values. This is going to naturally be and what are my second property and can you somebody tell me precisely what is the second one? What do you mean? Is this property holds? Why is that? So just plug in for A equals to omega here. What is going to be if A is omega, what is omega intersection B, B, right? So probability of B by probability B you are going to get it as 1. What is the third one? So we want that if let us take a sequence A1, A2, we want this to be what? We want this to be hold for countably many AIs. Is this true? So just verify this condition as well. So because of this we are going to say that this conditional probability, so that is why even though we just define this in this function, this conditional events based on this but we still called it conditional probability because it is already satisfying these properties of probability function. So P i is probability function. Now from the independence property like I had just said, I had given you the definition that what I mean by A1 and A2 are independent function. Suppose I say I am going to denote independence by A1, 2 what does this mean? What are under that the probability function when I say A1 is independent of A2 that means probability of A1, A2 is going to be probability of is equals to probability of A1 times probability of A2. I am just going to use this short term notation for that. Now suppose A1 and A2 are independent, does this imply that A1 complement is also independent of A2 and just apply your definition of probability and see whether it works. So what I want to show? If I want to show this is true, I need to show that P1 A1 complement of A2 is equals to P1 A1 complement and P of A2, so this is what I need to show. And what I know from this, I know that P of A1 of A2 is equals to if I assume this is true that is A1 and A2 are independent if this is going to be true. Let us try to see why this is true. So now let us say I have this event A1 here and event A2 here, let us say they have some overlap, that is fine. So now what is this event A1 complement A2, what is the region that it includes? So it includes this region. This region, how can I write it as union of two disjoint regions? So can I write it as what is this region? This is A1 intersection A2 in this part. Now what I want is A1 complement A2. So you can check this. I can write it as this A2 as simply this guy A1 intersection A2 and then union of whatever I am interested in. Is this correct? I am just written and are these two disjoint? If they are disjoint, I can just apply the probability of A2 to be the sum of the probabilities of these two terms. So then now what I know, now I want to have been told A1 and A2 are independent. If A1 and A2 are independent, I am going to apply that condition here, what I am going to get? Y0, P of A1 and P of A2, whatever this part is. Now if you simplify, take this quantity on the other side, what you are going to get? P of A1 complement A2, this is going to be what? P of A2 1 minus P of A1, is this correct? What is 1 minus P of A1? It is simply going to be P1 of complement. So you can see that like if A1, A2 are independent, you can simply derive many such properties like that. In this case, I have derived one simple properties like A1 complement and A2 are also independent. So just to, you can keep on extending this analogy just to let us see if you guys can understand this. Suppose I say that E1, E2 and E1, these are n number of events, there are totally how many events? There are n number of events. Suppose if I do a partition of them by taking E1, I am going to take E1, E2 all the way up to first n1 numbers. Let us say n equals to 50, I will take E1, E2 up to n1, I am going to set as 10, first n sets and then I am going to take it as E n1 plus 1, E n1 plus 2, I am going to take all the way up to E n2. Let us say n1 is 10, it is 11, 12, n2 is maybe n2 is 30 till that and like that I will make whatever the remaining ones. Let us say similarly I have going to get after nI plus 1, E n1 plus 2 all the way up to here. So I basically group them like this one consists of n1 elements, this one consists of certain number of elements n2 minus n1 and this will consist of another set of elements, this total number of n. Now, suppose from this n1 numbers, you take whatever union, complementation of the sets like this. For example, like here when I have 2, I have taken A1 complement and A2, like that you take arbitrary things like maybe you can take A1 complement, intersection E2, intersection E3 complement like this any combination and whatever the resulting function, let us call F1 after doing that operation, let us call this another second operation and let us call this an ith operation. Now you can argue that if E1, E2 up to E1 are independent, then F1, F2 are independent. You got this sense what I am trying to say, if you take any set, you do arbitrary operations on them, complementing intersection and then let us call that F1 and take another set and do, if you are going to look at these sets now, this is one set now, if you look at these sets now, they are themselves are independent. So, when I say I got this, these are like any Boolean functions like you can do intersections, unions and complementation of your these sets and get a function F1. Similarly, you do this and get F2 and like that, it is up to you how you are going to define how many sets you want to take to get F1, how many sets you want to take F2. As long as these guys even it are F2, this F1, F2 all the way up to F5 will also be independent. So, as you see independence is actually pretty strong property, because to satisfy this independence, you for k elements, you already need to have check 2 to the power k minus k minus 1 number of conditions. So, because of that like when you have set of things and if you do operations among them, you will end up with these sets which are further dependent of each other.