 So, as of now I am simply defining this notions of independence, conditional probabilities and all. So, as we move on you will see that to certain results as we start analyzing the system, certain analysis become simpler if the independence conditions hold. So, as we move along we want to bit maybe model somewhat complex cases where it will involve many, many events, but defining the probability for every joint, so you have this script F, right, it will consist of so many events, okay. Now instead now to define probability function we need to apply given number to each of the elements in this whatever like I do not know how many events it contains, let us say it contains n events, but suppose somehow this events turns out to be independent I only need to define probabilities for each of the elements in my sample space for the others I will can just get multiply for each, if I take a set in this and let us say each of the elements in that are such, okay if I take a set or multiple sets and if they happen to be independent now if I have probability for each of the sets and they happen to be independent all I need to do is just multiply them otherwise I need to define probability for each one of them. So, if there is an independent my definition of probability becomes simpler all I need to define probabilities for the individual events from that I can derive for all the joint events, okay in that way independence brings lot of simplicity when I have to analyze a big chunk of events. For example if you want to find the probability that that happened this happened that is going to happen all these things and if you know that they are independent all you need to know is the probability of each of these terms, okay fine. So now moving on we are going to study this concept of total probability, so the basically the law of total probability tells us that if I want to find probability of an event we need to just look at its disjoint sets, okay and how we are going to find that disjoint set by looking at some partition of my space and look at how much of these events falls in each of these partitions. So let me make it more clear now all of you understand what is a partition, okay so as we are going to say e1, e2, pk set to partition omega if so basically let us say if you have an let us say this is your sample space omega one possible partition could be like this simply e1, e2, e3, e4 so e1, e2, e3, e4 they are disjoint and they are such that their union covers my entire my sample space. Now suppose let us say in this you have an event A this is an event which is a subset of my sample space, now if I want to find probability of this event, so do you think that if I know something about the probability of this e1, e2, e3 should help me because some portion of this event A falls in e4, some portion in e3, some in e2 and some in e1, so if I know somehow calculate each of these portions I should be able to compute the probability of the event A itself. So that is what exactly the law of total probability says, so if I have A can I write A as A e1 plus A e2 plus A eA, is this correct? So for example here A e1 means this portion and A e2 means this portion and this is true because my e1, e2 all the way up to eK are disjoint, if they are not disjoint this is not necessarily correct. Now are these A e1, A e2 themselves are disjoint right because e1, e2 are disjoint it is necessary that A e1, A e2, A e all the way up to A eK themselves are disjoint. Now can I apply if what is then probability of A is going to be? So probability of A is going to be probability of A e1 plus probability of A e2 all the way up to probability of A eK, why is that? And what probability of probability I applied here? The third property right, so I have finitely many mutually exclusive events here, so their union should, the probability of their union is nothing but some of their probabilities. Now I want to further use my definition of conditional probability here, so can you tell me in terms of the conditional probability how I can write this as, let us say probability of EI is not equals to 0, okay, so write it as a conditional on the event EI, how you are going to write this? Okay, so let me write this for you too guys, now think of A as my A, B as e1, so then I can write my this A as probability of A given e1 into e1 all the way up to probability of A given eK into right. Now do anybody see any usefulness of this formula here? Fine, I have just manipulated my definition of conditional probability and express in this form, do you think the way right expressing it in this form should be of any use anywhere? Can anybody imagine why at all it is useful? It is not given but suppose assume that you know these probabilities of E1, E2, E3, E4, you have some information about them, so you know that let us suppose you have omega, you know that omega is, it can be partition like that, you have further information on that, prior information and you know what is the probability of each of this partition, in that way do you think you can leverage that information to compute these probabilities, so suppose let us say I tell you this even E2, EK partition and I know this probability, further I know conditional probability of A on these partitions. These are partitions which I have defined at my convenience and whenever this condition of this partition that the portion of A falling in each of this partition that is the probability of A given EI, I know I am maybe let us say I can compute. Then using this formula you are just, you can compute what is the probability of A, so what is this problem? What is this formula basically telling you? Portion of A falling in E1, portion of A falling in E2, portion of A falling in EK right, but before I take this, that portion of A falling in E1, then I have to also take the probability of even happening itself right, so that is why if I know what is the probability of each of this partitions happening and what is the likelihood that my event A happening condition that, then if I going to take the sum of all these things that is going to give me the probability of A. So let us say, let us take a simple example, before I write an example, let me give something called Bayes formula. So here I have told you to compute P of A, you need to know that condition on E1 what is the probability of A happening, but suppose that this is what I have, this probability is even E2, I have defined a priori and I also said you that I have somehow computed the probability of A given E, E1 E2 all these things, these are like prior probabilities that I have computed from this, I got probability of A. Suppose, now I will say you, I will ask the reverse question, event A has happened, what is the probability that it might have been due to E1, so that is I want to ask E1 given A, I am now asking basically posterior probability, so these are my prior probabilities I have based on my partition, now I think okay fine event A has happened that I have now what is that it has, this is due to E1, now can you express this in terms of this, so let me call this as posterior priori probabilities and this has prior probabilities, can you express my posterior into the in terms of the priors, so this one I know already by definition of my conditional probability, this is probability of E1 A divided by P of A, but this one I can further again apply the definition of conditional probability to express it as P of A E1 into P of E1 and then P of A, I have just applied the definition of my conditional probability and now this P of A, I am going to use my this law of total probability which I have already derived express in terms of my conditional probability, so I am going to just write it as P of P of E1, P of E1 divided by summation of P of A and this is what we call it as base formula and this is indeed one of the important formulas we have to come across in analysis. Let us try to see how, what is the usefulness of this formula, so what is base formula is basically correct is that posterior probability based on my prior probabilities, prior probabilities this is something which I compute a priori and how that is going to help me in computing my posterior probability, so let us see an example. Let us say you are working with a system and that system has some critical component, let us say it is your satellite that you want to launch and that satellite has some critical components thing and you have built in enough redundancy in your system that if this critical component works fine everything should go normal, if this critical component fails it is still possibility that your mission goes through, even this go through. Now the prior information you have tried to compute is failure or success of this critical component, there are only two possibilities failure or success of this critical component and given this system fails what is the probability that your mission still complete successfully or not. Let us say you have that, you have computed this through your various lab experiments that you do extensively in your offline test modes. So let us say you have that information, so when I say F this is the failure rate of my critical component I have, so this is some very high pressure unit or whatever let us say I test it in the lab environment I say I see that when I put it into actual test mode with probability P it fails and I have that information. And now let us say I also have this, suppose critical component fails what is the probability that my mission still succeeds. So let us call this or instead of that let us say that critical component fails my system still works, so that is what W I mean working condition. You also have computed it based on your lab environment like you based on your prior experiments and let us call this probability Y and let us call it minus X. And now let us say you have probability that it works when your critical component fails. Let us say that is some Y and then probability that yours does not fail is 1 minus Y. So this is all you can compute like before you actually launch your mission you can do various experiments and try to compute and get these numbers. Now based on this information, so these are all your prior probabilities that you could compute. Now based on that let us say you want to compute the probability that you observed that your mission worked well, you are successful but what was the probability that it was successful in spite of the fact that the critical component failed. So this is the posterior probability maybe you want to compute like fine you observed that your mission actually went through now that is the condition. What is that it went through in spite of that your critical component failed. Now can you compute this probability based on your prior probabilities what formula you are going to use. So what is the base theorem you are going to apply. So this I can write it as P of W by F into probability of F divided by P of W given F. So everything is expressed in terms of the things which you already computed beforehand. Do you know what is this probability of W given F? Yes, right. You know what is P of F? Yes, I have already computed that and I have already computed all these things. So based on a probability I can go back and compute what is the probability that actually succeeded and in spite of there is a failure in my critical component. So let us see another simple example maybe that you can relate a bit more. So let us say you have a question for that question you either know the answer or you do not know the answer. What is the probability you know the answer with probability P and probability 1 minus P you do not know the answer and what you guys do when you do not know the answer you guess. So and when you guys guess you guys know you are always not going to be lucky right. So you are going to be taking a risk and that is where this probability comes you need to see what probability you are going to succeed. And then let us say if you do not know the answer what is the probability that you end up guessing it correct. Let us say that probability is some 1 by M and then probability that you will not when you do not know the answer but your guess ended up in a wrong answer let us say that is going to be 1 by M. And then we can also compute what is when you know the answer you are going to always make it correct right. You are going to get the correct one let us say that probability is 1 and probability that C complement is going to be 0. Suppose as an exam setter somebody setting multiple choice questions he derives this set he wants to be fair he want to make sure that he want to set a question paper in such a level that the guy who actually got the correct answer he got it correct because he knew the answer. It is not that he got it correct because he just guessed it smartly. He want to max he want to find out this probability. So, that guy got the answer correctly but he got it correct because he knew the answer he want to compute this is my posterior probability I want to compute. So, if I want to compute based on this information how I am going to do this can I do it at all? How? So, can you compute this? So, let us substitute this values. So, if you know it correct it is correct and what is p of k it is p and what is this? This is p plus what is C k complement 1 by m and what is this? This is 1 minus p. So, let us try to understand like how does this? Suppose you increase p fix m if you increase p this probability is going to increase or decrease. This function is an increasing function in p or decreasing function in p increasing function. So, it is good right like if you know the answer fine the probability that you actually got the correct it is increasing. Now, let us say m here denotes the number of questions the number of questions you have in your multiple choice. Now, suppose you want to be fair for your students. When I say fair you want to make sure that if you got the correct you want the exam to be robust in the sense that if you got it correct you got it correct because you knew it with high probability. So, if you want to do that and m is your choice that is the number of choices you can give. So, you want to increase m or decrease m? What is happening with increasing m? So, if you increase m this is also increasing right. So, that means you are trying to make your exam paper more robust. So, as you can see here like fine like I may not be able to model things exactly, but this gives me in a some sense to model some things and get some interpretation out of it ok. And you will see like plethora plethora of applications where this bias formula comes kicks in. Some of them we are going to see it in the assignments. So, now after this I want to move on to the definition of random variables. So, what is random variable? It says something random right some value which is random. So, in many experiments our outcomes will be kind of descriptive right. For example, when you toss a coin the outcome is like I expressed in descriptively like either heads or tails or when you go to casino your outcome is just you won or lost. But win and loss may be associated with some numbers also right like you won because you got more than this many points or you lost because you didn't get this many points. So, instead of making this outcomes very objective maybe it is better we are better out by assigning some numbers to this objective right. We actually did that already in the case of dice. In the case of dice we threw it and we say outcome is 1, 2, 3, 4 like that right. These are the different possible outcomes but we actually assigned the numbers there. And in many, many experiments that you do in life it is not necessary that the outcome is always numbers. The outcome is some description okay this happened, this happened among all these 10 possibilities like if you are going to say whether outcome of the weather, outcome of the weather could be sunny, rainy, cloudy whatever different options are there right. Instead of explaining them in this descriptive way maybe you want to assign numbers okay. I mean just make it uniform less than this much temperature we are going to call it humid less than above this temperature we are going to call it sunny or something of that sort. So, what basically random variables will do is it will try to quantize the outcomes in terms of numbers so that it become more amenable to our analysis okay. So, let me first give a formal definition of what I mean by a random variable. So, to define a random variable I need to have this underlying probability space. I only need basically the first two components omega and script f you will see why. And this random variable is basically a function from your sample space to real line when I say R this is real line that means for every possible outcome in your sample space it is going to assign a real number but that is it not that it is not just a random variable. If it is a function like this further it has to be f measurable okay now what I mean f measurable f measurable says that for all C in R so this is the definite meaning of f measurable it means you take any C and if I ask the question the number of outcomes that the set of outcomes that are going to take value possibly less than or equals to C this is going to define a set that set has to be belong to event space. So, what it is basically asking for I said x a random variable is basically a function which is going to assign numbers to the outcomes. So, now I have mapped all my outcomes to numbers right now in such experiment I want to ask the question what is the probability that my outcome of the experiment takes value less than or equals to C. So, for example in your case of dice throw I could ask the question what is the probability that outcome of the dice is less than or equals to 5 okay. In that case this is exactly so that is set of all outcomes that are going to take value less than or equals to 5 and that event whatever it is it has to belong to this script f and why is this if I belong to script f then only I can assign probabilities to that right because my probability space is such that I am going to assign probabilities to only to the events that belong to event space. So, that is why and why I am allowing and notice that this condition need to be satisfied for every C you may ask any question of this problem what is that outcome of my experiment is going to be less than 10 you may ask what is outcome of my experiment is going to be less than 10.003 whatever the number you are going to ask for that I want to be able to assign a probability and I could do that if that event belongs to my event space that is why for any C I want this to be belong to my script f. So, basically what we are saying is a random variable is a function that maps my outcomes to real numbers and it is such that for every possible real number the set of outcomes that takes value less than that real number should be an event that lies in my event space okay fine. So, let us look an example what is a random variable and what is not a random variable. So, let us take the simple case of our dice problem in my dice throw problem what are the outcomes? Outcomes are in this outcome I am also let us say I am also going to define my event space on this take let us take one special case. So, let us take I have 1, 3 and let us say I have 2, 4, 5, 6 and I have 1, 2, 3, 4 and I have 5 I had defined this to be my event space and let me first define a random variable on this like this which is going to assign to every outcome or real number in this case what I am going to do I am going to assign omega to be simply omega for all omega in some outcomes are 1, 2, 3, 4, 5, 6 I am going to define a random variable X which is just going to give the same numbers as the outcome in this case which is trivial example because outcomes already in terms of the numbers but it is not necessary that the outcomes need to be in terms of numbers. For example, head tail I already talked about right. Okay fine I have just defined a map from omega to R now before I call it a random variable on this experiment it has to be f measurable is it f measurable yes sure. So, how you are going to check f measurable here? I have given you the definition of f measurable here why? So, let us apply this let us take some C right C belong to R let me and this should be true for any for every C right. So, I have a freedom to choose C. So, let me choose C to be 3 now and what is that in that case what is this set is going to be 1, 2, 3 and now I have an event which consists of outcomes 1, 2, 3 does that belong to my f here? No right. So, this is not a random variable on this f but suppose let us expand this I let me include 1, 2, 3 and if I include 1, 2, 3 let me also include its complement 4, 5, 6. Now for C equals to 3 this satisfies is it now still an f measurable? Okay let us take a C equals to 1, C equals to 1 outcome is just 1, 1 does not belong to f. So, let me include that also. Now is it f measurable? No for what it violates? It violates too. So, let me now let us include all possible values okay how many subsets are there? Let us say all subsets are there okay all possible subsets are there. Now is it f measurable? Yes. So, you take any possible C you will end up with one of this one and that belongs to f. So, it is not necessary that every function that you are going to define on a sample on your sample space is a random variable. It has to be f measurable that is because I can the very purpose of me defining a random variable by assigning numbers to the outcome is that I want to measure events and this is exactly doing that and if this measurability condition fails that random variable may be the way I have applied define my random variable is not appropriate one okay fine. So, let us stop here then you will see so as you see like maybe you need to just digest this fact that any function on sample space need not be a random variable it has to be appropriately satisfying my measureability conditions okay.