 You may actually end up uncountable case also let us look into the example. Let us say this is my x axis and this is my y axis and I am interested in this region 0 to 1, 0 to 1. So just think of a geographical location which is like a square area, 1 unit square area which is represented by x and y coordinates ok. Now I am interested in this square area, this is my region and I am interested in let us say some particular point here and my omega is all x y which are between 0 and 1 and also my second axiom said that probability of omega has to be 1, then only it is a proper probability function otherwise it is not a probability function. Now let us look into this. Suppose let us say in this region x and y for any for any x y pair ok and that x y you gave a strictly positive number, strictly positive number you assigned. Then if I am going to sum them over all of them and everybody is strictly positive remember and those who know real analysis what does this mean? p of x y is greater than 0 greater than 0 that what does this mean? That means there exist there exist an epsilon positive such that p of x y is greater than or equals to epsilon right. If p x y is greater than 0 there exist a positive epsilon such this has to happen. If this is the case what is this number is sum is going to be? What summation of p of x y or x y is going to be? 1, will it be 1? See how many x y points are there infinity and each points probability is larger than epsilon. We are adding epsilon infinitely many times yeah I mean these are just all possible x y's if you want you take it as a integration instead of summation do a integration that is fine. But if you want to do integration of this p x y here every point is greater than epsilon what is this value is going to be? Is it 1? Can this be 1? Can this be finite? Can this summation will be finite if every point is greater than in is going to be greater than epsilon here ok let us do simple math further. So, I so for time being do not worry that is a summation or integration we can do. Now if this is the case we know that this has to be greater than or equals to x y epsilon because every point here is greater than epsilon ok and now you are adding this over all x y. So, now this value is nothing but epsilon times cardinality of omega and what is cardinality of omega that is going to be infinite right because there are how many points x y are there which are between 0 and 1 grid it is uncountably many actually. So, this becomes that if you are going to do like this this becomes actually infinity this summation cannot be 1. So, as you see that if this guy all this term has to be if this guy has to be 1 that is if this value has to be finite many of the p x y has to be 0 if they are not 0 then this you cannot guarantee this p of omega to be equals to 1 ok. So, this means let us take now p of x y equals to 0 for many point I mean take all of them to be 0 if that is the case if you are going to add p of x y equals to 0 and add all of them you see that right hand side is 0, but left hand side you want it to be 1 p of omega by assume it should be 1, but now in this case if all p of x y equals to 0 you will adding infinitely many 0s is still 0 right. So, and you will end up with a contradiction. So, this extension of additivity from countable to uncountable does not extend just because of the fact that I mean many points in this space has to have a 0 likelihood value they cannot all have positive likelihood and this is just for your friends do not worry much about that I mean we are not going to use it it is just to make it clear that the finite additivity why it is extended to countable, but why not to uncountable in many terms ok fine. So, if none of them is 0, if none of them is 0 the value cannot be finite, but so, I can think of some a simple case where instead of x and y we just take x x 10 is from 0 to infinity and p of x is 1 by x. So, then we take summation of that quantity that is finite 1. p of x is 1 by x. So, what is this example saying? So, suppose let us say is p of x is summation 1 by x is finite finite summation 1 by x let us take it is to be integer only like here x is greater than or equal instead of that just take x to be integers is this finite? This is not ok take this is finite ok you can make it a normal probability. Now, what we are saying here? Here in this interval for every point you have assigned a positive number let us say this is 1 by x I mean infinity let us exclude the infinity part, but open it interval infinity, but everything else. So, this is fine every point has a mass I am not talking about that case here right I am talking about the case where this assume where probability of omega has to be 1 whether that is going to be satisfied or not and probability of 1 is omega is nothing, but the uncountable many additions over all x y. So, in this case it could be 1 right. This could be 1 here in this case. The values are x values 0 to n table and each value has some finite mass there. So, finite mass here yes. So, all the cases are same with the condition right the number of points are infinite and none of the points are 0. So, over there it is come to be 1. So, how can we say that. So, this has to hold for anything right not for a particular example I mean this when we are talking about assume this has to be generalized over all possible space. So, saying in this example if we can prove that some of the values have to be 0. This would be a wrong statement right. In this case countably many has to be 0. That is true that is possible all of them can have some mass, but now that is what I am saying right like this is what we need to think about. Let us take if every point is positive then we are already saying that the summation cannot be anything finite it will be exploded. This is what we are we are arguing right. So, let us say in your case 1 by x square right this is going to 0 as extends to infinity and even those points which are tending to 0 are included in this. So, because of that you will not be able to find any epsilon positive and below which this value will not go. That epsilon here is arbitrary in this case, but where I mean what I am talking about is if everybody is positive with some fixed epsilon this condition is violated, but the example you are giving is that epsilon itself is tending to 0 ok. So, we have to somehow be careful when we are dealing with uncountably many points ok. So, that is why like I mean most of the arguments are going to make it for the finite case and wherever it is easy to extend it to the uncountable case we will do, but otherwise we will not get to that. So, that is why like people have built nice sophisticated theories in terms of measures and all I mean those things we will study in advanced classes ok ok. Next, how to interpret probability? I talked about one thing like likelihood ok like if you have a fair coin it is the likelihood of head happening is same as tail. So, I will assign equal values to them that is based on your intuition, but is there any other interpretation of these probabilities? Other interpretation, so the one we talked about is based on description or preferences. There other interpretation is called something called frequentist view ok. So, let me ask this question to understand this frequentist view. Suppose let us say you have a coin, a fair coin and you are going to throw it n times and out of this n times let us say you are going to see this many times heads and these many times tails and naturally n 1 of n plus n 2 of n has to be equals to small n. This is the total number of tosses you have made out of which n 1 is now let us look into this n 1 by n of n. So, what you are doing in this out of n trials you are trying to compute what fraction of the time out of this n outcomes what fraction of them are corresponding to heads right. Now, what do we expect this value to go if I increase n to larger and larger number tends to tends to what? Point 5. Why is that? You are understanding is if the coin is fair in the large number of trials it must have happened equal number of I mean it should its fraction should be almost half right. So, that is what like one could think probability as the fraction of the time it is going to happen when the experiment is done repeatedly and repeatedly means you continue till infinity. So, the frequentist view is exactly that like I can define let us say p is the probability of heads how I can do this? You toss the coin again and again and again and again and see you throw it many times let us say 10,000 times and see that out of that how many times it came head let us call that n 1 of n we say right and you take that as your probability and this frequentist is what is very handy. Let us say suppose let us say in we have all this drugs that have been come up right which have been went through various trials and they said that the efficacy of my drug is 86 percent or my efficacy of my drug is 95 percent what does that mean? How did they come up this number? It is not like they just put its likelihoods on that they have to come up with this numbers right how did they come up with this? Yeah? Observe. Observe? Data and routine technical types. Yeah, so like they would have asked many volunteers like when you are go for a drug I mean evaluation you will ask volunteers or pay them to undergo trials you collect a good number of people you give this drug and see that on how many this drug is effective and then you take that on how many it is effective divided by the total number of people you administer this drug that can be taken as your probability of that effectiveness of your drug or curing that and that is why it like whenever you have data mostly you are going to go with this frequentist view and when you are actually not dealing with data like you have to just simply model some probabilities some probabilistic phenomena and you have just a belief about some probabilities then maybe you just to go and put this beliefs and try to see how things are ok. Now we are going to deal with three things as I said there is an underlying probability model we are going to observe that probability model through the data through which the data is generated and we want to see that can we imitate that underlying probability model so that we can also generate the data in the same way that guy is generating. If we could do that then basically we have understood that that system right we are basically able to like see what potential it could have done and infer and that is why the that role of the probability statistics and data comes into picture and that is what is now called it as a data science the interplay between these two. So, what happens in how they are used? So, probability and statistics these two things they kind of provide us a framework to understand the underlying phenomena and then make what we call it as consistent inference and also consistent reasoning and predict and make. So, how is that? Let us talk about some real word problem let us say this could be like a weather and in the weather I am interested in predicting whether it is going to rain tomorrow or not I am interested in that. So, if I know that how the weather affects like how the rain comes based on what factors everything then I kind of myself build a probability model and then predict whether the rain is going to come based on that, but weather is very complex like I cannot simply understand everything like based on which some weather phenomena happens. So, what we actually do is this is a real word weather, but we try to model it according to some probability model this is our model. So, this is reality this is our thing model is ours the reality is the real word problem. So, now what we are trying to do is we are trying to understand this real word problem using some probability model and when we try to build some probability model and we will say that this will come with some parameters probability when we are going to do models the models will come with some associated parameters. Now, just assume that real world problem is also another probability, but it is going to use some particular parameters and I do not know those parameters. If I know those parameters maybe my real word model is as good as my probability model is as good as the real word model it is just like maybe I do not know those particular parameters that is using, but then what is good is I can get the data from this real word problem right like whether I observe whether it is going to rain tomorrow or not and I have like that I have in the I have recorded whether it rained or not in the past 10 years on particular each of these days. So, I have this data I can use this data try to do the inference what could be the possible parameters that are going to explain this data and how to effectively identify and estimate those parameters all these things this statistics is going to tell me how to do that and once you use those parameters you go back and use them in your probability model and see that what your probability model this is your built model is going to tell like whether it is going to rain or not then you actually observe what happened and then you know that whether what you predicted and what data said they are same or not. If same good if not you know that you made a mistake your probability model is not good enough it needs to be improved ok. So, you take that and you use your other further statistics method to improve the parameters and go back and improve your probability model. You tries to continuously try to do this so that your probability model tries to give as good as predictions where your predictions are you are making lesser and lesser errors in your prediction that means you have try to understand the real or problems and your probability model is trying to capture that well. So, this is the interplay between all the things and I mean if you just think of the forecasting whether forecast this is exactly happening and people have built they have been collecting this data by putting sensors at various places and based on that they estimate the parameters and use that parameters to use in certain probability models based on that they predict whether it is going to what is the weather is going to be and then they will see that what actually happened get a new data and for that they will try to improve the parameters and discontinues ok. So, that is why it is very important that we understand probability models and see how to use the data to make a interference and get a good parameters for them and improve this probability model so that they capture the real world problems well ok. I think we already exceeded time we will stop here.