 So, next what we will do is, so far we have been just defining random variable, what are the possible characteristics one should be interested and we defined all this expectation variance and then characteristic function. And now we are going to see, we also just defined the notion of a probability distributions, right like before we started defining random variable. Now, what we want to now look into, so good, I can come with any probability distribution I want for which I can define all this whatever you want to do like starting from expectation, I can go all the way up to compute characteristic function and all the moments I can compute. Fine, now we are going to look into certain set of probability distribution that are useful for modeling. I do not want to use any arbitrary probability distribution, right? We know that a probability distribution is something as long as this satisfies this axiom this is a probability distribution for it. But I am interested in modeling, right, modeling and analyzing a system. When I want to model and analyze, I want it to be kind of also mathematically tractable, right, so that I can write expression and try to derive some intuition of what is happening. So, people often use a set of distribution which are like seems to reasonably capture the things we want to model, but also more importantly they are kind of tractable. We can explain to each other well what we mean by this probability distribution. So, some of the standard distribution that we are going to use that we are going to discuss next, some of them we have already used, but we did not explicitly name them. So, about this characteristic function and all, you will see exercise questions related to them in that assignment like you will see that where they are going to be useful and you are also going to see that where they are going to come when I am going to discuss various probability distributions here. So, there are set of distributions people use when you are trying to model something discrete where the outcomes are discrete and set of random variables when you try to use, you try to use when you are trying to model an experiment which is going to take continuous valued outcomes, yeah. It is just like a mathematical convenience for us which will help us to relate different moments with the characteristic function and as I said it uniquely determines my probability distribution function. So, this is mostly like a mathematical convenience for us. So, now we are going to talk about first discrete distributions. So, first thing is the one which I have used already many times and this is going to be and how this probability distribution looks like. So, I am now talking about probability distributions of the discrete random variables. So, I will only give the probability mass functions. So, now PMF of this thing it is going to be defined as P of i equals to P, i equals to 1, 1 minus 2, i equals to 0 and 0 and its Z transform is going to be 1 minus P plus Pz. So, can somebody quickly calculate if I have PMF like this what is the mean value of this P and what is the variance? So, where is this distribution useful? Did we use this distribution already? So, I am saying I am going to call something a Bernoulli distribution with parameter P where it is going to take value 1 with probability P and going to take value 0 with probability 1 minus P and all other values are 0. This is already complete, right P minus 1 minus P means everything else is 0. I am going to call such a distribution as Bernoulli distribution. So, where this is useful? Coin toss, right. So, if I am going to coin toss my outcomes are heights and tails, but on this I am going to put a random variable and map head to may be 1, tail to 0, then head is going to happen with probability P, tail then 1 minus P and this is exactly this. So, to characterize this distribution all I need is a single parameter P that was what we have been using when we are doing a coin toss. So, P is a bias of the coin when we did a coin toss and that value could be anywhere in the interval 0, 1, right. If it is a fair coin P is half otherwise it could be any value between 0 to 1. And now if you can for this like you can compute Z transform and whenever you see somebody give you a Z transform if it is of this form. So, for example, let us say I have something like 0.5 plus 0.5 Z. If I say that a random variable has a Z transform of this form then what is the understanding you will have that it corresponds to a Bernoulli distribution with what parameter? It is going to be 0.5 or maybe like somebody can just try just to give it you in this form. Then maybe you can just manipulate and see that you can relate it in this form and then realize that oh this is just like a Bernoulli random variable with this parameter. The second is we call it as binomial and it has two parameters n comma P where n is going to be given as equals to 1 and P is in the interval 0, 1. And its PMF is of this following format, okay. So, suppose I have a PMF in this format where probability that it takes value i is n choose i P to the power i 1 minus P n minus i and any other value is 0. So, you can claim, you can verify that this will all sum to 1. So, this is a valid probability mass function. So, now can somebody compute quickly mean value for this n P y? Why it is n P? What is variance n P 1 minus P? So, why it was n P? What is n time Bernoulli distribution? What does that means? So, what is this capturing here? If you want to model something, okay, if you have made n successive process of your coin, what is the probability that you are going to see n heads from this, sorry, i heads from this in out of this n trials given that probability that head coming in each of these trials is going to be P. So, you see that like if I am going to have a probability mass function this, I have many situations that I can use this one to model and this is given a name like binomial distribution. For another example is suppose, so let us say there is a target and you want to hit it down. And if you have like let us say missiles and each missile has a probability that it is going to hit the target is some P. Now, you can identify right, okay, each one has a probability of hitting the target is P. What is the probability if I am going to hit 10 missiles that 3 are going to hit the target or like 5 are going to hit the target or alternately you can pose a design question as following suppose there is a target, I want to make sure that I need to hit that target with probability 0.99 because like let us say I am guarding a very secure area which need to be given a very, very high standard of security and I need to deploy all my security missiles whatever. Now, the question is if a target comes into this territory, how many missiles I should deploy so that I hit the target with probability 0.99. So, in that how you are going to pose this question maybe you want to say, okay, I want this probability to be greater than or equals to 0.99 and you know P, you know P the success probability of your missile. Now, then you want to see like what is that N so that I should use so that this probability becomes larger than 0.99. So, you can come up with many such cases where you can use your Bernoulli distribution to model things then geometric and it has the PMF which is so I have a PMF which says that for any i greater than or equals to 1 the probability is 1 minus P to the power i minus 1 P. So, suppose if I want to ask the question what is the probability that i equals to 3 this will say that that is going to be 1 minus P square and then P and any other values are going to be 0 and here I have written i to be greater than or equals to 1. So, this can go on it can take 1, 2, 3 all the way up to any value you like. So, like these are only restricted to 1, 0 and here it is only up to N right. Now, suppose you have a case where you want to model something like it is not necessary that every time I am going to try I am going to succeed, but I keep on repeating repeat the same thing again and again till I succeed. Now, your question could be how many trials I need or what is the probability that I will succeed in some ith attempt and that probability could be given by this provided that the probability that your success in each attempt is P here. So, for example, let us say again let us go back to our coin toss problem. Let us say if you keep on tossing your coin, if you are going to see a head you win otherwise you lose and every time you toss a coin the probability that head shows up is P. Now, what is the probability that you need I trials to see a head? It is exactly this right. So, you fail in the first I minus 1 trials and then you succeed. So, this is basically gives the probability of succeeding for the first time in the ith trial and like it depends like you then you may if you have a scenario like this you want to ask the question what is the probability that you succeed in 10th trial, 100th trial like that whatever it is. So, geometric distribution here the way I have defined here has one nice property called memory less property. So, let us discuss what is that? So, suppose I want to ask the question. So, let us say x is my a random variable which is going to capture the number of trials before I succeed. Now, you have been told that the number of trials it took to succeed is already more than j and now you are asking the question what is the probability that this is going to be more than i plus j? You already have this information. Let us say somebody told you that guy took 4 attempts to clear an exam. Now, you want to ask the question what is the probability that that guy is going to take 7 attempt to clear the exam. Given that he has already taken 4 attempts to do this. How does it is going to be what? Probability than i. Why is that? Is memory less? No, okay. You are calling that as memory less property. Because it has to what if it starts from the new point. What starts from the new point? First, first let us do not jump into. So, I have this expression. So, when I have this all I know is before I do anything this is a conditional probability. I am going to apply the definition of my conditional probability and see what this look like. So, if I am going to apply my conditional probability, this is x greater than i j intersection x greater than j given x, x greater than j. I have just applied the definition of my conditional probability. Now, what is this probability in the numerator? x greater than i plus j. Now, for this case, can you compute what is the probability that if x is my geometrically distributed, can you compute what is the probability that x is greater than j? So, probability that x is greater than j, you need to add all these terms for values greater than j. So, if you add that you will see that it is simply going to be 1 minus p to the power j. So, in this case it is going to be and this is going to be 1 minus p to the power j and if you simplify this 1 minus p to the power i. So, given that you have already taken j attempts and requiring further I more attempts, then that you have already taken more than j attempts is going not helping you anything about in understanding whether you need at least j more, i more attempts. In that sense, this is a memoryless property. You see this like why there is a given that you already took more than j attempts and now if you want to ask i more attempts in addition to this, the fact that you already taken j attempts is not going to help you. So, that you are going to your understanding about you are going to need i more attempts. So, in that way this is what we are going to that is why we are going to call this as early property and this is nothing but simply probability that x is 1 minus p to the power is exactly x is greater than i. So, we are saying that this probability is equals to this that is x is greater than given that x is greater than j, this is going to happen is simply x is greater than i. Then we have Poisson. So, this geometric as I said it takes all possible integer values starting from 1, 2, 3 that goes on like that. So, for a particular such thing is Poisson 1 and its Pmf is given as and it is defined as Poisson with lambda where lambda is greater than 0 and it has P of i is equals to lambda i 2 to the power lambda divided by i factorial for i is equal to 0 and 0 otherwise for any other non-integers. So, when I say i greater than or equals to 0 these are like integer values i 0, 1, 2, 3, 4 like that and if you are take any value this is going to be 0. That means, it has a mass only on the integer values and then its z transform is going to be and its mean is equals to lambda and that also happens to be the variance. So, if I have a distributions of probability mass function like this I am going to call it as Poisson distribution. Now, so where do you think such a probability mass function is going to be useful? It is basically telling what it is basically assigning some number to each integer value. So, this Poisson and here this lambda is called intensity. So, this Poisson distribution comes very handy in situations where you want to basically do counting. Suppose, you want to count let us say you are you want to count let us say you are in a some traffic signal I will just imagine like IIT main gate you are there you want to count how many vehicles are passing when signal opens like. So, maybe in the late night like around 2, 3 or 4 am maybe 1 or 2 vehicles will pass, but as peak time goes like actually nothing passes right everything is traffic jam, but still like you can count it as like lot of vehicles moving and you can basically want to model something like how many vehicles passed. So, here number of vehicles is an integer number right. So, it could be and it could be taking any value 1, 2, 3 and it could be taking large number of value. So, here maybe like to model such things Poisson distribution come very handy. We had a geometry distribution but that was more suitable for what? Number of trials till success, but this Poisson distribution has some more properties which we will discuss later. So, but this comes very much when you actually want to count how many things happened like for example, maybe in a traffic signal you want to count how many number of vehicle passed or whatever you want to count. So, I think this is a popularly used in a case where you want to count in a given duration how many calls arrived like so many. So, you are in a telephone exchange you want to manage your resources right like you have certain number of switches or bandwidth whatever. You want to see like and people are calling it from different location at any given time how many calls come and that may you may want to model as a Poisson distributed okay. So, it depends on all application I am just listed you some set of distributions here and it is not necessary that whatever the things you want to model it is not like I it is not like a something okay geometry key is the best it depends on the application for something with the geometric is not perfect, but the best thing I can go for I mean the the nearest realistic approximation is geometric or for something maybe Poisson is the nearest realistic approximation I would like to go for and these are the ones which we have a better understanding of these distributions we know how they look what does that mean and also we are able to I mean analytically play with them well okay. As somebody said you can verify that Poisson so maybe I will just write it as when you have this binomial NP when you let n go to infinity, but you will let go it in a controlled fashion such that the product NP goes to infinity. So, this will be equals to Poisson with lambda you understand this notation what I am saying if you have a binomial distributions in this like this you let let n go to infinity, but P go to 0 that is we have large population, but success in each round is very very small okay and you and you make sure that this NP product converges to lambda and then in that case this binomial distribution is nothing but the Poisson distribution this is an exercise please complete this check that you are able to show this formula and all this Z-transform mean and variance I have just written please verify them they are correct okay and there is something called hyper geometric distribution I don't have complete information of that just just go and look for it what is hyper geometric distribution what are the parameters in that what are the mean variance of that.