 In the previous lecture we have discussed various laws which govern us to calculate the probabilities of various events. One thing that we notice that when we are discussing the raw sample space many times it is quite difficult or you can say complicated to look at all the possibilities and then look at the manifestations of various events. It may happen that it may be helpful to look at only the numerical values of certain variable. For example, let us consider the game of a say a basketball match. Now at the end of the match we will be interested in the total number of the baskets which were put successfully by both the teams and that would give the winner of the in the match. In the early in a game of tennis and at a game of badminton and so on where the outcome may be simply a number for example, in a game of badminton the final scores of the three games or five games or in a game of tennis the final scores of best of three games or best of five games etcetera. So now the entire duration of the match is not important which point was one by which player that is not important ok. That means ultimately we are associating certain numerical values with the phenomena or you can say outcome. Now this association if we formalize for all values of the sample space we call it a random variable. So let us formally define what is a random variable. So let omega Bp be a probability space a real valued measurable function defined on omega this is called a random variable that means so now usual notation for the function in mathematics is F, G, H etcetera. However here it is a random variable so we call it as a x, y, z etcetera. So we use a notation for example, x is a function from omega into the set of real numbers at the same time we put a measurable condition. By measurable we mean that if we consider a sigma field of subsets of R then the inverse image of any of the set should be an event here that means it should be a set here. Now typical examples let us consider. So consider five hits by or five shots fired. Now out of this x could be the number of successful hits to the target. In that case see when the five shots are fired at a target then the sample space may have the possibilities like all are successful. So you may have four successful and one missed and so on all are missed these are the possibilities. However x can take only values 0, 1, 2, 3, 4 and 5 and so here you can see that looking at this one is quite easy and convenient. Let us take some more example suppose we are considering the number of bulbs operating after say 2 months. So now this could be for example there may be there are say total 100 bulbs then how many of them will be operating after 2 months. So let us say x is the number here then x can take values 0, 1, 2 up to 100 and we may look at the probability distribution of this. We may look at the life of a bulb. So in this case it could be some numbers say from 0 to infinity or it could be 0 to 100 and so on. It can be depends upon the way we try to use our sample space. We can consider say waiting time at a traffic signal. Suppose this is y then waiting time can be anything from say 0 to 3 minutes suppose the timer is set up up to 3 minutes then it could be 0 to 3 minutes. The values possible values can be any value between 0 to 3 etcetera. So here you can see that in each of these example we are having a numerical value from the sample space that means each element of the sample space is allotted a numerical value. So this is called a random variable. Now based on the type of the values for example here I am assigning the values 0, 1, 2, 3, 4, 5. Here I am assigning 0, 1, 2 up to 100 and so on. Here I am assigning the values on an interval. So depending upon that we consider the classification of random variables classification of random variables. So the classification of random variables is one is discrete random variable. So if a random variable takes finite or countably infinite number of values then it is called a discrete random variable. For example if you look at this random variable this is a discrete random variable that is a number of successful is to the target. So I consider number of attempts to clear a competition. So a person may be allowed a finite number of attempts to clear the competition. So the values that the random variable may take 1, 2 suppose up to n number of trials are allowed. So you can have 1, 2, n as the number of values of y then continuous random variable. If a random variable takes values over an interval it is called a continuous random variable. So for example waiting time at a traffic signal I am putting 0 to 3 interval or life of a bulb 0 to infinity etc. So this is example of continuous random variable. For example life of a bulb waiting time at a traffic signal. There are also random variables which are partly discrete and partly continuous but that distinction will become clear when I give the probability distribution of a random variable. So let us look at the probability distribution now. How do we define the probability distribution of a random variable? First of all let us look at the origin of this one. See whenever we have a random experiment from the random experiment we have a sample space and a certain assignment of probabilities can be made depending upon the type of experiment to the various outcomes or to the various events which are there. When we are allocating the values real numbers to those outcomes then the corresponding probabilities are also transferred. Now in a case of discrete distribution this assignment is quite natural but in the case of continuous distribution since we are dealing with the intervals that assignment does not result in a direct transformation. However, we can give a formal definition in the following fashion. So we can consider so we have omega B P as the probability space and when we define a random variable X then omega goes to R this sigma field of events it is also transformed to another sigma field which is a sigma field of subsets of R and in place of P let us say Q. So this is an assignment let us define. So for any set C we define Q C is equal to probability of X inverse C that is equal to probability of B where B belongs to B. So here Q is called the probability distribution of random variable X. Now let us see that how it is done in the real situations when we are dealing with the discrete random variable when we are dealing with the continuous random variable then we will have different ways of this assignment. So let us consider the discrete case here this probability distribution is given the name probability mass function of a discrete random variable. So when it is a discrete random variable the random variable takes values over a set which is either finite or it is countably infinite we can name it by some script X. Let X be a discrete random variable taking values over a set say X which is of course a subset of R. So here X can be since it is finite or countable we can actually arrange the values in this sequence X 1 X 2 and so on. Now the probability mass function of X is a function let us call it P X which satisfies the following 3 conditions. First thing is that P X X i is always greater than or equal to 0. Second is that the sum of all the possible values of the function over X is equal to 1. And thirdly this P X X i is actually the probability that the random variable X takes values X i. Now what is the meaning of this kind of a statement let us explain it further. I wrote here the assignment of a value here. Now this set B is in the it is a subset of omega ok the sample space. So when we are writing like this actually it means probability of all those sample points for which X omega is equal to X i. So this is actually the full description but for brevity we will be actually writing X is equal to X i because probability statement is basically valid for a set on the which is a subset of the sample space. So omega the set of all those sample points for which X omega is equal to X i. So in short we will be writing probability of X is equal to X i. But when the meaning is clear it does not matter whether we write this statement or this statement. Let us explain through an example here. Suppose a shop has 5 say computers out of which 2 are not fully you can say they are not working up to the full capacity not fully operative ok. You may call that they are having some defect ok. A customer buys 2 computers and selects randomly out of given 5 because this that they are defective that is not known to the owner also. Now let us consider X is the number of defectives in his purchase these are her purchase. So now what what are the possible values of X? Since there are 5 out of which 2 are defective and this person buys 2 he may buy both good he may buy one bad one good or he may buy both bad. What are the various possibilities here? Let us look at what is the probability of X is equal to 0. So that is equal to that means he has purchased 3 C 2 that means he has purchased both from the good ones. The total number of ways of selecting 2 computers out of 5 is 5 C 2 which number is equal to 3 by 10 here. This is actually P X 0. Similarly, if we consider probability of X is equal to 1 that means he has purchased one good and one bad. So, the probability here is becoming 6 by 10 that is P X 1 and what is the probability of X is equal to 2 here. So, that will be equal to he has purchased both bad that is 2 C 2 divided by 5 C 2 that is equal to 1 by 10 that is equal to P X 2. So this assignment of values that is P X 0 P X 1 P X 2 this is called the probability mass function of this random variable X here. If we want we can have a basic understanding by considering the plot of this suppose this is 1. So, this is say 0, 1, 2. So, at 0 you have 3 by 10 at 1 you have 6 by 10 and then you have 1 by 10. So, this is can be depicted by a bar diagram actually. Let us consider one more example of the discrete distribution. Suppose a card is drawn randomly from a well shuffled pack of cards. Now, we define the random variable X as follows. If the number on the card is between 2 and 10 its score is that number. If the card drawn is Jack, Queen or King that is a picture card its score is 20 sorry its score is 10 and if an S is drawn its score is 20. So, let us call X as the score ok. I want the probability distribution of X. So, what are the possible values of X here? X can take values if it is between 2 to 10 it is score is that number that is 2, 3 up to 10. If it is Jack, Queen or King then also it is 10 and if an S is drawn its score is 20. So, the possible values are 2, 3, 4 up to 10 and then 20. Now, if you want to find out the probability distribution here what is the probability of say X is equal to 2. Now, that is simply equal to because there are 4 cards which are having the value 2 and we are drawing one card out of 52 cards. So, it is becoming simply 4 C 1 divided by 52 C 1 that is equal to 4 by 52 that is equal to 1 by 13, but this argument will be valid for 3, 4 up to 9. So, actually we can say probability X is equal to J is equal to 1 by 13 for J is equal to 2 up to 9. Now, the case of 10 can be clubbed with Jack, Queen and King because that score is also 10. Now, there are 16 cards which are 10 Jack, Queen or King. So, the probability that X is equal to 10 that will become equal to 16 by 52 that is equal to 4 by 13 and what is the probability of X equal to 20 because that is an S which is again equal to 4 by 52 that is equal to 1 by 13. So, this number I can also combine here. So, ultimately we can write the distribution in the following form we can write. So, the probability mass function of X can be written as P X J that is equal to 1 by 13 for J is equal to 2 up to 9 and 20 and P X 10 is equal to 4 by 13. So, this is the complete probability distribution of the score which is defined in this fashion in this particular problem. Now, if we have continuous random variable it is obvious that one may not be able to get the probability distribution in this fashion because here what we are doing is we are making a simple assignment and since the elements of the sample space have a probability assignment that is directly transformed to the probability distribution of the discrete random variable. In the case of continuous random variable since there are the values are taking over an interval now this type of facility is not there and in that case the probability distribution is obtained either by the rule which is governing by which is governing the process or one may look at the probability histogram of the data that means frequency distribution and then we approximated by a frequency curve. So, let us consider probability distribution of a continuous random variable. In the case of continuous random variable we call it a density function. So, probability density function of a continuous random variable. So, let X be a continuous random variable and when it is a continuous random variable it is taking values over an interval. Now, that interval can be having a closed and bounded interval or it can be infinite from both the sides. So, in general we can consider the set of values from minus infinity to infinity. Now, over the portion where the probability is not there the density function will be defined to be 0. So, we can define in general f x x is probability density function we call it in short pdf of x. If it satisfies that f x x is always greater than or equal to 0 the integral over the real line is equal to 1 and thirdly if we consider any interval probability of say a less than x less than b that is given by from a to b. Now, in this case a less than x less than b we may also consider a less than or equal to x less than b we may consider a less than x less than or equal to b or a less than or equal to x less than or equal to b all of them are considered to be equivalent. In fact, when x is continuous probability of random variable taking a fixed value is always 0 for all c. In fact, this is not the density function in the discrete case this is the probability mass function, but in the continuous case this is not this is this probability is always 0 actually you have a density at c which may be 0 or non 0. So, let me explain it through some example here. Let the let implies of an organization between 9 am to 11 am daily in the office let x denote the time of arrival. Now, here you see that we are having a 2 hour period and it is found that the probability density function of x is given by. So, considering the starting time as 0 the time can go up to 2. So, the density function is given by f x x is equal to c times 4 x minus 2 x square 0 less than x less than 2 and it is equal to 0 otherwise. By otherwise we mean that when x is outside this interval now that is natural because we are considering the arrival timings between 9 to 11 only. So, it is a 2 hour period. So, the probability density function is having a non negative value in this region only outside this it has to be 0. Now, here there is an unknown constant c written here. So, first thing is that let us determine this c. Now, this function has to be positive which we can obviously see it is actually positive between 0 to 2. Now, secondly we observe that the integral of this function to determine c we should have the integral of f x d x is equal to 1 which is equivalent to saying c times integral 0 to 2 4 x minus 2 x square d x is equal to 1. Now, if we look at this integral. So, this becomes simply 2 x square minus 2 by 3 x cube from 0 to 2 into c that is equal to 8 minus 16 by 3. So, that is equal to 8 by 3 c. Now, if this is equal to 1 this means that c is equal to 3 by 8. So, actually the probability density function is now so the probability density function is actually f x equal to 3 by 8 4 x minus 2 x square or we can take 2 common here then it becomes 3 by 4 2 x minus x square for 0 less than x less than 2 and it is equal to 0 otherwise. Now, suppose we are interested in certain probability for example, what is the probability say that half is less than x less than 3 by 2. That means, we want to know that imply arrives between 9 30 to 10 30 time. So, then this is nothing but the integral of 3 by 4 2 x minus x square d x from half to 3 by 2 which of course can be evaluated easily it is x square minus x cube by 3 from half to 3 by 2. So, we can substitute these values here and we can get the answer as equal to 11 by 16. For example, we want to find out what is the probability that x is less than half that means what is the probability that an imply arrives between 9 to 9 30 then this will be equal to 3 by 4 0 to half 2 x minus x square that is equal to 3 by 4 and once again it is x square minus x cube by 3 0 to half. So, once again if you want we can actually evaluate this 3 by 4 1 by 4 minus 1 by 24 that is equal to 5 by 24 that is equal to 15 by 96. You can actually see that the probability of arriving within first half hour is much less the there is more probability concentrated from half to 3 by 2 more implies arrive between 9 30 to 10 30. As I mentioned that there may be a random variable which may take values over an interval and it may also take at a finite number of points in between. Now, what is the meaning of that? Now, here you can see that when we are having a continuous random variable then the probability of a point is 0 whereas, in a discrete case we have observed that the probability assignment is for the points only probability of x equal to x i is equal to p x x i. Now, we may have a random variable which may take positive probability at points and then for points it may be 0 and probabilities over intervals are allocated. This is called mixed random variable. So, let us look at this a mixed random variable. So, a random variable which may be partly discrete and partly continuous is called a mixed random variable. Let us look at some physical example. We have considered say waiting time at a traffic signal. Now, the vehicle is going and then there is a traffic signal. So, we are looking at the probability distribution of the waiting time. Now, one may say that it is an interval depending upon the setting of the signal timing like it may be maximum most of the signals are set up to say 2 minutes, 3 minutes or 1 minute timing. Then you may say the time is from 0 to 3 minutes and it is an interval. But it may happen that when the vehicle is passing or when it is encountering the signal the there is a green light and then the waiting time is actually 0. Now, it may happen that say one third of the time or one fourth of the time or one tenth of the time you actually encounter a green signal. So, there is a positive probability of x is equal to 0. So, you may say for example, waiting time at traffic signal. So, let us call it x. So, we may have probability x equal to 0. Let us say suppose one tenth of the time we observe this and thereafter we may have a density and that density may be having say an interval you have 9 by 10 into say 1 by 10 that is equal to 9 by 100 and you are having 0 less than x less than sorry it need not be it could be some interval say 0 to say theta. So, we may have 9 by 10 theta. So, that means that this is the total length of the interval in which you may have to wait. So, here you can see if I integrate this density f x dx from 0 to theta I get 9 by 10. Now, 9 by 10 plus 1 by 10 is equal to 1. So, the total probability assignment is completed, but here you can see that in this range you have a density function whereas, at 0 you have a discrete point. So, for example, a question may be asked what is the probability that the waiting time is less than or equal to say theta by 2. In that case this is equal to probability of x is equal to 0 plus the interval the integral of the density over the interval 0 to theta by 2. So, 9 by 10 theta dx. So, this is equal to 1 by 10 plus 9 by 10 and then you have half here. So, that is equal to say 11 by 20 here. So, this is an example of a mixed random variable. Now, from this example it is clear that sometimes when you have a mixed random variable it may not be very convenient to consider the probability density function or the probability mass function. In that case we have a more general function which is called the cumulative distribution function of a random variable. Cumulative distribution function of a random variable. So, the cumulative distribution function we can write in short CDF. The CDF of a random variable x is defined by capital FX is equal to probability of x less than or equal to x. As I mentioned actually this is the probability of the set of all those points for which x omega is less than or equal to x, but in short we can write it as a probability of x less than or equal to x. So, the name cumulative distribution has come because you are considering the probability of all the points which are coming less than or equal to x. So, you are adding for example, if you say x is equal to 0 then you are considering all the points which are up to 0 you are considering the total probability assignment up to 0. Then if I say x equal to 1 then further you are adding the probabilities of all the points which are assigned values between 0 to 1 and the previous values are already there that is why it is becoming cumulative distribution function. We will see that this function is quite useful and also it is having certain characterizing properties. Now, certain things you can observe easily. For example, if I say x tends to plus infinity if I say x tends to plus infinity then naturally all the points will be incorporated here. So, this will be actually become the probability of the full sample space and therefore, this will be equal to 1. If I say x tends to minus infinity then naturally this is showing that the no values can be assigned here and therefore, the probability of the impossible event that will be equal to 0. So, similar properties we can write here. The CDF fx satisfies the following limit of fx as x tends to minus infinity is 0. Limit of fx as x tends to plus infinity is equal to 1. Then if x1 is less than x2 then fx x1 is less than or equal to fx x2 that is f is always non-decreasing function. And limit of f h tending to 0 plus that is equal to fx for all x that is f is continuous from right at every point. So, now let us see that what is the relation of this capital F function with the probability mass function in the case of discrete random variable and with probability density function in the case of continuous random variable. Further we also have a point to make that if I have a capital F function which actually satisfies these four properties if that is so, then certainly that function will be CDF of a random variable that is a very striking property here. Conversely if f is a function defined on R. So, R to R basically satisfies the above four properties then it is CDF of a random variable x. That means, we can actually find out a random variable for which this will be the cumulative distribution function. So, now let us look at the relationship that it may have with the discrete and continuous random variables. So, when x is discrete then see we can find out the PMF from the CDF we can given the PMF we can find the CDF. So, both the things can be found out let us see. If I consider CDF then it is probability of x less than or equal to x that is the probability of all those points which assign the value capital x to be less than or equal to x. Since it is a discrete random variable say with probability mass function say Px then this is nothing, but probability of x is equal to xi, xi less than or equal to x. So, that is becoming simply probability of Pxxi xi less than or equal to x. Conversely if I want to find out Pxxi then what it is? It is probability of x is equal to xi. Now if I have assigned my values in the space of values of capital x in the sequential order x 1, x 2 and so on. That means we are saying like this. Then this is nothing, but probability of x less than or equal to xi minus probability x less than or equal to xi minus 1. Then this is simply becoming Px it is becoming f of xi minus f of xi minus 1. So, given a CDF one can find a PMF given a PMF one can find a CDF. Let us look at the example that we did here. In the case of discrete random variable I define this probability mass function. For this one let us consider for the example of defective computers let us consider CDF. So, let us look at this. Now from the nature of the distribution it is clear that random variable does not take values before 0. Actually the values which are taken with are 0, 1, 2. Now this function is defined on the whole real line. So, naturally this probability of x less than or equal to x that is actually 0 if x is less than 0. Now first value that it is taking is at x equal to 0 and the probability for that is 3 by 10. So, actually you can say that it is equal to 3 by 10 if x is equal to 0, but now again you observe that from 0 to 1 further addition of the probability is not there. So, actually this statement you can qualify you may say 0 less than or equal to x less than 1 because up to 1 there is no further addition. Now at 1 you are actually having 6 by 10. So, this probability will be added here this will become 9 by 10 for 1 less than or equal to x less than 2. At 2 you are further adding 1 by 10 and therefore, this is becoming 1. Actually if we plot this it will look like this up to 0 you have 0, from 0 to 1 you have 3 by 10 then from 1 to 2 it is 9 by 10. So, something like this and there after it is becoming 1. So, if on this side we have x then f x is given like this. Now this is further revealing if we look at this function this is actually discontinuous at 0, 1 and 2 and you look at the discontinuity and you can actually look at this is basically jump the values was 0 suddenly at 0 it is becoming 3 by 10. So, if you look at the size of the jump that is 3 by 10 that is precisely the probability of x is equal to 0. Similarly, you look at the jump size at 1 it is 9 by 10 minus 3 by 10 that is 6 by 10 and that is precisely the probability of x equal to 1 that is 6 by 10. Similarly, here if you see this jump size is 1 minus 9 by 10 that is 1 by 10 and that is precisely the probability of x is equal to 2. So, we can conclude that the CDF of a discrete random variable is a step function with jumps at finite or at most countably infinite number of points. Further the size of the jump is equal to the probability of that point. See if this was given we can actually calculate the probability of x equal to 0 by taking this difference the probability of x equal to 1 by taking this difference the probability of x equal to 2 by taking this difference. Let us further consider for the card example. So, for the card example here x was a score and we look at the CDF. Now, you look at the assignment here the values are starting from 2 up to 9 then 10 and then 20. So, there will be jumps at these points. So, you can describe like this it is 0 for x less than 2 it is becoming 1 by 13 for 2 less than or equal to x less than 3 it is 2 by 13 for 3 less than or equal to x less than 4 and so on 3 by 13 7 by 13 for 8 less than or equal to x less than 9. Now at 9 you are again having the probability 1 by 13. So, that is added it is becoming 8 by 13 for 9 less than or equal to x less than 10, but at 10 you have the probability 4 by 13. So, this is actually becoming 12 by 13 for 10 less than or equal to x less than 20, because the next value is assigned 20 that is 1 by 13. So, it is becoming 1 for x greater than or equal to 20. Now, this is a slightly lengthy example, but it is also quite revealing here. For example, you can see the jumps are at the points 2 3 4 5 6 7 8 9 10 and 20. So, for example, if I want what is the probability of x equal to 7. So, probability of x equal to 7 is found out by looking at the size of the jump at 7 that is 6 by 13 minus 5 by 13 that is 1 by 13. Suppose, I want what is the probability x equal to 10 then you look at this at x equal to 10 the jump size is 12 by 13 minus 8 by 13. So, it is becoming 4 by 13. So, that is actually the probability of x equal to 10 that is 4 by 13. Now, if the random variable is continuous then you can see that there will be a slight difference in the description here. If we are having if x is continuous with pdf say f x then by the definition of pdf if I look at f x that is probability of x less than or equal to x that is integral from minus infinity to x f x or f t dt because we have seen the definition in the case of continuous random variable that the probability of an interval is given by the value of the integral of the density over that interval. So, now in this case the interval is minus infinity to x. So, we integrate from minus infinity to x this one, but if we look at this carefully this is actually denoting indefinite integral of the function f and we know that if capital f is defined as an indefinite integral then this will be an absolutely continuous function. And further you will have d by dx of f x is equal to f x almost. So, let us look at the example that we have done earlier. So, let us look at this continuous distribution here. So, here f x is equal to 3 by 4 into 2 x minus x square for 0 less than x less than 2 it is 0 otherwise. Now, if you want to calculate this f x then this is by definition integral of minus infinity to x f t dt. Now naturally you can see that there will be a non-negative value only in the interval 0 to 2 in other place it will be simply because you will be integrating 0. So, for example, if x is less than 0 then this is simply 0. Now if x is between 0 to 2 then you are actually integrating 3 by 4 2 t minus t square dt from 0 to x only. So, this integral as we can easily see it is equal to x square minus x cube by 3. So, this value is actually becoming 3 by 4 x square minus x cube by 3. And when we cross 2 then this will become the full integral and therefore this will be simply equal to 1. So, this value will become equal to 1 for x greater than or equal to 2. Now as you can see you can actually include 0 here or 0 here it will not make any difference because this function is actually continuous function. See value at x equal to 0 is 0 from this side also it is 0 and at x equal to 2 if you look at this value is 1 the value at x equal to 2 from left hand limit and the right hand limit both are 1. So, this function is actually continuous and differentiable function and therefore if you look at the derivative of this you will get this here. So, this is the case in the case continuous random variable the relationship between the CDF and the EDF. Now in tomorrow in the next lecture I will be explaining how this concept of cumulative distribution function is quite useful when we are discussing mixed random variables. Because in that case we do not have to separately talk about the density function and the mass function and the CDF itself will give the complete information. We will further look at characteristics of the distributions such as expectations, moments, moment generating function, mean variance, median and say quartiles, measures of skewness, kurtosis etcetera. So, that we will be covering in the following lecture.