 Hi, I'm Zor. Welcome to a new Zor education. I would like to continue talking about random variables with continuous distribution. Primarily, I would like to explain how to describe the distribution in more mathematical terms using the cumulative distribution function. This lecture is part of the course of advanced mathematics for high school students, for teenagers actually. And it's presented on unizor.com. Together with many aspects of educational process like notes for each lecture, there are exams for enrolled students. And I do recommend you to watch this lecture from the unizor.com website even if you found it somewhere else like YouTube or anywhere else. Just primarily for these extra things which the website contains. In this particular case, it's definitely detailed notes which make sense to read before or after you listen to the lecture. Alright, so we are talking about function which is called cumulative distribution function for continuous random variables. Random variables with continuous distribution of probabilities. And then we will apply it to discrete random variables as well. So let me use the same example I was using in the previous lecture where I introduced the continuously distributed random variables. Let's say you have a tennis ball and you are trying to determine its weight. Well, we will assume that the instruments you have have absolute precision. And also the same kind of tennis balls which we were considering in the previous lecture. The ones which have certain variation in weight from minimum 50 to maximum 60 grams. So you are measuring the ball and well, the balls are manufactured by different manufacturers, etc. So basically there is some distribution of probabilities to find the ball of any particular weight or within certain range of weights. Now what's important about random variables which are distributed continuously, like in this particular case from 50 to 60, is that the probability to have any specific value, probability of our random variable c, which is the weight of the tennis ball, to be equal exactly 55 grams, this probability is equal to zero. As well as the probability to find any other specific value. However, if you are talking about the range, let's say 52 c 56, this will be greater than zero. So any range has the weight or probability greater than zero, but any specific value always has zero. In a way, it's absolutely analogous to, let's say you have a segment and you are calculating the lengths. Now the length of each interval within this segment is definitely greater than zero. But the length of one particular point on this segment is equal to zero. And as I told many times, the probability and the lengths or weight or anything else, they are all related, they are all measures. Measure has certain qualities in mathematics, for instance, measures are additive. And in this particular case as well as in this, it's exactly the same thing. In continuously distributed random variables, the probability to have any specific value would be zero, as well as the length of an interval where beginning and end coincide is equal to zero. But if you just open any interval where end and beginning and end are different, then you will have certain non-equal to zero probability, as well as non-equal to zero lengths. And if you are increasing this range, you will increase the length and you will increase the probability up to the maximum, which is in our case, we have decided that our ball can be in this boundaries and the probability is equal to one, because there are no other balls but those which are within this weight interval. Alright, so now the question is, how can I describe the distribution of probabilities of this particular random variable? Well, obviously it's not easy, because what I would like to know is, I would like to know this for any a and b between 15 and 60. So for any interval of values, I would like to know the probability because that's exactly what means to know the distribution of probabilities. Now, that's obviously not easy, because there are infinite number of these numbers. I mean, I cannot really specify that for a is equal to 55 and b is equal to 56, it's this. For a is equal to, let's say, 57.5 and b is equal to 58.77, it's something else, etc. I cannot describe my distribution in these terms. So, now we have a very important problem. We have to find out how to describe this probability in a simple and yet sufficient way, so I can determine this for any a and b. Now, before doing that, let me just make another analogy. Let's say you're driving certain distance from point a to point b. Now, obviously you're driving with different speed, right? So, now, if there is a question, how much did you drive from time t1 to time t2? Well, how can I answer that question in a relatively simple form using what, basically? So, I need certain knowledge and obviously, if I'm driving, it's not such an easy thing, even if you do have all these instruments on your panel, you cannot really mark every time interval what exactly has been covered as far as the distance is concerned, right? So, here, to help you with this, you have odometer, right? Odometer gives you basically your mileage or whatever kilometer rush, whatever units you are using for any specific moment. And let's just assume that your odometer is absolutely precise, which means whenever you look at the odometer, it gives you exact distance which you have covered. More than that, let's just make such an odometer which puts it in some kind of a functional specification. So, you have a function which is equal to your odometer reading at moment t, odometer at moment t. Now, if you have this particular function, and this is the function which you can basically put into a computer or whatever else, you can even graphically represent it, is that function which seems to be like a simple thing, just one function of one argument, right? Is this sufficient to answer the question how much has been covered from any moment t1 to any moment t2? Well, absolutely. It's very easy. It's just function of t2. That's how much we have covered at t2. Let's just assume that t2 is later than t1 minus how much was covered up to t1. And that gives you how much was covered during this interval, right? And that's very simple thing. So, you have one function which basically describes completely the whole road which you have driven along, and it's sufficient to answer the question how much did you really drive from 1 o'clock to 2 o'clock, or from 1.30 to 2.30, or from 9 o'clock in the morning to 9 o'clock at night, whatever. So, any question of that kind can be answered using just one relatively simple function of one argument. How much was covered? What's my odometer actually reading at any moment t? So, if this function is given, then you can answer all these questions in exactly similar way. We will solve the similar problem with probabilities, namely, we will introduce a function which we will call a cumulative distribution function, which is a function of one particular argument. Let's use argument x. And this is the probability of our function, our random variable c to be less than x. Now, in our case with the tennis balls, now obviously, if x is 50, what's the probability of our random variable weight of the ball to be less than 50? 0. We have already agreed that all the balls are from 50 to 60. So, less than 50 is 0. What's my probability of 60? It's 1, because it's definitely not greater than 60. Everything is less than 60. Or equal, but it doesn't really matter, because we have agreed that the probability to get any specific value like 60 is 0 anyway. And now, we can always tell that the probability which we are looking for from a to b is f at b minus f of a. Let's say a is 55 and b is 57. So, what's the probability of our tennis ball to weigh between 55 and 57? Well, that's probability to be less than 57. Now, this is the probability to be less than 55. And we have to subtract it, because we are not interested in whatever is less than 55. We are interested only in greater or equal than 55. And that's why we have subtracted. Probability is a measure, measure is an additive function. So, the probability to be, this is 50, this is 60, this is 55, this is 57. So, the probability to be within this interval is the probability to be within this interval minus probability to be in this interval. It's exactly like the length or a weight or anything or any other measure. So, this is how we can use our cumulative distribution function. So, this is a definition of this function and it's defined for any x, by the way. It's defined for any real x. It can be less than 50 or it can be greater than 60. Or I can say is that for every x, which is less than 50 or equal to 50, this function is equal to zero. So, let's just draw a graph of this function. So, it's defined for every x. Now, if this is my 50 and this is my 60. Yeah, forget about the scaling, obviously. So, what happens? Our function is definitely equal to zero before 50. And our function is definitely equal to one for every x, which is greater than 60. Because the probability to be less than 60 is 100%. Now, in between it's whatever it is, but what I would like to say is that it's monotonically non-decreasing function. Because the greater the x is, obviously the greater the probability is. Because again, the probability is a measure. If I increase the x, the probability to be, let's say it's x plus d. So, the probability to be c to be less than x plus d would actually be broken into two. This is x, this is x plus d. So, this is 50, this is 60. So, the probability to be within this interval is equal to the probability to be within this interval, which is f of x. Plus probability to be in this interval, which probability is always greater or equal to zero. So, that's why we're always adding something. Increasing x, we're always adding to the probability. It's easier to be less than 57, right, than less than 55. As long as we increase the boundary, it's more elementary events fall into this category. So, the function is monotonic. I'm not saying it's increasing. I would say it's non-decreasing. And why let me just address it a little later when I will talk about discrete distributions. So, that's basically the function which is called cumulative distribution function for our random variable c. Now, by the way, if I would like to know something a little bit more complex than this, I can also do it. Because, again, probability is an additive measure, right? So, for instance, I would like to know the probability of a1, b1, or a2, b2, where a1, b1, a2, b2. So, these are non-intercepting values. Well, obviously, again, the probability is additive function. This is or. These are completely non-intercepting events. So, there are no elementary events which are common. And that's why this probability is equal to sum of these two probabilities. And the first one is equal to f of b1 minus f of a1, where this is the cumulative distribution function. And the second, which we have to add, would be f of b2 minus f of a2, right? So, the function f, the cumulative distribution function, is completely sufficient to describe the behavior of our random variable. Now, what does it mean to describe the behavior? Well, that basically means that we have to really find out what's the probability of this random variable to be in any particular range of values. And this question is answered by the cumulative distribution function, right? So, that's a little bit more complex, but we can always find it out. Now, just one particular example of cumulative distribution function is called uniform distribution. So, let me just talk about this just one little example. Now, let's assume that, again, you have this tennis ball example. The weight is from 50 to 60, definitely. But the probability distribution is proportional to the range between the values. So, the greater the b minus a, the greater the probability with the same coefficient. So, let's say this is k b minus a, where k is some number, whatever the number is. Now, obviously, if we are talking about 50 and 60, and we want 60 minus 50, which is the entire interval to have the probability of 1, it means that the k should be, in this case, 60 minus 50, it's 10, so it should be 1 10. So, let's assume that this is where 50 minus a b 6. So, let's assume that our probability is proportional. Now, is it unreasonable? It's perfectly reasonable. I mean, the wider the interval, obviously, the greater the probability should be. That's kind of obvious. So, to assign it proportionally means something quite natural. It's called uniform distribution of the values of this random variable in between the maximum and minimum values which we can take. So, from 50 to 60, it's uniformly distributed. There is the same probability, let's say, to find the value between 50 and 51 as between 57 and 58 as between 55 and 56, because the length of this and this and this is equal to 1, right? Or any other similar example. So, this is basically the function. Now, I can say that my f at x is equal to 1 10 x minus 50, right? If b is x, a is basically 50, because that's the minimum, so that's my distribution function. And it's defined only when x is from 50 to 60. Now, if x is less than 50, then the function equals to 0. And this function is equal to 1 for all x greater than 60. So, that's a complete definition of our function. Now, let's have a graph of this function. So, again, 50, 60, 0 here, 1 here, and here, between 50 and 60, it's linear, right? It's linear function. It's equal to 0 at x equal to 50, and it's equal to 60 minus 50, which is 10 divided by 10, which is 1, which is here. So, it's a straight line. So, that's the graph of uniform distribution between the values 50 and 60 in this particular case. All right. So, this is the uniform distribution of random variable on a segment between certain values. There is no such thing as a uniform distribution on the entire line, because it's infinite. We cannot really have this point in infinity and this point in negative infinity and positive infinity. It doesn't really work. So, uniform distribution is always on a finite segment. Now, there might be examples of cumulative distribution, which is non-zero on an entire line. It's something like this. You asymptotically go to 0 here, then at certain moment you rise and asymptotically go to 1. So, this type of function. It's still monotonically increasing. Something like a normal distribution would have a graph like this, because it's basically defined. Any value can be taken, but the more you are deviating from the middle, from the median distribution, the less probability to get into that area. So, that would be something like this. And now, let me just mention one more thing. You see, the cumulative distribution is really a relatively universal tool to analyze the random variable. And its universality is that it's not only applicable to variables, random variables with continuous distribution, but also to discretely distributed random variables. Here's how we can do it. Let's consider some example. For instance, you have a dice. Now, dice is a model of a random variable, which takes value 1, 2, 3, 4, 5, 6, with probability 1, 6, 1, 6, 1, 6, 1, 6, 1, 6, 1, 6. Right? Now, I would like to again use my apparatus of cumulative distribution to describe the probability of this c to be less than x. Let's just think about this function. How it's supposed to look like? 1, 2, 3, 4, 5, 6. 1, 2, 3, 4, 5, 6. What's the probability to be less than 1? Obviously, 0, right? So, the function is equal to 0 here, not including the endpoint. Now, what's the probability to be between 1 but less than 2? So, 1, c less than 2. So, it's basically only 1, because 2 did not really take here. So, it's only 1, which is 1, 6. So, the function starts at 1, at 1, 6 here and goes for every x between 1 and 2, it goes horizontally. It does not change the value, right? Not including the point 2. At point 2, what's the probability to have 2, which is equal to 2, 6, right? So, it's here and it goes up to the point 3. So, that's the probability of our c to be less than 3. Now, what's the probability and greater or equal to 2? Now, what's the probability to be greater or equal to 3 but less than 4? Well, obviously, that's 4 fifths and then 5 fifths, etc. 6, I mean, not 5, 6. So, that's 4 and that's 5. And that includes this and from this, it's 1. 1, 6, 2, 6, 3, 6, 4, 6, 1, 2, 3, 4. 1, 2, 3, 4. 1, 2, 3, 4, 5. Oh, then 6. One more. And this goes to infinity on x. So, this is 5, 6 and this is 6, 6, which is 1. So, that's the graph of our function, which is a cumulative distribution of the random variable with discrete values. Now, in this case, as you see, function is not monotonically increasing, it's monotonically non-decreasing, right? Because during this and this and this period for x, it just stays constant. So, my point is that cumulative function is really a universal function to describe the behavior of any random variable. It's discrete and continuous. And that's why it's very convenient in this particular case. And it always exists, so to speak. So, for any random variable, we always can construct this type of a function. And the knowledge of this function is sufficient to basically describe the complete distribution of probabilities of any random variable. And complete means that we can actually answer the question, what's the probability of this particular random variable to be between this and this, or between this and this, or this and this, et cetera, et cetera. So, all questions of this type, all events, if you wish, related to the behavior of our random variable, all these problems are answerable using one cumulative function, cumulative distribution function. Okay. Basically, that's all I wanted to discuss right now. And I do suggest you to go to the Unizor.com and read the notes for this lecture. It's always useful. It's like a textbook, basically. Whenever you have listened to a lecture, it's always good to read the same material again, just to refresh your memory. And if you would like, I definitely recommend you to register on this site because it will allow you to take exams, which is always useful. You can take any exam, any number of times, just for self-checking procedure. That's it for today. Thank you very much and good luck.