 Hi, I'm Zor. Welcome to a new Zor education. I would like to continue talking about probabilities. In particular, it's a continuation of the previous lecture where I was talking about cumulative probability distribution. I would like to talk about probability density distribution. This is part of the advanced course of mathematics for high school students presented on Unisor.com and I do suggest you to go to this website to watch this lecture because it also contains very detailed comments and notes for each lecture, which you can basically read like a textbook. And in addition to register students, there are some exams and some other educational functionality kind of procedures like enrolling, etc. Alright, so probability density distribution. Well, let me start with whatever I have finished in the previous lecture about cumulative distribution. So just a reminder that the function which is called cumulative probability distribution is basically a probability of our random variable to be less than some value x. Now obviously, as x is increasing, the probability is also increasing. And if x goes to infinity, the probability goes to 1. So the function f at x is monotonically, well, not decreasing. Let's put it this way. Not necessarily increasing, but definitely not decreasing. And if we go to the left, to the negative infinity, obviously this probability should go to 0. So the graph of this probability is something like this. And this is asymptotically going to 1. Or it may actually reach 1 at some point. If, for instance, our variable c is concentrated on certain segment and there are no values, there is zero probability, it will take the value less than a or greater than b, then the probability would be, this cumulative probability would be something like this. So it starts from 0, ends with 1. And then basically continues with 1 here and continues with 0 here. So whatever it is, this represents a cumulative probability. Now, before going into probability density, I would like to make certain analogy with mechanics, with mechanical motion. Now, when I was explaining this cumulative probability, I was comparing this with the movement from point A to point B. And this is the time. And we start at A and the function which we are talking about, it's a distance function of time. We start at point A, which means we have covered 0 miles, kilometers, meters, whatever. And by the time we reach B, our distance is growing. Well, it can grow uniformly if we are talking about the constant speed. Or you can go slower, in which case you will have less distance covered here and then faster here. But eventually we still reach the point which is the distance between A and B. Or capital A, capital B rather. So no matter how we move, we reach from distance 0 covered by our trip to the maximum distance which is distance between these two points. Now, as we are moving, we are covering distance. And analogy with this is as x is increasing, we are covering more and more values which random variable c might take. So it's kind of an analogy. But now, with traveling from A to B, obviously the distance covered, if you are going by car, it's a domiter, right? But there is also another very important characteristic, a speed. Now, speed for our trip is extremely important. And what I would like to show you right now, that there is an analogy with the speed in the probabilities and basically the probability density function is in the probability theory is equivalent to speed in mechanical movement. And here is how I would like to present it. Now, let's talk about speed first. It's kind of much more familiar territory, right? How do we determine speed? Well, let me just start from something very simple and complex at the same time. For those people who understand the calculus, speed is basically the first derivative of this function, the distance covered by the time t, by the time t. Now, for those not familiar with calculus, it will be a lengthier explanation, but here is what it is. First of all, we can determine the concept of average speed. Average speed is distance divided by time. So, for instance, from a to b, the distance is, let's say, d. And the time which we spend from 0, and this is time t. So, the time is t. So, the d divided by t would be our average speed on entire, during entire trip. That's obvious, right? Well, that's actually a definition, there is nothing obvious about this. The average speed is the distance divided by the time we have covered this distance in our trip. Providing, obviously, we are going into one direction only. We don't do this type of movement back and forth, back and forth. Always driving from a to b during the time t and the distance covered d. So, d divided by t is our average speed. Now, what if we would like to know a little bit more precise our speed at certain moments? For instance, the first half I was doing slow, and then the second half of my trip I was doing faster, right? Or I can divide this particular time interval into many, many different small intervals and determine my average speed on each interval. So, I will have average speed during my first, let's say, second. It's such and such. My second, second would be such and such. My third second would be a little bit more, etc., etc. So, I can define my average speed on any however small time period. Now, as I increase the number of intervals and decrease their lengths in time, I more and more precisely calculate my average speed on any however small interval. And basically, if I will do this, for instance, this is the moment t, and the next one, let's say this length is d. So, the next one would be t plus d. So, if I will calculate the distance I covered from t to t plus d, which is distance of t plus d minus distance at moment t. That's how much I covered during this period of time. And divide it by the time from t to t plus d, which is d, right? And if I will do the limit of this as my d goes down to zero, so my interval is getting smaller and smaller and smaller. Finally, this number will be my momentary speed, speed at moment t, which is actually what speed at moment t actually is, right? So, that's basically the definition. And that allows you to absolutely precisely know the speed at any time provided you have the function which is distance covered by the frame time t. Now, this is how the speed is introduced. Now, I will do something similar in my theory of probabilities. So, I know that my random variable takes certain values. And let's just, for simplicity, assume that my random variable is always from lowercase a to lowercase b. So, its values are concentrated. Next, continuation of this would be probably to put infinities on either of both sides. But that's actually absolutely not important. So, let's just, for argument's sake, assume that xi belongs to this particular interval. Now, I know that xi can take all these intervals and the graph of this f at x, this is a and this is b. Now, the probability less than a for our random variable xi to take the value less than a is zero, right? Because all the values are concentrated from a to b. So, f at x would always be zero prior to a. Now, it will always be less than b. So, by the time, after the value b, the probability should always be one, right? So, it's definitely less than any number greater than b since it belongs to this interval. Now, in between from a to b, the probability, the cumulative probability function will grow somehow like this, whatever way it grows, doesn't really matter. Now, what does it mean? You see, here it grows a little slower, here it grows a little faster. And recall, back to our trip, sometimes we're going slower and sometimes we're going faster. Now, what does it mean? Well, it means that the values within this interval are more probable than, let's say, values within this interval before that. Because it looks like the function grows faster during this, which means I'm adding more probabilities to take these values on each small step, rather than on these values. So, that's the difference. Now, we can actually make certain logical arguments with analysis of the growth of this function, similar to the ones which we just did for speed. Let's divide this particular segment from a to b into small intervals. Now, what is probability to get from x to x plus d? So, the probability of function c to b from x to x plus d. Well, we know this probability function, so it's basically function of x plus d. So, the probability to take value less than x plus d minus probability to take value less than x. That would be the probability to be in between x and x plus d, right? Now, obviously, if d is greater, this thing will also be greater, right? Because this is growing function. However, if I will compare, let's say, this difference for different intervals, some intervals will have this difference smaller, like this, and some interval like this, for instance, from this to this. It will be a bigger difference. So, expression divided by the length of this interval would be a characteristic of how fast the function grows during this interval from x plus x plus d. And if I will decrease my d to zero at point, x is fixed and d would be smaller and smaller and smaller. I will get basically something which is equivalent to speed in the movement. And in the case of theory of probabilities, this would be my density of probability at point x, exactly. This is the definition of the density of the probability, or probability density. And again, if you know the calculus, this limit is basically the first derivative of the cumulative function by argument x. But in any case, even if you don't know the calculus, and we will address calculus in some future lectures, but since we already know what limits are, that was actually part of the old stuff which we have covered long time ago. So, this is something which is a definition of the probability density at point x. And now, you can definitely say that at certain moments, this is faster, like in this moment, it's faster than in this moment, for instance, if this is my function. Okay, now let's just make a very small example. Let's consider the probability is so-called uniform distribution on segment from A to B. Now, uniform means that basically the probability to be within certain interval, this interval is exactly the same as probability to get into this interval. As long as the intervals are of the same lengths between A and B. The probability is exactly the same. Now, what does it mean? Well, it basically means that f at x is linear function. And the graph would be, if this is A and this is B, this is 1. So, the function would be equal to, my cumulative probability function would be 0 until A, then it will grow linearly to 1 and then again 0. That's my uniform cumulative probability. Now, what would be the probability density in this case? Well, obviously, if I will do this calculation, x, x plus d. Now, this is f at x and this is f at x plus d. Now, what is the difference between my increment of the probability divided by increment of the value? f at x plus d minus f at x divided by d. Well, obviously, it's just proportional to d, this increment of probabilities. Which means if I divided by d, it would be just a constant. Now, for those who remember trigonometry, it's basically a tangent of this angle, right? You divide this calculus by this calculus and this is the tangent of this angle. And it's a constant since this is a straight line. So, how my graph of the probability density would look like? Well, very simple. It's a constant and what is this constant? What's the value? Well, we can take, for instance, this and divide by this. This is 1 and this is b minus a. So, my f at x is equal to 0 if x is less than a. 1 over b minus a if a xb and 1, sorry, and 0 again, 0 again if x is greater than b. So, the probability would look like this. This is a and this is b. This is 1 over b minus a. It would be a constant here, 0 here and 0 here. This is my graph of my probability density for uniformly distributed on the segment a, b random variable. By the way, I said that if a and b are not really fixed numbers but infinities. So, for instance, we have normal variables and normal variables can have the values, any values. They're not restricted by anything. And you have definitely seen this bell curve. Now, what is this bell curve? Well, bell curve is a probability density for a normally distributed random variable. It shows that some maximum probability will be concentrated around the middle value. And as we go further and further from the middle value, the probability will decrease. So, basically, whenever you see something like this, you just have to understand that this is the probability density graph. It's another story that we don't really have exact bell curve like this. We have certain statistical, for instance, data and we put it on some kind of a graph and it looks like the bell curve, but it's only because if you remember some of random variables, the more random variables you're adding together, the more their sum resembles the normal distribution. And that's why we have this bell curve. All right. So, that's all I wanted to cover. Yes, one more little thing. This cumulative probability is actually a universal function in that respect that it fits for description of the probabilities of discrete random variables as well as continuous. So, let me just give you an example again. For continuously distributed random variable, you just saw the graph might look something like this, right? For discreetly distributed random variables, the graph would look something like this. Because these are the values where probability mass is concentrated. For discrete values, in this case, my random variable, which takes only four different values. So, the probability of less than this value would be zero. The probability to get the value between this and this is actually the probability to get only this value, right? Because it doesn't take all these in between. So, that's why the probability jumps. Now, so, this is kind of a universal function. How about the probability density function? Well, with this, that's easier because it's continuous thing and in most cases, in smooth cases, it always has the probability density because that limit, which I was talking about, limit of f at x plus d minus f at x divided by d. Now, this exists in smooth cases. In this case, it does not exist, quite frankly, because these are jumps. And whenever you're jumping, the probability density function, it looks like the whole probability density finite probability is concentrated in one point, which is not really the way how smooth function actually act. So, in case like this, probability density function, generally speaking, you can say it does not really exist because, well, it's zero in between, but at these points, when it jumps, it doesn't really exist, right? Because the length d would be, you know, d probably would be actually zero in this case. So, the probability density function does not exist for discrete random variables. I mean, you can actually talk about the probability density function using some mathematical tools. It's called delta function, which we don't really want to cover, at least not right now. So, in a normal sense, probability density function does not exist for... It's better to use the mass distribution function, which is not really the same thing as probability density. The mass distribution function just gives you how much probability is concentrated in all these functions, in all these points where the value can be taken. All right, fine. That's probably it. That's all I wanted to talk about probability density. What I would suggest you to do is go to theunisor.com and read notes to this lecture. They are quite detailed, and I think it will just bring your understanding to a higher level. It's like a good textbook, I would say. So, it's always good to read after you have listened to something like this. All right, that's it. Thank you very much, and good luck.