 Hi, I'm Zor. Welcome to Unizor Education. I would like to continue talking about statistical approach to unknown distribution of random variables. This lecture is part of the Advanced Mathematics course presented on Unizor.com. I suggest you to watch this lecture from the website rather than from YouTube where it's linked to because there are some notes for each lecture and also the whole website has a very important educational functionality. You can enroll in certain courses, you can take exams, etc. Alright, so back to statistical distributions. Now, before, in the prior lectures, we were addressing Bernoulli variables. Bernoulli variables are probably the most simple random variables in theory of probabilities because there are only two values which they can take and there is only one parameter, the probability of taking one of these values. The other one would be one minus that probability. So, one single number actually defines the distribution. Now, we will go to slightly more complex case. What if our random variable takes many different values and how to approach statistically to these cases? And I have decided to separate this whole problem of statistical distribution of random variables into four sub-problems or subtasks. And they are similar in some way but it's still kind of an increased complexity if you wish. The simplest one is the following. So, this is let's say case A. Case A is when my random variable C can take certain number of fixed values and we know exactly what these values are. What we don't know is we don't know the probabilities. Example, for instance, you are rolling dice, regular cubicle dice. It has six sides and so basically the results are one, two, three, four, five, six. That's it. Actually, Bernoulli variable also belongs to this category because there are only two cases, one and two, which it usually is like one and zero, for instance. And there are some other cases when you definitely have the theoretically possible values our random variable takes and you don't know the probabilities which you would like to evaluate statistically. So, that's the first task. The second task is when we don't know discrete values, let's consider this is a discrete random variable, which most of the variables which we are dealing are discrete actually. We consider this is discrete random variable but we don't know what kind of values it might take. For instance, you can talk about, for instance, number of car accidents in the country during certain months, for instance. It's certainly no less than zero but you don't really know the upper bound. So, the numbers can be basically anything. It can be 25, it can be 377, you never know basically. So, this is the case when discrete random variable takes unknown values with unknown probabilities. Now, the next two cases are with continuously, continuous random variables which can take continuous range of values. For instance, yes, and there are two cases again. So, one case is that this is continuous let's say from A to B. So, we have upper bound and lower bound of our variable. For instance, you are measuring a water temperature. Well, it's definitely greater than freezing and less than the boiling, right? So, it's continuous variable but it definitely has certain limits. Now, as far as the case number D where it basically a continuous variable which does not have any limits, any reasonable limits, let's put it this way. And we have to be able to address each of these cases with certain procedure, with certain strategy provided we are experimenting with these variables. We have to evaluate the probabilities in each of these cases. And that's what I'm going to address right now, very briefly. Okay, case number, case A. Now, what do we do? Well, we obviously start in this case and in every other case with n experiments. Now, if you have n experiments and you definitely know which values our variable in theory can take, you can actually have numbers. It took value x1 and 1 times, value x2 and 2 times, et cetera, and value xk and k times. Now, obviously, some of these is equal to n because this is a number of experiments. Now, what's my approximation for the probability Pi of taking the value xi? Well, I don't know it, but the frequency, empirical frequency is obviously ni divided by m. So, this is the frequency and as we know, the probability is actually, I mean, it's one of the ways to define the probability as a limit of frequencies of occurrence of a concrete event. So, if event is c took the value xi, then this event has a frequency and i, lowercase and i divided by m among our n experiments. Now, some of these can be zero because it just didn't happen. Obviously, if you roll the dice, for instance, ten times, maybe number two will not happen at all. Is it possible? Yes, it is possible. Maybe it's less probable than when it will happen, but it's still possible. And if you roll the dice less than six times, there is definitely something missing, right, in one of those numbers. So, in any case, some of these numbers can be zero and we should not be discouraged about this. It means that our approximation of the corresponding probability is zero. Maybe with number of experiments increasing, this number will not be equal to zero. But in any case, this is the best we can do. And this is the main approach how to solve these problems. So, let's stop on this. I'm not going to talk a lot about the quality of this approximation. So, it looks like it's a correct approximation because it's a frequency. And frequency is supposed to tend to, it's tending to the probability. But that's as much as I can say right now. Now, let's go to the number two. Now, in this case, we don't have the values at all. So, we don't know what this particular random variable can, in theory, take. But we do know what it took. It took x1 value n1 times, x2 value n2 times, etc. And xm with nm times. That we do know because we have conducted certain experiments. So, we do have the values. So, the values it took during our n experiments. It doesn't mean it's all the values which this particular variable can theoretically take. It just took these values. And obviously, some of these n1 and 2 and nm is equal to capital N, the number of our experiments. Now, here is an example. If this is, for instance, the number of car accidents during the months of, let's say, February in the United States or something. Then, obviously, in some experiments, in some February's, out of the last century, let's say, we could have 300 and in some of them would be 700 and 357 or whatever else. Obviously, we will not have each number hit certain number of times. We will hit certain numbers, certain number of times. Now, does it mean that our distribution is really this particular discrete distribution which took only various 357, 753 and 127? Let's say, I mean, just for the sake of the first three experiments. Well, obviously not. So, how can we approach this? Well, the way how I can suggest and that probably makes sense is the following. You can take the minimum and the maximum from this. Let's say, minimum is 100 and the maximum is 757. Then, you can probably divide it in equal parts. How many equal parts? Well, it depends. With lower number of equal parts, it will give you probably better numbers here, but it will not be representative enough as far as the distribution. For instance, you want only two categories, like from 100 to 400 and then from 400 to 800, something like this. You might round it, obviously. So, everything, all these ends which belong to this would be in this category. Let's say, you will have out of 100 experiments, you will have 67 of these and 33 of these. So, you can say that with probability 67%, your number of accidents would be from 100 to 400 and with probability 33%, it will be from 400 to 800. Now, if you will divide it a little bit finer, let's say, instead of 2, you will have 4, like 250, then 250 to 450, 1, 2 and 450 to 600 and 800, something like this. It's not even, probably it would be better if I will have even numbers. In this case, what I will do, I will do the following. You can have from 0, let's say, to 1,000 and then you can divide it in like 200, 400, 600 and 800 and these are intervals. So, now you see how many of these fall into this category, how many to this, to this, to this, to this category. So, you have 1, 2, 3, 4, 5 different categories. And you can say that with the probability, for instance, here you will have, let's say, 10 out of 100 cases and you will have 15 here, you will have 10 again, you have 5 here and you have 5 here or something like this. But some is supposed to be equal to 100. Then you can say that, again, the frequency is used as an evaluation of the probability. Then you can say that the probability for your number of accidents to be in this range is one-tenths, right? And 1500s in this, etc., etc. So, that's an approach which can be suggested in case when you don't really know the real numbers. Now, what's interesting about this? Well, the more the finer you divide your range from minimum to maximum, well, the problem is the smaller numbers you will have for each particular category. And the smaller numbers mean less precision, obviously. Now, if you divide it more crudely, let's say in only like three categories, from 0 to 350, from 350 to 700 and from 700 to 1000, then numbers will be greater because there will be more cases falling into each particular category. Categories are wider, so number of cases will be greater. So, the precision will be better for each particular probability. But now you have to question yourself, is it a good division? I mean, is it not too crude, from like 0 to 300? It means we don't really know whether it's closer to 0 or closer to 300. And it depends on practical implementation, practical purpose of whatever the research you are doing. So, it's all kind of a judgment call. But anyway, this is an approach which you can take. I don't think it makes sense to say that our random variable takes only these values with these probabilities. I mean, n1 divided by capital N, et cetera, probability. Because most likely, if you don't know really what kind of theoretical values random variable can take, most likely can take some other values, not only these, which it already took. Okay, let's go on. The next case is this one. So, we have a continuously, we have a variable with continuous range of values and we don't know the distribution of probabilities, obviously. And this is a continuous distribution, which means basically we have to really talk about the density of probability, so to speak. So, it's like normal, for instance. Well, normal is not a good example because it doesn't have boundaries A and B, but what if it's something like this would be? So, from, let's say, A to B, and this is the density of probability, which means that the probability to get between A, between this capital A and capital B would be the area of this domain. And the total area underneath this density of probability curve is one, obviously. So, if you have a case like this, now, you obviously don't have infinite number of experiments. You still have n experiments and you still have certain number of values which it took, not even necessarily multiple times because it's continuously distributed variable, so all of them can be different and the numbers can be actually all ones. How can you vary the probability? Well, you do exactly the same as we did before. We know maximum and minimum, right? So, we subdivide it into parts and then we just count how many of these values falling between these, how many falling between these, these, these and this. And that's how we will get new numbers. Let's call it number m1, m2, etc., m whatever. It doesn't matter. So, knowing how many times your variable value was in between these, these, these and these categories, then you can say that the probability of my random variable to be between, let's say, this and this is this number divided by capital and number of experiments. So, you're artificially, how should I say it? It's not a discrete variable, but you're making it like discrete, if you wish. So, you are making a quantum increase of the value as a base for converting this variable into variable, which it's not really with discrete values, but with values which are between 1 and 2, 2 and 3, 3 and 4, etc. So, you're kind of approximating the continuously distributed random variable with discrete distribution. And in this case, you can build a graph, by the way, a similar graph into the previous case also can be built. So, this is, for instance, number of cases for each category. And you have something like this, right? So, this is a, this is b, this is a1, a2, etc., ak. And these are probabilities which are actually the number of times our values were in between these boundaries, well, divided by the total number of experiments. So, this resembles the density of probability function for the continuously distributed random variable. Okay, the next is this one. So, again, it's continuously distributed, and you have absolutely no idea about the range, or at least you don't have an idea about one of the boundaries, either lower or upper. Just, for instance, if you will take distance between two different cities in Europe. Well, obviously lower bound is zero, you know that. But as far as the upper bound, it doesn't really make sense to have an entire size of Europe from, you know, from Spain to Royal Mountains to be your upper bounder, because you definitely know that this is not practical. Cities are much closer than that, than that distance. So, it doesn't really make sense to have this big distance as an upper limit. You just don't have upper limit. You just have to agree that you don't have upper limit, because there's no reasonable upper limit. So, what to do in this case? Well, you do exactly the same as I suggested here, when you have discrete values, unknown values, discrete random variable. You take the minimum and the maximum from whatever you have. So, you measure whatever hundred different distances, you know your minimum and you know your maximum. Well, in case of a minimum, you probably can reduce it down to zero. But maximum definitely should not be increased up to the size of Europe. So, you have artificially assumed upper and lower boundaries as the minimum and maximum from your sampling. So, basically, then the problem is reduced to the same thing as we had before. Because we have chosen A and B, our range from and to, from practical experimentation rather than from theoretical knowledge. Like in this case, for instance, I was talking about temperature of the water from freezing to boiling. I mean, that's obviously very reasonable boundaries. It really can be as close to zero as possible and as close to boiling point as possible. So, it's reasonable boundaries. In this case, when there is no reasonable boundaries, just take the minimum and maximum from your samples. The problem, if your sample is not representative enough, if you don't have like hundred different distances in different European countries, that's probably good enough. But if you have only ten and the countries, let's say, only Italy, let's say, and you don't pay attention to other countries, most likely your sample is not representative enough. So, well, again, statistics is not a precise sign. I mean, it might be a precise science, but it's only under certain special circumstances when it's a pure theory of statistics. Practical statistics is not precise. It's not real science. There is a lot of intuition which you have to introduce into this. And that's very important. Again, if you take too small number of experiments, like in case of distances like ten, ten, and that's it, that would not be representative enough. If you have a lot, well, obviously the more you have, the better you will be off and your precision will be greater for evaluation because you will probably cover almost like all the cases. But, well, you probably, you want to arrange only a sample of the data rather than all the possible distances among all the different cities in Europe. It's too much of a task, right? So, the size of this particular depends on, again, the precision you would like to be and, again, it's kind of a judgment call. What is the certainty level my statistical evaluation is supposed to be? That's the most important question. How precise and how certain I should be about this precision. So, if you remember, in Bernoulli variable we had a margin of error and we had a certainty factor or a probability that your estimate is good within the margin of error. That's basically the same thing. Now, we will talk about the details of these cases, probably not each of them because they all kind of look alike. So, most likely we'll concentrate on this one, I guess, which is kind of most, I mean, it's the simplest among them, but it's representative enough to evaluate the quality of our approximation. So, we will do that in some time. And today I just wanted to introduce you to real practical cases when statistical evaluation is important and approaches we can actually take depending on the case in question. Well, that's it for today. Thank you very much. I suggest you to go to the website Unizor and if you register then you will basically open the whole functionality of the website for you including the exams, which I believe is very important. That's it. Thank you very much and good luck.