 Hi, I'm Zor. Welcome to Unizor Education. We continue talking about mathematical statistics, about foundations of mathematical statistics, just main principles, which is actually a part of the course of advanced mathematics presented on Unizor.com. I suggest actually to watch this lecture from this website because there are notes on the side. And generally the site provides some function, now it's like you can enroll into the course, you can take exams, etc. All it requires is just your login name and that's it. Now, this is the lecture which should emphasize importance of the volume of data to do any kind of statistical analysis. Well, let me start with something which I have repeated a couple of times before, the purpose of mathematical statistics. Now, this is a subject which is supposed to in conjunction with theory of probabilities to serve the following purpose. For instance, you don't know anything about certain process, about the result of this process. It can be different and you would like to predict what this process will be doing in the future. We can talk about weather forecasting or election forecasting, etc. So basically the probability is a key to predict the future behavior of random variables. But if you don't know the probability, you need the previous steps, you have to gather the statistics and using the apparatus of mathematical statistics evaluate your probabilistic characteristics of your random variable. And based on these probabilities, the distribution of probabilities which you have derived from the statistical data, however imprecise it might be. You basically do some kind of conclusions about the future. So we have a lot about imprecisiveness in this particular case. First of all, statistical data do not give you a exact picture of the distribution of probabilities. In the best case, it's approximation. Maybe better, maybe worse, but it's still an approximation. And then the probability, even not precise probability, gives you only the probable results of your experiments in the future, not concrete. So it's not really an exact sign, so to speak, in this particular sense. But at the same time, we can always say that the future cannot be predicted with precise evaluation. It's always some kind of range of values, etc. So with this presumption, let's just continue talking about mathematical statistics and the importance of the volume of data to have before you make any kind of conclusions about the probabilities. Alright, so basically what do we deal with? Let's consider, as before, we are dealing with discrete random variable, which might take one of and different values with certain probabilities. So this basically describes our random variable. And what can we say about this? Particular random variable that we don't know anything about it. We don't know the values and we don't know the probabilities. I mean, it's great if it's given, but in all practical situations, it's not. Now, sometimes the values might be given. For instance, if you're tossing the coin, you do have only two values, heads and tails, and you can say that in this particular case at least values are known. Now, if it's a perfect coin, then two probabilities which are corresponding to these two values are supposed to be one-half and one-half. But what if it's not a perfect coin? So let's just consider a situation when you would like to find out whether the coin is ideal or not. So what do we do? Well, okay. So first of all, let's define a little bit more precisely, more mathematical, but we are dealing with. So our random variable right now will contain only two values, we'll be able to take only two values, and let's associate with heads the value of one and with tails the value of zero. And we have two probabilities. Well, obviously P2 is one minus P1, so we don't have to evaluate P2. We have to evaluate only P1. And I mean, we presume that if it's an ideal coin, P1 is supposed to be one-half. But how can we prove it? Well, we toss the coin, right? Okay. So let's think about the variation of the P1. Now, what's the definition of the probability? As we defined it, I mean, there are many different approaches to define the probability. But the way how I preferred to do it with discrete variables like these, I Preferred to relate it to the limit of the frequency of occurrence of certain events. So if we will toss the coin infinite amount of infinite number of times, then the number of heads would actually be Approaching whatever this particular probability is. I mean for ideal coin it will be one-half. But that's an infinity, which means I have to actually make my experiment infinite number of times, which is impossible. And so what do we do? Well, we just do it certain finite number of times, right? All right. So what do we do is? Let's consider that we are tossing the coin lowercase n times. Now, each toss of the coin results in either head one or tail zero, right? So we have basically lowercase n Random variables. Now, why is it random variables? If we toss the coins n times then I will have concrete values, right? But if we will toss it another n times, we will have different values, right? So basically my series of n experiments it can be conducted in either of two ways. Either you have one coin and you toss it n times. Or you have n coins and toss it once. So in both cases we can consider the cumulative result of either a series of n tossing of one coin or one toss of n coins at the same time. We can consider the cumulative result as one result of one big cumulative experiment, right? So as one experiment is conducted, we will have n values. If we will conduct another time, this cumulative experiment, we will have other values. So the combination of the series of these n values is one concrete example. One concrete result of one cumulative experiment. And based on these data, we would like to evaluate our probability P1. Now, how can it be done? Well, very simply. Since tail is one and head is zero, some of these would be the number of times we have heads, right? And if we will divide it by n, new random variable eta represents actually the frequency of occurrence, the head, among n experiments. So if my n tends to infinity, then this actually will go to the real probability P1. But for any finite number, any single value of eta, eta has one particular single value. And we are trying somehow to say that, okay, it's probably close to P1. Well, now let's consider just a little bit more general problem. If you have a random variable and it has only two values, one is zero, and you have conducted one experiment. In this case, it's a cumulative experiment. But anyway, we've got some value, right? Now, what's the probability of this particular value to be a good evaluation of the random variable itself? Well, it depends. There are different random variables and different distributions. If the distribution of our random variable is very narrow, let's say a little bit more practical example. If you are measuring the length of the car with some kind of a ruler in centimeters and millimeters, let's just put it this way. You measure it once and you will get some number. But if somebody else measure, even with the same ruler, in millimeters, the length of the car, most likely it will be different result. Because the millimeter is such small thing. So basically one particular measurement does not really tell you the full picture about the distribution of the random variable. In this case, our lengths of the now measurements of the length of the car are random variables. But it's supposed to evaluate something real, something concrete and constant, right? But in this particular case, the distribution of our random variables around this real value is probably very, very small. A couple of millimeters here and there. And considering the car is pretty long, I mean relative to a millimeter, our error is really not very big. We can really say that any one single measurement, you don't have to really have 100 measurements and then average them and do whatever else. Even one single measurement is a good enough approximation. Now, same thing here. We are measuring something real, the probability of our random variable to take value one. Now, this is something real, which we don't know what it is. It's like lengths of the car. But this random variable, Eta, really gives some kind of a measurement of this. And the quality of this measurement is probably, we are intuitively thinking, since this is a limit by definition of the values of this thing. So intuitively we are thinking that Eta should, as n goes to infinity, should be more or less within very, very narrow margin from the real value P1. So that's our intuitive understanding of this. And again, intuitively, it is obvious that the greater the n is, the more precise this evaluation is. Now, we will separately examine how precise it is. But basically, obviously, we do understand that we need some kind of quantitative measure of the quality of this evaluation. But at least intuitively we all understand that the volume of the statistics which we need should be greater. And the greater it is, the closer this Eta would be, even single value of Eta would be to P1, because we are saying that it should have a limit. And whenever the n is increasing, that's exactly the direction our limit should go. So, can we evaluate P1 absolutely precisely? Definitely not. But what can be done about our evaluation and how can we quantify whatever actually is happening? Well, the best thing which we can say is that P1 can be in certain range around a single value of a random variable which we have obtained by conducting n experiments. Now, if we will conduct more than n experiments, let's say 10n, or 100n, our evaluation should be better, which means P should be in a narrower margin around whatever the value we have obtained. So, if we have only like 10 experiments and we have some value of Eta, we can say that P should be somewhere between Eta minus Delta, let's say for 10 experiments. This is an error for Eta plus Delta, 10. But if it's 100, it should be 100 experiments and Delta 100 should actually be smaller. So, this particular range would be smaller and P would be evaluated more precisely. That's the idea. But now, can I say absolutely that with certain number which we have obtained as a result of our n experiments and knowing n, we also know precisely this particular margin and we can say that something like this is actually true. So, this would actually depend only on number of experiments. Well, no, let me confuse you a little bit more. Is it possible that all n times I will get let's say heads? The answer is yes. It is possible. It is very improbable, but not impossible. So, the probability of having n experiments and n times we will have heads would be for ideal coin one half to the n's degree. Because we have to have for the first experiment should be head for the second should be head, etc. Experiments are independent from each other. So, the probability of the combined event of every one of the n experiments to be a head is really the multiplication of them. Now, this is a very small number with relatively large n, right? For instance, n is 100. Is it possible to have 100 tossing of the coin and get 100 heads? Again, yes, it is possible. Very improbable, but not impossible. All right? That means that the value in this particular case, if I will have n times head, it will be one and one and one n times. So, it's n divided by n. It will be the value of eta would be one in this particular case. So, the eta can be one with the probability in ideal case this. Now, what if n times in a row, I will have tails? So, I will have 0, 0, 0, 0. In which case eta would be equal to 0. So, one and zero are possible values. And if one and zero are possible values, it actually gives us nothing in absolute terms about p1. Because I know that p1 is a probability. It can be from zero to one, obviously. So, we can have extreme values, improbable, but not impossible. So, this particular inequality cannot actually be stated in absolute terms. I cannot definitely say that it will be something. The only thing which I can definitely say is this. Which doesn't mean anything at all, because it's the definition of the probability as a frequency, right? So, that is an absolute term. As soon as I'm narrowing this interval from zero to one to, let's say, I know, for instance, for eta, I've got something 0.51. All right? So, I can say that maybe it's less than zero, sorry, 0.51 minus 0.1 p1 0.51 plus 0.1 Now, I would like to say something like this, but I cannot say it in absolute terms. But I can say is that since these are very improbable and the greater n is, the less probable they are. So, I can actually narrow with only certain probability. So, there is always the probability of this thing and the probability, let's say, is lowercase p. So, my most important result is that I can derive some kind of a probabilistic inequality as a variation of p1. Probabilistic in terms that I can say that p1 is from one to another value with certain probability. I can say that with a probability 1, which means in absolute terms, I can only say something like this. However, if my n is sufficiently large, and that's the point actually, then with a probability maybe not 1, but close to 1, I can narrow the particular range of the values. And if I'm satisfied with the probability, let's say 0.9, I can cut off all these results of the experiment which have the probability less than 0.1. So, whatever is left will be 0.9. So, again, I'm cutting off improbable results like these. And by cutting them off, I'm decreasing my certainty about this particular inequality. But maybe not significantly decreasing. Maybe certainty, like to be sure that something will happen of 90 percent, 0.9, is good enough for me. So, that's the most important basically result of the statistics which you can do. That a variation of certain probabilistic characteristics of the random variable can be made within certain range with certain probability. If you want to increase the range, your certainty in your evaluation is greater. So, the p would be closer to 1. If you want to narrow the range, then the probability of our p1 to be within this range would be smaller. And you have to really basically adjust your demands, how precisely and how certain you would like to be. Now, precision and uncertainty are kind of opposite things. The more precisely you would like to evaluate some characteristic, the probability, the less probable, less certain you are that your evaluation is correct. And the wider the range you agree with, then your probability would also be greater. So, they're kind of going against each other. More certainty, you have to increase the range, which means you have to decrease the precision of your evaluation. Less certainty, then you can make your evaluation a little bit more narrow. So, what kind of three different entities we are dealing in statistics? That's very important and this is kind of a result which I would like you to keep in mind. We have three major components in all this philosophy. We have the N, which is number of experiments. Now, we also have certain level of certainty which we would like to achieve. It's a probability of our evaluation to be correct. That's what it is. The probability of our evaluation to be correct. And we have certain range. Or sometimes it's expressed as either from A to B or from some in the middle to the left and to the right, in which case M is A plus B divided by 2 and delta is half of the B minus A divided by 2. Doesn't really matter. Both are equivalent. So, the range, the certainty and the volume. Of experiment, number of times, certainty and the range. So, we have three major components in this science, if you wish. And they are interrelated. For instance, if you fix certainty and which means, let's say, you would like to be able to evaluate with a probability of 0.9. And you fix the range. So, you would like to evaluate your, let's say, probability in this particular case within the range of one tenths, for instance. Then you can find out what exactly the number of experiments you need. Or alternatively, if you have the number of experiments and you have certain certainty would like to be adhered to. It would like your conclusions to be certain at certain percentage point. Then it defines your range. Or, if you wish, your delta, your margin of error. Or, if the volume is defined and margin of error is defined, then you can calculate your certainty. So, basically what I would like to say is, there is some kind of a functional dependency between n, p and, let's say, delta, which is margin of error. This functional dependency exists, let's say, function is equal to 0. Whatever the function is, it doesn't really matter. And knowing two elements allows you to define the third one. If you know number of experiments and the certainty level which you would like to achieve, that dictates how wide your margin of error is. If you have defined, let's say, number of experiments and margin of error, then your certainty goes to a certain way. Or, if you would like your margin of error to be such and such, and you would like your certainty to be such and such, that dictates what number of experiments you have to provide. So, the mathematical statistics actually solves these three major problems. The maximum, again, what mathematical statistics can achieve is the following. That certain parameter which they are very aging is within certain range with certain level of certainty expressed as probability of truth. And number of experiments is actually involved in all these things. That's the purpose of mathematical statistics in the more concrete terms. Not just to evaluate the probability, this is a very general purpose. But this is a more concrete purpose. Actually, there are three more concrete purposes. Knowing n and p to define delta, knowing n and delta define p, and knowing p and delta to define n. So, one of these three problems is supposed to be solved if you would like to deal with mathematical statistics. Now, this particular expression that our parameter is within certain range with certain probability is a key point to understand the mathematical statistics. It's not absolute, and it's always related to certain level of certainty which we can make our conclusions. Well, that was it. And that probably would conclude my introductory part to mathematical statistics. I would like actually to read the notes for this lecture. It's presented on Unisr.com again, and that probably, I mean, something I explained maybe slightly differently in the notes than during the lecture. So, you will just be more familiar with the whole approach how it is. All right, that's it for today. Thank you very much, and good luck.