 Hi, I'm Zor. Welcome to Unisor Education. We are continuing talking about mathematical statistics. It's part of the advanced math course for teenagers and high school students presented on Unisor.com. That's where actually I suggest you to watch this lecture from because there are some notes, exams, and the whole functionality of the website which allows you to basically enroll and follow the whole progress of your education. Anyway, this is the second lecture on mathematical statistics, and let me just repeat the main point of the previous lecture. That was about purpose of the mathematical statistics. So again, the purpose is, which is kind of an opposite to theory of probabilities, having the data from the past random experiments with certain random variable, the mathematical statistics is trying to evaluate the probabilistic characteristics of this variable, like its mathematical expectation, for instance, variance, etc. And based on these empirically obtained data, we are trying to predict the future of this random variable. Now, one of the extremely important detail about this process is that our random experiments, which we are conducting with the purpose of getting some values from the random variable, they are supposed to be conducted under the same condition again and again and again, because if conditions are changing, that means that the whole probabilistic picture of this particular random variable is changing. Then the data from the past have absolutely no relevance to something in the future. So, this lecture is about stability, and let me start from a couple of examples. Let's assume that you would like to know what's the probability of getting two aces from the deck of cards. Let's say five decks of cards mixed together, and you want to know what's the probability of getting two aces out from it. And here is the experiment you conduct. You take five decks of cards, shuffle them, and then pick any two of them, and you see whether they are aces or not aces. You put aside these two cards, and then you pull again a couple of cards, another pair, aces or not aces. Again, put it aside, and you write down results. Now, how many times you are pulling the cards? Let's see. You have 52 cards in the deck, you have five decks, so it's 260 cards. You get two cards every time, so it's 130 pairs. And let's assume that during this experiment, during 130 drawing basically a couple of cards, a pair of cards from the remaining deck of cards, you have, let's say, three pairs of aces. And then you are saying, you know what, I have conducted this experiment, I did 130 drawings three times, I've got my aces, so approximately, I mean, you understand, it's not precise, but approximately you have 300 of 130 aces as the probability of getting something like 2.3 maybe percent. So 2.3 percent, that's the probability of getting two aces from the deck of cards. Is this right? Well, it's absolutely wrong. Let's think about it. What's wrong with this picture? Well, obviously your experiment is not stable. First time you're drawing two cards from 160, the second time you're drawing from 158, because two cards are already out, so your experiment conditions are changing from drawing to drawing. So you cannot talk about any kind of stability, and these numbers are absolutely meaningless. Now, let's talk about something a little bit closer to reality. Let's say you are trying to predict temperature. All right, now, the question is, how can you predict the temperature using some experiments? Well, again, let me start from the wrong experiments. You are observing the temperature during the whole year, 365 days. You have this observation. And let's say that out of 365 observations, you have 50 times with the temperature from 0 to 10 degrees. You have 100 times with the temperature from 10 to 20 degrees. You have 150 times from 20 to 30, and you have 65 times. So the total will be 365 above 30. Okay, so that's your observations during the year. And now you can say, okay, here is the probability of tomorrow's temperature, today's temperature, whatever. Here are the probabilities of this temperature. With the probability 50, 365, the temperature will be between 0 and 10. With the probability 100 over 365 will be between 10 and 20. And within the probability of 50 over 365, temperature will be this. And this will be the probability of the temperature. So you are thinking that you are getting a complete distribution of probabilities of what will be the temperature, let's say tomorrow. Is it right? Again, absolutely wrong. And the main reason why it's wrong is, the main reason, there are many reasons, but the main reason is that the earth is actually rotating around the sun. There are winter, there are summer, autumn, spring. There are different times of years, and different times actually have different ranges of the temperature. So if today is, let's say, summer, and I'm using these probabilities, these probabilities are absolutely wrong because most likely all the temperature will be more towards the higher temperatures. And if it's winter, again, the probabilities are wrong because most likely the temperature will be more concentrated around the lower numbers. So this is also wrong. This is just an example of how easily people who do not really very attentively address this issue of stability of the experiment, how easily they can go wrong. Now, these are kind of extreme cases and it's obvious that these are wrong. But there are some much less obvious reasons. For instance, with the same climate, there are not only the main reasons, like in this particular case, rotation around the sun. But there are some less maybe major issues, like for instance, volcano was erupting or sun's activity actually is changing in some kind of cycle. Actually, we are talking about like 11 year cycle of changing activity of the sun. And there are even more interesting and more maybe unknown really factors which are affecting our climate. Our Earth is rotating around the axis, but the axis is not actually directed at the same direction all the time. It also has a cycle of, I think it's called, procession. The axis is not exactly directed to the same, let's say, star. It's circulating around certain axis by itself. So, that also changes the seasons because if the axis, the rotation of the Earth is more perpendicular to orbit around the sun, you have much warmer piece in the middle and colder at the end, at the poles. And if it's tilted, then the situation is different. It would be more extreme temperature in summer and in winter on the poles. So, there are many different aspects of this climate thing and nobody actually knows what's more important, what's less important. And I'm not denying, for instance, the fact that CO2 in atmosphere which we are producing is a factor. No, it is a factor. But how much this factor is greater than, for instance, sun's activity or deforestation or volcanoes, nobody actually knows. So, statistics is a very, very easy thing to screw up. That's my basically point. Now, let me just go to the right statistics. Right statistics are those statistics when we observe certain random variable under exactly the same conditions time after time after time again. Otherwise, all the mathematical apparatus, whatever we are trying to apply, is just not working. So, let's consider a relatively correct experiment, for instance, with the axis pulling from the deck of cards. Okay, so let's go back. You have, let's say, five decks of cards. So, you have 260 cards. So, you shuffle them first and then you pull two cards from it and you write down whether it's axis or not. Now, the correct way to repeat exactly the same experiment is to return these two cards which you have just pulled, whether they are axis or not, back to the deck. So, that again contains exactly 260 cards. And then shuffle it again and then pull again, write down the results, put it back into the deck of cards, etc. And that's how you can conduct this experiment for 100, 200, whatever number of times. And then you can say that whatever the statistics are, they actually represent correct experimentation when random experiment is conducted under exactly the same condition as the next one. And then the statistics will really mean something. As far as temperature is concerned, well, probably the best way to, well, make it as good as possible, let's put it this way, is have a fixed position on Earth and fixed time, let's say 12th noon, something like this. And fixed day, let's say December 31st. And then year after year, on December 31st, at that particular location, at that particular time, you measure temperature. And after 100 years or so of measurements, you can say that, you know what? Well, our statistics are really representing values of stationary, stable random experiment. And then whatever the temperature will be, you can really distribute it among certain ranges. And that would really correspond much better, also not ideal, because again, there are volcanoes which are erupting and there is a CO2 which we are producing, etc. So it's still not ideal, but at least it's much better because the main factor, the rotation around the Sun, would be neutralized by repeating the experiment on exactly the same day at exactly the same location and time. So these are good and bad statistics, so to speak. Now, back to the theory of mathematical statistics. So what we really need is, we have to repeat experiment under exactly the same conditions. And what is the result of this experiment? Let's just think about it. Now I'm actually trying to approach a very important point. The result of our experiments are numbers which we are treating as values of the same random variable, let's call it XE, at different times under exactly the same experiment conditions. Now exactly what does it mean? It means that this is a value of basically the same random variable, but I will use index because this is the experiment number. So under one experiment we've got a random variable which is exactly the same as this one, which means it's exactly the same distribution of probabilities it has. And we actually get the value of this. Now then again we've got another experiment independent from the first one under exactly the same condition, which means this variable has again exactly the same distribution of probabilities as this one. And we take this value, and the value is, so we measure this one and we get this. We measure this one and we get that, etc. And this is our nth experiment. And again the result of the nth experiment is the random variable with exactly the same distribution of probabilities because all experiments are the same. They're all independent of each other, but under the same condition. And we've got this value Xn. Now what do we want to do with these numbers? Well, most likely we want to do this. Mathematical expectation of our random variable C approximately is equal to X1 plus X2, etc. plus Xn divided by n. That's what we want to say. But do we have the right to say it? Well, let's just think about it. And this is a very interesting shift. You see, this is a random variable. These are values which we have received as a result of the experiment. What I would like to say right now is that from this random variable in these constants, I would like to switch to, instead of Xc, I'm talking about mathematical expectation of Xc, which is a constant. But these things, now I'm talking about these things as being an approximation. But how do I know that this approximation is good? Well, these are all individual values of individual random variables. So if I will conduct different experiments, I will have different values of the same random variables. So basically my point is that all of these are actually variable as experiments are conducting one after another. So if I will conduct another experiment, I will get another numbers. But I still would like to say that there are some, average some, will be close to this one, right? So how can I evaluate this? Well, here is how. Let me construct this construction. So instead of using these constants, I will use real random variables and I will use exactly the same composition of these random variables. Now let me ask you this question. Is this as a random variable now? You see, these are constants. These are constants and this is random variable. Now I'm switching to this is the constant and these are random variable because we are actually getting random variables and we are trying to approximate these constants with these random variables. Now, to evaluate the quality, I have to find out what is exact, how close this, regardless of the concrete values, how close these, this average of random variables, how close it is to this number. That's what my purpose of the mathematical statistics. If I will show that these, even as a random variable is relatively close to this one, then my evaluation is right because no matter what numbers I will get as a result of any experiments, since I have theoretically proven that the whole random variable constructed in this particular way is close to this one, to the constant, then I can say that no matter what kind of results my experiments will show, this would be a good approximation for this. Now, let's just think about what does it mean that a random variable is close to the constant? Well, it means it doesn't really deviate from this constant very much. More or less rigorous definition is if I have a random variable, let's call it eta, if I have a random variable eta, which has certain probabilistic characteristics, but I know that its expectation is certain constant, and I know that her variance around this expectation is very, very small, whatever the word small means. So, it's averaging on some constant, and its variance around this deviation from standard deviation, whatever, around this particular average is small. Then I can say it's a good approximation. Now, what does it mean small is a different story? For instance, if, for instance, average height of a person in United States of America, let's say, is for men, for instance, it's 175 centimeters. Well, then you can say that probably evaluation of people, if this deviation from this number is really like within a couple of centimeters, then I can actually take a certain number of people and use their average height, and it will be probably close to 175, more or less. So, it all depends. If you're talking about measuring something of this size, right, and you know that no matter how you measure, your deviation will not be greater than this size. Is it good? Well, for many purposes, it's good. Well, actually, every particular situation requires its own definition of what small actually means. Now, in this particular case, how can we evaluate the quality of using this random variable as an approximation of this? Well, let's just think about it. Mathematical expectation of eta is exactly equal to mathematical expectation of xi, right, because you have to sum n goes outside as a factor in the denominator. These are all independent and equally distributed, similarly distributed random variables. So, expectation of each of them is exactly the same as expectation of xi, so I will have n divided by n, so that's the same. So, this is good, but again, it's really good only if the standard deviation of this is really small. So, let's think about what is the variance of eta? Well, if you remember, factor goes out of the variance in square, right, and inside you have variance of sum and they are independent, so it's sum of variances, so it's n times variance of xi. So, n is going out, so variance of eta is variance of xi divided by n. Now, this is very important, this n. It means that the more experiments we conduct, the more precise our evaluation will be because the variance goes down as n is increasing. So, we have basically satisfied our requirements of expectation of this to be the same as the constant which we are trying to evaluate, and the variance of this random variable actually can be as small as we want. So, first we have to obviously establish what exactly we want from the variance, and that's basically a choice which depends on conditions, on human conditions, experiment conditions, whatever. And then we can choose number n to satisfy this particular requirement. So, that's great. However, how can we satisfy if we don't know variance of xi and we don't really know variance of expectation of xi either? We are saying that this is a good approximation for the constant which is the expectation of xi because the expectation of this is equal to expectation of this. Okay, fine, that means it's a good approximation. But now I'm saying that the variance of this is one n of variance of xi, but we don't know the variance of xi. So, how can I basically find out what my n should be if I don't know variance of xi? Well, again, I can approximate variance of xi using the same strategy. Now, what is this? Okay, how can I approximate? Again, I will use this, and then I will use this. xi 1 minus eta squared plus, et cetera, plus xi n minus eta squared divided by n, right? That might be, might be. So, I'm averaging my square deviation from estimate. So, I'm not really, in theory, if you want to calculate variance, you have to have the constant which is the mathematical expectation. But I don't know the mathematical expectation. I know only the estimate of this, approximate value of this. So, I can use it to say that this is approximately variance of, and this is approximately mathematical expectation. These are all approximations. Now, how good is this approximation? That's a different story. You see how complicated the whole story is? You cannot really find out what's the proper number n without knowing the variance of xi. You don't know the variance of xi, but you can evaluate it. But in the evaluation, you're not using expectation of xi because you don't really know, you're using, again, evaluation of this expectation. It goes more and more, more and more complex. So, that's why mathematical statistics is so complex, basically. Now, what else did I not cover? Yes, I think that's actually everything I wanted to talk about. So, let me just summarize a couple of things. First of all, an absolutely mandatory condition for conducting statistical experiments, which you would like to use as the source of data to evaluate certain things, certain random variables and its behavior in the future, is stability of the experiments and their independence. So, results, xi1, xi2, xin, etc., they are, they must represent the random variables, which are exactly the same, which was exactly the same distribution as xi. So, they should be independent from each other because experiments must be independent and identically distributed. There is even abbreviation iid, independent and identically distributed random variables. So, this is a must. This is how you should conduct experiment. And now you can say that, hey, that's impossible. And you're absolutely right. Because, for instance, if you're doing experiments with the climate, for instance, like predicting the temperature, etc., you obviously cannot exactly repeat experiment as you did, let's say, a day ago or something like this, because conditions are changing. Not only rotation around the sun, but lots of other things. So, well, that's a problem. And to address that problem might be, and this is completely unknown territory, let's put it this way. You can think about maybe they are not exactly independent, but they are almost independent. Maybe they are not exactly identically distributed as this, but they have distribution which is very close to this. And then maybe you can still use this as an evaluation, maybe losing a little bit of precision, but still getting something reasonable. But now pay attention to the words which I just said. So I said maybe they are not exactly independent, but almost. What does it mean almost independent? You need some kind of a quantitative measure of independence which is very, very difficult. I mean, there is something, some concept like correlation and stuff like this, but that's difficult. And another thing that the distribution is not exactly the same as this one, but almost the same. Again, what does it mean almost? How can I judge how one distribution of probabilities is different from another quantitatively? It's not easy. There are many different approaches. People are trying to address these problems. Many people address it differently, and all of them make sense. So basically it's not like a completely established theory. I just wanted to emphasize how difficult it is and how accurately you should really approach any kind of statistical results, whatever is presented to you, whether it's about predictions for who will be the president of the United States or what will be the temperature in 50 years from now on Earth. I mean, all these numbers, well, they're good to have it, but you should always approach them very, very cautiously and with a big grain of salt. Let's put it this way. So in theory, and we will probably be talking about theory only, I do consider the experiments which are conducted prior to evaluation of the probabilistic characteristics as independent and identically distributed as this one. And then I will try to construct certain formulas like this one, like this one, which will allow us to use these concrete values of these random variables to evaluate certain characteristics of our random variable in question. That's it. I hope I scared you a little bit today because I really wanted you to understand the statistics is not exactly as mathematically rigorous kind of a thing as you might actually think, and you have to really approach it very, very cautiously. All right, that's it for today. Thank you very much and good luck.