 Hi, I'm Zor. Welcome to a new Zor education. I would like to solve a problem. It's about statistical distribution, and I'm going to solve the problem within the framework of the task A as I formulated it, and I will go into the details. By the way, this lecture is presented on Unizor.com, and that's where I suggest you to watch it from because there are detailed notes for every lecture. Plus, there is some functionality like you can enroll into the course, you can take exams, etc. if you're registered. So this is the better place to watch this lecture, although it is on YouTube, obviously. All right. So first, let me just briefly mention what I mean task A when we're talking about statistical distribution. Well, let's consider you have a random variable, which takes values of x1, x2, etc. xk with unknown probabilities. And your task is knowing the theoretically possible values the random variable can take. And knowing certain statistics about what exactly happened with the random variable c. For instance, it took the value x1, certain number of times, x2, certain number of times, and xA, certain number of times. We made n experiments, and based on these numbers, I would like to make an estimate what are these probabilities, and what's very important is the margin of errors. Now, obviously the best estimate of these probabilities are empirical frequencies, right? So that's the best we can do for p1, and that's the best we can do for p2, etc. So that's obvious. Now the next thing is to evaluate the margin of errors and where exactly our real probabilities are relative to empirical frequencies. And I'm going to solve this problem based on a concrete problem, which I'm going to specify right now. So here is the problem. Let's consider that in some country we have elections. Now we have three different parties which are offering their candidates for top position. We have the white party, we have the blue party, and we have the red party. So each one has candidates, right? These are candidates. And then people basically vote each one, each person in the country vote for somebody who this particular person prefers to see at the top. Now, we have three positions. We have the president. We have the vice president. And we have minister of defense. Or defense minister, okay? So these three people will take these three positions based on whoever gets most votes will be the president. The next one will be the vice president. And the third out of these three will take the defense minister's portfolio. Now our task, statistical task, before the the real election started, is to evaluate the chances of each of the parties, right? So our purpose is to make certain predictions for elections based on certain statistical results. So here are statistical results. There was a survey of 4,000 people. Out of these 4,000 people 1480 said that they prefer the they prefer the white party candidate. 1320 prefer the blue party representative and 1200 suggested that they would prefer the representative of the red party. Now the first problem, what are the chances or what are the probabilities for each of those guys to to have the top position? It's obvious. It's 1480 divided by 400 for the white party which is 0.37. So that's the probability of the white party candidate to become the president. Now for the blue party, we have obviously 1320 divided by 4,000 which is 0.33 and for the red party, the probability of the red party candidate to become the president is 0.30. So that's easy. Now the question is what are the margins of errors in these cases? Here's how we can calculate that. I can actually refer to the previous lecture for the formula, but I would like to remind you what's the derivation of these formulas, etc. Here's how we will approach it. Let's introduce a new variable, random variable, beta, which is equal to 1 or 0 and I'm talking right now only about the white party candidate. So I would like to evaluate the margin of error of this particular probability 0.37. So I'm introducing new variable, beta which for each out of 4,000 person would represent 1 if this person votes for W and 0 if it's not W. So if the person, the voter votes for representative of the white party candidate to be his favorite then the beta is equal to 1, otherwise beta is equal to 0. So first of all, let's consider the true distribution of beta. Now, if we knew that the probabilities of becoming a president for the representative of the white party, or red party are correspondingly PW, P, beta and PR and we don't know these probabilities, we are evaluating these probabilities, right? So these are our evaluations. So if this is a true probability, then I can say that the probability of beta to be equal to 1 is equal to probability of beta is equal to 1 equals to PW. Now obviously this is a mathematical expectation, which is also PW and as far as variance is concerned as we know I calculated many times it's PW times 1 minus PW. Now I also mentioned that this particular expression is less than or equal to 1 quarter if PW is a probability which means it's between 0 and 1. So this is just a regular quadratic polynomial and in the segment from 0 to 1 1 quarter is its maximum when PW is equal to 1 half actually. Remember the parabola goes this way. So I can use this to evaluate the variance of beta. Even if PW is unknown to me, I can use the maximum which PW can take as a variation and that actually signifies certain precision of my evaluation of the probabilities, right? Now since I'm talking about 4,000 different people each one of them is voting with this probability for the representative of the white party. If I will take the average of those guys so I have 4,000 identically distributed random variables which have exactly the same distribution as my beta which I have defined for one particular person and I am averaging them, right? Now what is this? This is actually one experiment which gave me 14,080, right? Divided by 4,000. So this is my 0.37 evaluation because 1480 of these are ones, right? And the rest are zeros. So basically this is a random variable which I basically made an experiment once and got the value 0.37. It's a new variable in the future. Now what is the expectation of this variable? Obviously it's the same as expectation of this, right? Because expectation is added together and these are all the same so it's 4,000 of them divided by 4,000 it's still the same. The variance of this is obviously less. The variance of eta is pW times 1 minus pW divided by 4,000. Remember that the average of identically distributed independent variables is you have to actually you have to factor out this in square and then there are 4,000 of them so that will be the result which obviously is smaller than 1 over water times 1,000. So that's how I evaluate the variance of this. That's actually a very big deal because it allows me to evaluate the margin of error of this particular estimate which I made. Now, here's how. Since I know the variance since I know the variance I obviously know standard deviation it's less than or equal to 1, 2 square root of 4,000, right? Because standard deviation is the square root of this. So I'm evaluating standard deviation. Now, if I would like to do my evaluation with 95% certainty which is kind of standard this is actually an interval which is equal to 2 sigma from the midpoint, right? Midpoint is this one and 2 sigma is 1 over square root of 4,000 which is approximately I have it somewhere 0.0158. So that's my margin of error for this particular number. Now, what does it mean? It means that the real probability is with 95% certainty is from 0.37 minus this to 0.37 plus this number which is 0.3542, right? Right. To 0.3858. That's my margin of well, that's my interval considering the margin of error. So with 95% certainty, I can say that the probability of the representative of the white party to become the president is this. Okay, now, let's talk about other two parties. If I will do the same kind of logic with representatives of the blue party. Now, obviously this thing is exactly the same because we are not really depending on anything but the number of people in this survey which is 4,000. So this value is just based on the number. Which means my margin of error will be exactly the same for the blue party representative. And we are talking about evaluating his chances to become the president. Same thing with this guy. So let me just put it down. So the probability of the blue guy the blue party representative is 0.33 minus this which is 0.3142 and plus would be 0.3458. And the same thing for the red party which is minus would be 0.2842 and plus would be 0.3158. So that's the result of my calculations based on crude evaluation of the precision crude evaluation of the variance of all my random variables which I am talking about and that's the intervals where with 95% probability our probabilities belong to. Now, is it sufficient to predict the results of the election? Not exactly because look this interval is definitely above everything else. So even the maximum of this is still smaller than the minimum of this interval. This is 3458 this is 3542 which means that the probability with 95% certainty I can say that the white party will become the president. Now, how about the vice president and minister of defense? Now, we know that the vice president position should be awarded to the guy who will be the second, right? But now you see this interval and this interval, they are intersecting. The smaller is 3142 and the bigger one is 3558 which is bigger. So intervals overlapping which means we cannot differentiate between them, between these two probabilities with 95% certainty. That's what's very important. So whatever we have done is sufficient for prediction of who is the president with 95% certainty. But it's not sufficient to predict who will be vice president and who will be the defense minister. Now, if I will use exactly the same methodology for more precise evaluation, I need numbers to be greater. So 4000 is not enough because 4000 gives me only this precision and this precision is too rude to crude to differentiate between these two guys. Now, if I will get more people surveyed and let's just assume for a second that the proportions will be the same which not necessarily the fact, but let's just assume that if I will ask more people the same question about who is your preferred candidate, I will have the same probabilities. Now, I will have to have the difference between these two to be greater than here. This is 0.30 and this is 0.33. I would like interval to be here. Which means double margin of error should be smaller than the distance between these two things. So, not good because the distance between these two is 0.03 right? But the double of this is greater it's 0.3 1,6 whatever. Right? Now, I need it to be smaller. So, I need the 1 over square root of n and is the number of my number of people which I have surveyed should be smaller than half of this distance right? I can put this too. Two margins of errors should be smaller than distance between these two. So, that's what my condition now from this I can derive what is n supposed to be. So, square this and invert you will get n greater than 4444 something like this. I did it once. So, 4,000 is not sufficient to differentiate in case you have such difference between the probability, between empirical frequencies. But with the same empirical frequencies 4444 people would be sufficient because then you will definitely differentiate between the first and the second place. Then you can say that this is definitely with 95 percent certainty greater than this. Because without this number of people, with only this number of people, 4,000, you cannot say that 0.33 with 95 percent certainty gives you better chances than for this guy. Okay. This was based on a very crude evaluation of my variance based on just the number of experiments number of people which we have surveyed. Now, there is a better way. Now we were talking about sample variance which is slightly better than this maximum which I have just used. And let's do this same calculations based on sample variance because that's how practical people actually do in cases like this. Now, what's good and what's bad about sample variance? Well, the good is that it gives you a little bit better precision of the variance. The bad thing about sample variance is that it introduces certain additional element of uncertainty however small it is. So, let's do that. So instead of calculating for this variable beta that its variance just less than or equal to one quarter, I would like to evaluate it better. Now, how can I do it? Well, to better evaluate this particular variance I will use the sample variance which means sample variance is the... what is variance? Variance is average of square of deviation, right? So, let's just do this average of square of deviations. We will use this particular variable if you remember we had for white, blue and red these statistics out of 4,000. Right? These are preferable numbers. So, this variable, random variable beta is equal to one if the person chose the white party candidate and zero otherwise. So, in 1480 cases my variable beta took value of one and its deviation from its empirical frequency square is this the rest out of 4,000 the rest is what? 25, 20 it took value zero and if you remember I have to divide it not by 4,000 but 3,999 to make this evaluation of my variance unbiased. Again, if somebody doesn't remember where I got 3999, just go to previous lecture where I'm talking about this sample variance and I have the value of this I guess somewhere which is 0.2331 now is it better than this? Yes, it is. This is 0.25. This is 0.2331 which is slightly less. What I can say is that sample variance in this particular case at least although it does give you a little bit more precise evaluation of variance which means your interval where your real probability will be around its empirical frequency would be narrower which is good because it allows you to distinguish different probabilities for different parties a little bit more precisely. So maybe this is even sufficient to differentiate between three candidates because this crude evaluation was not sufficient to distinguish between the vice president and the defense minister, right? Between blue and red. Maybe this one will be. What I can say is that this 0.37 can use this particular variance and that's a little bit better than this. Now, this is the variance of beta, right? Now, we are talking about beta 1 plus etc. plus beta 4000 divided by 4000, right? This is the average of different random variables each for individual who was surveyed which means my variance of this so variance of the estimate so this divided by 4000 is 0.37, right? So this is my empirical frequency so the variance of this would be 0.2331 divided by 4000 okay? And from this I can extract square root and it will be my standard deviation sigma and 2 sigma which I need for 95% margin of error. 95% certainty that margin of error is good would be in this particular case 0.0153 okay? It's a little bit better than before before I get 0.158 for all three different candidates. Now for the first candidate I've got this margin which makes me thinking that I have a slightly narrower real interval where I can evaluate my probability so my PW would be from 0. 37 minus this is 35 47 right? okay? I didn't write it down and 0.3853 okay? Got that now if you remember with the crude methodology I didn't really have to calculate separately the probabilities and margins of error for these two guys now I have to because I have to take into consideration exact statistics which happened to evaluate my variance so exactly the same evaluation of 2 sigma for blue gives me 0.049 149 and evaluation for the red guy would give me 0.0145 so I have different numbers as you see for different people right now because it's not only dependent on the number of people with survey but also on exact statistics which allows us to say that the for blue party to become the president the probability is this is 0.33 and 0.30 if you divide by 4,000 so that would be from 0.3 33 minus this so it's 351 and 34 49 and for the probability of the red party candidate to become the president is 0. 0.8 55 0.30 minus this and 0.31 45 and now what's important now the maximum of this guy is less than the minimum of this guy oh sorry, yeah, that's something like this no, no, no, no, wait a moment I'm wrong here 33 minus so it's 31, yes right, now that's right um which means that these intervals are not intersecting anymore they're not overlapping which means that with this more precision which we are allowing ourselves by replacing the crude methodology of evaluating variance with a more precise evaluation based on sample variance calculation so sample variance calculation is a little more a little bit more precise but it gives now intervals non-intersecting which means that with 95% probability or certainty rather I can say that it's the blue guy who will be the vice president and the red guy will be the defense minister so the purpose of all these calculations etc is just to show how to use the real values, the real results to evaluate unknown probabilities of certain things and we are talking about task A of statistical distribution when you know that there are certain predefined results of experiment and you just don't know the probabilities these results will be obtained by random experiments right so in this particular case and the results depend actually very very strongly depend on the number of experiments which we are doing so as you see in this particular case of relatively close distribution among different participants you need a lot of people to survey to make your evaluation significantly precise to differentiate between different positions for instance like in this particular case well that was my first problem and that's related to this task A now I will probably spend some time for analogous problems in some other cases like for instance when you don't know the exact values our random variable can take or maybe that's in the discrete case in some continuous cases where also you have certain results either bounded or unbounded etc but in all these cases we have certain preliminary assumptions and preliminary calculations which will eventually fall into the same task A algorithm so it's very important to understand how all these manipulations are done in this particular case because all other we have tasks B, C and D will basically follow the same logic and I will spend much less amount of time to present them now I do suggest you to read the notes for this lecture again and what will be even great even than just to read my notes if you can just forget about whatever you saw on the screen right now try to do the same calculations yourself and if you will come up with certain the same numbers would be great if you don't think about who made the mistake maybe I made a mistake alright that's it for today, thank you very much and good luck