 Hi, I'm Zor. Welcome to a new Zor education. I would like to continue talking about Bernoulli distribution, Bernoulli statistics, as part of this advanced course of mathematics for high school students. I do suggest you to watch this lecture from Unizor.com because it contains notes and problems and solutions, etc., etc. and exams for registered students, so it's like a whole course for either self-study or just as an auxiliary material for the regular school. All right, so we were talking about the problem and its solution and right now I would like to basically exemplify whatever I was talking before about Bernoulli statistics with three problems, which is actually one problem turned three different ways. There are three very important characteristics when you are talking about statistics. The volume of samples which you have, like number of experiments, the margin of error you would like to have or you already have based on whatever you have and the level of certainty, the probability that your evaluation is correct basically, that's what it is. So these three parameters, statistical parameters are very much related to each other and these three problems which I'm going to present today are exactly how to evaluate one parameter based on two others. Okay, so let's consider a situation when you have some kind of manufacturing facilities and they manufacture parts for, I don't know, car parts, whatever and there is a quality control. So the quality control takes a sample of, let's say, 10,000 parts which were manufactured in this facility during a certain amount of time and they found that 300 of them are defective. They don't pass the QA control. Something is wrong. So now why it happens is a different story. I mean it depends on what kind of problems they have identified with these 300. Now as the person who is probably more interested in financial situation with this particular manufacturing facility, I might be interested just to know how probable the manufacturing of the defective parts is. Obviously if I have a choice between two different manufacturing facilities to invest my money, I would like to invest into the manufacturing facility which is more reliable, which has less number of problems like this, etc. So I would like to evaluate how big this actually is and I can do exactly the same with another facility, have another number of parts taken for quality control and number of results but how I can compare it. So I have to compare with certain mathematical validity. So my first question is, I would like to be able to evaluate the probability of manufacturing a defective part with a certain level of certainty and my certain level of certainty is 0.9545. Now it's not just an arbitrary number. I mean if you remember the normal distribution has sigma rules. Now sigma rules for normal distribution is as follows. If you have a certain bell curve and that's the statistical distribution of random variables. Now this is interval of one sigma around the mean. So it's from minus sigma to plus sigma, from the mean value, mu. Now this is minus 2 sigma plus 2 sigma and this is 3 sigma minus and plus. So the probability of random variable with normal distribution to be within one sigma from its mean, which is relatively narrow strip of values. So one sigma would be 0.6825. Two sigma rule is, so what's the probability to fall within mu minus 2 sigma and mu plus 2 sigma? That's obviously bigger. That's 0.9545 and this is exactly the number I have chosen. And the 3 sigma interval, that's a very wide interval around mu. 3 sigma would be 0.99, I think 73, final mistake. So that's the reason why I have chosen this particular thing. So with the probability of 0.9545 or I would rather say with certainty level of 0.9545, I would like to evaluate the probability of manufacturing a defective part. Now obviously my best approach would be just to divide 300 by 10000. Now why? Well let's just go back to the definition of probability. Now one of the definitions and probably most natural one is that the probability is related to frequency. So if you would like to know what's the probability of certain event, you observe occurring or not occurring of this event during certain number of experiments. And as your number of experiments goes to infinity, then the statistical frequency of occurrence of this particular event would tend to some number which is defined as a probability of this event. Now obviously open question is whether the Riesel limit, whether it's a unique limit, et cetera, et cetera. But that's besides the point. Now the frequency approach to definition of probability is kind of natural and everybody understands it. So if I have 300 out of 10,000 defective parts, well then it actually prompts me to say that most likely the real probability of manufacturing should be somewhere around 300 divided by 10,000, which is 0.03. So my best estimate for mathematical expectation of the event manufacturing the defective part would be 0.03. Now in the previous lecture when I was talking about solutions, I have suggested that the best way to approach this particular problem mathematically is to consider every test which I have made in my quality control and I've made 10,000 tests. Consider this to be a Bernoulli random variable which has value 1 or 0. 1 means the defective part, 0 means it's good part, not the facts. And we are interested in the real probability p of this random variable to take the value of 1. So we have experimented with this random variable and in the series of 10,000 experiments we have come up with 300 over 10,000 frequencies. Now if somebody else makes another series of 10 experiments, well he will have different obviously number, maybe 310 or 250 who knows. So what I have just did, what I have just done and I have calculated basically the frequency which can be expressed as the following. If the result of the first experiment is x1 and result of the last experiment is xn where n is equal to 10,000. So these are results of my experiment with xc. I know that 300 times I have 1s somewhere here and the rest I have 0s. So basically if I will summarize them and divide by n, that would be my frequency or sample mean as we are saying. So again the sum of Bernoulli variables is actually the number of times this event occurs because we have assigned 1 when it occurs and 0 if it's not. So their sum divided by n would be my sample frequency in this case is 0.03. Now if I make another series of n experiments, I would have another number. So basically I would like to say is that this is actually a single value of some random variable and random variable is eta which is equal to xc1 plus etc. plus xcn divided by n where every xc1, xc2, etc. is exactly the same way distributed as xc and they are all independent. So they all kind of represent each individual experiment. So I have 10,000 experiments, I have 10,000 random variables. Each one of them is 1 with a probability p is 0 probability 1 minus p and their sum divided by n is basically a probabilistic picture of whatever I have done. So what I have done, I took one particular single value of eta, one single series of experiments and I've got this particular number. Now as you know in Bernoulli variables the mathematical expectation of xc is it takes value of 1 with probability p and value of 0 with probability 1 minus p so the whole thing is equal to p. So mathematical expectation is actually p. That's why this which seems to be kind of the same mathematical expectation should be as xc right? Mathematical expectation of eta is what? One ends can be brought out of the expectation and then some of these mathematical expectation of sum is sum of expectations and each one is exactly the same p as xc. So I will have np divided by n so it will be exactly the same p. So mathematical expectations of these two are exactly the same. That's why single value of eta which I have obtained might actually be an approximation of the real p. And how good is this? Well we are talking about this that the good measure is a variance. So what's the variance of this? And we were discussing it in the previous lecture. Variance of eta is variance of c divided by n as we have discussed previously. And again it's very easy because variance is a quadratic thing right? It's supposed to be taken out in a square so it will be 1 over n square variance of a sum. Variance of a sum is sum of the variance so it will be n variances of c divided by n square which is variance of c divided by n. Now what is the variance of c? Well variance of c is equal to p times 1 minus p. That's what I have derived in the previous lecture and we were derived many times before in the variable explanation. Alright so I have the variance and the fact that this denominator contains n is very encouraging. It means that as n grows my variance is getting smaller and smaller. Now variance is a measure of deviation from the mean value right? So if this is a variation of the mean then this is a variation of the mean. This signifies the deviation from the mean which is smaller and smaller as n grows. So that's a good thing. So our distribution becomes more and more concentrated around its mean value. And that's why even a single value of our random variable eta might be a good approximation of its mean value. So mean value is p so that's what I would like to know and I don't know it. But single value of eta would be a good approximation for the mean value because all the values of eta are concentrated around my mean value. And how well they concentrated, how close they are that's the measure and the measure is actually getting smaller and smaller which is a good thing. Now another assumption is that this sum with relatively large n is very much close in its distribution to normal distribution. We have spent many times actually discussing this issue. This is the central limit theorem when the sum of random variables under relatively liberal conditions is distributed closely to a normal distribution with the same mean and variance as the sum. So I actually can say that eta is almost normal random variable with mean value p. So the mean value is here that's my p and the variance this one which is getting smaller and smaller as n goes to infinity. Now how can I estimate where is my mean value p if I have only a single value of this random variable? Well I can do it using the sigma rules. I know that my let's say 2 sigma interval around the mean value is the interval where my random variable falls with this probability 0.9545. And that's the certainty level which I would like. So basically I can say that with this probability with this certainty level this random variable eta would be within 2 sigma interval from the mean. So all I have to say is that my 2 sigma interval around my unknown probability p is actually the margin of error which I need to establish. So I can say that whatever the value here is which is 0 point the value of frequency which is 0.03 it's a good estimate of the mean of this particular random variable. And the quality of this estimate is measured as follows with the probability of 0.9545 it's within 2 sigma interval around the mean unknown mean. But I don't care actually because I'm talking about this is my estimate and I know that the mean is within 2 sigma range from it. So it's around plus or minus from this from this way. Now that's great but I don't know my sigma because it depends on variance of xi and xi is again depends on unknown probability p. But here is what I have suggested the previous time as far as the solution to this problem. Now this is not a good solution let's put it this way but it's a very practical solution in many cases. Now again my variance is equal to p1 minus p divided by n right. Now what I have suggested in the previous lecture is yes I don't know p times 1 minus p however this is 0 this is 1 this is function. 1 minus x it's a parabola right. 0 and 1 are the points where it intersects the x axis and its maximum is in the between which is one half. And if it's one half then the maximum is equal to one quarter. Now if you don't care about these because the probability cannot be less than 1 or greater than 0. So I can say that my maximum of this is one quarter so I definitely know this right. So now this is already something which I don't know because I know n it's 10,000. So I can say that my variance is less than 1 over 4n which means my standard deviation is square root of variance which is less than 1 over 2 square root of n. So although I don't know my variance exactly of this variable eta I know it's upper bound. So if I will use this as my 2 sigma it will be a wider interval than reality in reality. So at least it's some measure of evaluation. So what is this? Well n is 10,000 so this is 1 over 200 and now I'm talking about rule of 2 sigma right. Now 2 sigma is 100 which means that the real probability would be within 0.03 minus 100. So it's 0.02 and 0.04 with certainty level 0.9545. So this is actually a very simple problem which I have spent lots of time discussing and I just wanted you to understand that you cannot just say that the frequency is 0.03 which means probability is somewhere around 0.03 if you do not define somewhere around. Now somewhere around means the following it's better probably than this because we have taken the upper bound but at least I can say that the real probability would be within 0.02 and 0.04 with this certainty level. It's not 100% certain by the way. Probability can be significantly greater or significantly smaller than this. But I cannot, based on whatever I know right now, I cannot say anything better than this. So that's something which is good. Now two other problems I would like to go much faster and their purpose is you see again right now we had n and we had the certainty level p. Well, I use lowercase p. This is not uppercase p which is the probability of c to have the value of 1. And what I have derived from this? I have derived the margin of error which is 2 sigma which is 0.01, 100. Now two other problems I will just change something. I will be given one particular, two particular parameters and define the third one, different third one. So that's the end of this first problem. We have defined based on the number of experiments and required level of certainty 0.9545 I have derived the margin of error. So from 10000 from certainty level 9545 my margin of error is equal to 100. Alright, let's go to the next problem and it will be much easier in this case since we basically know what we are doing. By the way, I do suggest you after you finish this lecture to go to the website Unisor.com and read the comments because they are also like a textbook basically. It's my notes for this lecture which I have written before starting it. Alright, now what if under the same circumstances I would like to narrow my margin of error from 100 to 5000. So to basically reduce it by half. What would be the certainty level in this case? Certainty level of my evaluation. So basically what I am saying is that the probability, this is uppercase, from 0.025 to 0.035. What would be the probability of this being true? So I reduced by half. So going back to whatever I was just saying before my margin of error was equal to 2 sigma. Now I would like to reduce by half. So my interval would be half of this one. So it would be single sigma rule. Now single sigma rule for normal distribution is this. If we are requiring more narrow interval around my average, my sample average, my sample mean, then obviously I cannot say it with more, I can say it only with less certainty. So I know that within larger interval the certainty is 0.9545. But if I would like to say that it's actually much closer I cannot be as certain. So my certainty level goes to a single sigma. So what is in this case probability of having this thing? Well obviously this is 0.6825. Because everything else is exactly the same. So all my calculations, my n is the same, my number of defective parts is the same. So my sample average is also the same. And I am requiring 0.005 interval around it. So it's from 0.03 minus 0.005 which is this to plus. And since I have already calculated my sigma which was equal to 1 over 200 which is exactly this one. That's why I have chosen this one. That's why I have applied rule of single sigma. So that's my second problem to establish the level of certainty if my margin of error is given. Now the first problem was I have established my level of certainty and then I have found what my margin of error should be. And in that case when that was 0.9545 my margin of error was greater 0.01100. But if I would like to actually be satisfied with a smaller certainty level I can narrow my interval in half. And the third problem is what if I would like to have a certainty level and I would like to have a certainty level to sigma. And I would like actually to have a little bit more precise variation. So I would like my margin of error to be this small. So with my 10,000 experiments I can achieve this margin of error only with this certainty level. But now I would like to have more certainty and to do this I have to have more experiments. The question is how many? Well let's just think about it. Now sigma if you remember was equal to 2 square root of n, right? And that's why if I would like this to be equal to 0.005 just one second. No, I would like, sorry, I need this certainty level. I would like to have certainty level 0.9845 but I do not know my n. That's my problem. So with this certainty level what interval I have to be satisfied with around my mean value? It's 2 sigma, right? This certainty level is associated with the 2 sigma rule. So I know that if I am on the 2 sigma minus and plus 2 sigma then I will be within 0.9845. So I need 2 sigma to be equal to 0.005 because that's the margin of error which I would like to have. Which means that 2 divided by 2 square root of n is equal to 5.000 which is 1.200. So this is square root of n is equal to 200 and is equal to 4.000. So if I would like to reduce by half my interval relative to my first problem where I was actually requiring this probability but I was satisfied with 0.01 margin of error. If I would like to reduce margin of error by half my number of experiments should go up by 4. That's because the variance is a quadratic function. So if I would like to have this precision, so the probability of my p to be within 5.000 of my average value, if average value is 0.03. If I would like that to be equal to 0.9845, so if this level of certainty is required then my number of experiments should be 40.000. So this is basically three typical problems which statisticians are solving all the time. The first problem, let me just make a quick summary. So the first problem I was given number of experiments and I was given the level of certainty and I have derived my margin of error. Second problem I also was given this but I have defined my margin of error to be twice as narrow. And then I have derived that the probability of my variation, my certainty level was actually smaller. This was 0.9545, this was 0.60 something. And the final problem was the third one. If I have the level of certainty and margin of error, what is my number of experiments which I have to basically get into? I don't know which problem is more prevalent. It seems to me that sometimes experiments have already been done than either of these guys. But sometimes if you are planning experiment then you have to really think about what the certainty level you would like to achieve and what's the margin of error you would be satisfied with. And then you will derive the number of experiments you have to do. And let me point out another very important aspect of this thing. Now you remember that the real variance of eta, eta being c1 plus c2, etc. divided by n is equal to variance of c divided by n which is equal to p1 minus p divided by n where p is the probability of c to take the value of 1. Now what I have done, I have done this. Now how good is this? Well obviously if p is equal to 1.5 that would be exactly equal. Now if p is, let me go back to the graph. This is my graph y is equal to x1 minus x. This is 1, 0, this is 1. So somewhere in the middle, yes. It's very, very close to this maximum value of 1.25. But if you go to the both extremes, so if your event is either very unlikely or very likely then this particular variation really not good at all. I mean the real variation, real variance would be significantly smaller than this one if my p is closer to 0 or close to 1. So sometimes it's a good estimate and whatever we have derived was relatively good. But in some other cases it's not and it's not good whenever my events are rare, too rare or too frequent. Now speaking about defective parts, is it rare event or is it frequent or is it like 1.5 probability, 50-50 as they say. Well that's supposed to be a rare event. So the real probability should actually be close to 0. And in this particular case, well let's say the probability is 0.03 which is my evaluation of the probability. When obviously this times 1-0.03 would be significantly smaller than 1.5. It would be something like 0.0, like 2.8 approximately, right? It's significantly smaller than 0.25 which is 1.25. So our evaluation was really not very good in this particular case. And what we can do is we can really make this significantly better and smaller, therefore. However it will not be with 100% certainty. So that's another level of uncertainty which we can introduce to improve this particular variation of my variance. But again for the price of losing certain amount of certainty. Now this will be a subject of the next lecture where I will spend certain time basically to better evaluate this variance. Not 100% certain but better. Let's put it this way. So then we have to really choose what we prefer to have absolute certainty in this evaluation. And then we will have another uncertainty because it's still a variation of the variance and our variable can go outside of this interval. Or on the top of that uncertainty to add another uncertainty of evaluation of this variance more precisely than this. Which would narrow our interval but again it will reduce the certainty level. So as you see statistics is a really tricky kind of a science but it's a science nevertheless. Alright, thanks very much. I do suggest you to read the notes for this lecture on Unisor.com. Thanks very much and good luck.