 Hi, I'm Zor. Welcome to Unizor Education. Well, I continue the course of advanced mathematics for teenagers which is presented on Unizor.com and that's where I suggest you to watch this lecture. This lecture contains a couple of problems in theory of probabilities. I call this problems number six actually in this series. So basically, these are problems related to bias formula for conditional probabilities and I do suggest you to refresh theoretical material which precedes this particular lecture dedicated to conditional probabilities and well, let's just go and solve problems. I have only two problems, so I'll just probably speak a little bit longer about each one of them. So the problem number one. Let's say you have two events, A and B, and you also have the probabilities of these events we know and we also know the probability of conditional probability of the event B under condition recurrence of A. Basically, graphically it can be presented as this. This is A, this is B, this piece is A and B. So that's what's known. What's necessary to find out is a conditional probability of A under condition of occurrence of B. The opposite. Well, I don't want you to consider this problem as purely theoretical. Here is an example which might actually kind of lead you to appreciation of the problems of this type. Let's consider that event A is a particular randomly chosen person is a good mathematician and event B is that any particular individual randomly chosen from whatever you meet is a good chess player. Now, you might actually have heard that good mathematicians, well, in many cases are good chess players as well. So let's assume that you know approximately, at least, the number of good mathematicians among all the people you might meet randomly. And you also know approximately number of people who are good chess players. Now, if you know that certain percentage of mathematicians are good players, chess players, you basically realize the conditional probability of having, being a good chess player under condition of the person is a good mathematician. But you want to do a reverse. What's the probability of the person to be a good mathematician if you know that he is a good chess player? Well, it makes sense, actually. Now, we can just put some numbers. Let's say the probability of being a good mathematician is 0.01. So let's say 1% of the people you might randomly meet are good mathematicians. Now, the probability of the person to be a good chess player, let's say is 10%. So it's 0.1. So 10% of the people you meet are good chess players. And also assume that the probability of the person to be a good chess player under condition that he is a good mathematician. So out of all the mathematicians, which you know or you might meet, let's say 90% are good chess player, which means this is 0.9 probability. What you are looking for is you just randomly find the person who is a good chess player. What's the probability that he is a good mathematicians as well? So that's the practical problem, which we are talking about. Now, well, actually the problem is trivial, quite frankly. It's very simple and it can be solved using just the definition of the conditional probability. What's the conditional probability of A under condition of B? Basically, by definition, it's the ratio of simultaneous occurrence divided by the probability of B. So it's area of this divided by area of B. That's what will be in geometrical interpretation of this, of the probability. You know the geometrical representation of probability is basically an area of some of some geometrical figure and the conditional probability of let's say A under condition of B. It's basically a ratio of this common area divided by the whole area of B. Now, at the same time, we know all these three parameters. Now, from these three parameters, well, probability of B is known, but probability of A and B is not known, but we can find out it from here, because at the same time, I know that the probability of B under condition of A is also the ratio of the same intersection divided by probability of A. So, if you divide this area by the area of A, that would be a probability conditional probability of B relative to occurrence of A. And this, we know, we know this one, we know this one, so we can determine this and substitute it here, right? So, probability of A and B is probability of A times conditional probability of B relative to A. So, I will substitute it into this formula. So, what would I have? I will have this probability of A times conditional probability of B under conditional A and divided by probability of B. So, that's the formula. So, from this is the formula of conditional probability of A under conditional occurrence of B, if I know probabilities of A and B and conditional probability of B under occurrence of A. Now, let's apply it to this particular case. So, again, I am meeting an accidental meeting, the random meeting, the person who happened to be a good chess player. Now, my question is, is he a good mathematician? What's the chances? What's the probability of him to be a good mathematician? Well, let's just substitute it into this formula. What do we see? We see probability of A, which is 0.01 times conditional probability of 0.9 divided by probability of B, which is equal to 0.09. So, nine percent the chances are that if you have met a good chess player under these conditions, the chances are nine percent chances are that he is a good mathematician as well. Well, it's nine times greater, by the way, than if you just randomly pick the person, because if you just randomly pick the person, the good mathematicians are only one percent. So, the chances to meet a good mathematician is one percent. But the chances to meet a good mathematician among good chess players is nine times greater. So, if you want to find a good mathematician, well, go to chess tournament and well, ask around, whatever it is, and the chances are that you will meet a good mathematician among them significantly greater than if you just try to randomly pick people from the street. Okay, that was my first problem. My second problem is in some way similar and more practical, I would say. And I can even exemplify it in a very practical way. Let's say you have two events, A and B, and they are mutually exclusive, which means that the total sample space is divided by these two events. Well, as an example, which I'm going to use, event A means that the person is sick with certain illness, whatever the illness is. It doesn't matter. And the event B means that he's not sick with that particular illness. These are two mutually exclusive events. They cover completely entire population. So either the person is sick with this illness or not. There are no other cases. Now, next is we have certain event X. It's somewhere here, partially overlapping with A and partially overlapping with B. Now, what do we know about event X? Well, we know conditional probabilities of happening of event X under this and under that condition. So, geometrically speaking, we know the ratio of this area, which is common between X and A relative to A, and we know the ratio of this area, which is common of X and B, relative to B. That's what we know. And obviously, we know the total probabilities of A and total probability of B. Now, what I would like to determine is conditional probability of A under condition of X. And here is a very practical example of this. Let's say X is certain test which you are running a diagnostic test for the person. You know, some person comes to a doctor for a regular checkup, for instance. And the doctor wants to conduct this particular test, X, which is supposed to identify this illness, which I called A. So, event that this person is ill with this particular illness is A, right? So, question is whether this diagnostic procedure determines or it does not determine correctly whether the person ill or not. Okay, now we need some numbers. We're talking about example, right? Okay, so let's consider the following situation that there are only 1% of people who are sick with this particular illness, which means that the probability of A is 0.01%. Now, obviously, B is the rest, 99% of the cases. So, probability of B is equal to 0.99. Now, what about this test? Now, this is a diagnostic procedure. Diagnostic procedure is not perfect. I mean, if it's a good diagnostic procedure, it has a very high probability of identifying the illness. But here is what the real numbers are. Now, if the person is really sick, then the test X shows the positive result, which means the person is sick. Not in 100% of the cases, which would be ideal. But let's say in 99%, 95%. So, the probability of positive identification of the illness, when illness really there, is 95%, which is pretty high, right? Unfortunately, unfortunately, if the test is run on a healthy person, then sometimes it gives so-called false positive, which means that the person is not really sick. But the test shows something. Well, let's say it's an X-ray. And X-ray, you know, sometimes X-ray is clear, sometimes it's not exactly clear, and it's not always obvious what exactly the X-ray shows. Alright, so let's assume that the false positive are 5% of the cases, which is not a lot, right? So, that's what's given. What I'm looking for is the following. What's the probability of the person being really sick, if the test is positive? It's a very practical situation. In medicine, I mean, you always happen something like this. Any test has certain reliability factors, and the reliability factors are these two. What's the percentage of correctly identifying the illness when it really exists? And what's the percentage of the false positive results when there is no illness, and but the test shows some positive results? Okay, so these are reliability numbers, which are quite practical. But let's use this to find this particular probability. All right, we'll do exactly the same thing. This is equal to probability of their simultaneous occurrence relative to the X. That's what we're looking for. So we're looking for ratio between this piece, which is common between X and A, divided by this area of X. That's what we're looking for. Now, both we can determine. Now, the simultaneous occurrence of A and X, I can use my existing probabilities, is its probability of A times conditional probability of X over A. That's simple. That's exactly the same as in the previous problem. Now, the probability of X, the full probability of X, very easy actually, we can see it from here. X is divided basically in two different pieces. The one which is common with A and the one which is common with B. The one which is common with A, we have just calculated. So I will put exactly the same thing. Plus, the one which is common with B is analogous. Sorry, this is A, which is P of B times P of X under conditional B. This is the full probability of X represented as a sum of two pieces, X common with A and X common with B. Since A and B completely encompass the entire space, there are no other pieces. So this is basically the formula, which happened to be called bias formula in theory of probabilities. Now, let's use this formula in our case. What do we see? 0.01 times 0.95 divided by 0.01 0.95 plus P of B, which is 0.99 times conditional probability 0.95. Now, let me write the result approximate. And this is, quite frankly, it's a striking result. You see, just judging by these numbers, it doesn't look like the test X is really like bad. I mean, it looks really relatively reliable test. You see, 95% of the cases, if the person is really sick, it shows, and only 5% on healthy person it shows. But that's unfortunately is the problem. This is the problem, you see. 99% of people are healthy and considering 5% of them give false positive. That's a lot of people. So that's a lot of positive results, which really are, you know, misleading, so to speak. That's why I have only 16% probability of the person being really sick, if his test is positive. 16% is really low probability. It's low reliability. So no matter how you think about this test, the real result, which is supposed to be applied to real patients, is that, well, test is unfortunately only 16% reliable because there are many false positive. Okay, now I'm sorry, I have to put 05. Okay, so what I wanted to present to you is basically another application of the bias formula, especially the second problem related to reliability of the tests. I do recommend you to read the notes. They are, for this lecture, they are very detailed and that will probably help you to understand this approach to theory of probability is much better. And well, it's very practical, as you see. So that's it for today. Thanks very much and good luck.