 Hello, everyone. This is Alice Gao. In this video, I'm going to talk about a universal approach for calculating a probability. In the previous videos, I introduced several rules that you can use to calculate probabilities. The sum rule, the product rule, the chain rule, and the base rule. Believe it or not, these rules are sufficient to calculate any probability that you ever want to calculate. Now, when you have a problem, the challenge is often, which rule should I apply first, then after that, which rule should I apply next. In this video, I will discuss a universal approach that I came up with, and you can use this universal approach to calculate any probability. This approach may not be the most efficient one, but it always works. Let's look at some example problems. Suppose that we have three Boolean variables, a, b, and e. Any probability calculation problem you can come up with, using a set of variables, belongs to one of two categories. In the first category, we want to calculate a joint probability. In this example, we're calculating a joint probability over a and e. This probability may involve all the variables, but in this case, it only involves some of the variables. Category number two, we want to calculate a conditional probability. In this example, we want to calculate the probability of a given e. Again, this probability may involve all the variables, or it may only involve some of the variables. This is a universal approach that I came up with. Let's take a look. There are three steps. Step number one, we want to take a conditional probability and convert it into a bunch of joint probabilities. We can do this by using the product rule. This is step number one. Step number two, we want to take the joint probabilities and introduce all of the variables into each joint probability. Introduce the other variables into each joint probability, and we can do this by using the sum rule. Now, step number three, the only work left is to calculate each joint probability that involves all of the variables. We can do this by using the chain rule. Now, if you are calculating a conditional probability, you need all three steps. If you are calculating a joint probability, you only need to perform steps two and three. Let's look at an example. I will use example number two on the previous slide. We want to calculate the conditional probability. The probability of a being true given that e is true. What probabilities are we given? Generally, we assume we are given enough probabilities so that we have enough information to calculate what we need to calculate. In this case, we're given the probability of a, we're given the probability of b given a and b given not a, and we're also given the probability of e given a bunch of combination of values of a and b. If you look at this for a second, you might notice that we are given probabilities in the form of the chain rule already. We have the probability of a by itself, and then probability of b depending on a, and then the probability of e depending on a and b. This suggests that we can calculate the joint probability of the three variables using the chain rule using the various probabilities that were given. Step number one, take a conditional probability and convert it into several joint probabilities. We can do this by using the product rule in reverse or we can make use of a form of the base rule. Let's take a look. By applying the product rule in reverse or the base rule, we get the following expression. If we want to calculate the probability of a given e, it's equivalent to calculate two other joint probabilities. One is the joint probability between a and e, and the other one is just the probability of e. Step number two, take each joint probability and change it to involve all of the variables. In other words, if a joint probability has missing variables, then we should introduce the missing variables into its expression. And we can do this by applying the sum rule in reverse. Let's take a look. Here are two examples of using the sum rule in reverse. For example, if we take a look at the probability, the joint probability between a and e, we only have two of the three variables in this expression. And we want to introduce the third missing variable, which is b. How can we do this? Well, we can write this joint probability as a sum of several joint probabilities. And each term in the sum has all of the variables in the joint probability. And we have to consider all of the possible variations for the missing variables. In this case, the missing variables is only b, so we have to consider a term where b is true, and then we have to consider a term when b is false. As another example, we can take the probability of e and write it as a sum of a bunch of joint probabilities where each joint probability has all of the variables. In this case, we have two missing variables a and b. And because they're both Boolean, we have four combination of values for a and b. So we end up having four terms, having involving all of the combinations. So a and b both are true, a is true, b is false, a is false, b is true, and a is false, b is false. So at this step, we took a problem where we are supposed to calculate a joint probability between some of the variables and turned it into another problem where we have to calculate several joint probabilities, but each joint probability contains all of the variables. Finally, we're at step number three. This step says take each joint probability and calculate it using the chain rule. At this point, I haven't introduced Bayesian networks yet, so chain rule is our best friend for calculating a joint probability. When I introduce Bayesian networks, you will see that sometimes there might be an easier way of calculating a joint probability. Let's take a look. From step two, we have six joint probabilities to calculate. Let me just pick one of them and show you an example. Let's suppose that we want to calculate the joint probability e given e and not a and b. First of all, notice that the order in which I wrote the variables in the joint probability doesn't matter. We can change the order of the variables, however we like, and the meaning stays the same. So let me make this a little more convenient for myself and I will change it so that a appears first and then we have b and then we have e. The reason I changed the order of the variables is so that I can see it easier. I can have an easier way of seeing how I would apply the chain rule. Remember how we apply the chain rule? We have to order the variables and then based on that order, we will have a bunch of conditional, the product of a bunch of conditional probabilities in order. The first variables would be a probability by itself and then the second variable will condition on the first variable, then the third variable will condition on the first two variables, the fourth variables. So every variable will condition on all of the variables that came before. So in this case, where we have the probability of not a multiplied by the probability of b given not a multiplied by the probability of e given not a and b. And looking at this expression, you can see that either we can derive one of the probabilities from what's given, so probability of not a is just one minus the probability of a, or we're given that probability directly. So the other two probabilities b given not a and e given not a and b, they are both given directly. So we have a way of calculating this joint probability involving all the three variables. That's it for the universal approach. Let me do a quick recap. The universal approach has three steps. In step one, we take a conditional probability and convert it into an expression involving several joint probabilities. In step two, we take each joint probability and make sure that it contains all of the variables. So we'll introduce all of the missing variables into the expression. And finally, step three, we will calculate each joint probability using the chain rule. That's everything for this video. After watching this video, you should be able to do the following. Calculate a conditional or a joint probability given several probabilities using Alice Gauss universal approach for calculating probabilities. Thank you very much for watching. I will see you in the next video. Bye for now.