 Hello everyone, this is Alice Gao. In this video, I'm going to do a short review of two important concepts in probability theory unconditional independence and conditional independence. I'll first go over the formal definition of these two independence assumptions. Then I will talk about why it is important to understand these definitions for constructing a Bayesian network. This is the definition of unconditional independence is referring to a relationship between two variables. Suppose we have two random variables x and y. We say that these two variables are unconditionally independent if they satisfy one of these three formulas below. Now these three formulas are all equivalent and we'll see why. Often when we're referring to this definition, we often don't leave out the word unconditional and just say they are independent. Intuitively, what does unconditional independence mean? It means that learning the value of one variable does not influence our belief about the other variable. This relationship is symmetric. Even though I've only written out the English description for one direction, the other direction also holds. Learning the value of y does not affect our belief about x and vice versa, learning the value of x does not affect our belief of y. You can also see the symmetry by looking at the formulas we have. The first two formulas are formally describing what I described in the English sentence below. The first formula is saying our belief of x is not influenced by whether we know y or not. On the left-hand side, we know the value of y. Where on the right-hand side, we don't know the value of y. But either way, the probability of x is the same. And the second formula is the other way around in terms of y. So whether we know the value of x or not, the probability of y stays the same. Then the third formula is a little different. So the third formula is saying the joint probability of x and y is equal to the product of the marginal or prior probability of x multiplied by the prior or unconditional probability of y. So the first two and the third are sort of two different ways of expressing this definition. There's an easy way to convert between the two. So you will see that they are equivalent ways of expressing the same thing. So let's take the second formula, for example. If we start with the second formula and we can multiply by p of x on both sides. Let me actually write this in a place where I have a bit more space. So the second formula, multiply by p of x on both sides. And then on the left-hand side, you can see that this is exactly part of the product rule or the chain rule. So by either of those rules, the left-hand side is equal to the joint probability of x and y. And this is exactly how we recover the third formula. So you can see that these are all equivalent. One more interesting observation we can make is that if we look at this third formula, then you might realize that on the left-hand side we're saying, in order to specify the joint distribution between these two variables, normally we would need to define four numbers. We need to define the joint probability of x being true, y being true, x being true, y being false, x being false, y being true, and x and y both being false. There are four combinations, so we need to specify four values. But because of this formula, in fact, every joint probability is equal to the product of the unconditional, or the prior probability of each variable. So if according to the right-hand side, then we only need to specify two numbers. One number we need to specify is the probability of x being true. Given that we can derive the probability of x being false, and then we also need to specify the probability of y being true, and then that allows us to derive the probability of y being false. And specifying these two numbers is enough information for us to recover the entire joint distribution between these two variables. This is an interesting point to realize that if two variables are independent, then in order to specify the joint probability, it's sufficient to specify the individual prior or unconditional probabilities. We'll see how this can help us to find a more compact representation of a probability distribution. Let's now look at the definition of conditional independence. This definition is quite similar to the one we've just seen. It's still about the relationships of x and y, and it's claiming that x and y are conditionally independent, except now there is a third variable involved. So there's a third variable called z involved. And any statement we can make about x and y is conditioning on the fact that we already know the value of z. So the English description is very similar to before, except that we have to say if we already know the value of z, then learning the value of one of x and y does not influence our belief about the other variable. The sentence here is saying, learning y does not influence our belief about x, but you can write it the other way around. Learning the value of x does not affect our belief about y. And formally, we can express these relationships using the first two formulas, using kind of directly translating our English description into mathematical notation. So in our first formula, we are given the value of z. z is on the right-hand side. It's the value we're conditioning on. Given that, then knowing the value of y does not influence our belief about the value of x. And similarly, knowing the value of z, if we already know the value of z, then knowing the value of x does not influence our belief about the value of y. And the third formula is, again, very similar to the one that we had before, where instead of writing this relationship in terms of joint probabilities, we're writing it in terms of conditional probabilities, we are writing it more in terms of joint probabilities. So the third formula says, the joint probability of x and y, given z, can be written as, again, as a product of the individual probabilities, where we have probability of x, given z, and then multiplied by the probability of y, given z. Now, remember for the previous definition, we were able to take the second formula and do some manipulations to derive the third formula. We can do the same thing for this definition. I will leave this as an exercise for you. The derivation for this is a little more interesting before because it involves some conditional probabilities. Now, having seen these two definitions, you may be wondering, well, these definitions sound really similar. Do they have some inherent relationship? If we know that two variables are independent, for example, do we know anything about whether they are conditionally independent given a third variable and vice versa? Unfortunately, these definitions do not have any inherent relationship at all. If we know one is satisfied, it tells us nothing about whether the other one is satisfied or not. That's everything for this video. Let me summarize. After watching this video, you should be able to do the following. Formally define unconditional independence and conditional independence. Explain the two independence relationships intuitively in your own words. Show that, for each independence definition, the multiple mathematical expressions are equivalent. Thank you very much for watching. I will see you in the next video. Bye for now.