 Welcome to our lecture on probability. It's really very difficult to define probability. The best way to see it is you see how often this event will occur in the long run. For example, to keep tossing a coin over and over again, you'll note that 50% of the time you'll get ahead. So we say the probability of the head is 0.5. So think of it that way, but really there's no good definition of probability. Now the difference between probability and statistics. In probability, the population is known. You know what the mu is going to be. We talked about this. U is NP. You toss a coin 100 times. You're supposed to get 50 heads. In statistics, you don't know the population parameters. So you've got to make an inference. You take a sample and from the sample, you can make an inference to kind of estimate the population parameter. The definition of the probability of an event that we just saw is an objective probability. It's the classical mathematical definition of the probability of an event as the long run frequency of occurrence of a particular event. On the other hand, a subjective probability is the kind of probability we might use in ordinary conversations that's not based on mathematics. And it measures the strengths of our personal beliefs. For example, if someone says, as a personal opinion, the probability that this new product will succeed is 95%. That's the strength of a personal belief. Or even better, maybe. The probability that my commute will not exceed one hour tomorrow morning is 90%. And I haven't exactly computed that. So it measures the strength of my personal belief. That which is observed as the result of a stochastic or probabilistic process is a random variable. A random variable takes on values. Each of these values has a probability. Each is related to an event. The event is the value that the random variable takes on. So for example, when we talk of die, that's a process. It's a probabilistic process. We could get a one on the face of a die. We could get a two. We could get a three. There are six possibilities. Assuming the die is fair, they're all equally likely. So each one of those has a probability of one sixth. A simple probability, as we'll see soon, that's also called a marginal probability, that's the probability of an event. We're looking at one event, and we're looking at the probability that that event will occur. A joint probability is the probability of two events occurring together in a time or space. A conditional probability is the probability of an event where we have some information about some other event having occurred. So if we know that B is true and we're asking about the probability of A, that's a conditional probability. It's the probability of A given that we know B, that B has occurred. Here are some basic rules about probability. The probability of an event will never be less than zero. It will never be negative. It will never be greater than one. It can't be by the definition of a probability. A probability is exactly like a proportion, which also can't be negative and can't be greater than the total than one. So if you tell me that the probability that you will pass this course is 150%, you're starting at a disadvantage here right away. Rule two, the sum of the probabilities of all possible outcomes of a particular process must equal one. So the probability of A and the probability of everything that's not A together, that's got to be equal to one. The probability of getting a one on the face of a guy or a two or a three or a four or a five or a six, those are all possible values. You can't get anything else. You'll add those up. That has to be 100% or a one. If the sum of all your probabilities is less than one, what does that mean? It means you're missing something. You're probably missing an event you forgot about or an outcome. If it's more than one, that's a little bit more complicated and it probably means that the events you're looking at that there's some overlap. That maybe you're looking at the probability of getting an odd number and then the probability of a one at the same time. Well, there's overlap there. Before we get to the rule of addition, let me define something called mutually exclusive. Mutually exclusive has the word exclude in it. It means two events that can't occur together. Like heads and heads or tails. You can't have a head-tail. Male or female. It's one or the other. You can't have both. It's simultaneously. All right. When two events are mutually exclusive, that's called an A and D. They're mutually exclusive. The probability of A or B, notice it's two different ways of showing it. Some books have that U symbol. Okay. I'll probably use it with or. Probability A or B is the probability of A plus the probability of B if A and B are mutually exclusive. But in general, the probability of A or B is the probability of A plus the probability of B minus probability of A and B. Remember that's a joint probability. Okay. But again, by definition, the joint probability, probability of A and B is zero if A and B are mutually exclusive. These are things that can't happen together. I'm sure things can happen together. And you have to know whether they're mutually exclusive or not. Now, we know that the probability of getting a head or a tail is the probability of a head plus the probability of a tail. And you have to worry about that joint probability of a head-tail. Because it's zero. It doesn't exist. That's why the probability of getting a head or a tail is a half probability of a head plus the probability of a tail which is a half and that equals to one. This Venn diagram here is a very simple way of seeing how A and B look when they're mutually exclusive. See the top box? It's the A and the B and they're not touching each other because they're mutually exclusive. Notice in the bottom box, you show A and B that kind of overlapping because they're now mutually exclusive. Notice you have an A and there's an intersection. That's an A and B there where they both intersect. So that's called the probability of A and B. That's the intersection of A and B. And the Venn diagram makes you see it very clearly. And that means they're not mutually exclusive. So the Venn diagrams, you can also see the probability of A prime which means not A. You mean that not as prime but it's not A is one minus the probability of A. And the probability of not A or not B is one minus the probability of A and B. These examples will make it clear. So the other college, we allowed double majors. You can double major. Two majors. So we know 10% of the students are majoring in accounting. That's A. 15% of the students are majoring in business. D. 3% of the double majors. A and B. Okay? So that's the probability of being an accounting or a business major. Here's a little hint. They're not mutually exclusive because we just said it, you can be a double major. A and B. That means there's an intersection. So the answer for that is the probability of A or B is 0.10 plus 0.15. Notice we have to subtract the 0.03. That works out to 0.22. If you didn't subtract that 0.03, you'd be double counting the accounting and business majors because they show up in both lists. Think of it as two lists. The accounting list shows them. The business list shows me 3%. You don't account for 3% two times and that's why we subtract it out. So this is a very good illustration of the law of addition. The rules of multiplication are used to compute joint probabilities, which we know from the previous slide. This is the probability that event A and event B occur together, that they both occur. Like the example we just saw, the probability that someone is both an accounting major and a business major, that's a joint probability. Well, first let's look at a very specific case and we'll go to the general case. In the specific case, if the two events we're talking about are independent of each other, in other words, that knowing something about one having occurred doesn't change the probability that the other one occurs, then we just multiply the probabilities. And for example, if we're talking a coin and let's say it's a fair coin, you toss it once and you know that a head occurred. What's the probability that you'll get a head on the second toss? Well, it was 50% on the first toss, it's still 50% on the second toss, it's still one out of two, because each toss is independent of what happened before. So with independent events, we just multiply the probabilities of the events. Here's some examples to help you understand what independence is. Suppose I take our student at random from our college population and I tell you that student is male. What's the probability that that student has blue eyes? Well, that would be just the same as the probability that a student has blue eyes without knowing whether the student is male or female. Because there's independence, there's no relationship between sex and eye color. On the other hand, if I take a student at random and I say that that student is female, now what's the probability that the student is over six feet tall? Well, knowing that the student is female, that should make a difference. And it would change the probability from the simple probability of just any student being over six feet tall. And you can find all kinds of other examples that are similar that help you make the case. Here's the general formula for multiplication. If we don't know that events a and b are independent, or if we know for sure they're not independent, then we can compute the joint probability of a and b by taking the probability of a given b multiplied by the probability of b. You're at least multiplied by the event after the given. We could do it the other way around, too, where you get the probability of b given a and multiplied by the probability of a. That will also give us the joint probability of a and b. It all depends on what you have and what you're trying to compute. And, of course, we know the probability of a given b is a conditional probability. If the two events are independent, as we just saw, then the probability of a given b is just the same as the probability of a. And so the general formula for multiplication, in the case of independent events, will turn into the probability of a times the probability of b, which is exactly what we had before, for independent events. So the two formulas for multiplication are not different at all. They're exactly the same. It all depends on whether the events we're looking at are independent or not. Sometimes we're going to want to use these formulas in order to prove that the events are independent or not independent, and we can. We have the probabilities that say the conditional probability of a given b, and we have the simple probability, the probability of a, and they're not equal to each other, then we can say these events are not independent. Because if they are independent, knowing something about b would not change the probability that a occurs. And, similarly, we can use the multiplication rule to determine if two events are independent. If the probability of a times the probability of b is exactly equal to the joint probability of a and b, that's proof enough that events a and b are independent of each other. And finally, we have the rule for conditional probability for how we compute conditional probability. All we do is we're just turning around the previous rule from 4, number 4b, but we actually have it on the same slide. If the probability of a and b is equal to the probability of a given b times the probability of b, then if we need to compute the probability of a given b and we have the other terms, that's exactly equal to the joint probability of a and b divided by the probability for the event after the given or divided by the probability of b. We're not going to cover base theorem in this course. It's an optional topic, so we're not going to worry about it for now. Here we have a simple example to start. We're looking at a readership of two different newspapers, the New York Times and the Wall Street Journal. And in this particular village, the probability that an individual reads the New York Times is 0.25 because 25% of the population does. The probability that an individual reads the Wall Street Journal is 0.20. The probability that an individual reads both the New York Times and the Wall Street Journal is 0.05 or 5%. So the question is, what's the probability of an individual being either a New York Times reader or a Wall Street Journal reader? So that's the rule of addition that we're going to be using. The probability of being a New York Times reader or a Wall Street Journal reader is equal to the probability of reading the Times 0.25 plus the probability of reading the Journal 0.20 minus the probability of reading both of them because we don't want to count these people twice. And that's 0.05. So 0.25 plus 0.20 minus 0.05 is equal to 0.40, so 40% of the population reads something. Here you see the same thing laid out in a Venn diagram and you can see the joint probability in the middle and actually what you see here is the same data you saw before but instead of being laid out looking like probabilities or percentages based on the basis of 100. So suppose we have 100 people selected at random from this village, five of them would read both the New York Times and the Wall Street Journal, 40 of them would read at least one, would read one of them or anything, and then there are 60 people out of 100 that don't read either one. Sometimes your probability problems that you do for homework will ask you for various different types of probabilities and here you have an example of some of the different types of questions that you could be asked. Another way to look at problems like this where you have a set of probabilities and then you ask a lot of questions about them is to put them into a table of joint probabilities. We'll see more of this later on. So we're not going to work with this now. It's just to show you that this is an alternate way. In some problems it's actually a much better way of solving the problem. It's very important to understand the difference between independence and mutually exclusive events. Don't confuse them. A lot of students tend to do that. Mutually exclusive simply means that the probability of A and B is zero. They can't occur together through a physical thing. You can't have a head and tail at the same time. You can't pass and fail of course at the same time. Those are mutually exclusive. So simply put, it's just that probably the A and B is zero. Independence means that we're all together different. It has to be the effect of let's take B on A. Does knowing B have any effect on A? If knowing B has an effect on A, then we could say they're not independent. If knowing B has no effect on A, then they're independent. So independence is more like the concept of unrelated, uncorrelated. So the example we have here is waist size and gender. Are they independent of each other? So suppose you know that somebody is an adult with a 24-inch waist. Does that give you a hint as to whether that person's male or female? Well, we can figure that out because we know the problem of having a 24-inch waist, given that you're female, is not the same as the problem of having 24-inch waist in general or probably 24-inch waist given that you're male. So we do probably 24-inch waist in general combining males and females. But the minute I know that somebody has a 24-inch waist, I'm quite certain that they're highly likely to be female or a few men have 24-inch waist. So that's another way of seeing this idea of relationship and independence. So again, if the problem of A given B is the problem of A, then A and B are independent. If the problem of A given B is not A, not the problem of A, then there's some kind of relationship here. They're not independent. Researchers are always interested in whether two events are independent or not. For example, is there a relationship between lung cancer and smoking or are they independent of each other? Well, we know the answer to that one. Everyone in this room believes that independence, certainly smoking has an effect on lung cancer and people who get lung cancer generally have been smokers. All right? Is there a relationship between occupation, how long you live, so-called longevity, or are they independent? Well, studies show that occupation does affect your lifespan. Librarians, I think, have the highest lifespan of them all. Cold miners and drug dealers have the shortest lifespans. So you see there is some kind of relationship. And again, insurance companies will use that information when they fill your life insurance. They look at your occupation, too. Is there a relationship in salary and how slender you are? Well, studies are showing that women are very slender, get much more money. Unfortunately, I mean, there's one of these kinds of discrimination, but there have been studies done on this. What about how many dates you get and your hair color? Well, believe it or not, somebody studied that. I think on the sites like e-harming and others, they found that women who had blonde hair had more dates on these sites than women who had other color hair. So you see, this is the kind of thing researchers want to know. You're not going to win the Nobel Prize for finding out what doesn't cure cancer. You're going to win the Nobel Prize when you find the cure for cancer or the cure for Alzheimer's. We're always looking for relationships. So these tests for independence, which we're going to learn soon, are very valuable to statisticians. Well, in this example, we're looking at a contingency table. We're looking at smokers, non-smokers, and diet of cancer or not. We're looking at a sample of 1,000 people. Those are frequencies of the table. And the way you read that is in the sample of 1,000, 100 people with smokers who died of cancer. Those are a total of 400 smokes. You get that from the marginal total. 300 smokers do not die of cancer. Among non-smokers, they have 50 that died of cancer and 550 that didn't die of cancer. Incidentally, those totals on the side that we call the marginal or simple probabilities are useful. Now, 150 people die total in this sample of cancer. Right? All the way on the right. 850 did not have cancer. We have 400 smokers in the sample, 600 non-smokers, and that's how we have 1,000 people. If you divide through by 1,000, then you can get a joint probability table. But you see the joint probabilities in the bottom. You take 100 divided by 1,000 at 0.10. That's the probability of C and S. If you want the probability of not C and S, 300 over 1,000, that's 0.30. If you divide 50 by 1,000 at 0.05, it's the probability of C and not S. And if you want the probability of not C and not S, that's the 550 over 1,000, that's 0.55. That gives you the joint probabilities. If you take those marginal totals and divide them by 1,000, the probability of C, 150 over 1,000, is 0.15. The probability of not C is 0.85. That's 850 over 1,000. Now look at the column totals. There's 400 smokers divided by 1,000. The probability of being a smoker is 40%, 0.40. The probability of being an non-smoker is 0.60. The reason that you're looking at this table would want to answer the question, are smoking and cancer independent or not? Well, the way to... some ways to answer it is one way to do it. We look at the probability of C, and we see the sequence of the probability of C given S. And if you want to go and look at it, what is the probability of C given not S? Well, the probability of C given S is cancer given at your smoker. That's the probability of being a cancer and smoker over probability of smoker, or 0.10 over 0.40, of 25%. So once we know you're a smoker, there's a 25% chance you're going to die of cancer. Look at the probability of C given not S. That's the probability of C and not being a smoker, and not S, over the probability of not S. That works out to 0.05 over 0.60. That works out to 0.083, 8.3%. If you're a non-smoker, there's an 8.3% chance of dying of cancer. The probability of cancer, the simple probability of that's... it's kind of an average. It averages out to a deleted average, actually. Remember, there were more non-smokers than smokers, so it's not a simple average. That's 0.15. But looking at all three probability... You just got to look at two of them, but looking at all three, you can see what's going on. In general, there's a 15% chance you're going to die of cancer. If you're a smoker, it's 25%, if you're a non-smoker, that probability is 0.083. So research is looking at these numbers will simply state that smoking and cancer are not independent. There is actually a relationship between the two. And this is how you win the Nobel Prize in finding relationships. This one, of course, you're not going to win the Nobel Prize, I think everybody knows this one. Here's another way you can actually determine that cancer and smoking are not independent. If cancer is smoking independent, then the probability of C and S is equal to the probability of C times the probability of S. Remember, that was the formula for independent events. We did with A and B. The probability of A and B is the probability of A times the probability of B if they're independent. Well, the probability of C and S, you know, is 0.10. Is that equal to the probability of C of 0.15 times the probability of S of 0.40? Well, 0.10 is not equal to 0.06 on this planet. So we conclude that cancer is smoking and not independent. And even if you want to do this properly, it's always useful to set up a joint probability, which we'll show you in the next two slides. Okay, here we see the joint probability of the same data we had before. Remember, you divide through by 1,000, and those frequencies in the four cells, remember the cancer and smoking cell, they're four cells, divide through by 1,000, you get a joint probability table, and the marginal total to divide through by 1,000, too. And notice you've got everything there. You read it, that's an AND always. The probability of smoker and cancer is 0.10. Cancer and non-smoker is 0.05. Smoker and up-cancer, 0.30. And the non-smoker without cancer is 0.55. And all these probabilities add up to 1, you see that? It's very useful to set up this joint probability table because you can see everything you want. And again, the marginal probabilities are the row and column totals. Okay, so look at the same problem set up with a joint probability table. It's a lot easier to solve. Well, again, here we have the joint probability set up nice and neat. Everything is here. And notice you get the joint probabilities in the center of the table, the marginal probabilities on the margins, and a conditional probability is computed simply by dividing the joint probability by marginal probability. So the probability of C given S is the probability of C and S, that's a 0.10, divided by the column total, the marginal probability is 0.40 or 0.25. Here's another example with gender and beer drinking. Okay, we have beer drinker for B. B-prime means not a beer drinker. We have M and F for male and female. And we see the frequencies here, 450, 450, 350. We turn into a joint probability table by dividing through by 2,000, the 2,000 in the sample. And you see all the joint probabilities given, right? For example, the probability of a male who drinks beer, notice B and M is the same as M and B. A man who drinks beer would be, that probability would be 0.225. A female who drinks beer, that's 0.175. A man who doesn't drink beer, that's 0.225 also. And a female who doesn't drink beer is 0.375. And we have these marginals there. Marginal probabilities. So actually, what's the probability of a male in this town? You say 30%. That's the probability of B. That includes man and women together. They say, well, what's the probability of somebody being a male in this town? Well, in this town, 900 out of 2,000 or 0.35 are males. So you see, again, in the joint probability table, everything you need to know is right there. Well, here's the conditional probabilities. So during that individual or male is male, what's the probability that person is a beer drinker? It's given M. And that's the probability of B and M over the probability of M. That works out to 0.225 over 0.45 or 50%. Once I know you're a male, 50% chance in this town that you drink beer. What about if you're given that you're a female? What's the probability of a beer drinker? Well, that's the probability of B given F. That's the probability of B and F over the probability of F. Again, we take the joint over the marginal. So in this case, the B given F is 0.175. That's beer and female. Over probably being female in this town, which is 0.55. And that works out to 0.318. Are beer drinking and gender independent? We know the probability of beer drinking is 0.30. It's a weighted average, again, at those above numbers. Notice if you're a male, you know the 50% chance you drink beer. If you're a female, it's a 0.318, about a little less than, let's take it exact, 31.8% chance that you drink beer. Clearly, there's some kind of relationship, and we know men are more likely to drink beer than women. Again, the 0.40 is a weighted average of the two. But again, the research you're looking at this data would say beer drinking and sex are not independent. There's some kind of pattern. And once I know that you're a man, I know you're more likely to drink beer. Okay? Here's how we're looking at marketing examples. Dove soap. Why do you make use of soap? And we want to know if there's a relationship between gender, male-female, and use of dove soap. So we have this contingency table here. Notice there's a sample of 1,000 people in some town. And here's what we found. Male and people, men who use dove soap, males who use dove soap, there are 80 out of 1,000. Females who use dove soap is 120. Men who didn't use dove soap, 320. And females who didn't use dove soap, 480. All the joint probabilities are on the bottom there. So the probability of DNM, which by the way is the same as M&D, is 8%. The probability of DNF is 12%. The probability of the not DNF is 0.48. And we have a marginal total. So the 20% chance of using dove soap, 80% don't use dove soap. Probably not D. Probably the M is 40%. 40% chance you're a male in this sample. There's 40% males. And probably the female is 60%. So we have all the important probabilities computed for you. Well, the marketing research might want to know whether gender and use of dove soap are independent or not. Well, here's what you find out. The probability in the dove user is 20% in this town. The probability given M, that's dove user given at your male, that's 0.08 over 0.40, which is also 20%. The probability given F, remember that's DNF over the probability of F, that's 0.12 over the 0.16, that's also 20%. Guess what? They're independent. Doesn't matter. Knowing the gender of the person doesn't affect their use of dove or not. We know for sure that it's 20% whether you're male or female. So that's independent. You can do it a different way. We know that the probability of M and D, the joint probability is 8%. Is that equal to the probability of M times the probability of D? Well, 0.08 is equal to 0.40 times 0.20. So yes, the two events M and D are independent of each other.