 As Salaamu Alaikum, welcome to lecture number 27 of the course on statistics and probability. Students, you will recall that in the last lecture, we discussed in detail the concept of bivariate probability distributions, both discrete and continuous. Towards the end of the last lecture, we began the discussion of two important properties of mathematical expectation that are valid in the case of bivariate distributions. Expected value of x plus y is equal to expected value of x plus the expected value of y and the other property that if x and y are independent, then expected value of x y is equal to expected value of x into expected value of y. In today's lecture, I will begin with a detailed discussion of exactly these two properties with reference to the same example that we were talking about last time. As you now see on the screen, if we have a situation where x and y are two discrete random variables with the joint probability distribution given as you see on the slide and if we are required to find expected value of x, expected value of y, expected value of x plus y and expected value of x y, then in order to find e of x and e of y, we will compute the marginal probability distributions g of x and h of y. As indicated in the last lecture, g of x will be found by summing the joint probabilities over y and h of y is found by summing the joint probabilities over x. So, as you now see on the screen, in this example, the probabilities which are denoted by g of x are 0.40 and 0.60. On the other hand, the marginal probabilities of y are 0.25, 0.50 and 0.25. Students, in order to compute expected value of x, our formula is sigma x into g of x and applying this formula, we obtain 0.80 which is the product of 2 and 0.40 plus 2.40 which is the product of 4 and 0.60. So, the expected value of x comes out to be 3.2. In a similar manner, we will find the expected value of y given by sigma y into h of y. Any y values go h of y corresponding values say, multiply k j or in products ka sum le le j and doing so, we obtain expected value of y equal to 3.0. Hence, as you see on the screen, expected value of x plus expected value of y comes out to be equal to 6.2. Students, what we are interested in now is to compute the expected value of x plus y or jaisak mein aapko last time bataya tha, iska formula thoda sa complicated to ni kainachahi. It is an extension, a kind of an extension of the formula that you have in the univariate situation. So, the expected value of x plus y is given by double summation x i plus y j into f of x i y j. And what will we do in order to solve this expression, dekhye is problem mein, humare paas chae cells hai in the body of the table aur hum ye karenge ke har cell mein, hum corresponding x value or corresponding y value ko add karke usi cell ke top corner pe likh lehte. So, as you now see on the screen, the first sum of this type will be 2 plus 1, because the first value of x is 2 and the first value of y is 1 and this sum 2 plus 1 is written in the top corner of the first cell. Similarly, we write 2 plus 3 that is 5, 2 plus 5 that is 7, 4 plus 1, 4 plus 3 and 4 plus 5 in the top corners of the other cells. In other words, students, all we have to do is this, jo bhi cell aap consider karenge hain uske against jo y value aapko top pe milre hai aur uske against left side pe jo x value aapko milre hain, aap uni 2 x or y values ko add karke usi cell ke top corner mein likh lehte. So, it is actually very simple and the next thing is to multiply this sum that you have written with the probability of that you have in that particular cell and thus you will have the product of x i plus y j a and f of x i y j for any cell i j. Of course, i j is the general expression i means we are talking about the ith row and j means that we are talking about the jth column and so combining the two, we can say that we are talking about the i jth cell, i ke value 1, 2, 3, 4, j ke value 1, 2, 3, 4 is tara se aap us pta maamtar table jo hain usko aap handle karsate hain. Now, that we have all these sums multiplied by their corresponding probabilities, students all we have to do is to add these in order to get expected value of x plus y. So, as you now see on the screen we obtain 0.30 plus 1.00 plus 0.70 plus 0.75 plus 2.10 plus 1.35 and 1.00 plus 1.30 plus 1.30 plus 1.30 the sum comes out to be 6.20 exactly the same that we had when we added e of x to e of y. The second property which I have discussed with you is that if x and y are independent then expected value of x y is equal to expected value of x into expected value of y. So, this problem ne yeh dekhne ke liye ke kya x or y vaa kai independent hain we will have to try to see whether or not this equation holds. And if I look at the right hand side of this equation e of x into e of y 3.0 into 3.2 obviously I obtain 9.6. Lekin svali peyda hota hain ke jo left hand side hain e of x y is that also equal to 9.6. So, what do we do? Students the procedure is very very similar to the one that I just explained. As you now see on the screen the expected value of x y is given by double summation x i y j into f of x i y j. So, this means that in this particular situation we do not have to write the sum of the x value and the y value in the top corner of any cell rather we write the product. Therefore, the value that we write in the top corner of the first cell is 2 into 1 and that is 2. For the second one we write 2 into 3 and that is 6, next 2 into 5 which is 10 and so on and so forth multiplying each one of these products by the corresponding probability. We obtain all those products whose sum is going to give us the expected value of x y and students when you work on it exactly the same way as I explained a short while ago for the sum you find that the expected value of x into y also comes out to be 2 into 9.6. Hence we can say that in this particular example the two random variables x and y are indeed statistically independent. Students this was the computation of the expected value of x plus y and expected value of x y in the discrete situation. Let us now consider the continuous situation and I hope you remember that in the continuous situation the summation sign is replaced by integration. Let us do this with the help of the example that you now see on the screen. Let x and y be independent random variables with joint probability density function f of x y is equal to x into 1 plus 3 y square divided by 4 and this expression is valid for x lying between 0 and 2 and y lying between 0 and 1. We would like to compute the expected value of x, expected value of y and to verify the two properties that we have already discussed. In order to find the expected value of x we will first of all compute the marginal distribution of x and as explained last time for this purpose we will find the integral of our expression with respect to y and doing so g of x comes out to be x over 2. Similarly, to find the marginal distribution of y we integrate our expression with respect to x and doing that the marginal distribution of y comes out to be 1 plus 3 y square divided by 2. Now, in order to find the expected value of x the formula is integral of x into g of x with respect to x. Students, I have shared with you the formula and I will encourage you that the similarity with the formula of the discrete situation in the discrete case we had expected value of x is equal to sigma x into g of x and of course the integral is with respect to x. So, as you now see the formula is very similar as you see on the slide the expected value of x in this particular example is the integral from 0 to 2 with respect to x of the expression x into x over 2 that is x square over 2 and applying this integral and applying the limits 0 to 2 our answer comes out to be 4 by 3 or 1.33. Similarly, the expected value of y is given by the integral of y into h of y the integration being with respect to y and applying this formula e of y comes out to be 5 over 8. Now, the more complicated part of the example and that is to compute the expected value of x plus y. In the discrete case we had double summation x i plus y j into f of x i y j or now instead of double summation we have double integral. So, as you see on the screen expected value of x plus y is the double integral of x plus y into f of x y and applying this in this particular example and going through all the steps in a fashion similar to what I explained in the last lecture the expected value of x plus y comes out to be 47 divided by 24. Also, we are interested in the expected value of x y and very similar to what we have just done for x plus y in this case we have the formula double integral of x y into f of x y and once again applying the method that is already known to us the expected value of x y in this example comes out to be 5 over 6. Hence, students we find that both the properties are being fulfilled. 4 by 3 plus 5 by 8 is equal to 47 by 24 and 4 by 3 multiplied by 5 by 8 does come out to be equal to 5 by 6. Hence, if we have the expected value of x plus e of y is equal to e of x plus y and because x and y in this example are independent therefore, e of x into e of y has come out to be equal to e of x y. The next concept that I am going to discuss with you students is the covariance and the correlation of bivariate probability distributions. You will remember that we discussed the concept of correlation earlier. In lecture number 15 immediately before we started the segment on probability theory, we had a lecture on regression and correlation. You will remember that there we were dealing with sample data. We had a sample and we had measurements on two variables for example, height and weight or marks in mathematics versus marks in statistics i.e. two related variables. So, now we are going to discuss the same subject. But the difference is that now we are going to talk about the covariance and the correlation of not just a sample but an entire probability distribution. So, the covariance here is an interesting or important point. When we say that we are dealing with a probability distribution, you can also say that we are dealing with the entire population and you draw a sample from the entire population. So, if we draw a sample and you talk about it, then we say that we are talking about descriptive statistics. So, if we draw a sample and we describe it. But when you expand this concept to the entire population, then we are talking about probability distributions because every population has a distribution. What is the basic concept of covariance students? As you now see on the screen, the covariance of two random variables X and Y is a numerical measure of the extent to which their values tend to increase or decrease together. It is denoted by sigma x y and it is defined as the expected value of x minus e of x multiplied by y minus e of y. The shortcut formula is covariance of x y is equal to e of x y minus e of x into e of y. Students, do you remember that the variance we discussed in case of probability distribution, what was the shortcut formula? The expected value of x square minus the expected value of x whole square gives you the variance of the random variable X. So, you see that the variance of X is equal to the expected value of x into x minus the expected value of x into the expected value of x into the expected value of x. Exactly the same shortcut formula that I just now convey to you for the covariance. The extent to which two variables vary together is the covariance of the two random variables. This covariance is the extent to which two variables vary together. So, when we were talking about only X, at that time we did not have any option except that we multiply X with X and if we talk about only Y then we do not have any option except that we say variance of Y is equal to expected value of Y square minus expected value of Y whole square. Here, because we are trying to measure two variables together, our formula is covariance of X and Y is expected value of X Y minus expected value of Y. This is the covariance and what is the correlation? Students, if we divide the covariance by the standard deviation of X into the standard deviation of Y, we obtain what is called the correlation of X and Y. So, as you now see on the slide, the correlation coefficient is given by the covariance of X Y over the standard deviation of X into the standard deviation of Y. Students, correlation coefficient in the case of a bivariate probability distribution is not denoted by R as we had in the case of sample data, rather it is denoted by the Greek letter row. No, no, no, it is not about the word row. It is a Greek letter which we pronounce as row and if you want to spell it in English, then you will spell it as R H O. Now, there is no problem in this. From the beginning, we are seeing that for the sample statistic, we generally use the English letter and for the population parameter, we use the Greek letters. For example, sample mean X bar, population mean mu, sample standard deviation S and population standard deviation sigma. Let us apply this to an example. Suppose that we have a discrete bivariate probability distribution in which X takes the values 0, 1, 2 and Y takes the values 0, 1, 2 and 3. The probabilities are as you see on the slide and what we are interested in is to find the correlation coefficient between the random variables X and Y. In order to solve this question, students, we can compute, first of all, E of X and E of Y. The reason being that in the formula of the correlation coefficient, we will have to divide the covariance by the standard deviation of X and the standard deviation of Y, the product of the two or standard deviation of X, we will need to find first the expected value of X. Variance of X, standard deviation of X formula, apply karte. So, proceeding as before, expected value of X comes out to be 1.10, expected value of Y comes out to be 1.65. Also, we find expected value of X square by the formula sigma X i square into G of X i and the answer is 1.70. In a similar way, expected value of Y square, which is sigma Y j square into H of Y j comes out to be 3.45. Substituting these quantities in the formulae of variance of X and variance of Y, we obtain variance of X is equal to 0.49 whereas, the variance of Y is equal to 0.7275. Now, I want to remind you that, expected value of X raised to k is equal to sigma X i raised to k into F of X i. I had told you that, any power of X i will have the same power of X i and X i will have the same power of X i. So, in a bivariate situation, it is the same situation. It is just that distribution of X, instead of denoting F of X, we are denoting G of X. So, our formula is expected value of X square is equal to sigma X i square into G of X i. Having found the variance of X and the variance of Y, students, the next thing is to find the covariance of X and Y. As you now see on the slide, according to the shortcut formula, the covariance of X y is equal to expected value of X y minus expected value of X into expected value of Y. Now, expected value of X y is computed in exactly the same manner as we did in the example that we considered in the beginning of today's lecture and applying that procedure, it comes out to be 1.90. Substituting the values of E of X and E of Y, the covariance comes out to be 0.085. Now, the correlation coefficient is given by the covariance of X y over the square root of the variance of X into the variance of Y and substituting all the values, the correlation comes out to be 0.14. I a is trying to interpret the answer. You may remember that in lecture number 15, I had told you that the correlation coefficient always lies between minus 1 and plus 1. If there is a positive correlation between X and Y, then our correlation coefficient is somewhere between 0 and 1 and if there is a negative correlation, then our correlation coefficient lies between 0 and minus 1. Of course, if X and Y are absolutely uncorrelated, then the correlation coefficient comes out to be exactly equal to 0. So, in this example, it has come out to be equal to 0.14. So, what is your interpretation? It is clear that since this is a positive answer, therefore, we will not say that there is a negative correlation between these two random variables. It is clear that if there is a correlation, then it is positive or direct as X increases Y also increases. The closer my answer is to 1, the stronger is the correlation between X and Y. In this case, actually it is not at all close to 1. I mean that if you compare the distance between 0 and 0.14, that is much, much less than the distance between 0.14 and 1. Therefore, our answer is 0 or 1. Hence, our final interpretation is that there is a weak positive linear correlation between X and Y. The words linear, I have already indicated in lecture 15 that the correlation we are removing is actually a measure of the strength of the relationship between the two variables in a linear manner. Let us now discuss the computation of the correlation coefficient in the case of a continuous bivariate probability distribution. As you see on the screen, suppose we have the joint probability density function X square plus X Y over 3, such that X lies between 0 and 1 and Y lies between 0 and 2. And suppose that we would like to find the variance of X, the variance of Y and the correlation of X and Y. In order to solve this question, the first step is to find the marginal pdx g of X comes out to be 2X square plus 3 by 2X and h of Y comes out to be 1 by 3 plus Y over 6. Next, we find the expected value of X, which is the integral of X into g of X, integration being done with respect to X and the answer is 13 over 18. Similarly, the expected value of Y comes out to be 10 over 9. Now, in order to find the variance of X, we can either apply the direct formula or we apply the shortcut formula. For the sake of interest, let us apply the direct formula in this example. And if we do so, we have expected value of X minus e of X whole square equal to the integral of X minus e of X whole square multiplied by g of X and the integration is being done with respect to X. Now, X minus e of X whole square is the same thing as X minus 13 over 18 whole square and substituting this in the formula, we obtain the product of two expressions, which we can solve and applying the integration. The variance of X comes out to be 73 divided by 1620. Students, I would like to encourage you to compute the variance of X in this problem by both the shortcut formula and the direct formula and see for yourself that you get exactly the same answer. You do not have to get confused by thinking that the direct formula is something that is out of your hands. Absolutely not. In fact, it is quite simple. Similarly, we can also find the variance of Y and as you now see on the screen, the variance of Y is equal to the expected value of Y minus e of Y whole square and that is the integral of Y minus e of Y whole square multiplied by h of Y. The integration being done with respect to Y, substituting 10 over 9 instead of e of Y, we have to find the integral of Y minus 10 over 9 whole square multiplied by 1 over 3 plus Y over 6. So, we multiply these parts and we get a number of terms upon which we apply this integral and doing all the calculations. The final answer is 26 over 81. To find the covariance of X and Y by the direct formula, we have to compute the expected value of X minus e of X into Y minus e of Y. In other words, we have to find the double integral of X minus 13 over 18 into Y minus 10 over 9 into the joint pdf X square plus X Y over 3 and applying the integral in the same manner as we discussed earlier, we find that the covariance of X and Y comes out to be minus 1 over 162. The correlation coefficient is equal to the covariance divided by the standard deviation of X into the standard deviation of Y. So, in this example, the correlation coefficient comes out to be minus 1 over 162 divided by the square root of the variance of X which is 73 over 1620 into the variance of Y which is 26 over 81 and hence, the final answer is minus 0.05. Students, you have noted that in this example, our answer is very close to 0 minus 0.05. So, what is our interpretation? We can say that there is a very weak negative linear correlation between the random variables X and Y. . . . . . . . . . . . . . . . . . . . . I will again encourage that if you do the correlation with covariance or double, double summation in all the concepts so that you feel at home and comfortable and confident with these ideas. We have completed the discussion of the basic concepts of univariate and bivariate probability distributions, both the discrete situation and the continuous situation. And students, after this we begin the discussion of some important univariate distributions that are encountered in practice. The distributions that I would like to discuss with you are the discrete uniform distribution, the binomial distribution, the hypergeometric distribution and the Poisson distribution. In the continuous scenario, I would like to discuss with you the continuous uniform distribution and last but not the least, in fact most importantly the normal distribution. So, aayi is nice segment ka aagas karthe, we begin with the discussion of the discrete uniform distribution and I would like to explain it to you with the help of an example, suppose that we toss a fair die and let x denote the number of dots on the uppermost face. Since the die is fair, hence each of the x values from 1 to 6 is equally likely to occur and hence the probability distribution of the random variable x is x is equal to 1, 2, 3, 4, 5 and 6 and for each one of these x values, the probability is 1 over 6. The sum of the probabilities is 1 and none of the probabilities is negative, hence we are dealing with a proper discrete probability distribution. Aapne dekha ke is particular probability distribution me, tamam probability is identically. So, if we draw the line chart of this probability distribution, we have a uniform distribution as you now see on the screen. Since the height of each of the six lines is equal to 1 over 6, hence we get a horizontal impression and that is why this is known as the discrete uniform distribution. Aapne dekha ke kisi bhi distribution ke liye, after drawing the line chart we are interested in the center and the spread of the distribution. So, in this particular example, if I wish to find the mean of this distribution, students what do you suggest? Wohi purana formula sigma x into f of x, x column b hamare paase, f of x yani probabilities ka column b hamare paase and we can find the mean very simply, but what I would like to convey to you is that you do not even need to go through this formula in order to find the mean of this particular distribution. The point to be understood is that this distribution is absolutely symmetric, agar aap iske dhar mian ek aina karakari, you will find that the left hand side is the mirror image of the right hand side and hence the mean of the distribution lies at the exact center of the distribution. So, as you now see on the slide, the mean of this particular distribution is equal to 3.5. Once again you might say ke 3.5 dots to ni ho sakte, of course not, agar aap dai ko toss karenge to either you will get 3 dots or 4 dots or 2 dots or 6 dots, 3.5 dots per toss on the average, iska kya matlab hai that if you toss it many, many times then in every 10 tosses you can expect to have a total of 35 dots. The next thing is the spread of this distribution and of course we would like to compute the variance and the standard deviation of the distribution. Iske liye to zahir hai ke hame formula apply kar nahi padega and I would like to encourage you students to find the variance, the standard deviation and the coefficient of variation of this particular distribution on your own. Let us now consider another example, the lottery conducted in various countries for purposes of money making provides a good example of the discrete uniform distribution. Suppose that in a particular lottery as many as 10,000 lottery tickets are issued and the numbering of these tickets is from 0 0 0 0 to 9 9 9 9. Since each of these numbers is equally likely to occur, hence we have the following situation. We have a uniform distribution in which the X variable has 10,000 values starting from 0 0 0 0 and going up to 9 9 9 9 and because the number of values is 10,000 therefore the probability of any one of these lottery numbers is 1 over 10,000. Now the line chart of this distribution is absolutely uniform because of the same reason that we have exactly the same probability for each lottery number. Because of the fact that the height of every line is equal, we can see that this is a perfect example of a discrete uniform distribution. Students, you know that the crucial point in this is that every lottery number has equal chance. Look, the meaning of lottery is that everyone has identical chance of being selected for the prize. So, in this regard and the example I have presented of the throwing of the die, you have seen that every situation where you have various outcomes which are equally likely and you can express those outcomes in numerical form as a random variable X which takes values 0 1 2 3 or 1 2 3 4 or even values like you had in this example 0 0 0 0 up to 9 9 9 9, you do get what is called a discrete uniform distribution. Note that the X values are equi spaced because if there is a difference between them, you agree that there will be some problem which will distort the symmetry of your line chart. I would like to encourage you to think about it and to work on this point on your own. The next important, very important discrete probability distribution that I am going to discuss with you is the binomial distribution. Students, this distribution was discovered by James Bernoulli around the year 1700, and even today it is considered a very important discrete probability distribution and there are many situations in real life which can be identified with the binomial distribution. Let me begin its discussion with the help of an example. Suppose that we toss a fair coin 5 times and we are interested in determining the probability distribution of the random variable X, where X represents the number of heads that we obtain. Now, in this problem, you note that the number of heads is going to be a variable that will go from 0 to 5. If you toss the coin 5 times, then what are the various possibilities? First of all, you do not obtain even a single head or you can have one head or two heads or three, four or you can have five heads. So, it is clear that if you toss the coin 5 times, then you cannot obtain six heads. Hence, we note that in this kind of a situation, our random variable X goes from 0 to 5, generally speaking from 0 to n, small n, where n denotes the number of trials. This is the technical term that we have for the binomial experiment. Formally speaking, a binomial experiment is that experiment which satisfies the following four conditions. Number one, every trial results in either a success or a failure. Number two, every trial is independent of every other trial. Number three, the probability of success remains constant from trial to trial and number four, the total number of trials that is n is fixed in advance. Students, these four properties I have conveyed to you, these are essential for us to be able to apply the formula of the binomial distribution. So, I would like to discuss with you these points in detail one by one. First of all, we have said that every trial results in a success or a failure. Look, success and failure are technical terms. Success does not mean that we are talking about something that is very good and failure means that we are talking about something that is wrong or bad or failure. It is a technical term and it simply means that the outcome that we are interested in is called success and the outcome that is the complement of what we want that is called failure. For example, suppose our experiment is that we are testing any person for any particular rare disease. When you conduct a test on it, its result will either be negative or either be positive. Negative result means that this person does not have this disease or positive result means that he does have that disease. If we are interested in doing research on that disease, then students, the prevalence of the disease will be regarded as success. So, out of a sample of 10 patients, if three of them do have the disease and seven of them do not, then we are saying that there are three successes and seven failures in this particular example. This was the first point and the second point was that every trial is independent of every other trial. In this example, we will consider that there are 10 patients who we tested and they are independent. We say that the trials are independent and the third property is that the probability of success, this probability should remain constant from trial to trial. Now, in this example, how do we decide that this is happening? In this problem, we will have to apply a relative frequency definition of probability and the large population of which we took these 10 patients, the proportion of the people who disease here, that proportion will act as the probability of this particular disease. So, for example, if in the large population we have this knowledge from past records that 5 percent of the people have this particular disease, then we say that the probability of success is 0.05 and this probability remains constant from trial to trial. The reason being that they are independent and they have come from that same population. The last point is that the number of trials is fixed in advance. In this problem, we had already decided that we will test 10 patients and so the number of trials is 10. Students, in the next lecture, I will discuss with you the binomial distribution in more detail and we will be looking at the formula of the probabilities that will be computed in any such situation where these four properties are fulfilled. In the meantime, you will be practicing the concepts of covariance, correlation and the discrete uniform distribution. Best of luck and until next time, Allah Hafiz.