 I will just review the standard normal distribution. We know that the standard normal distribution is used for finding the probabilities corresponding to any normal distribution, which need not be a standard normal distribution. The shape of a normal distribution is bell shaped. This is a picture of a standard normal curve and the area on the left of any point z is given here and these areas have been tabulated for different values of z. And you can see that these this curve is symmetric about 0. So, the area on the left of z can be used to find out the area on the left of minus z somewhere here. The table provides the area on the left of z where z is positive. However, the same table can be used to find out area on the left of minus z and how do we do that? We can see that the area on the left of minus z is same as the area on the right of minus z and this area can be easily worked out which would be 1 minus the value provided in the table which is the area on the left of z. So, these things we had discussed and these techniques are used to find out different areas under the normal curve. The table as you can see looks like this which gives you the area on the left of any point x. So, the values of x are here. So, the area on the left of say 3.9 is 0.9990 when you can see the picture here drawn on top. So, now we describe the z alpha notation this would be frequently used in the inference problems that we will be discussing later on. Z alpha will denote the value on the measurement axis for which the area under the z curve line to the right of z alpha is alpha as you can see in this picture. So, if the area on the right of z alpha here is equal to alpha and this area probability that z is given equal to z alpha is nothing but 1 minus the area on the left of z alpha and these are the values which we can get from the tables. Let us take an example let z be the standard normal variable. You are asked to find out the value of z given that the probability that z is less than little z is 0.9278. So, here what you are given is the value a value 0.9278 which is a value in the body of the table and looking at that body of the table you are required to find out what is the value of z. So, in other words you are given that this area is 0.9278. So, it is asking what is this value of z. So, basically it is looking at the table in a reverse way and so look at the table and find an entry 0.9278 and then read back to find that the value of z is 1.46. So, let us see in the table 0.9278 0.9278. So, this is 0.9278 0.9278. This is what we see from the table and this means that the value of z is 1.46. Now if you are asked to find out z such that the central area between minus z and z is 0.8132. Now in order to work this out we know that since the standard normal curve is symmetric about 0, I can write this area as twice of the area between 0 and z which is equal to this area and this area is this is nothing but the area on the left of z minus 0.5 because the area on the left of 0 is 0.5 and that is what we have subtracted 0.5 from the area on the left of z. This gives us the required answer which is 0.8138. 0.5132 and the value of z is 1.32 again using the reverse calculation. So, here this is what we want the form in which we have to write down so that we can actually look at the body of the table and look for 0.9066 and thereby getting the value 1.32. Now the standard normal curve has its importance to find out areas under any normal curve. So, we look into the normal distribution in relation to the standard normal distribution. Now if x has a normal distribution with mean mu and standard deviation sigma then x minus mu by sigma has a standard normal distribution. This is an important result since it enables us to write all probability statements about the normal variable x in terms of the probabilities of the standard normal variable z. We are asking how to find out the area on the left of say 65. So, what we do is we standardize the variable x. So, x minus 80 by 20 that is z is less than equal to 65 minus 80 by 20 which is z is less than equal to minus 0.75. So, this is in terms of x. So, if we the area under the curve for the variable x that area is same as that under the curve z. Only thing is we are shifting the mean and standardizing with respect to the standard deviation. So, that would amount to finding the area on the left of 0.75 and so if you look at the table how do we find out the value on the value of the area on the left of minus 0.75 the table would it provide you a direct answer. So, you just have to find out the area on the right of 0.75 and for that that will be the area on the left of 0.75 you subtract that value from 1 that will be your answer. Let us take one more example a particular rash is noticed in kids at elementary school. It has been determined that the length of time that the rash will last is normally distributed with mean 6 and standard deviation 1.5. So, this is a variable indicating the length of time that a rash would would last. Find the probability that for a student selected at random from this elementary school the rash will last for between 3.75 and 9 days. So, we are required to find out the probability that the variable x lies between 3.75 and 9 so we standardize x into z by subtracting the mean which is 6 and dividing by the s d which is 1.5 and that gives us to find the area corresponding to the z curve between minus 1.5 and 2. This is what would be required and that would be that can be worked out looking at the table the area on the left of 2 is 0.9772 minus the area on the left of minus 1.5 again for that we have to first find out the area on the right of 1.5 in order to find out this value of 0.0668. Now, let us look at percentiles of an arbitrary normal distribution we have already looked at the percentiles for a standard normal curve. Now, the 100 into p th percentile for a normal variable with mean mu and s d sigma is equal to mu plus the 100 into p th percentile of the standard normal times sigma. So, let us try to see what is happening. So, we have the normal curve and suppose this is the standard normal curve z and the p th. So, if this area is say p then the 100 into p th percentile is given by eta p. So, this is your 100 into p th percentile. However, when you have the variable x the percentile 100 into p th percentile would be nothing but mu plus eta p times sigma the mean is here. So, for the problem the rash problem for the elementary school if we want to find out the 80 th percentile. So, here basically we are looking at find we need to find out the value of the 80 th percentile. So, that would be trying to work out something like this. So, in other words we are looking at z less than equal to something minus 6 by 1.5 this is 0.8. Now, we can look at the standard normal table and find out the value of this quantity and find out the value of this quantity such that the area on the left of that value is 0.8 and that gives us that this is 0.84 from the tables. So, this is nothing but your eta 0.8 and so the 80 th percentile is nothing but 6 plus 0.84 times 1.5 which works out to as 7.26 and this is also equal to nothing but mu plus eta 0.8 sigma. So, what we have is that for 80 percent of the school kids with rashes the rash would last for at most 7.26 days. So, again what we have done is we have used the standard normal curve to find out the percentile for a variable which has a normal curve with certain mean mu and sigma. Now, I am going to take up an IQ example consider the IQ model having normal distribution with mean 100 and sigma 15. So, the normal curve has a mean of 100 since the SD is 15, 115, 130, 85 and 70. So, I have drawn these numbers on the measurement scale such that we see that almost 95 percent of the values or the 95 percent of the area is lying between 70 and 130 that is part of the empirical rule which says that within 1 sigma of the mean 68 percent of the values lie within 2 sigma of the mean around 95 percent of the values lie and within 3 sigma of the mean almost 99.7 percent of the values of the random variable X would lie provided the distribution of the variable is a normal distribution. So, now the question what proportion or what percentage of IQ is below 130 can you work this out what proportion of IQ is below 130 where the distribution of the IQ is normal with mean 100 and SD 15. So, consider the IQ values being represented by the random variable X then what is being sought here is to find out the probability that X is less than 130. So, that means finding the probability that Z is less than 130 minus 100 divided by 15 which is probability that Z is less than 2 and this from the tables you should get the value as 0.9772 in other words around 97.72 percent of the students have IQs less than equal less than 130. So, if I now ask what IQ values captures the middle 95 percent of the IQs can anybody attempt this these values would definitely be between some value which would be less than 100 and some value which will be greater than 100. So, 95 percent of the middle values would be captured by two numbers one will be less than 100 and the other would be more than 100. So, what are those values how do we attack this problem can you attempt this. So, what would be the first step the first step would be to draw the picture and translate the question in a pictorial form that would be your first step. So, in other words what is sought is to find out the two numbers corresponding to these two values here. Now if I look at the standard normal curve do we know these values what are the two values corresponding to the standard normal variable Z such that it captures the middle 95 percent of the area of a standard normal curve. So, you look at the table if you look at the table you should be able to find that out and the value has to be very close to 2 since we have already seen that within 2 S D of the mean almost 95 percent of the values lie. So, this area corresponds to 0.025 that. So, what we are looking at we have the upper tail area which are 0.025 and the lower tail area is 0.025. So, what we now need to find out is a and b such that this area is 0.95 and this is equivalent to saying that the area A minus 100 that is the mean divided by 15 the S D is less than Z is less than B minus 100 by 15 this is 0.95. And we have seen from this picture here that this value is minus 1.96 and this value is 1.96. So, we have to find out A and B by equating these. So, A is nothing but 100 minus 1.96 times 15 which works out to as 70.6 and B is 100 plus 1.96 times 15 which comes out to as 129.4. So, the I Q values which captures 95 percent of the I Q's is 70.6 and 129.4. So, we have to find out A and B by equating these two values will capture the middle 95 percent of the I Q values. Now, let us finally take one more question what I Q is necessary to be in the top 5 percent bracket? What I Q is necessary to be in the top 5 percent bracket? So, in other words we are looking for the value so that this area is 0.05. So, the probability that X is less is greater than A that is equal to 0.05 and this amounts to equating the standardized form Z is greater than A minus 100. So, we have to find out A minus 100 by 15 this is equal to 0.05. So, you need to equate this value A minus 100 by 15 this you need to equate to the value of Z. So, that the area on the right of it is 0.05 and that is actually 1.645. So, what is the value of A? This will give you the value of A as 124.67. So, basically here we consider the picture so that the area on the right is 0.05 and this value is 1.64. So, this was some example of the value of X. Now, let us get into some use of the normal approximation to the binomial. So, let X be a binomial random variable based on n trials each with probability of success P. If the binomial probability histogram is not too skewed X may be approximated by a normal distribution with mean mu equal to n p and standard deviation sigma equal to root of n p q. Now, here when we say that the probability histogram is not too skewed in other words it means that the parameter P of the binomial variable is not very high not very close to 1 or neither it is very close to 0. It is the approximation holds good if it is closer to 0.5. So, the probability of the binomial variable taking values less than equal to say little x is approximately is approximated by the area on the left of the standard normal variable the standardized form of x except for adding this 0.5 which is a continuity correction. This correction helps in bringing the approximation in a better shape. So, let us try to use this result to find out a probability of a random variable for which is following a binomial distribution at a particular small college the pass rate of intermediate algebra is 72 percent. So, the probability of passing is 0.72 if 500 students enroll in a semester determine the probability that at most 375 students pass. So, here if you look at the mean and the s d of the random variable corresponding to the number of students passing the mean is NP which is 500 times 0.72 which is 360 and the standard deviation works out to about 10. So, to find out the probability that x is less than equal to 375 we can of course use the binomial probabilities, but that would involve lot of computation. So, we can approximate it by the normal result. So, the standardized value of x. So, x minus 360 by 10 which includes the correction factor of 0.5. So, this is the requirement we are trying to find out the value for probability z less than equal to this quantity and looking at the table that amounts to finding the area on the left of 1.55 of the standard normal curve and that works out to as 0.9374. Let us now look at some other continuous distributions. So, for that we first define the gamma function for alpha greater than 0 the gamma function gamma alpha is defined as this integral and as special cases gamma 1 is what as the integral from 0 to infinity e to the power minus x dx which works out to as minus e to the power minus x which is 1. So, gamma 1 is 1. So, in general gamma alpha plus 1 is equal to alpha times gamma alpha. So, what happens when alpha is an integer when alpha is an integer say alpha is equal to n then what is gamma n plus 1 that is equal to n times gamma n. So, this can be iteratively used to get that gamma n plus 1 is factorial n also gamma half is root pi this can be worked out. So, using this results we are now in a position to define an exponential distribution a continuous random variable x has an exponential distribution with parameter lambda where lambda is greater than 0. If the probability density function is defined as f x is lambda e to the power minus lambda x where x is greater than equal to 0 and is 0 otherwise the mean and variance of an exponential distribution the mean is given by 1 by lambda and the variance is 1 by lambda square. If we see the mean which is expected value of x which is integral 0 to infinity x lambda e to the power lambda x dx which works out to as 0 to infinity y by lambda e to the power minus y dy where y is equal to lambda x and this works out to as 1 by lambda and this is nothing but gamma 2 which is 1 by lambda. So, the picture here is right here. So, this is your density curve for the exponential. So, what about the variance sigma square which is the variance of x nothing but expected value of x square minus expected value of x whole square we can work this out. So, for this we need to find out the expected value of x square. So, what is the expected value of x square can you work it out in the same fashion as I did for expected value of x and come up with an answer what is the expected value of x square. So, basically it is going to be the integral 0 to infinity x square lambda e to the power minus lambda x dx and what is the answer to this. Can you work this out let us see you need to practice a bit can you work this integral 2 over lambda square anybody else got that you got that 2 over lambda square is what you should be getting this is nothing difficult you just have to refresh your integration. So, from here therefore, what do we have we have the answer as 2 over lambda square minus 1 by lambda square which works out to as 1 by lambda square as the variance. In fact, this is nothing but 1 by lambda square gamma 3 in the same way as we got 1 by lambda gamma 2 this also on substituting y equal to lambda x you should get this on integral. So, now let us look how to find out areas under a exponential curve density curve. So, the cumulative distribution function for the exponential distribution the CDF for exponential distribution. So, the CDF is given by 1 minus e to the power minus lambda x for x get then equal to 0 and is 0 otherwise. So, in other words what we are looking in CDF is if this is x f of x. So, the area on the left is the area of little x that is defined as f x. So, the probability that x is less than equal to little x that is the CDF which is f of x lambda that amounts to integrating from 0 to x lambda e to the power minus lambda x dx which is 0 to say lambda x e to the power minus y dy where y is equal to lambda x is the transformation. This works out to as e to the power minus sorry minus of e to the power minus y lambda x 0 which is minus e to the power minus lambda x minus 1 which is 1 minus e to the power minus lambda x which is what has been mentioned and from here which gives you the probability on the left of little x we also see that the probability on the right of little x that is nothing but 1 minus the CDF which is e to the power minus lambda x. So, the area under the density curve for an exponential is easy to calculate we do not need any tables here. Let us take this example let the random variable x denote the response time in seconds on a certain online computer. That means x is the time between the end of a user's inquiry the beginning of the systems response to the inquiry. The density function of this random variable x is given as 0.2 e to the power minus 0.2 x where x is get then equal to 0. So, the exponential curve here 0.1 0.2. So, the curve is like this 20. So, this is your x of the response time first of all you can easily check that this is a PDF that is the area under the curve is 1 in other words the integral f x d x from minus infinity to infinity which in our case is 0 to infinity 0.2 e to the power minus 0.2 x d x is 1. This integral can be worked out and you can check that this is 1. Let us check what is the probability that the response time is less than 0.5 that is the proportion of inquiries with a response time less than 0.5 seconds. So, that would be simply integral 0 to 0.5 0.2 e to the power minus 0.2 x d x and that from the CDF function that we just now saw is e to the power 1 minus e to the power minus 0.2 times 0.5 which works out to as 0.095. So, we simply used the CDF result similarly this one is nothing but 1 minus the probability of x less than 0.5 which works out to as 0.905. So, using the CDF you can easily check that easily compute areas under the exponential. I think I will stop here today and we will continue with the memory less problem next time.