 Friends, in the last lecture I had started the continuous distributions, especially we did exponential distribution and the Erlang or Gamma distribution and both of the distributions I showed that they arise as distribution for the waiting time of the incidences in the Poisson process. However, today I will introduce some more continuous distributions which may not be related to the Poisson process. The first one is a one of the simplest distribution we call it uniform distribution. Now, we have seen the uniform distribution in the case of discrete variable also where we allocate equal probability for each outcome. Now, in the case of continuous distribution if you have constant density over the region over a finite interval then it is called a continuous uniform distribution. So, we can define like this a continuous random variable x is said to have a uniform distribution on the interval say a to b. Now, here one may take open interval or closed interval it will not make any difference if its probability density function is given by f x is equal to 1 by b minus a for a less than or equal to x less than or equal to b and it is 0 elsewhere. So, if you make a plot of this you can easily see how it will look like suppose I consider a and b here. So, 1 by b minus a is this one. So, you have a so that is why if you plot it it looks like a rectangle. So, that is why there is another name to this distribution it is also called rectangular distribution. This type of distribution is applicable in various situations. For example, you may consider waiting at a traffic crossing when you go on a road on a busy road. So, signal may waiting time may be up to say 3 minutes. So, you may have to wait say 0 to 3 minutes. For example, the time spent at a telephone booth by the customer and many other applications of this nature can be considered as applications of uniform distribution. Now, as you can see since the density is constant the mean or the first moment will be simply the mid value of the interval. One can easily calculate higher order moments also. In fact, moments of any order can be easily calculated I can consider mu k prime that is equal to expectation of x to the power k that is equal to integral x to the power k by b minus a dx from a to b. So, naturally this is equal to b to the power k plus 1 minus a to the power k plus 1 divided by k plus 1 into b minus a. In particular we can consider mu 2 prime and also the variance that is mu 2 in this case it will turn out to b minus a whole square by 12. That means, the standard deviation will be equal to b minus a divided by 2 root 3. One can look at the moment generating function expectation of e to the power t x. So, that will be equal to e to the power b t minus e to the power a t by t into b minus a. Of course, this is for t not equal to 0. If t is equal to 0 then naturally it is equal to 1. As I mentioned the uniform distribution has uses when the density is assumed to be constant. Let us look at some other useful distributions. Let me introduce one concept here. We have considered the exponential distribution or gamma distribution. So, for example, what is exponential distribution? Exponential distribution we have defined as the distribution of the waiting time for the first occurrence in a Poisson process. So, it can model various kind of phenomena. For example, lifetimes of the components in a manufacturing process, life of an electronic system and so on. Now, when we consider the lifetime interpretation of the exponential distribution then one of the important concept is that of a rate or you can say rate of occurrence. Now, in the context of lifetime that rate of occurrence of the incident can be called to be a for example, if a system fails. So, you can call it failure. So, that means, we are interested in failure rate. So, we define formally what is a failure rate we connected to the exponential distribution and in the light of exponential distribution then we will like to see what further generalizations of this can be made. So, we consider let us consider a quantity probability of say system is working till time t. So, I am considering let x denote the life of a component or system etcetera. Now, if the system is working till time t what is the probability that it fails in some time immediately after t that means, from t to t plus h where h is a small quantity and if I want to look at the rate then I divide it by h and then I take limit as h tends to 0. Now, you can easily give an interpretation to this the system is working till time t and we assume that it fails immediately after time t and then we divide by the length of the interval in which it fails and then take the limit then this can be called instantaneous failure rate of system at time. So, this is a useful quantity let us evaluate it. So, this is equal to h tending to 0 limit 1 by h. Now, if you look at this one this is conditional probability, but in the numerator you have a event which is subset of the event in the conditioning. So, this will become probability of t less than x less than or equal to t plus h divided by probability of x greater than t. So, this is equal to limit h tending to 0 and now this is nothing, but f of t plus h minus f of t divided by h and this quantity is 1 minus f x of t. So, x is continuous here then we assume capital F denotes the cumulative distribution function of x and let us take a small f x to be pdf of x then this limit as h tends to 0 this will become is the density function of x. So, this is equal to f x t divided by 1 minus f x t. So, this we call failure rate we have some name like h t this is also called hazard rate of the system. This 1 minus f x t that is probability of x greater than t this is also called reliability of the system at time t that means, what is the probability of the system surviving till time t. Now, in the light of this let us firstly look at exponential distribution and then we will look at generalizations. So, for exponential distribution let us consider one standard model we can consider say f x t is equal to lambda e to the power minus lambda t for t positive. So, here capital F x t is nothing, but 1 minus e to the power minus lambda t that is r x t is equal to e to the power minus lambda t. So, this h t that will be equal to lambda e to the power minus lambda t divided by e to the power minus lambda t that is equal to lambda. Now, you can easily note here that this is free from time t that means, hazard rate for the systems which follow exponential lifetimes are constant. So, now that is a very unique property which is free from t that is the failure rate of systems with exponential life distributions is a constant. One more thing that we can see here this particular relationship you can actually consider inverse relationship that means, given the distribution we can evaluate the failure rate. Given the failure rate also we can calculate the distribution. So, let me call this relationship is star from relation star we can write see h x t that is equal to we can consider minus d by d t log of 1 minus F x t. So, this implies I can consider log of 1 minus F x t is equal to integral of h x t d t. So, this means 1 minus. So, plus a constant you will get F x t that is equal to and you will get a minus sign here e to the power minus integral h x t d t into a constant. For the exponential distribution you can see h x t is lambda. So, you get e to the power minus lambda t and this k can be determined from the initial condition from for example, F x 0 is equal to 0. So, this can be calculated. Now, you think of a situation where the failure rate is not constant that means, it may depend upon t. Now, the simplest situation you may think of for example, it could be lambda t that means, it is linear linear failure rate or you may think of parabolic failure rate like lambda t square etcetera. Now, given any such thing you can evaluate the function let us take one example here. Let us take say h x t is equal to say lambda t. In that case, what will be 1 minus F x t that is equal to e to the power minus lambda t square by 2 and if you consider say differentiation. So, the density function will give you minus and this minus will come on this side. So, you will have e to the power minus lambda t square by 2 twice lambda t by 2 that is equal to that means, the density function is equal to lambda t e to the power minus lambda t square by 2. Now, this gives you a more for example, if I had taken lambda to the power k here then I would be getting lambda t to the power k by k 2 e to the power k plus 1 by k plus 1 and here I would have got again the derivative of t to the power k plus 1 that is lambda t to the power k here. So, that gives a more general class of densities called Weibull distributions. This gives rise to a general class of distributions called Weibull distributions that means, we are considering F x is equal to something like alpha beta x to the power beta minus 1 e to the power minus alpha x to the power beta x greater than 0 and of course, alpha and beta are positive. So, if we have this density you can see that capital F or 1 minus capital F x will be equal to e to the power minus alpha x to the power beta that means, the hazard rate will become equal to alpha beta x to the power sorry t to the power beta minus 1 which is nothing, but a power of t. So, when you have power of t you get a general Weibull distribution. Now, here you can have two type of things for example, if I consider beta greater than 1 then what does it mean if beta is greater than 1 that means, if time increases the failure rate increases which is. So, that describes various for example, systems which are used in industry for example, any practical kind of system the system life will the failure rate will increase as the time increases be it a television set or be it any manufacturing system or be in any machine, but we can also model here certain different kind of systems for example, I may consider beta less than 1. If I take beta less than 1 then this t will go in the denominator that means, as the time increases the rate of failure decreases. Now, this kind of things for example, are applicable you consider say any organic life forms. So, in an organic life forms for example, a small kid for example, after the birth the failure rate or you can say the death rate is very high mortality rate is very high, but as the time progresses then his rate of death decreases. So, for example, from 0 to 5 years it is quite high then from 5 year to say 50 years it is much less. So, that means, you can model the systems where the initial failure failure rate may be very high that means, as the time increases the failure rate decreases. So, this gives you a flexibility. Now, in the light of this reliability function and the failure rate we also consider systems in which multiple components are there. So, we can consider say systems which we consider systems in a series. So, this is called a series system and you may have systems in parallel. So, for example, the system will be working if either of the systems 1 to n is working here the system will work if all of the systems 1 to n are working. So, this is a series system this is a parallel system. So, these are the systems which are used quite often in engineering design. Let us look at the reliability of such systems, reliability of a series system. Let a system consist of n components with say system lives defined by random variable say x 1, x 2, x n which are connected in a series. So, let us consider say let us define x to be the system life. Then if we define the system reliability that is let us call it R x t that is probability of x greater than t. So, if this entire system is functioning at time t that means each of the components x 1, x 2, x n is functioning at time t. So, this is equivalent to saying probability of x 1 greater than t x 2 greater than t x n greater than t. So, this will define the system reliability. So, if we know the joint distribution of x 1, x 2, x n this can be evaluated. Now, special case we can consider that if x 1, x 2, x n are independent then this R x t can be written as product of these probabilities product of x i greater than t i is equal to 1 to n that means it is product of the reliability of the individual systems. So, if you have a system which is connected in a series which has components connected in a series and the systems are and the components are independent then the system reliability is nothing, but the product of individual component reliability. Now, the physical interpretation you can think of for example, let us take constant component reliability say p at a time t. Now, suppose you have n components then the system reliability at time t for the entire system will become p to the power n. Now, naturally suppose I consider p is equal to half and I take n is equal to 3 then p cube that is the system reliability is 1 by 8 which is much less than half that means if you add the systems in a series then the entire system reliability will continuously decrease. The reason is that if you have more systems then any of them can contribute to the failure of the system and therefore, the entire system reliability will become weak or you can say it will become less. On the other hand let us consider reliability of a parallel system. So, let a system consist of n components with system lives say x 1, x 2, x n connected parallely that means for example, if you have the circuit from a to b then the circuit will function as long as any of them any of the systems 1, 2, 3 functions. So, and let us again as before x denote the system life then the system reliability will be this is equal to probability x 1 greater than t or x 2 greater than t or x n greater than t. Now, we can consider it in a different way we can consider 1 minus now if it is the at least one of them is working then if I consider the complementary event complementary event will be that all of them will be failing. So, it will become probability of x 1 less than or equal to t, x 2 less than or equal to t, x n less than or equal to t. Now, again if I consider the special case that is if x 1, x 2, x n are independent then r x t can be written as 1 minus product of probability x i less than or equal to t which is nothing but 1 minus 1 minus now this will become r i t product i is equal to 1 to n. Once again let us look at through a practical example suppose I assume components to have identical reliability at time t suppose p is equal to half suppose I have 2 components. If I have 2 components then this will become 1 minus half square that is equal to 3 by 4 that means the system reliability increases if we connect the components in parallel the system reliability decreases for the series the components which are attached in a series but if you put them in the parallel then it increases. So, an important engineering design is that in which we create or you can say increase the system reliability by adding extra systems as a backup in series. Generally they call it like waiting time that as soon as something fails you put something more that means the other system starts functioning. So, this concept of reliability failure rate hazard rate has extremely useful applications in engineering studies because generally we are dealing with the system lives. So, these are important quantities to be considered. Based on V-Vol distribution there are some other distributions also which are used as you can see here that we have introduced in the exponent a power here that is x to the power beta. There are other distributions where in place of polynomial power we can consider exponential power also. Now naturally you can see that they will go to 0 very rapidly. So, they are also called extreme value distributions. So, one can look at various extreme value distributions also. In this particular course we are just mentioning this point here. Now, we move to one of the most widely used distributions in a statistical theory it is called normal distribution. Firstly let me introduce the distribution and then we look at its importance and then how why it is actually considered to be most popular and why the name normal is coming. So, a random a continuous random variable x is said to have a normal distribution with parameters mu and sigma square. We write the normal distribution write actually x follows normal mu sigma square. If its pdf is given by 1 by sigma root 2 pi e to the power minus x minus mu square by 2 sigma square where the range of the variable is minus infinity to infinity. The parameter mu is also from minus infinity to infinity and the parameter sigma is positive. If we plot the curve it is something like this as you can see this is symmetric about the value mu. One can well let me demonstrate how to evaluate the integrals related to this and then we can. So, let us consider the evaluation first. So, for the evaluation I consider a general term for example, mu k prime. So, mu k prime is expectation of x to the power k that is equal to integral rather I will consider firstly let me just look at the evaluation of the density from minus infinity to infinity. Actually if it is a proper density then this should be integrate to 1. So, let us look at this one first 1 by sigma root 2 pi. Now, you consider a transformation z is equal to x minus mu by sigma as you can see from minus infinity to infinity this is a 1 to 1 transformation. So, this integral will then become equal to 1 by root 2 pi e to the power minus z square by 2 dz. Now, what we do? We observe that this is a convergent integral that you can easily check because it will be rapidly diverging to 0 on both the sides. So, let us put say then this will become equal to twice 0 to infinity e to the power minus z square by 2. Those who have done the theory of gamma function you already know its value, but I will just demonstrate the evaluation here. So, z is actually equal to root 2 t. So, we can also write it as this is to demonstrate to transform it into a gamma function. So, this is becoming e to the power minus t by root 2 t dt that is equal to simply gamma. So, 1 by root pi gamma half which is root pi. So, it is equal to 1. Now, if we use this we can evaluate for example, what is expectation of x minus mu. If I consider expectation of x minus mu to the power k for example, as you can see this integral evaluation is much better compared to expectation of x to the power k because I have made the transformation x minus mu by sigma is equal to z. So, that will simplify this term whereas, in this one it will not get simplified. So, if I consider this term here 1 by sigma root 2 pi e to the power minus x minus mu square by 2 sigma square d x and here you have x minus mu to the power k. So, if I make this transformation x minus mu by sigma is equal to z then this becomes equal to minus infinity to infinity 1 by root 2 pi sigma to the power k z to the power k e to the power minus z square by 2 d z. So, obviously this vanishes for k odd. Now, let us consider k equal to 1. This means expectation of x minus mu is equal to 0 which means expectation of x is equal to mu. So, the term mu here denotes the mean of normal distribution which is also you can look at it here from the shape of the distribution this is also the median and it is also the mode of this distribution also it is median and mode. Now let us consider say for example, k equal to 2. If I take k equal to 2 then so now this quantity because mu is the mean then this becomes actually the central moment here. This expectation of x minus mu to the power k this is actually kth central moment because I have proved here that mu is the mean. So, for k equal to 2 mu 2 will denote the variance of x. Now, let us look at the value here it is equal to 1 by root 2 pi minus infinity to infinity sigma square z square e to the power minus z square by 2 d z. Now, this one is an even function and therefore, this is equal to 2 by root 2 pi 0 to infinity. Sigma square z square e to the power minus z square by 2 d z. We again look at the transformation that we considered here that is z square by 2 is equal to t. So, if we consider this transformation then this value is simply equal to 2 sigma square by root 2 pi 0 to infinity 2 t e to the power minus t divided by root 2 t d t. So, easily you can see this term is giving you t to the power half that is gamma 3 by 2. So, gamma 3 by 2 is half gamma half that is half root pi and this 2 and this 2 and this 2 cancels. So, half and 2 cancels so you get simply sigma square. So, we have shown that this parameter sigma square of the normal distribution is actually denoting the variance. So, now when I write that x follows normal mu sigma square means the mean of the normal distribution is mu and the variance is sigma square. Now, as I have mentioned here the mu can be any real number and sigma can be positive number. One of the important special case will be when mu is equal to 0 and sigma is equal to 1 that is called a standard normal distribution. When mu is equal to 0 sigma square is equal to 1 it is called a standard normal distribution. The probability density function of standard normal distribution that is 1 by root 2 pi. Since sigma is 1 e to the power minus x square by 2 for x between minus infinity to plus infinity. In the statistical text a special notation small phi is used for this. The cumulative distribution function of standard normal distribution is denoted by capital phi that is minus infinity to x small phi t d t. Now, these functions have some special property also as you can see this is since the normal distribution is symmetric around its mean when I have a standard normal this will be symmetric around 0. That means small phi of minus t is equal to small phi of t. Now, if you utilize this here you will get capital phi of t is equal to 1 minus capital phi of minus t. And in particular capital phi of 0 will be equal to half that means 0 is the median which is true here. Now, there is another important point because of which standard normal distribution is considered. Given any normal distribution you can always shift it to a standard normal distribution for that I will prove one result here. So, firstly let us look at say consider the moment generating function of normal distribution. So, expectation of e to the power t x that is your m x t. So, that is equal to integral from minus infinity to infinity e to the power t x 1 by sigma root 2 pi e to the power minus x minus mu square by 2 sigma square d x. Now, as before we consider the transformation that is x minus mu by sigma is equal to z. Now, the consequence of this is that x is equal to mu plus sigma z. So, this is if we consider this then I get here minus infinity to infinity 1 by root 2 pi e to the power t mu plus sigma z e to the power minus z square by 2 d z. So, this I write as 1 by root 2 pi and this e to the power mu t can be written outside and e to the power minus half z square minus twice sigma t z. Now, I add and subtract sigma square t square here. So, if I subtract here and I take it outside it will become. So, this I can express as e to the power mu t plus half sigma square t square minus infinity to infinity 1 by root 2 pi e to the power minus 1 by 2 z minus sigma t whole square d z. If we look at the integrand this is nothing but the probability density function of a normal random variable with mean sigma t and variance 1. That means, in place of mu if I put sigma t and in place of sigma I put 1 then I get this. Therefore, the integral of this should be simply equal to 1. So, this quantity becomes the MGF of a normal distribution with mean mu and variance sigma square. Now, using this we can prove the linearity property of a normal distribution linearity property of normal distribution. That means, if x has a normal mu sigma square distribution let us define y is equal to say a x plus b where a is not 0. Then let us calculate the moment generating function of y that is expectation of e to the power t y that is equal to expectation of e to the power t a x plus b. So, that is equal to e to the power b t moment generating function of x at the point a t. Now, moment generating function of x is calculated as e to the power mu t plus half sigma square t square. So, we substitute it here this gives us e to the power b t e to the power mu a t plus half sigma square a square t square that is equal to e to the power t a mu plus b plus half sigma square t square. So, this is moment generating function of a normal distribution with mean a mu plus b and variance this is a square sigma square. So, a square sigma square. So, what we have proved by the uniqueness property of MGF we conclude that y follows normal a mu plus b and a square sigma square. So, this is if you look at any random variable x with certain mean and variance then we know that its mean is linear and variance will be square, but if we consider the distribution the distribution itself may change. However, for the normal distribution any linear function is also having the normal distribution. Now, if I take the special case if I take x minus mu by sigma then that will follow normal 0 1. So, from any normal random variable I can transfer it to normal 0 1 this is called standardized variable or standardized value that means, you subtract the mean and divide by the standard deviation then this will have normal 0 1 distribution that means, from any normal I can always shift to standard normal. Now, this property is useful to evaluate the probabilities related to a general normal distribution. So, if I consider say cumulative distribution function of x that is probability of x less than or equal to x as you can see this is nothing, but the integral from minus infinity to x of the density function here 1 by sigma root 2 pi e to the power minus x minus mu square by 2 sigma square which will lead to an incomplete gamma function. So, for every different mu and sigma it will be difficult to evaluate. However, if we use this linearity property we can consider this is equal to probability of z less than or equal to x minus mu by sigma that is nothing, but phi of x minus mu by sigma. The tables of capital phi are available and in almost all the statistical text these tables are there I will just show you the CDF of the standard normal distribution. So, that means, you have the normal curve and standard normal curve. So, what is the probability up to a point z that is the CDF. So, for different values of z this values are the probabilities are tabulated for example, what is the probability up to 0. So, this is equal to half what is the probability up to say plus 1 it is 0.8413. Now, if we look at this we also come across some very interesting phenomena about normal distribution which I will show you now. Let us consider some special points we look at phi of 1 that is equal to 0.8413. I have also given you the relationship that phi of x is equal to 1 minus phi of minus x. So, if I consider what is the probability of minus 1 less than or equal to z less than or equal to 1 that is equal to phi of 1 minus phi of minus 1. So, if I use this property I get twice phi of 1 minus 1. So, that will give us 0.6826 that is I have just taken twice of this that is equal to 1.6826. So, then I subtract 1. So, I get 0.6826. Now, if you write here z is equal to x minus mu by sigma then it is giving you mu minus sigma less than or equal to x less than or equal to mu plus sigma is equal to 0.6826. Now, this is interesting let me again draw the normal curve this is mu. So, let us consider mu minus sigma to mu plus sigma. So, what we are saying that more than 68 percent of the probability is concentrated in the zone mu minus sigma to mu plus sigma that is more than 68 percent probability is concentrated between mu minus sigma to mu plus sigma. Likewise, let us consider say phi of 2 from the tables of standard normal distribution you can see phi of 0.2 is 0.9772 that is equal to 0.9772. Once again if I consider probability of mu minus sigma to mu plus sorry mu minus 2 sigma to mu plus 2 sigma then that will give me that means, I have just multiplied this by twice and subtracted 1 like here. So, what we are concluding that more than 95 percent of probability is between mu minus 2 sigma to mu plus 2 sigma. Finally, let us write say phi 3 phi 3 if you see from the tables of the normal distribution it is equal to 0.9987. So, if I consider probability of mu minus 3 sigma less than or equal to x less than or equal to mu plus 3 sigma then I get point that is more than 99 percent probability is between 3 sigma. That is mu minus 3 sigma to mu plus 3 sigma. So, in the industrial quality that means, when we are looking at the quality of the items then we try to see that our most of the material is between mu minus 3 sigma to mu plus 3 sigma. So, for very long time in the industry this 3 sigma limits are found to be very useful and so they say that the process is under control if it is within the 3 sigma limits the and of course, nowadays there is further generalization they are considering 6 sigma in place of 3 sigma because if you consider 6 sigma then the probability of inclusion becomes it will be actually 0.999998. So, that means, probability of being outside will be 1 in a million kind of thing. So, this 3 sigma limits are used in the industry. To conclude about this normal distribution actually I will be telling more in the following lecture. Let us just look at here the measures of eschewness and kurtosis for the normal distribution. You can easily check from the calculation because we have already calculated the general mu k. So, mu 3 is 0. So, measure of eschewness will be 0. If I look at the measure of kurtosis for that I need mu 4 it can be checked it is equal to 3 sigma to the power 4. So, the measure of kurtosis that is mu 4 by mu 2 square minus 3 that is equal to 3 sigma to the power 4 by sigma to the power 4 minus 3 that is equal to 0. So, when I was mentioning the peakedness or the kurtosis of a distribution I mentioned that there is a normal peak and higher peak that we called leptokartic and flat peak we called it platikartic. So, here we can see that the normal peak is actually the peak of the normal curve or the normal distribution. Now, more about that why we are actually calling it a normal distribution I will be covering in the next lecture. What we have observed here is that the normal distribution satisfies a linearity property it is also having symmetry and the probabilities connected to any normal distribution can be calculated using standard normal probability probabilities which are actually available in the form of tables. We will in the following lecture show that normal distribution arises naturally as the limiting distribution of various distributions. So, that the results are known as the center limit theorem. So, in the next lecture I will be covering that.