 So, after discussing the normal random variable, we will today talk about the exponential random variable, a continuous random variable x whose p d f for some lambda greater than 0, please note that the parameter has to be positive. Then, f x equal to lambda e raise to minus lambda x for x non negative and 0 otherwise is said to be exponentially distributed or has an exponential distribution whichever way you want to say it. So, again we validate that this is a p d f indeed a p d f. So, it is non negative right since lambda is non negative. So, therefore, by definition this is non negative and the integral from 0 to infinity lambda e raise to minus lambda x d x would be minus lambda upon lambda e raise to minus lambda x 0 to infinity. So, at infinity it is 0, at 0 it is 1. So, and with the minus sign, so minus minus plus. So, this is equal to 1. So, this is indeed a p d f. Then, you want to compute its distribution function and here this would be 0 to a right. So, this is probability x less than or equal to a therefore, this is integral 0 to a and this again comes out to be 1 minus e raise to minus lambda a a greater than or equal to 0 right because our variable itself is non negative. Now, we verify the conditions that c d f must satisfy and so limit f a as a goes to plus infinity is 1. From here you see as a goes to infinity this will go to 0. So, this reduces to 1 and I should have also written down the limit f a as a goes to minus infinity as a goes to minus infinity. See, we have anyway said that x is less than I mean for x less than 0 this there is no mass. Then, can I say it from here if a is less than 0 then of course, this integral is not defined. I mean if a is less than 0 then this integral is 0 or you can argue that this is there is no mass for x less than 0. So, limit f a goes to this is 0. Then, f x is monotonically increasing. So, therefore, you take the derivative of f prime of f x which is f prime x that will come out to be lambda you are differentiating. So, your f x would be well treat a as x does not matter. So, if you are differentiating this so the lambda would come here minus lambda. So, minus lambda in minus plus e raise to minus lambda x and since lambda is non negative right because lambda is greater than 0. So, therefore, this is again non negative and so the function is monotonically increasing. So, all the properties of a cumulative density function have been are satisfied. Then, we find out the expectation of the random variable and that will be integral lambda 0 to infinity integral 0 to infinity x raise to minus lambda x d x and by integration by parts I treat this as a first function. So, this will be this and now you see that at 0 this is 0. So, therefore, and this is 1. So, the product is 0 at infinity e raise to minus lambda x goes to or if you can write it as x upon e raise to lambda x then e raise to lambda x goes to infinity much faster than x does. So, the ratio it will tends to 0. So, therefore, no contribution from this term and you are left with just this. So, therefore, gain when you integrate minus 1 upon lambda e raise to minus lambda x 0 to infinity that gives you 1 by lambda. So, the way to remember is that if lambda is the parameter for the exponential distribution and it is defined in this way then the inverse of the parameter is your mean or the expectation. Some places people define it as 1 by lambda minus 1 by lambda x. So, in that case your expectation will become lambda. So, it is the inverse of the parameter. So, whatever you use for defining this the inverse of that will come out to be your. So, then to find out the variance you find out expectation x square and I have not done the calculations here, but again in the same way you will have to do two iterations of the integration by parts. Here x square is the second function this is the first function and you continue doing it. So, then that comes out to be 2 by lambda square and therefore, variance is 2 by lambda square minus 1 by lambda square which is equal to 1 by lambda square. So, a quick example normally these would be associated this distribution, but because you know when you go to any public place where there is a service counter. For example, post office or railway booking and so on. Now, of course it is mostly online, but still people have to go to counters for all these services. Then you know the time that the clerk will take to service the customer is most of the time random variable and so exponential variables do model that situation in quite a few this thing. So, suppose the length of service at a post office counter in minutes is an exponential random variable with parameter lambda equal to 1 by 15. So, immediately you can say that the expected number of time that the expected duration of service to a customer would be 15 minutes, because the expected value of this random variable would be 1 by lambda which is 15 minutes. If someone arrives immediately ahead of you at the counter, find the probability that you will have to wait 15 minutes between 15 and 30 minutes. To find the probability that you have to wait for 15 minutes that means at least 15 minutes. So, therefore, the event would be x greater than or equal to 15 where x is your time for having to wait. So, probability x greater than or equal to 15 which will be equal to 1 minus probability x less than 15. And therefore, as we have computed this already while computing this for the x less than x exponential distribution. So, this will be 1 minus of 1 minus e raise to minus 15 by 15, because your lambda is 15. So, therefore, this probability comes out to be e raise to minus 1 which is 0.368. Since, as I have said that because we have already made this computation that this will be equal to 1 minus e raise to minus 15 by lambda and your lambda is 15. So, therefore, this will be equal to e raise to minus 1 and so this is 0.368. So, similarly for the second one probability 15 less than or equal to x less than or equal to 30 that means your waiting time is now between 15 and 30 minutes 15 minutes and 30 minutes. So, that will be again by the same computations will be f 30 minus f 15 and which again would be 1 minus e raise to minus 2 because it will be 30 by 15 which is minus 2 and then this is 1 minus of 1 minus of e minus 1. And so because this is less than or equal to 15 and so less than or equal to 15 less than 15 for a continuous distribution. So, it will be this will be the probability and so it is e raise to minus 1 minus e raise to minus 2 which is 0.233. Now, I want to show you another important property of the exponential distribution. So, first of all we will talk about the memory less property and we say that a random variable x is said to have the memory less property if probability x greater than s plus t given that x is already greater than t is equal to probability x greater than s for all s and t non negative which means that it does not matter how long you have already waited. If you are asking for this probability x greater than s plus t then it is the same as probability x greater than s. So, therefore the system does not have the system we are modeling does not have a memory less property and so we are talking of a random variable which whose distribution satisfies this condition. So, now I will show you that the exponential distribution among all continuously distributed random variables or among all continuous pdf's exponential distribution has the property of being memory less. Distribution which are discrete and which also have the memory less property, but among continuous continuously distributed random variables random variable which is exponentially distributed has the memory less property. So, in case this random variable is exponentially distributed with parameter lambda then you see if you write down this expression probability x greater than s plus t given that it is x is greater than t then this will be this because when the product of these two would be intersection of these two events would be x greater than s plus t you would because it is t smaller than s plus t. So, this reduces to this expression x greater than s plus t divided by x greater than t. Now, by our definition this is because f a is 1 minus e minus lambda a. So, 1 minus f a would be simply e raise to minus lambda a. So, here it is e raise to minus lambda s plus t divided by e raise to minus lambda t and which is this which is equal to probability x greater than or equal to s. So, exponential random variables are memory less and in fact little more arithmetic would be required or calculus to show that if you impose this property then you can actually show that only exponential random variables have the memory less property. So, since this being the first course I am omitting the mathematics here, but those interested can sit down and work it out and see that when you start with this condition for a continuous random variable then you will see that only exponential random variables the pdf which will satisfy this condition will actually be the exponential pdf. So, let us again look at an example suppose that the number of hours before a transistor fails is exponentially distributed with mean 500 hours. So, here the lambda is 500 if a person desires to go on a long tracking trip of 300 hours and uses the transistor in his radio then we want to find out what is the probability that the transistor will not fail. So, what is the probability that the person will be able to complete his trip without having to replace the transistor. So, he wants that the probability that the transistor will not fail for the 300 hour long trip tracking trip that he is undertaking. So, let us find out the. So, the solution is that f x is 1 upon 500 e raise to minus 1 upon 500 x because in the problem it says that the hours the lifetime of the transistor is exponentially distributed with mean 500. So, as I was telling you if the mean is 500 then the parameter will be 1 upon 500. So, therefore, the pdf of the random variable representing the lifetime of the transistor would be given by this x non-negative. So, therefore, probability x greater than 300 is simply e raise to minus 3 by 5 because it is e raise to minus 300 upon 500 which is e raise to minus 3 by 5 and you can compute this value from the tables or. So, what can be said when the random variable is not exponentially distributed in that case the memory less property is not there and so this will be simply a conditional probability in terms of capital F and so that is it. So, you see the advantage of having random variable which is memory less. Now, another thing that is useful and can be again for exponential random variable it gets quite simplified and that is the hazard rate function or in some times you also call it the failure rate function. So, if x is the random variable which is the which is representing lifetime of some item and of course, it is non-negative variable and the pdf and the cdf the pdf is given by small f and cdf by capital F then hazard rate function called the failure rate function is denoted by lambda t and is defined as follows. So, lambda t is small f t that means the pdf divided by 1 minus cdf. So, and a simple explanation is possible. So, here again what we are saying is that x belongs to this is a bracket here t plus delta t delta t is very small and then given that x has already worked for t. Now, what we are saying is that it will fail just after t because t plus delta t. So, x lies in the interval t comma t plus delta t given that it has been functional till time t and so this is a probability x belongs to t comma t plus delta t delta t is a positive quantity very small. So, therefore, when you take the intersection of these two you get this event divided by probability x greater than t and so this will be by definition f t plus delta t minus f t divided by 1 minus f t. So, now you see as the limit of this delta t becomes smaller then you know you divide by delta t and multiply by delta t. So, this divided by delta t remember limiting value of this is the derivative of f at t. So, which becomes f prime t into d t upon 1 minus f t and f prime t is your pdf f small f t divided by 1 minus f t d t. So, therefore, the definition of the failure rate because as you see that it should fail just at t time t. So, it has been functional up to time t and then it fails. So, now if f t is exponential you see when you write out this will be f t small f t. So, the pdf is mu e raise to minus mu t I have used symbol mu because lambda is already being used here fine and this is 1 minus f t will be e raise to minus mu t. So, you see this is mu a constant. So, for an exponential random variable the hazard rate is a constant it is not a function of t otherwise as you see that this will be a function of t. So, the hazard rate function is dynamically changing depending on the time that means if you are talking of life time. So, as it should be, but for exponential distribution the explanation is simple since it is memory less therefore, the hazard rate function does not change it is a constant. So, this is the rate of change of this is the rate of failure and so because it is memory less it does not matter how old the instrument is the probability of its failing anytime is same and so here the rate is also constant mu is therefore, also referred to as the rate of the exponential distribution. So, now mu is the parameter this is also the rate of failure for an exponential distribution and we saw that 1 upon mu will be the mean of the exponential distribution. So, given hazard rate of function lambda t for a continuous random variable it is possible to determine its pdf. So, we will just show because you see lambda t I can write as d dt of f t 1 upon 1 minus f t right and then if I take the integral of both the sides and attach a minus sign. So, there is a minus sign here this is 0 to x lambda t dt and this is integral 0 to x minus f prime t I have written this as f prime t dt upon 1 minus f t. Now, by you know formula for integration because derivative of this is this therefore, this will be l n of 1 minus f t right. So, therefore, l n of 1 minus f x because you are computing from 0 to x. So, it will be 1 minus f x will be and this is integral minus 0 to x lambda t dt I hope this is clear because see this will come out to be l n of 1 minus f t and this is from 0 to x. So, 0 to x when you put x here this will be l n of 1 minus f x and at 0 f 0 is what f 0 is 0 right. So, l n of 1 is 0. So, therefore, the contribution from here you get is l n of 1 minus f x yes remember limit f x s x goes to minus infinity is 0. So, therefore, I am just using that. So, that reduces to l n of 1 minus f x and this on the right hand side this is minus 0 to x lambda t dt. So, therefore, you can say 1 minus f x is equal to e raise to minus 0 to x lambda t dt and so f x you can write down from here as 1 minus of e raise to minus integral 0 to x of lambda t dt. So, if I know lambda t then I can integrate here and then my f x will be of this form. So, that means it is enough if you know the hazard rate function of a random variable you can determine its distribution. So, once you know the cumulative density function you can determine the pdf also right. Now, I will just illustrate the concept some more through an example and this says that see it is said that the death rate of a smoker is at each stage twice that of a non smoker that means they say that your age gets reduced by half if you are smoker compared to what non smoker. So, what is the ratio of the probability of a non smoker to that of a smoker to that of a smoker of surviving up to the age of. So, suppose I am just saying that what the both of them surviving up to the age of 60 given that both have survived up to 50 years we want to find out the probabilities. So, maybe the ratio part is not important all I am saying is let us find out the probabilities that a non smoker will survive up to the age of 60 given that he or she has survived up to 50 years. Similarly, a smoker the probability what is the probability of a smoker surviving up to 60 years given that he has survived up to 50 years. So, I will define lambda s t as the hazard rate of the smoker and lambda n t as the hazard rate of non smoker. So, now let me just right now take instead of 50 just let me take some years age a. So, what we are saying is that the probability that a non smoker's lifetime is more than 60 years given that the lifetime of the non smoker has been more than a is the conditional probability because again the intersection of these two because 60 is more than a. So, this is a probability that the non smoker has been the lifetime would be more than 60. So, that becomes 1 minus f n 60 divided by 1 minus f n a and from here you see this is 1 minus f x is e raise to minus integral 0 to x lambda t d t. So, that will be 0 to 60 lambda n t d t and 1 minus f n a will be 0 to a lambda n t d t e raise to. So, this is what we just computed here and therefore, if you take this upstairs then you see the integral this will become e raise to a to 60. So, this will be e raise to minus integral of a to 60 lambda n t d t and this let me call as p n. So, the probability of a non smoker surviving up to the age of 60 given that he has already survived up to the age of a and correspondingly for a non smoker for a smoker this probability of having survived of surviving up to 60 years given that he has survived up to a years is lambda s t d t and I am calling it as p s. So, if the belief is that lambda s t is twice lambda n t that means the rate of the death rate is twice as high for a smoker compared to a non smoker then I substitute lambda s t for lambda s t twice lambda n t here. So, that will be twice lambda n t and so this will become square of I have not written the step I mean actually this is equal to e raise to minus a to 60 lambda n t d t square which is the probability p n square. So, the effect is that the probability gets squared up for a non smoker. So, the probability of surviving up to the age of 60 is a square of the for a smoker is square of the probability of a non smoker surviving up to the age of 60. So, this 50 was not really important because a could be anything here. So, given that is why we said that at any stage at each stage. So, therefore, does not matter when you are making this comparison. So, however old the both the people are after that if you want to say what is the age of surviving up to 60 then the probability for the smoker is square of the probability for the. So, now here as I said that if you take lambda n t to be 1 by 30 that means remember I am taking the situation when yeah. So, that means this is now because this is constant. So, therefore, I am taking the exponential situation that means the random variable the lifetime is a random variable is exponentially distributed then probability that a non smoker reaches the age 60 would be yeah because this is lambda n t is 1 by 30. So, you want to compute will it come out to be e raise to minus 1 by 3 yes e raise to minus. You are saying this is 50, 60 and 1 by 30 dt. So, what will that be 1 by 30 into 10. So, this is I mean e raise to minus. So, this is e raise to minus 1 by 3. So, if this is a constant that means it corresponds to exponential random variable and so this is e raise to minus 1 by 3 which turns out to be 0.7165. So, for a non smoker and for a smoker it will be the square of this which will be 0.5134. So, see how fast the probability has reduced because the because the person smoking. So, the probability of a smoker surviving up to the age of 60 is 0.5134 and for a non smoker it is 0.7165. So, that way one can have many more applications and as we go along may be I have put some problem related to hazard rate function in your exercise 4 also. Now, we continue with some more special continuous random variables and the next one is the gamma distribution and x is set to have gamma distribution with parameters alpha and lambda. Both the parameters have to be positive p d f is given by this equation. So, f x is lambda e raise to minus lambda x lambda x in raise to alpha minus 1 upon gamma alpha and that is why the name. So, this is let me show you the calculation for the gamma alpha. So, let us compute the value of gamma alpha for alpha greater than 1. So, therefore, the definition is this is equal to 0 to infinity e raise to minus y y raise to alpha minus 1 d y. So, integration by parts and therefore, this will be the derivative here is e raise to minus y should be minus. So, here this is minus e raise to minus y y raise to alpha minus 1 0 to infinity and then this becomes plus because minus minus is plus. So, therefore, this will be plus 0 to infinity then derivative of this is alpha minus 1 y raise to alpha minus 2 e raise to minus y d y. Now, let us compute the values at the two limit points at the end points. So, for y equal to 0 see this will be 0 because alpha is greater than 1 and that is why it is important. So, alpha is greater than 1 therefore, this is positive power. So, therefore, this is 0 and of course, at 0 e raise to minus e raise to 0 is 1. So, this is equal to 0 and again when y goes to infinity then this goes to infinity faster than this here again because alpha is greater than 1. So, this exponent is positive and therefore, this will go to 0. So, therefore, no contribution by this term and so your integral. So, gamma alpha reduces to just this which by our notation will be gamma alpha minus 1. So, therefore, I did not write. So, here iteratively that means, this integral will be now alpha minus 1 and so for positive integer values of alpha if I go on doing it iteratively. So, for integer values of alpha positive integer values of alpha this we can see is gamma alpha would reduce to factorial alpha minus 1. You can see that because as go on. So, the finally, what you have will be this will be then I should have yes this is alpha minus 1. So, this should be equal to alpha minus 1 gamma alpha minus 1 from here alpha minus 1. So, therefore, as you go on the next iteration it will be gamma alpha minus 2 and so as you go on alpha is a positive integer. Therefore, you will end up finally, with just integral 0 to infinity e raise to minus y d y and so this will reduce to this will reduce to alpha factorial of alpha minus 1 for alpha b positive integer. Now, it can also be shown that the gamma function is defined for alpha between 0 and 1. This is also possible we can also show that the integral will is defined that is it will be finite value. So, for all values of alpha between 0 and 1 the integral is also defined that means gamma alpha is defined for alpha between 0 and 1. And one important value is gamma 1 by 2 which is root pi and this integral we will we will obtain these values later on in the forthcoming chapters. So, I will talk about fractional values of gamma alpha alpha between 0 and 1. And there are tables available for values of fractional values of alpha their tables available non negative fractional values of alpha tables are available. And for alpha equal to 1 the gamma distribution reduces to the exponential distribution. I will have to check the see our gamma pdf is given by this. So, when you put alpha equal to 1 this term is gone. And so your pdf reduces to lambda e raise to minus lambda x and gamma alpha is also 0 I mean sorry 1. So, therefore, you will be the gamma pdf reduces to lambda e raise to minus lambda x for x non negative. So, therefore, for alpha equal to 1 the gamma distribution reduces to the exponential distribution. So, this is one relationship and then I will show you some other relationships between many other this thing. Now, you want to again check that this pdf is a valid pdf. And therefore, I have to show that this integral will be evaluated to 1. So, here of course, you put lambda x equal to y then lambda dx is equal to dy. And immediately this integrand reduces to e raise to minus y lambda dx gets replaced by dy and this is y raise to alpha minus 1 upon gamma alpha. And so from here you see that divide by gamma alpha here and this is equal to 1. So, the validation is complete. So, therefore, this is a valid pdf. And now you want to compute the expectation of this gamma random variable then you have to multiply this by x and integrate. But then it is easy to manipulate because I will add x to this and lambda also. So, I divide by lambda then I multiply by alpha and multiply here by alpha. So, this becomes gamma alpha 1 by our definition. So, this will be gamma alpha 1 and this will be lambda x raise to alpha. So, then this will be that means the pdf of gamma here your parameters are alpha plus 1 and lambda. So, therefore, since the this is again a pdf for gamma alpha 1 comma lambda. So, again it will integrate to 1 and so you will be left with alpha by lambda. So, the expected value of alpha lambda gamma variable is alpha upon lambda. So, that means if the convention is to write this first and then this. So, then this divided by this that is if you are taking the gamma distribution alpha comma lambda. Then similarly, expectation x square can also be just by simple manipulation computed immediately. What I will do I need lambda square here to bring this together with this. So, then I divide by lambda square and I multiply by alpha alpha 1 to make it gamma alpha plus 2 and this will be lambda x raise to alpha plus 1. So, again this is a pdf of gamma alpha plus 2 lambda. So, this will integrate to 1 and I will be left with alpha into alpha plus 1 upon lambda square and so variance will be this quantity minus alpha lambda whole square and that gives me alpha upon lambda square. So, simple calculations to tell you the required quantities. Now, let me just show you an application and this application may be I will be using a concept which we have yet to do, but does not matter I still thought that this will be a good time to mention this application. So, here see this is we are considering the case when alpha is positive integer when alpha is equal to n is a positive integer. Now, gamma distributions rise as the distribution of the time one has to wait until a total of n events have occurred. So, it is like you go to a railway booking counter and railway ticket booking counter then you have people ahead of you in the queue and each person as I told you that we will treat the service time as a random variable. So, and remember that the I had shown you that the service time being a random variable can be an exponential random variable. So, for each person the service time is a random variable and then since they are different people independent people. So, each one gets serviced. So, then the total time would be some of that many independent random variables and exponentially distributed random variables this is the idea. So, now the what I am trying to show you is that the gamma variable is actually the time one has to wait till all people ahead of you have been serviced and you have also been serviced. So, the way we will measure it is that. So, when I am saying alpha equal to n. So, here I will be counting that you also have been serviced. So, then until a total of n events have occurred and of course, later on when we do the Poisson process and so on then the whole thing will become much more clear, but you can just get a feeling for the application that I am trying to discuss here. So, now let T n be the time at which the nth event has occurred. So, time at which the nth event occurred. So, then T n less than or equal to t this event would that mean if and only if all the n events have occurred by time t. T n is the time at which the nth event occurred. So, now T n less than or equal to t will mean that by time t all the n events should have occurred which means in our particular case all n people have been serviced by the railway ticket booking counter. Now, let capital n T be the number of events in 0 t. So, see in this particular case if there are people ahead of you let us say n minus 1 people ahead of you you are the nth person then that means in and the time span that you are taking is 0 to t then that many people should have arrived in the time 0 to t when there are n people in the system then only they get serviced this is how we. So, in this case there are n arrivals in this time and so when we look at probability t n less than or equal to t if you want to compute this probability then this is the same as probability n t greater than or equal to n because at least n events in 0 t that means at least n arrivals must be there they can be more, but when you are talking of n people to be serviced in this particular time span of time then that at least that many people should have arrived or that many events must be there in the system. So, this is what it is and this we will therefore, because this is a discrete thing people arriving is a discrete events. So, j this will be j from n to infinity probability n t equal to j probability that there are j people in the system time 0 t and then you sum it up from j equal to n to infinity. Now, this is the part see very often when you talk of events discrete events occurring in a span of time under certain conditions it can be shown that these will be Poisson arrivals that means the number of arrivals in a span of time would follow a Poisson distribution and therefore, probability n t equal and this I will prove hopefully when we are talking about the stochastic processes and so on. So, then in detail I will discuss how you arrive at this probability when you are under certain assumptions you can show that the probability n arrivals in the time 0 t would be given by this. So, you sum it up j to n so it means for n t equal to j essentially n t is a Poisson random variable and so the mean value becomes lambda t because you are taking the span 0 t. So, this will later on be explained. So, therefore, this is your Poisson probability you sum it up from n to infinity. Now, this is your cumulative distribution function for t n to find out the p d f I will take the derivative which is f t n of t and so we differentiate this expression with respect to t and you see if you first differentiate this then lambda comes out and j. So, j lambda e raise to minus lambda t lambda t raise to j minus 1 divided by j factorial minus derivative of this which will be lambda minus lambda e raise to minus lambda t lambda t raise to j divided by j factorial. And here j varies from n to infinity and so just rearrange the things a little bit j when cancels out here. So, it will be j minus 1 factorial lambda t raise to j minus 1. So, what you see is that here the terms are from starting from j minus 1 and here it is j. So, you see things will cancel out in pairs and except the first term here will be left out which will be lambda e raise to minus lambda t lambda t raise to n minus 1 divided by n minus 1 factorial. That is the only one because when you put j equal to n plus 1 here that will be lambda t raise to n upon n factorial and here also j equal to n it will be n factorial and lambda t raise to why is it j minus 1 here I will subtract. So, this will be j sorry because when I am differentiating with respect to this one here then lambda t raise to j remains intact. So, this is it. So, when j therefore, this term will cancel out with the second term here and then the third term here will cancel out with the second one here and so this will process will go on only the first term will be left out here which is lambda e raise to minus lambda t lambda t raise to n minus 1 upon n minus 1 factorial. So, this is now a gamma distribution with parameters n comma lambda. So, therefore, the amount of time a person has to wait till he is serviced and if there are n minus 1 people ahead of him in the queue and so that the that is a random variable and we have just now shown that when the arrivals are poisson then this will be a gamma distribution with parameters n comma lambda. So, in this case this is also referred to as an n Erlang distribution that is another name in literature you might some books may refer to this distribution as n Erlang. So, we have got some feeling about the gamma distribution and as we go on I will give you some more you know insight into the thing. Now, the other continuous variable distribution which is of importance is the beta distribution. So, a random variable x with p d f given by the equation f x x is 1 upon this is the beta function a comma b x raise to a minus 1 1 minus x raise to b minus 1 x between 0 and 1 and 0 otherwise the integral we denote by b a comma b. So, this is 0 to 1 x raise to a minus 1 1 minus x raise to b minus 1 d x where a and b both are positive. Now, again just as for the gamma distribution we can show that for a greater than or equal to 1 and b greater than or equal to 1 the integral will converge and in fact for integer values just like we did for the computation for a gamma distribution it can be shown that this integral will be equal to gamma a gamma b by gamma a plus b. So, therefore, I should again correct my statement here that is that the beta p d f is this function divided by this. So, it becomes gamma a plus b divided by gamma a gamma b and therefore, the integral will it turn out to be 1. So, the gamma p d f the beta p d f is actually this divided by this number and. So, we denote this integral by b a comma b which is. So, therefore, it has since it is defined for all a b positive. Now, for a greater than or equal to 1 and b greater than or equal to 1 you can show by integration by parts that the integral will converge and for and will be equal to this and for fractional values of a and b also it can be shown that this is defined the integral is defined and it is equal to gamma a gamma b upon gamma a plus b. Now, many useful applications of the beta distribution and one case is that it models random phenomena whose set of possible values is some finite interval c comma d. So, that means all possible values of this random phenomena occur between within a set interval c comma d and, but then since we have here defined the variable to be from 0 to 1. So, then by scaling and shifting we can transform this interval to 0 1 and of course, one obvious transformation is that y is equal to x minus c upon d minus c. So, then all possible values of x which are within c d will now be. So, the corresponding y variable will have all values between 0 and 1. So, and then we will see further applications of the beta distribution and we will compute the other quantities related with the beta distribution. Now, I am just trying to give you pictures of graphs of this beta distribution for different. So, now when a is equal to b this graph is symmetric the graph of the beta function is pdf is symmetric and for example, a equal to 3 it will be something like this and as a becomes bigger the mass gets concentrated this the graph becomes narrower and this is symmetric. So, if you draw for a equal to 10 probably it will be something like this p the peak will be higher and so on. Now, for a not equal to b the graph is asymmetric and skewed towards the left. So, for example, a equal to half it is almost skewed towards the y axis and as a increases again the skewness shifts to the center and these of course, they are not the graphs are not drawn to scale, but in any case a upon a plus b is equal to 1 by 20. So, in this situation when suppose a is 6 you can find out what the value of b is and so on. Now, there are situations see for example, if you have a big project in which you have lot of jobs and so of course, a big project will be made up of a number of jobs and this project may not have been handled completely before. So, there is lot of uncertainty about the job completion times and as a project manager he has to he or she has to you know sometimes we have some estimate as to how long it will take for the whole project to be completed, which means that must have a good idea as to how long it will take for each job to be completed. Now, in the absence of any previous experience because the jobs have not been performed. For example, of course, this is now a very old example that when they were trying to put a man on the moon then it was a completely new project all the jobs that made up the project were all new. So, people had no idea about the how long it will take for the jobs to be completed, but certainly there is a certain finite span and then of course, you do not expect just like as for the normal distribution for example, it is a symmetric distribution, but then here there was no reason to believe that the completion time distributions will be symmetric. So, therefore, beta distributions fitted the bill very well because here was a distribution which has a finite span and even though it is a continuous distribution and then it was not symmetric so and so on. So, then huge projects were then the time estimations were made using beta distributions. So, interesting applications where job completion times are not predictable you have no idea then again by integration by parts you can show that expected value of x is a upon a plus b and variance x will be a b upon a plus b into a plus b plus 1. So, beta distributions again one gets time one can talk about I can give you an idea how the time estimates are done using beta distributions. Now, as we go along we also need to keep coming back to distributions of function of a random variable and I will just do some sample functions here and then try to give you a general result. So, let us say suppose x is uniform 0 1 and that means x is taking a non negative values then if you define the function y equal to x raise to n then obviously x raise to n will also remain in the interval 0 to 1. That means the range for y is between 0 and 1 and so if I want to find the pdf of y then probability y less than or equal to small y is the same event as probability x n less than or equal to y and this will be probability x less than or equal to y 1 by n. So, remember this is because the values here are non negative. So, this is the same as this inequality the nth root and now if you differentiate both sides this will give you the pdf of y and here when you differentiate this will give you pdf because this is now f x this side is f x y 1 by n. So, you are differentiating with respect to y and so this is f x y raise to 1 by n into derivative of y 1 by n which is 1 by n y raise to 1 by n minus 1 and y between 0 and 1. So, but for a uniform random variable this is equal to 1 and therefore the pdf of y reduces to 1 by n y raise to 1 by n minus 1 when y is between 0 and 1 and 0 otherwise. Take another function so now even you take the function y equal to x square in this case you are not saying that x can only take non negative values x can take negative values also then you see when you write down this event probability y less than or equal to small y which is probability x square less than or equal to y then this will be then equal to this event that capital X is between minus root y and plus root y because I did not if I had you know sort of put the restriction that x has to be non negative then obviously this it would have been just this part this part would not have been there. But since I am allowing x to take all positive negative values therefore this will be this will be equal to this event right and so by again our this thing writing down in terms of the cumulative density function this will be f x root y minus f x of minus root y. And then so we differentiate again respect to y and d dy f y y is the so here this will be f x root y and then derivative of root y which will be 1 upon 2 under root y this is x and plus f x minus root y. So, the minus minus will become plus because there is a minus coming from here there is a minus here already so plus f x of minus root y into 1 upon 2 root y. So, this will be your pdf for y equal to x square now the third kind of function that I am looking at here is y equal to mod x. So, x has a pdf f x then here and so in this case y will have non negative values even though x has negative values right. So, we write down the event probability y less than or equal to small y this is probability mod x less than or equal to y which again can be written as x between minus y and y right. Because the absolute value has to be less than y y is a positive number. So, therefore in magnitude the value x even if it is negative it has to be higher than minus y right. Because if you are saying that mod x should be less than or equal to 3 then your x cannot be minus 4 because absolute value of x would be 4 which is not less than 3. So, therefore the values of x have to be between minus y and y right. And therefore, this is f y minus f of minus y and when you differentiate with respect to y you get pdf of capital y which is f x y again minus sign gets converted to plus because there is a minus coming from the derivative of minus y. So, this y greater than or equal to 0. So, one can go on but then I will just summarize all this in this theorem and yes. So, this is x is a continuous random variable and f x it is pdf suppose g x is a strictly monotone increasing or decreasing function. So, this is now very clear and it is differentiable. So, strictly monotone means that it is either going like this function like this or it is coming like this. So, monotonically decreasing or monotonically increasing then the random variable y equal to g x has a pdf given by this and it will take a few minutes to just prove this result. So, in this I just realized that in the statement of the theorem this absolute value sign is missing but that is important and I will show you y. So, that means when y equal to when you are looking at the function g x of the random variable x and we are finding the pdf of y then f y of small y is f x the pdf of x into of at g inverse y and absolute value of d dy of g inverse y when y is equal to g x and 0 if y is not equal to g x and of course here what we are saying is that the spurious values are not being considered because when you take the inverse function here see it is a monotone function. So, the relationship between that means this would be I do not have to worry about extra values here. So, this will be fine. So, now let us look at the proof of the theorem. So, we start with the event we start with the event that y is less than or equal to small y which is equal to this and this I can write x less than or equal to g inverse y. Now, because the function is monotonically increasing. So, this inequality from here this inequality is the valid outcome because it is increasing. So, the inequality will not change the function g we have assumed is increasing function and therefore, this is equal to f x of g inverse y. So, differentiate both sides with respect to small y then this is f y y which is d dy of this thing and in the next step you get this is the pdf of capital y which is here when you differentiate capital f x you get small f x. So, that is g inverse of y into the derivative of this in inverse y and now here because the function g is monotone this again result from calculus you can show that positive derivative and f being the pdf f x being the pdf of x this is non negative. So, the product is non negative and therefore, this is non negative. So, this satisfies the condition and of course, you can verify that this is also a valid pdf that means the integral will be 1 and so this is it. So, when you are taking the increasing function then because the this part is non negative I do not have to put the bar sign here, but when g is decreasing then you see the probability the event g x less than or equal to y will the transform to x greater than or equal to g inverse y and that is what I am showing you that if this is the function g x then you are saying g x less than or equal to y. So, g x less than or equal to y so beyond g inverse y are the values for which your function is less than y when the function is decreasing and so the inequality here will reverse and that will be x greater than or equal to g inverse y. And therefore, f y is 1 minus this you will write as 1 minus f x of g inverse y and then again differentiation both sides will give you f y y then minus f x g inverse y and derivative of this. Now, since g is a decreasing function this derivative would be negative so minus minus this will make it positive and that is why it is important that we write the absolute sign here because your p d f cannot be negative that is the first condition and which turns out to be because when the function is decreasing this would be negative so minus into this would become non negative and so here again this would be positive. So, this is for the completion sake that I wrote down this theorem, but normally what we do is we initially you know do compute the p d f we write down the equivalent event when you take a function of a random variable and then of course, differentiating both the sides you try to get the p d f for the function of a random variable, but at times it also helps to be able to use the theorem. So, this is this thing now let me just show you that as I was saying that one can either obtain results directly or using the theorem. So, if x is an exponentially distributed random variable with mean 1 by lambda then we show that expectation of x raise to k is k factorial upon lambda k k varying from 1. So, for any finite value of k this is what you have right now direct solution I am giving you. So, solution 1 I should have said actually this is solution 1 which is direct solution. So, here because I know the p d f of x. So, expectation x raise to k will be 0 to infinity x raise to k lambda e raise to minus lambda x d x remember we are told that the mean is 1 by lambda. So, the parameter the distribution would be lambda e raise to minus lambda x d x and integration by parts treating this as the first function. So, the lambda lambda cancels minus e raise to minus why am I writing t here this should be lambda x into x raise to k 0 to infinity plus the minus sign and minus sign makes it plus 0 to infinity e raise to minus lambda x and then relax x raise to k minus 1 d x into k right. So, you see this integral will come out to be this and then I again multiply and divide by lambda. So, then this becomes my regular gamma function. So, 1 upon lambda. So, this will be i k minus 1 into k by lambda this integral because when I differentiate this the k will come out the k should have been there. So, you have differentiated x k. So, k is already there k is here. So, that k I am writing k by lambda and then this integral is your i k minus 1 our i k was lambda e raise to minus lambda x x raise to k d x integral from 0 to infinity. So, now this integral I denote by i k minus 1 and therefore, your expected value of x k as you go on doing it by iteratively here of course, k is a positive integer. So, this is k factorial upon lambda k into i 0 final right if I go on doing it repeatedly. So, this becomes then for if I write down for i 0 this will be k factorial upon lambda k integral 0 to infinity lambda e raise to minus lambda x d x which turns out with this integral is 1 because anyway you know that this is the pdf of an exponential distribution with the parameter lambda. So, therefore, this integral is 1 or you can directly show that this is 1 and therefore, this is the answer right. Now, we want to use the theorem. So, using the theorem that is that you compute the pdf of the random variable x s to k. That means, I compute the pdf of the function of the random variable and then through that route I try to compute the expected value. So, if I define my y as x raise to k then I am trying to compute y less than or equal to small y which is probability x raise to k less than or equal to y which is then x because everything is non-negative. So, therefore, the inequality will be converted to this that is x less than or equal to y raise to 1 by k and then f y y that is the pdf of y now will be f x of y raise to 1 by k into 1 by k y raise to 1 by k minus 1 right because this I will be writing this is as f x of y 1 by k right. And so, e y will therefore, be 0 to infinity y into f y y which is f f y y d y which substitute for f y y in terms of f x. So, this will be 0 to infinity y lambda e raise to minus lambda y raise to 1 by k 1 by k y raise to 1 by k minus 1 d y. And so, e y now here I can put y raise to 1 by k is s. So, then 1 by k y raise to 1 by k minus 1 d y is d s. So, this whole thing goes to d s and so, I have now here s raise to k right because y will be s raise to k and lambda e raise to minus lambda s d s. And this is now we recognize that if I take lambda divide by lambda raise to k and combine lambda here. So, lambda s raise to k and so, this is your gamma pdf and therefore, is not pdf in the sense that I must have here gamma k plus 1. So, therefore, this integral will therefore, be equal to this integral will be equal to gamma of k plus 1 and then divide it by lambda k. So, that is another way of and since this is an integer k is an integer positive integer. So, this will be k factorial upon lambda k which we got here also. So, you know whichever is convenient one can try to get the result either way.