 Now, I want to discuss how to find distribution of a function of random variable. So, often what happens? You have some outcome, let us say your outcome of the experiment, I am going to define it by a random variable x. But most of the time it is not necessary that you will be just interested in x, but some function of this x. So, maybe like if you have a outcome which can take values both positive and negative. So, a random variable x is such that it can take positive and negative values, but you are not interested in the sign of the outcome. Here you may want to define a new random variable y which is simply the absolute value of x, that is what matters to the absolute value, the magnitude not the sign. So, then what is happened is you have basically interested in a function of this random variable. Or alternatively let us say you have a outcome, you do not want to just look at the outcome, but you are interested in the square value of the outcomes. Now, how to find the distribution of such random variables? So, let us start with the single random way functions on single random variable, then we will move to how to find distribution of the set of random variables. Suppose, let us say you have random variable x, but I am into what I am interested in some function of this random variable x and let us call that function to be g. Now what I want is I will tell you what is the distribution of x, now I want you to give me the distribution of y. How we are going to do that? If I am simply interested in the expected value of y, what I am going to do? What will be expected value of y? g of. So, here if you just tell me the pdf, let us assume it is a continuous random variable and if you give me a g function, then I will just find it. Now I am asking you more than this. You can find the expectation like this, but I do not want you to just give me the expectation. I am asking you to give me the actual distribution of y itself. So, how we are going to do that? So, one thing to do is simply you want to find distribution of y is right. So, to do that we start with our cdf. So, I am for some c, let us say some in R I will start with what is that probability that equals to z. This is going to be simply probability of g of x less than or equals to c. And suppose let us say you can this function g is invertible, then you are going to find g is equals to g inverse of c. So, since c is a constant here, you know g function and suppose if you can invert it, you will end up with another constant here. And if you already know the cdf of this, you can find the cdf for this function. And then what you can do is now this is nothing but f of g is now to find cdf of y, what you need to do? Sorry, this is going to be so this is going to be f of y at c. So, in general what we are going to do? We have basically translated this condition which this I can basically write as something like I have translated this into x belongs to some set a. So, this x being less than or equals to g inverse of c is saying that x belong to some a, where a is now defined in terms of this constant c and my function g. So far we had cdf of y, now how you are going to find the pdf differentiate with respect to what? C, right? So, suppose time being assumed that this guy is continuous and differentiable, then you can differentiate with respect to c and you will going to get right away the pdf of y, what you really wanted. Now, so this is for the case when you have your y to be sorry x to be continuous random variable, but what have what if your x is a discrete random variable. So, in that case you want to basically interested in first the PMF of your random variable y and for discrete we are going to say p of y if my x is already, so this is nothing but p of y equals to y, but now this is probability that g of x equals to y and this is going to be set of all x such that g of x equals to y of probability of and now, okay. See like when I am applying probability right, we have to be also careful, I am computing this probability with respect to random variable x or y or I am computing this probability jointly, okay. So, in this case when I write it here, I am computing probability that y is less than c right. So, here the probability is with respect to distribution of y okay and here now I have replaced y by g of x, now here probability with respect to distribution of x. So, often to be more precise we can subscript these values like this and here it is x, but when it is obvious we will just drop it, okay. So, now let us come back here, probability of y that it takes value small y is nothing but this right like and then I have just replaced y by g of x that is the definition of y. Now what I am interested is this is nothing but here set of all x such that g of x goes to y, okay. It is not necessary that one point goes to when you apply g function on that one point it goes to one point right that there could be multiple points that can go to y. So, for example let us say let us say this is your omega space and this is your random variable x y. So, what is x doing is for each value of omega that lies in this big omega it assigns some value on the real line that is the meaning of x right, x is basically giving values to each point in this. Now further what we are doing is we are further applying g on this to get y and that is further give me some for the same omega the new value y of omega. So, just to be clear what we mean by this is y of omega is equals to g of x omega for all right. So, now when I apply this g function let us say this is some y let us say I am interested in some y here that is my small i and I apply this g function it may happen that many function can fall to the same y right. So, that is why I have to look for all x that takes the value y and then add all of the probability that will give me probability of y at the value of that gives me probability that y takes value small y understand this of the phi. So, this is a process if I just give you x with a certain pdf or just cdf for its PMF you want to ask find the pmf or pdf of y this is the general step you have to just basically translate that value in terms of x itself for some known set and write it in this form. Now, where is this useful? Of course, this is useful whenever you want to find the pdf of a function of a random variable, but most often it comes to use when you want to simulate a random variable or samples according to a given distribution. So, many of the time you want to do a priori some analysis right you feel that your system has this kind of cdf, but you cannot always work on your system. For example, your system could be a very costly aircraft carrier or like a big aircraft fighter aircraft or something. So, you cannot like directly work on this right you have to basically simulate some of the things in your room. So, to and suppose say like some critical component or whatever component if you feel that it is going to behave in this form, then in the in the in in your laboratory or in your computer you want to generate samples according to their distributions right. So, how you are going to generate that distribution? So, here this function of random variables come handy suppose let us say you have a system which produces outcomes according to some distribution F. Now, I want to generate samples affording that is that obeys this cdf or come up with a random variable or characterize a random variable that has this cdf. So, suppose if I can set up an experiment that gives outcomes which has a cdf as per your requirement that means I am basically simulating whatever the real thing that you wanted to characterize through this F right. So, now how to generate a random variable which has this F? So, often we will be given X and we will say this is the cdf of it. Now, I am asking you a reverse question. Given the cdf generate me a random variable whose outcomes will have this cdf. How we are going to do that? So, one simple thing to do is we know uniform random variables right. So, the claim is that using uniform random variable and description of this function F you can generate a random variable whose cdf is exactly this ok. So, suppose let us say u is uniformly distributed in the interval 0 1. My claim is that if I am going to construct a random variable like this, this random variable X will have the cdf F ok. So, let us first see why that is true and then let us try to make it bit more precise. F has been given to you. When I say F is given to you that is a cdf and I expect it to satisfy all the three properties of a cdf. What are those? Monotonicity. That limits being 0 and 1 at the two extremes and then right continuity properties. So, let us say this already has those things. Now, my task is to generate a random variable that has the cdf. So, now, if I define this, my claim is if this random variable has exactly that cdf. Why is that? Suppose let us say and here F inverse does the job of g and u is uniform random variable which I already know how it behaves and how to let us say generate it. But now, my goal is from that generate a random variable whose cdf is going to be the given F. Now, let us say I am going to do this. What is this? Probability that F inverse of u less than or equals to this and now if you simplify this, this is going to be u is less than or equals to F of c right. So, notice that F is a cdf right. What is the range of F? It is going to take value only between 0 to 1 right. Now, I am asking the question after simplifying this, this uniform random variable taking value less than or equals to F of c which is between 0, 1. What is this value? It is simply going to be F of c. So, you see that the cdf of this random variable is exactly what you wanted. It is the cdf of, it is exactly the F function. Now, the question is to do this, I should be able to invert my function F right, whatever my function F that is given to me. So, is it possible like if you take a cdf which has all these three properties, I should be always able to invert it. Yes, that is enough. So, let us look a simple, let us take an arbitrary cdf, let us say it has the shape and let us say it jumps here and then something goes like this. So, there is a jump for us here and let us say this is the right continuous function at this point x and this is. Now, it is clear that how the invert look, invert point looks like. So, if I want to find inverse of my function F at this point, let us say this is let us say y here. So, how you are going to find the value of x that gives you this value y on y axis. So, you just draw a line there and then come back here right. So, at this point, let us say this is let me call this y 1 and this one x 1. So, this x 1 is the inversion of this y 1, but how you are going to do at a point here, let us call this y 2. So, what this any value in this, there it is going to map. Is it possible to clearly define what is the inversion here, what is the inverse of a y 2 here, right, but to make this work I have to assign some value to F inverse function that I need to define what is my F inverse function. So, one possible way to overcome this, what we have to do is we have to appropriately define F inverse u. What F inverse u you can define is suppose let us call this as u in this case, which is in this, we can define it as min of x and f of x. So, you look at all the values of f of x which is going to be greater than this point and in that you take the smallest one. So, it is going to be this. So, this is monotonically increasing, right. So, all the points above this you take, but from that so all this region will be covered in that take the smallest one. And if you do this, this is still well defined and from that you can continue with this definition of F inverse function in this. And you can see that like if you just define F inverse like this everything goes through, okay fine. So, this is you see that right like where how this function of random variables comes to help. At least if you want to simulate a one random variable which is complicated random variable, if I can define a function appropriately g function, then using a simple random variable, in this case uniform random variable, I should be able to generate a random variable with satisfies the more complex CDF characteristics, okay fine. Okay, first of all we will in the in the we will look into the textbook. There are so many examples about how to derive CDF of one function which is expressed as a function of another random variable, okay. So, just like one example which you can work out is suppose let us say you have a random variable which is defined as y equals to tan of theta where theta is distributed uniformly from minus pi by 2 to pi by 2. You understand this? Theta is uniformly distributed random variable that takes value between minus pi by 2 to pi by 2. Now, you are applying tan function on that and you are defining the new random variable as y. So, how this function look like? So, let us say this is my pi by 2 and this is pi by 2 minus infinity 2. So, it goes like maybe something like thing like this, okay. Now, what is the range of y? Minus infinity 2 plus infinity, right? Whereas the range of theta is just minus pi by 2 pi by 2. Now, how to find the CDF of y here? So, just like using this method you can work out and you will see that f of y of c is equals to 1 by pi 1 plus c square for c between minus infinity. So, this is 1 upon pi 1 plus c square and this PDF is in the literature it is called as Cauchy PDF. So, this is function of one random variable, right? It is not necessary that most of the time you will be dealing with one random variable. Like as I said if you are interested in weather, you may be interested in temperature, humidity, density of whatever, the clouds, whatever. So, there are so many random, bunch of random variables you have to deal with and then you may want to look at transformation of these random variables. So, here it is a transformation of single random variable, but you may be interested in transformation of a set of random variables. So, how to do that? Now, when we have a set of random variables, all are defined on the same probability space, I could treat them as vector. Suppose let us say I have m random values, x1, x2 all the way up to xm. I can treat them as a vector having m components. So, I am just going to represent that as a random vector now. So, a collection of random variables I have, I am just going to take them as a vector and denote that as x here. Now, what is the, and x here is a random vector here. So, I am just going to use the same notation for a random vector and also sometimes even single random variable I am going to write it as x. So, it should be clear from the context that I am talking about x, which is just like a single random variable or it is a random vector. Now, how does the distribution of this random vector? When we have a set of random variables, we had something called joint probability density function. We are going to take that joint probability density function as the distribution of this random vector. So, the joint PDF, I have already told you that when I have a m random variables, they are completely characterized by this joint probability density functions. I am going to now take this itself as the probability density function of this vector and that makes sense. Like now in this case, I will simply write it as x. Here x is this vector and this x is this vector. So, x is this vector of random variable, this x is the values, the vector realizations they are taking. Now, again, again notice that I am simply writing it as x to denote this entire vector and it should be again clear from the context I am talking about single random variable or a multiple random variables. Now, on this multiple random variables or a random vector, now I may be interested in the same. If I have PDF of this random vector to be x and I will be interested in another random vector, which is a function of this random variable. Now, how to find the distribution of this random variable y? So, we already discussed when this x is a random vector, it is just like a singular value, it just has one component, we already computed how to do this. But now this x is a vector, how to do this? So, for this I am just going to write a formula which is expressed in terms of the Jacobian matrix. We will not derive it, but we will take it and you will make yourself familiar with that and you are going to do some exercise how to use that. Now, this is like let us say y is, so notice that once we have here, I can also write x to be g inverse of y, where y is again a vector, g we have that f of y, f of y is equals to f of x, x divided by computed at y. So, when I wrote write this, this is the Jacobian matrix and when I say this two vertical box, verticalized this is indicates the determinants of this Jacobian matrix. Now, what is this Jacobian matrix? It basically looks how each component varies, each dependent component, independent component varies as a function of the independent component varies as a function of the dependent component. So, let us say you have x 1, here y is what? Dependent component, it is dependent on x, x 1, we are going to write it as y 1, let us say x m to the power 1 and then delta x 2 delta y 1. So, if you are going to solve this, you have to construct a matrix like this and then, so this is going to be square matrix. So, we are also going to assume that this y is also of the same dimension of x. So, then you are going to find the determinant of this. So, now here, this is you are going to compute it for a given y, right? And now, but for that you have to first find x here is nothing but g inverse of y and then you are going to compute this matrix at y, last column what? So, let us quickly look into this, we are not going to see how this will come, but let us just quickly to get a grasp. So, let us say you want to, for example, as an example, let us say we are in a Cartesian in a Euclidean space where the, how we are going to denote a given component point through its coordinates, right? So, like if you are in a two dimensional space, any point you are going to denote as x1, x2. Now, let us say you want to shift to polar coordinates. So, in polar coordinates, what are the components you need? R and theta. So, R and theta can be expressed in terms of x1 and x2. So, that is the map. So, let us say you are already in the Euclidean space and now you want to basically go to polar coordinates. So, that you can do through some function. And now if this Euclidean points have some distribution, you want to understand what is the distribution of the polar coordinates. Now, how we are going to do this? This method is going to help you, okay? So, let us take a simple case where my R1, R is equals to x1 square plus x2 square radius and my theta is equals to how it is going to be x2 by 2 by x1. So, here my x is going to be x1, x2 and my y is R and theta, okay? And my function g is such that it is doing this kind of mapping. It is taking R1 and R2 and giving you, sorry, x1 and x2 and it is giving you R radius and it is giving you theta. So, okay, so now you have expressed independent random variables here. So, these are independent here. So, how the dependent variable depends on them. So, I could also write in a reverse form, right? So, how can I represent x1, x2 in terms of R and theta? R, okay. So, now in this case, my, this matrix is going to be simply 2, 2 cross 2 matrix, right? Because I have only two components in this. So, can you quickly compute this and tell me what is this Jacobian matrix is going to look like? So, take a y1 to be R and the yt to be theta, cos theta and just check this is what the computation I have here. I am just writing that. You want to transpose of this matrix? Why is that x1, I am just going to do R. This is simply going to be cos theta and then this one I am going to do with R, okay? Sorry, I think I messed up this like this should be, yeah, but let me just define this correctly. Determined will be same, but whether I need to take transpose or not is the worry. Now it is fine. Okay, now what is this function? So, let us now try to let us plug in this. What is the determinant of this? It is going to be R. So, now what is this quantity? So, for a given y, what is this? We do not know it, right? Like I have not told you what is this f of x, okay? For time being, so now let us assume. To do this, I need to know f of x, right? The distribution of the independent random variables here. Suppose I assume that x1 is Gaussian, sorry, x1 is Gaussian with 0 mean and let us say variance 1, okay? And also assume the same, x2 is also the same thing. And assume they are independent, x1 and x2 are independent. Then what is f of x, x1, x, let us say x1, x2. So, what is this distribution? If I am saying independent, it is going to split, right? It is going to be x1, x1, f of 2 and x2. What is this distribution? f of x1 of x1, Gaussian, we know its formula, right? And for this also, we know the formula. Now this quantity, now what is x1 and x2 if I tell you y, y is now for me what r and theta, right? If I tell you that what is this x1 is going to be, I have already written what is that? Now, so finally what is the formula is going to look like for f of y for somewhere r theta. What is this? f of x1 r cos theta f of x2 r sin theta and what is the determinant? r. Can you quickly plug in the Gaussian distribution here and see what is that you are going to get? r by 2 pi e to the power. So, it is going to be r square cos square theta plus r square plus sin square theta, right? That is simply going to be r square by 2. Excuse me. So, it is fine. So, if you notice this, we have already come across this distribution. Where was that? And we gave a name for this. So, if you look here, last distribution in the continuous. So, this is Rayleigh distribution, right? But there you will have another term that we have written for more general case, but if you set sigma square equals to 1, this is what we are going to get, okay? So, this is Rayleigh with parameter what? Sigma square equals to 1. So, that time when we discussed about the Rayleigh distribution, we said that this is going to be the envelope of a sum of random variables, right? So, if you now look into this, this is exactly what we have written by envelope that time. You just take the squared sum and look at the square root of that. This is what then we will have. So, this is how like the sum of the Gaussian squared. And if you take the square root, we will recover the Rayleigh distribution. So, we will see again more things can be computed depending on what is your applications. So, let us stop here.