 Now, if you did not have, so let me, let us go from here into a problem 3. Now, if I have phi which is not 1 to 1, suppose it is many to 1, phi is many to 1, then in that case how can I still maintain an equivalence between 2 and star? That is the question here, is without, when it is not 1 to 1, how can I maintain the correspondence between 2 and star? So, if there are multiple z's that correspond to the same, that give you the same x or for, if you are on the, or the, basically the issue is that for every x there need not be a z, right? For every x there need not be a z. In that case what, what can one, how can we still ensure a correspondence between 2 and star? So, the trick to do that is the following. See the, why did we fail here when we substitute, when there are multiple, there are x's for which there is no z, in that case what can have, what, what kind of problem happened? We substituted phi of z as x and we got to problem star, right? We substituted, we got, we substituted phi of z as x and got problem star and in that case what, the, what has happened as a result? We were searching over too many x's, right? Far more many than we were supposed to in problem 2, right? So, what we can do is bring that back in some, through the constraint, okay? So, so suppose I look at this problem. So, this has all the constraints that star has and in addition to that it is requiring that x must be equal to phi of z. So, let me write it like this, x minus phi of z is equal to 0. x should be such that there is a phi, there is a z for which x is equal to phi of z, okay? And now I am minimizing over both variables x and z. So, as x varies, z will also vary but the, but my additional constraint here x equal to phi of z, this ensures that I am searching only over those x's for which there is a corresponding z. This automatically constraints my problem star to search only over those x's for which there is a corresponding z and this therefore becomes equivalent to problem 2, is this clear? So, if I get from here, if I get, if I have a solution x hat here, suppose x hat solves 3, if and only if z which is equal to phi of, sorry, phi inverse of, not phi inverse, sorry, if and only if, so actually let us take x hat z hat, this thing solves 3, okay? So, if x hat z hat, this, that solves 3, let me put it right like this, x hat comma z hat solves 3, then you must have that x hat is equal to phi of z hat, that is because of the constraint and z hat solves 3 and this is equivalent, is this clear? So, you can bring it, do the substitution or change of variables but then if you are losing the, the, if you are losing information, if you are losing information by about the z by transforming it to star, you can bring it back through the back door by putting it as a constraint, okay? Now, it may or may not simplify the problem that depends on the problem at hand. Sometimes, for example, one of the one of the issues that occurs is you can have, algorithms might not be able to process these kind of functions, you know, function pass to another function, it may make error in calculating gradients and so on, those kind of issues can occur. So, it is much cleaner to then write out, give them one function, one clean function in each constraint and then put this as an additional constraint. The cost you pay for this is, now you have additional variables also. So, your problem size has increased in terms of the number of variables you are optimizing over. So, this can sometimes be a cleaner way of formulating the problem than formulating in this sort of, this kind of complicated manner, okay? Yes. Yeah, so that could be another, so a case where you cannot, for which case where you have for every x, then you not be a z, that sort of case would be when you are changing dimension. So, if you are mapping down for example, if you are going from Rn to something small, some Rk, where k is smaller than n, in that case, there will be vectors in the range space which are not for which there is no corresponding value, okay? So, let another commonly used trick which is probably you already know is to do transformations that are of a monotone kind, but I will explain this in a much more, in a better way now, in a more general way. So, you may have seen for example, when you want instead of trying to maximize, instead of trying to maximize say the exponential of a certain function, one of the things that people do is instead just simply take logarithms on this, on the same function and then you maximize the log. And the reason for that is that log is a monotone function, right? So, that is the thing that I will now tell you in a much more general way. So, look at this, so suppose here psi 0 is a monotone increasing function. Suppose also psi 1 till psi m, these are all functions from R to R and they are such that they satisfy psi i of x less than equal to 0 if and only if x let me denote this by u psi i of u less than equal to 0 if and only if u is less than equal to 0, okay? And suppose psi m plus 1 till psi m plus p, these are also functions from R to R and they satisfy psi i of u equal to 0 if and only if u equals 0. Now, you define say f tilde as psi 0 composed with f and g tilde i as psi i composed with g i and h tilde j as psi m plus j composed with h j. So, I am composing this time from the left, all right? So, and now you look at this optimization problem, which is my problem 4. I am now minimizing f tilde of x, this transformation does not change my space, it is still x subject to g tilde i of x less than equal to 0 for all i 1 to m h tilde j of x equal to 0 for all j equal 1 to p. So, this is my optimization problem. Now, this is actually equivalent to star. So, how do you see again, you want to say that these are equivalent, equivalent in what sense? Equivalent means solving one gets you to the solution of the other, right? Now, in this case actually it is much, it is even better. So, first let us compare the feasible regions of the two. Can you say something about the feasible regions of the two? Feasible region of this problem, problem 4. How does that compare with the feasible region of star? So, suppose I have an x that is, that is feasible for 4. Let us start with 4. I have an x that is feasible for 4 which means it satisfies all these constraints, satisfies all the inequality constraints, all the equality constraints. Now, g i tilde of x is less than equal to 0. Now, let us take this as just as an example, let us take a g i tilde of x less than equal to 0. Now, g i tilde is what was g i tilde? g i tilde was psi composed with g i. Right? So, it says, so this is saying psi i of g i of x less than equal to 0. And if you see my condition on these psi i is from 1 to m, the condition is simply that condition is that psi i of u is less than equal to 0 if and only if u is less than equal to 0. Right? So, this, so if I look at all the x's, if I look at all the x's such that this is less than equal to 0, which is equal to the x's such that this is less than equal to 0, that is actually the same as the x's for which g i of x is less than equal to 0. Because psi i of g i of x is less than equal to 0 if and only if g i of x is less than equal to 0, that follows from the underlined thing property here. And similarly, you can, so you can do this for all the inequality constraints together. Likewise, for the equality constraints, you have that psi i of u is equal to 0 if and only if u is u is equal to 0. So, which means that u is the only 0 of this function psi. It can have whatever shape you want, but it has only 1 0 which is at 0. So, in that case, again the, what will happen is that x such that h tilde j of x equals 0 if this set, the x's for which h tilde j of x is equal to 0 is the set for which h j of x is equal to 0. So, their feasible regions are, so the feasible region of the two problems are the same, feasible region of 4 is equal to the feasible region of star. What about the objective? The objective has been transformed through a monotone increasing function. So, if you have a solution x star of star, so you have that f of x star is less than equal to f of x for all x feasible, then it will also be that f tilde of x star is less than equal to f tilde of x for all x feasible. This is because I can, on an inequality, I can always apply a monotone increasing function and that does not change the direction of the inequality. So, this is if and only if this is an increasing function, so it is also invertible, so I can go back and forth. So, what this would mean is that, so the optimal value of 1, optimal value of 4 is going to be psi 0 applied to the optimal psi 0 of the optimal value of star. So, the optimal value of 4 is optimal value of star with psi 0 of psi. Let us look at another example of transforming problems this time which concerns introducing additional variables and changing the nature of constraints. So, look at this problem, minimizing now f x subject to I have my equality constraint h g of x equal to 0, the original equality constraint. But I will also do the following thing. I will, I write g i of x which was an inequality, let me write it like this, g i of x plus s i equal to 0. This is for all i 1 to m, but this s i is a new variable that I introduced. Now remember g i of x, remember was g i of x was less in the original problem was less than equal to 0. So, there is a gap between, so g i of x is and 0 there is a gap here. So, I can fit in a positive number here between g i of x and 0 and make the two make such that this becomes equal. So, that is that number s. So, s i is now something that is greater than equal to 0 that has been fitted into that. Now, as my x changes it is for to in order to maintain this in order to maintain this equation here, in order to maintain this equation my s will change value. So, it is I need to bring that in also as a variable that cannot it is not a constant anymore. It has to be if I make it a constant then that will change the meaning of my constraint. It has to also float as my as my x changes. So, my variables now are x as well as s. So, what has happened as a result of this transformation? What has happened is my I have earlier I had inequalities on g. Now, I have an equality constraint in g but I have introduced new variables which are these s's and I have now an inequality on the s. So, my earlier problem if you look at star it had m inequality constraints and p equality constraints and of course n variables. On the other hand 5 has now m inequality constraints how many how many equality constraints m plus p equality constraints and how many variables m plus n variables. Now, what is the advantage of doing something like this? There is a there is one there are there are a sort of it has it is one key advantage is that it standardizes things quite a bit for us. So, all equalities will now be of a very simple form which is all equal all sorry all inequality constraints can be can be without loss of generality considered to be of this very simple form. That means there are some variable greater than equal to 0. So, you do not so inequality constraints will simply take the form of defining some quadrant or some orthodont in your space and equality constraints will define surfaces in the in the space. Whereas, earlier you had surfaces defined by equality constraints and you had regions defined by inequality constraints and that that interaction can be a little complicated. This this tends to make things a lot simpler. So, yes, yeah, so that is a so si greater than equal to 0 is the inequality constraint. So, the question is what are the inequality constraints in problem 5? This is these are the inequality constraints. It is a variable it is a so it is a the function si itself being asked to be greater than equal to 0. It should be treated as an independent variable. See eventually all these variables will interact you have to pick an optimal choice for all of them together jointly need to be they need to be chosen. But when we write variables in an optimization problem, if one is known to be a function of the other then you should substitute and get rid of that. So, that is the so you eventually left you are left with these bunch of independent decisions that are bound by constraints. That is how one poses the problem. So, si is si will is another variable, but then it is the values it can take is dictated by this you know the these two constraints that. So, once you tell me the x, it fixes the range for si for every i. And of course, the in the optimal value of optimal value they have to all be chosen together they cannot be chosen you know independently of each other. Because you know their regions are such that you know one you need to know one for to know what the other. So, so as I was saying the the the so can we can you observe why the two are equivalent first we have not I have not discussed that so why is phi equivalent to star? So, see remember so the way to one the way to observe this is look at if you look at the feasible region of star you look at the feasible region of star what this is the x is such that all the constraints of star fold. Now, how do I compare this with the feasible region of phi which is which is not in the same space anymore because now it has more variable right. Phi is both in the x and the s space so you have it is in a larger dimensional space. So, how do I compare the feasible region of star and the feasible region of phi? No, no, no see si is not present in my original problem in present in star. So, how do I compare the feasible region of star with that of with that of phi? Well, the one way to do this is to observe the following the feasible region of star is the x such that the constraints of star hold. Now, the feasible region of phi can be brought in in the following way you look at the x such that together with s. So, what I have written here is this is so look at the set of x's for which you can find some x s such that x and s together satisfy the constraints of phi. So, x and s if they together satisfy the constraints of phi right and you look at such x's only the x component of that x, s there. Now, for those x's wouldn't the constraints of constraints of star hold automatically right because if x comma s satisfy phi satisfies the constraints of phi what of course hj of x hj of x would be equal to 0 all of these would be satisfied hj of x would be equal to 0 and here you would have that gi of x plus si equals 0 and si would be greater than equal to 0. But gi of x plus si equal to 0 and si greater than equal to 0 would mean that gi of x is less than equal to 0 right. So, these two together implicitly automatically imply that gi of x is less than equal to 0 right. So, consequently what is happened is that the x's for which the for which star the constraints of star are satisfied is the same as the x's for which you can find some s such that the constraints of phi were satisfied with x and s together. So, that basically ensures that these feasible regions this is the sort of way you can establish a link between the two feasible region. Geometrically what is happening is if you want to picture here suppose here you know here is the space here is a region of on which you have defined the feasible region of star. This is the feasible region of star what you have what we have done here is sort of by going by introducing an additional variable the feasible region of 5 has become something like this. So, feasible region of star here is this box is this flat square here whereas, the feasible region of star is this red three dimensional box. And it is such that if I take the shadow of this box on this on to one of the variables that we look at its projection on to one that gives me back the feasible region of star. So, this is the region of x this is the space of x is the additional axis that we have that is the space of s. So, any point x here has a corresponding s such that x comma s together lie in this floating box on top that is your x comma s. And likewise if I take an x comma s here I can project that back down to get to a point in the feasible region of star. So, the feasible region of star is in some sense a shadow of the feasible region of 5 shadow or projection over however you want to think of it. So, now, but this simply relates the two feasible region why are they equivalent to optimization problem. So, the reason they are equivalent is because see in star you are optimizing only over this plane here this sort of square the region over the feasible region down here. But and in 5 in 5 the objective function has not changed going from star to 5 object you have introduced this new variable s, but that variable does not appear in the objective. What matters for the objective is the value of x alone it is still f of x which is the objective which was the objective earlier also in star also. So, the f of x is that you can get the values of f of x that you can get in star is the same as the values of f of x that you can get in 5. So, as a result the objective values will actually be the same. So, the bottom line is that if you have a solution x star is the solution of star this implies that there exists a star such that x star x star is a solution of 5. And likewise if x star x star solves 5 then x star is a solution of star. So, this is how you can go back and forth and in fact, moreover the optimal value of star would be equal to the optimal value of that. So, this new variable s that we just introduced has a name it is what is called a slack variable. So, the s is what is called a slack variable that is the