 Suppose x star comma y star is a local minimum, is a local maximum in this case of this particular optimization problem. Now, how did we argue in the case when the constraint was an open set? He said well x star and y star is a local solution. So, therefore, there is a ball around it and the constraint is an open set. So, what we can do is we can find a ball around it such that the ball lies completely in the feasible region. If the ball lies completely in the feasible region, we said we can go in any, we can move in any particular direction, any direction h. And so therefore, and then from there using Taylor's theorem, we concluded that the gradient transpose h should be greater than equal to 0 for every direction h or less than equal to 0 as the case may be. So, in that case then we and then we said that well one can one can take h suitably and show that this would imply that the gradient itself must be equal to 0 or the derivative of the function must be equal to 0. So, where does that argument fail here? The argument fails because this is not an open set. So, it is not true that if it just because it is a local minimum, we cannot compare the value of the objective with, we cannot compare that with every point in the neighborhood of x star y star. We have to compare it only with those points that are in the neighborhood and in the feasible region, right. So, you have to be sensitive to the shape of the feasible region near this point and that has to get factored in into your analysis, alright. So, it is not possible to therefore, simply work with only neighborhoods because neighborhoods are not necessarily all included in this, okay, alright. So, now let us look at a point like this. Suppose x star y star is a local maximum of this optimization problem, okay. Now, if you look at the, so can someone tell me without loss of generality, can I take either x star or y star as non-zero? So, I did not mention it, but it goes without saying that alpha is positive here. So, can I take x star or and either x star or y star as non-zero? If both of them are zero, then this equation will not be satisfied, right. So, the origin is not on the LA. So, both of them zero will not satisfy this equation. So, we will take, suppose let us take for simplicity. So, clearly either x star is not zero or y star is not zero, okay. So, let us suppose let y star be non-zero, okay. So, what that means is in terms of this diagram here, I am not looking at points where y star can be zero. So, I am looking at points, I am not looking at these sort of points, these two points here. These are, these two are points here where which could put, where y star could be zero, I am not looking at those points. So, I am looking at only points like these here, either here or here where y star is non-zero. Now, when y star is not equal to zero, that means we are in somewhere say at a point like this, okay. So, let us take some of this point itself, okay. So, we are at a point like this. Now, if you are at this sort of point, okay, you are on the surface of the ellipse and the thing that we want to exploit is because you are at this, at a point like this, I should be able to, if I tell you what the, if I tell you what the x is, I should be able to solve for y in terms of the x, knowing that I am on the ellipse. Being on the ellipse gives me this particular thing that if I tell you what the x is, I know precisely what the y is, okay. So, for example, if this is my point x, I know that this is what the y is. If this is my point x, then I know this is what the y is. This is because you are on the ellipse. On the ellipse, I can solve for y in terms of x. Now, I cannot do that actually at this sort of point because here if I tell you an x here, there are two possible y's, there is a y above and a y below. But here, near my point, near the point x star, y star that I am considering, I can solve for a y uniquely in terms of x on the ellipse, okay. So, let me explain this little more slowly. So, suppose this is my point x star, y star and I look in the very, in the neighborhood of my of x star. In the neighborhood of x star, okay, in the neighborhood of x star, every x maps to a different y and to a unique y. So, which means that if I give you an x, I can solve through the equations of the ellipse for what the value of y should be, right. What that is effectively saying is I can look at this equation f of x comma y equal to f 1 of x comma y equal to alpha and then fix the x here and solve for the y using this equation. And so long as I am not at this sort of point, I should get back a unique answer, right. I should be able to get y in terms of x. Now, this thing which I am just, I was, I am just telling you geometrically actually is a theorem, okay. It is what is called the implicit function theorem. So, this is the result of the implicit function theorem. Again, I will tell you what the result is in this particular context. So, the implicit function theorem in this particular context guarantees that there exists an epsilon greater than 0 and opens, number these, there is 1, an epsilon greater than 0 to an open set containing my point x star, y star and this function g, and a differentiable function g. This function is what will give me back my y in terms of x. That means the argument to this function is going to be an x and its output will be y, right. So, this function will take argument x, but this function will take an x and generate for me a y, but it will generate for me this y in a limited scope means that only in the neighborhood of x star can I talk of such an existence of such a function. Only in this neighborhood of x star will I be able to get my y back in terms of x. If I make this neighborhood too large, then you know all health breaks lose, I get multiple y's, there is no more correspondence between x and y. But so long as I am close enough, I should be able to solve for x in terms of y. Solve for y in terms of x. So, there is a differentiable function g whose domain is is like this. It is just x star minus epsilon to x star plus epsilon mapping back to mapping back to v. So, in this figure here v is this open set x plus minus epsilon is this sorry x star plus minus epsilon is this set x star minus epsilon to x star plus epsilon. And g is the function that you know taking the value of x returns the value of y for me. So, now I will tell you what the theorem says. There exists an epsilon greater than 0, an open set v containing x star y star and a differentiable function g, g whose domain is x star minus epsilon to x star plus epsilon mapping back to v such that if you are on the ellipse that means f f 1 of x comma y equals alpha and you are in this neighborhood v that is equivalent to that is equivalent to evaluating g on that neighborhood. So, y equals g of x if and only if it lies in the on the ellipse and x comma y belongs to and is in the neighborhood v that is being considered here. So, one direction it means that if you look at any point that is in the ellipse that is on the ellipse and in this small neighborhood v of x star y star then that that point can be expressed in this sort of way y is a function of x y equals g of x and x lies x of course lies in this sort in this domain. So, take any point on the ellipse that can be expressed this way the reverse is also true take any x that lie that is in this neighborhood here evaluate the function low and behold you get a point at that is bang on the ellipse. So, this is what formally it means to solve for x solve for y in terms of x. So, you have got y as in terms of x. So, you can actually work this out you know solve this as a quadratic equation and get what that what this function actually is we do not need the form of it I just want you to know that this is I we will see how this generalizes there is such a function that is the point. So, now because there is such a function. So, now that that it means it means the following that since x star y star is a local minimum x star y star is the best local maximum x star y star is the best in amongst all points on the ellipse and in a small neighborhood around x star y star it has the maximum value of maximum value for this area. So, x star y star is a local maximum. So, let us denote this objective here by f 0 of x comma y. So, implies f of x star y star is greater than equal to f of x comma y for all x comma y in a small neighborhood of x star y star and lying on the ellipse f 0. Now, so in a small neighborhood around x star y star x star is up is the best value gives you the best value of f 0 in a small neighborhood of x star y star y can be solved in terms of x. So, if I take the smaller of the two neighborhood what must be the case both should hold right both should hold that means enough there is a small enough neighborhood in which I can solve for y in terms of x as given by the implicit function theorem. And moreover x star y star is the local maximum is the maximum over that neighborhood. So, therefore, in a small neighborhood of x star y star. So, also the solution of this optimization. So, you can look at this particular optimization in which y has now been substituted for as d of x and you are optimizing only over x right. So, in a so small enough epsilon this the original your x star will also be an will also be the optimal solution of this problem. So, what does now what kind of problem has this become you have you have f 0 which was a differentiable function I told you g is also differentiable. So, this is all nice isn't all nice your objective is differentiable objective is differentiable, but importantly what is the shape of the constraint what what or what is the nature of the constraint it is now an open space right. So, as a this is what I wanted to tell you. So, you take a problem which where you are optimizing a function over a surface the implicit function theorem if it kicks in lets you convert that to an optimization over an open set over only some of the variable. This was an optimization over x and y to begin with we now have an optimization over only x, but it is an the earlier problem in x and y was an optimization over a closed set. We what we have got been able to conclude is that it is x star y star which was the solution of that earlier problem is also a local minimum of this particular of new optimization problem which is an optimization of a differentiable function, but over an open set, but open set only in x not in x and y both. So, now we can simply apply what we what we know from our previous lecture which is that the this this means that the derivative of this new function the derivative of of this this thing must derivative of this should be equal to 0 at x star right and that is just an an application of chain rule of differentiation. So, we therefore get that. So, I can write this in the following way that f 0 of f 0 x evaluated at x star y star plus f 0 y evaluated at x star y star times g x evaluated at x star this must be equal to 0. What is this quantity this the left hand side here is this is simply partial derivative with respect to x of the function f 0 of x comma g of x evaluated at x equal to x star I have just applied chain rule today. Now, in addition to this I also have that that the I also have that g of x returns me a point that is on the ellipse right in the neighborhood of x star g of x is the point on the ellipse x comma g of x is the point on the ellipse. So, which means that x f 1 of x comma g of x must be equal to alpha this is the first equation comes by optimization result on optimization over over open sets over open sets this here is that x comma g of x lies on the ellipse. Now, for what x does this lie on the ellipse the implicit function theorem tells you right the implicit function theorem tells you that this is true only if you are you are in a small neighborhood of x star. So, x should be within plus minus epsilon of x star right. So, this is lies on the ellipse for all x such that x star minus x in absolute value is less than epsilon right, but then this since this is true for all x like this that are in this this interval from x minus epsilon to x plus by x star plus at minus x star minus epsilon to x star plus epsilon since this is this is this is true for all of them what does it mean the left hand side the left hand side is actually independent of x as I vary x from x star minus epsilon till x star plus epsilon in that open interval because it is all for all these x is it is equal to alpha. What does that mean? What that means is I can put the derivative of the left hand side should be equal to 0 because it is independent of x derivative with respect to x as right. So, so, so derivative at x. So, derivative of this at x star must be equal to 0. So, it means that so that just gives me f 1 x evaluated at x star y star plus f 1 y evaluated at x star y star and is again g x x star this should be equal to 0. So, I want to highlight this we have got two equations like this. So, this is so this is true for all this is identically equal to alpha for all x in that interval right. So, at x star x star is the in that interval right the derivative should I mean it is a constant. So, its derivative should be equal to 0 if I view this as a function of x right. So, that is the so that is the reason. So, this derivative must be equal to 0. Now, what is this f 1 of f 1 of y evaluated at x star y star f 1 y evaluated at x star y star this is the partial derivative of of your ellipse equation evaluated at x star with respect to y evaluated at x star y star. Now, so long as my so if you look at that equation here. So, if I take the partial derivative with respect to y I am going to get a 2 y by b square and I had assumed that y star is not equal to 0. So, when y star is not equal to 0 that f 1 y evaluated at x star y star is not 0 right. So, since y star is not equal to 0 you get f 1 y evaluated at x star y star is not equal to 0. So, what that means is I can just use this equation second equation here. So, let us call number these let us call this equation 1 this call this equation 2. So, I can use 2 from 2 I can just solve for g x of x star. So, g x of x star is equal to negative of f 1 y evaluated at x star y star times f 1 x evaluated at x star y star and this then putting this into equation 1 gives me the condition that substitute in 1 sorry I think I have missed an inverse here. So, this is there should be an inverse here. So, notice what has happened. So, one of you asked me we do not we need not know g in closed form that is it is true we did not know g in closed form. But what I have done is I in using equation 2 and using that f 1 y evaluated at x star y star this is not 0 I was able to eliminate g from the equation or completely and I have now got here an equation that is only in terms of x star and y star and the known functions f 0 and f 1. So, this therefore is a necessary this gives me a new necessary condition for my for x star y star to be a local maximum. So, go back to the optimization problem that we had this was the optimization problem. So, x star y star is a local maximum then it is necessary that x star y star satisfies this this box. So, what this is giving us is that this is we as a result we by what we have done is we took an optimization of a differential function over a surface use the fact that you were on a surface use the equations of the surface to solve for one way one some variables in terms of the others turn that around and substituted and then use chain rule some more calculus and so on and then got back and a bunch of equations that that must be satisfied. You can see that if you did not have constraints some of these terms would vanish and all you will be left with is just you know derivative equal to 0 that is all you will be left with derivative of the objective equal to 0. So, this is how that earlier thing generalized what we have done earlier generalized. So, there is a these the implicit function and the and this particular term in particular this this this term in particular has an important meaning and I will discuss all these things in the in the next class. How this generalizes to to the two more general settings.