 So, now let me come to the point that one of you asked, what about existence? Now, remember Weistra's theorem gave us existence under what condition? The objective function has to be continuous and the feasible region has to be closed and bounded. I said a feasible region that is open is not closed and could be bounded, but it is not necessarily closed. So, there are, there are these strain sets that are both closed and open, let us not bother about them. Basically, there is no guarantee of closed and being closed. So, then therefore you need another in an independent way of verifying that the solution actually exists. It also brings me back to the first point I had made. We need the reason, the importance of Weistra's theorem in optimization is that it lets you check for a, that a solution exists without actually asking you to find one. So, now without having a guarantee, a way of claiming that a solution exists, you can be in this kind of situation. You just try and find one and then you find all these points, but then none of them, there is no guarantee about any of them. They could be solutions, could be not, nothing can be said. Now also let us go, if you go back to the argument, the way we proved this right. So, we started with this theorem saying let X star be an optimal solution of this particular optimization problem. And what did we do? We said, we took this particular optimization problem and we said let us look at this optimization, let us look at every direction in Rn, swing that direction down to the point where now it lies inside the ball and therefore inside the set and then you can say something about the inner product between the direction and the derivative. So, this kind of argument can also be done if you are talking of not a global optimal solution, but a local optimal solution. So, here I was referring to optimal solution means a global optimal solution, I use this particular thing, this particular property. I use this that f of X star is less than equal to f of X for all X in S, but then we do not need to do that, we can also work with a local optimal solution and your entire local optimal your earlier ideas will continue to work. I can still look for a vector that is in my local neighborhood, that neighborhood can I can shrink that vector further down to the point where it lies completely in S and again do complete repeat the same argument. So, what this means is this particular result actually also gives us, so this theorem, the above theorem also holds for X star that are local optimal, local optimal solutions or local minimum. So, what this means is if X star is a local minimum, my thing is function f X in S evaluated at X star equal, so now suppose you get to this situation where you now have you are in one of these situations, you have you know that the solution exists, but then there are multiple points where there are multiple points more than one point X star that solves that derivative equal to 0 or gradient equal to 0. So, what so can we eliminate some of these, so that is what we can look at now. So, let us consider this, so suppose f is twice differentiable, let X star local then it must be that derivative is equal to 0. And now we can actually use a stronger version of Taylor's theorem. So, when a function, when you have twice, when the function is twice differentiable, Taylor's theorem you can use Taylor's theorem to get that is a stronger version of Taylor's theorem that makes second use of also the second derivative. So, the Taylor's theorem I wrote here, this used only the first derivative and it was it said that you can construct a linear approximation of the function near that point. If you also have second derivative of information, then you can actually construct a quadratic approximation. So, that is a theorem that we will use. So, by Taylor's theorem, Taylor's theorem actually implies that f of X star plus delta H is equal to f of X star plus, so f of X star plus f X plus half delta square H transpose may be used just to be consistent with one kind of notation that we write. Plus now the quantity, the residual quantity that comes is now small o of delta square. So, you can construct a quadratic approximation and a quantity that the residual error that remains after that quadratic approximation is small o of delta square. What this means is this is now a function that after dividing by delta square also goes to 0, okay. This is that sort of quantity, alright. But now remember this because f is a X star is a local minimum, this gradient here is equal to 0. So, this first term is gone. So, you are left with just f of X star plus half delta square H transpose del square f at X star f plus some small o of delta square, right. And now once again we have in a small neighborhood f of X star plus delta H will be greater than equal to f of X star. So, for delta small enough, what this implies is that you have half delta square H transpose for delta positive and small enough. What that means is as if you can divide by delta and then let delta go to 0 that gives that H transpose is greater than equal to 0. Now this must be true for this, you get this condition the H transpose Hessian of H f at X star times H this should be greater than equal to 0. You get this condition, this should be true, this is true for all H, this is true for all H. Since this is true for all H, this just a matrix like this, a matrix like this that satisfies this kind of inequality matrix of this kind which satisfies an inequality like this for all H, this sort of matrix is called positive semi definite. Let me just define that for you here, m n cross n is said to be semi definite v transpose nv is greater than equal to 0 for all v in R n. So, with this definition what we get is that you look at this Hessian matrix evaluated at X star, this must be positive semi definite. So, if your f is twice differentiable, we can say more, we can say that if your X star is a local minimum not only must the derivative be equal to 0, then amongst the point where the derivative is equal to 0, you further check, you can further narrow down, you can say well there are these points where the Hessian is positive definite, then amongst, so those are your optimal solutions must be amongst those points because this is now a necessary condition. Now, what about sufficiency? Sufficiency means what necessary means simply that once it is a solution, this is a condition that the solution must satisfy. Sufficiency means the other way around. Sufficiency means the other way around, here is the condition if I get, if I can verify then definitely that is the solution. Now, what would be a sufficient condition in this case? So, sufficient conditions are stronger than necessary conditions because sufficient conditions would imply the necessary conditions. So, what would be the sufficient condition in this case? So, you must have that the derivative is equal to 0, that is necessary. You must also that and but in addition to this, the second derivative being or the Hessian being positive semi-definite, if we ask for something more, we ask for that the Hessian is strictly positive definite or positive definite, then that ensures that the X star is a local minimum. So, sufficient condition is strictly positive definite. What does that mean? That means that if I take V transpose del square f X star V, this is positive for all V that are not 0. So, if you have an X star in S such that the derivative is equal to 0 and the Hessian is positive definite that means V transpose the Hessian times V again is strictly positive for all V not equal to 0, then that X star is a local minimum. So, the optimization over open sets is that as does not always is not a very common problem that occurs. Usually, the kind of when one writes an optimization problem like this, you are writing a minimization of a function f subject to some G of X less than equal to 0 and H of X equal to 0, the less than equal to 0 and H of X equal to 0, these kind of constraints, if these functions G and H are continuous, then these constraints ensure that your set is actually closed, the feasible region is actually closed, but this closed if G, H are continuous. But a class of problems where it is very important to consider where we have, where we naturally encounter open set are those where there are no constraints. So, the problem of minimizing effects where X just is in Rn, all of Rn. So, this is what is called unconstrained minimization. So, unconstrained minimization, this is actually a minimization over an open set because Rn itself is an open set. So, all the previous results that I just mentioned, they all apply to minimization over an open set or so over an uncon, for a minimization of in an unconstrained set. So, you are unconstrained minimization. You can, another point to note is that suppose you had instead maximization, instead of minimization, how would the result change? Yes, so how would my R conclusions change? R conclusions, the conclusion of this theorem here that the derivative is equal to 0, this conclusion will continue to work, even if you had a maximization. The derivative will still has to be 0. The conclusion of this theorem, of this conclusion here, that this blue line here, that the Hessian must be positive semi-definite, that the Hessian should be positive semi-definite that gets changed to Hessian should be negative semi-definite. And the sufficient condition will change to Hessian should be negative instead of positive definite, negative definite. Negative definite simply means that in place of, in place of negative semi-definite would just switch this inequality to a less than equal to and this inequality also towards strictly less than. I am skipping over these details, you can easy to work out on that. So, we will end the class here.