 Now, let us look at this even further, let us go a little bit more into depth here. Now, what is this particular expression here on the left? What is this particular expression? The infimum of this linear function evaluated over T. Now, if you recall, I had discussed this a little bit earlier. If you evaluate a linear function over G and when lambda is greater than equal to 0, and when this quantity here is 1, then that linear function was actually nothing but the minimizing that linear function was actually the same, the minimizing that linear function was actually the same as minimizing the Lagrangian over the entire space. So, its value was actually the value of your dual function. So, now I want to try and relate this in that way. So, what I have on my, if you just take this here, so I have infimum of, let us write this out here once again. So, infimum of lambda tilde transpose U plus theta tilde transpose V plus mu tilde into T over U V T in G, this is greater than equal to for us mu tilde into P star. This is where we are doing. Now, suppose, or you can say, let us take two cases. So, case 1 is when, so actually, so now, in fact, before I take these two cases, let us come, let us look at this once more. So, what we have done is we have said that well T and G, they are convex sets and we applied the separating hyperplane theorem and that has got us to this stage, that has got us to this particular inequality here. So, now, we have so far, what have we assumed about the optimization problem? We have assumed that well, you have this optimization problem where A is full row rank, G is a convex and this is attained. And from there, we said we just, we only showed that well G must be a convex set, T must be a convex set, G and T are and we said that there must be a separating hyperplane that separates G and T. I have not yet made any assumption about constraint qualifications. So, now, what I will do is I will actually introduce a constraint qualification. So, what I will do is, so we have reached this stage, so now, before we move any further, let me raise this before I move any further. Now, suppose, so this is going to be my theorem, consider the convex optimization above, convex optimization problem. So, A is p cross n rank of A is p and suppose there exists an x hat such that, so we want to slate a point. So, there exists an x hat such that all these g i of x hat is strictly less than 0 and A x hat is equal to b. So, this sort of point is what is called a Slater point. So, consider the convex optimization this and suppose there exists an x hat such that g i of x hat is strictly less than 0 for all i from 1 to m and A x hat is equal to b. So, this is what is called, suppose there exists a Slater point and this condition is what is called a Slater condition, then p star is equal to d star and what is d star? d star is simply the max of d of lambda comma theta over lambda greater than equal to 0. So, at p star, so the optimal value of the primal is equal to the optimal value of the dual. So, that is what we are now, we are setting up to show. So, we came up till this stage, we wrote this set, we wrote this set g, we wrote this set t and we found that these two sets are disjoint and then therefore, there is a separating hyperplane and the separating hyperplane gave us that there exists this sort of quantities lambda, lambda tilde, mu tilde etc. such that lambda tilde is greater than equal to 0, mu tilde is greater than equal to 0 and this inequality holds. Now, from here onwards, I will need, I will have to invoke my Slater, I will have to invoke that we have a Slater point. So, now suppose here, suppose out here mu tilde is positive. Now, suppose mu tilde is positive or let us say we are taking the case where mu tilde is positive. So, if mu tilde is positive, then what can I do with this expression? This expression, let me just write this expression once again there on the next slide. So, I have this expression infimum over u v t in g lambda tilde transpose u plus theta tilde transpose v plus mu tilde times t. This is greater than equal to mu tilde times p star that is what we have so far. Now, suppose mu tilde is positive and I have that lambda tilde is greater than equal to 0 already. So, this is actually nothing but the infimum, this is nothing but the infimum of the Lagrangian evaluated at, so what I can do is the following. So, if mu tilde is greater than 0, I can just divide both all sides by mu tilde and then I would get that this is nothing but the infimum of the Lagrangian evaluated at lambda tilde divided by mu tilde and theta tilde divided by mu tilde and that would be greater than equal to p star. And what is this infimum? How did I get this? This is because of what we just wrote here that we just we wrote that the dual function is actually equal to the infimum of a linear when lambda is greater than equal to 0 and this coefficient here is 1. The dual function is actually equal to the minimum of this linear function over this entire over the set g and if you see that is exactly what this was. This was the minimum of this linear function over the set g, lambda tilde was greater than equal to 0. We did not have a coefficient 1 here but that coefficient can be made 1 by just dividing throughout by mu tilde and that you can do because mu tilde is actually, we assume mu tilde is greater than 0. So, when mu tilde is greater than 0, this becomes the infimum of the Lagrangian which is simply d of lambda tilde divided by mu tilde, theta tilde divided by mu tilde and that is greater than equal to p star. And what does that mean? This means that we have got the inequality which is opposite of that of weak duality. So, which means this implies strong duality holds equivalently p star equals p star. So, when mu tilde is positive effectively what we have done is we have automatically found a non-vertical supporting hyperplane and that has given us this particular direction. Now, suppose mu tilde is equal to 0, see mu tilde is greater than equal to 0, we took care of one case where mu tilde was strictly positive and there we got that strong duality holds. Now, when mu tilde is equal to 0, there should be able to rule out that we should be able to still show that either still show that strong duality holds or rule out that this case is not possible. Still, until this point we have not yet made use of this later condition. We have only used convexity. So, now we are going to actually going to need this, we are actually going to need slater condition. So, let us look at it this way. So, now suppose mu tilde is equal to 0, then what happens to my right hand side? My right hand side here becomes equal to 0 when mu tilde is equal to 0. So, if my right hand side is equal to 0, then I am left with and on the left hand side my mu tilde times t that term is also equal to 0. So, when mu tilde is equal to 0, what I have left with is something like this lambda tilde transpose u plus theta tilde transpose v infimum of this over u v t belonging to G. Now, this here I can again since lambda tilde is greater than equal to 0 and v should be equal to Ax minus B because of u v t belonging to G. This in fact can be written in this way infimum of lambda tilde times this itself. So, the least value of u will be get attained when u that u is actually equal to G of x. So, it is infimum of this into lambda tilde transpose G of x plus theta tilde transpose Ax minus B and over all the infimum is over all x. Now, let us take x to be a slater point. So, suppose when this is equal to this and my right hand side on the other hand is that my right hand side here remember this quantity evaluates to 0. So, this whole quantity is what I get is that this whole quantity is greater than equal to 0. In other words, I am getting that lambda tilde transpose G of x plus theta tilde transpose Ax minus B is greater than equal to 0 for all x. Now, take x equal to x hat which was my slater point. So, I had this x hat here which was my slater point. So, let us take x equal to x hat. So, what would we get from there? We would get that G of x hat remember would be strictly negative and Ax hat would be equal to B. So, if G of x hat is strictly negative and Ax hat would be equal to B and Ax hat is equal to B. So, if I put x as the slater point that would imply that the only way possible is that would give me. So, lambda tilde G of x hat greater than equal to 0 and the only way that is possible is that lambda tilde is equal to 0. So, not only is mu tilde equal to 0, I have now also got lambda tilde is equal to 0. We are now almost at the end. So, if mu tilde is equal to 0, lambda tilde is equal to 0. The only way remember the separating hyperplane theorem told us that there is a non-zero. What did the separating hyperplane theorem tell us that we told us this should be non-zero. So, the only way this slope should be non-zero is that now is that theta tilde is not equal to 0 this more neatly. So, the only way that is possible is that theta tilde is not 0. Now, theta tilde is not 0. So, what does this mean? But anyway, we have still we have got that lambda tilde is equal to 0. So, this term is so this term is now gone. It means now that with theta tilde not equal to 0, we have we must have that this expression is greater than equal to 0 for all x. So, let us go back there. So, which means that theta tilde transpose Ax minus B is greater than equal to 0 for all x. Now, if you take, so let us think about it this way. So, here is Ax minus B. At x equal to x hat, at x equal to x hat, you have Ax equals B. Now, that means that is the point that is exactly on the intersection of all of these hyperplanes. So, in the neighborhood of this point, you should be able to find other, there exist other points where you should be able to find other points for which for whatever slope you take, you should be able to find points for which the inequality actually gets reversed. See, it is this quantity is great in a neighborhood of this later point that they would exist points like this, points where this inequality gets reversed unless, see because this is after all a linear function. This is a linear function. So, at and at x equal to x hat, this the left hand side is becoming exactly 0. So, you should be able to find other points where in the neighborhood of x hat where it also becomes negative, because this is a linear scalar function. So, since it is the only way it is not that a linear that this function will not become negative is that its slope is actually 0. At x equal to x hat, you have Ax hat equals B and LHS equals 0. In the neighborhood, in the neighborhood of x hat, we have that there exist x such that theta tilde transpose Ax minus B is less than 0 unless, unless it is the slope itself is 0, unless theta tilde transpose A is equal to 0. Now, so in that case, then you will not be able to change x and get to you know, you will not be able to vary x and get to a negative value. So, unless it is a constant function, the linear function around that point would take values that are both positive and negative. So, now, so and it will be a constant function only if the coefficient itself is 0. But then can this coefficient actually be 0? What is this, what is this actually saying? Well, for this coefficient to become equal to 0, what we are saying is that theta tilde transpose A is equal to 0, which means that if you take the rows, the rows of A and you sum them up using some linear combination which is with these nonzero weights with a bunch of weights that are not all 0. So, you take a linear, so this is actually the left hand side is a linear combination of rows of A and that linear combination we are saying is equal to 0. Well, that means that what does this mean? That means that the rows of A are linearly dependent. Now, when can, so but then we if you recall here, what did we just assume? We assume that the rank of A is equal to P, that means it has full row rank, which means which is a contradiction to that, contradiction since A has full row rank. So, the only way this is possible is that that A has linearly dependent rows, but we assume that A has is full row rank. So, the problem we are starting with is one where A has full row rank, so which means that this is also not possible. So, what does this mean? What this has got us to is that this case, this case here where we said suppose mu is equal to 0, this case is not possible. So, which means mu tilde is not equal to 0, which means the only case that is mu tilde has to be greater than 0 and is the only possible case and then which means that strong duality holds. So, the only possible case is this case where mu tilde is positive and in that case we concluded that strong duality holds and that is the proof. So, to summarize, we have a convex optimization problem as written here with the following requirements that the constraints, the linear constraints are full row rank and there exists a slater point for the constraints. You put these together, then it has to be that the optimal value of the primal is equal to the optimal value of the dual. That is what this theorem has shown and in proving this basically up to the point we got to the point where we wrote this set G, we wrote this set T we showed that sets T and G are both convex and then we showed that there must be a separating hyperplane. Now, the problem from there onwards was that we wanted to show that the existence of a non-vertical separating hyperplane, which means that we wanted that this mu tilde is not the mu tilde here has to be basically has to be positive and that this slope here is greater than equal to 0. The greater than equal to 0 came quite easily but in showing that the separating hyperplane is non-vertical that means this mu tilde has to be positive that is where we needed a constraint qualification. So, until that point everything works without constraint qualifications. The constraint qualification ensures that your slope is that your separating hyperplane is actually a non-vertical one and that is what actually that makes your strong duality work. So, with this then we have we have now shown strong duality also for convex optimization. You will recall that the slater condition which is the constraint qualification for ensuring strong duality that condition is was also what gave us was also used in KKT conditions. So, all these things come together when you actually solve an optimal when we when strong duality holds the optimal values of these Lagrange multipliers are also optimal values of your also what will the optimal values of the dual variables are also what will solve your KKT conditions as Lagrange multipliers. So, with this I will end here we will next move on to a new topic.