 Welcome everyone, so today now what I will talk about is another type of optimization method that is a full blown primal dual method that means it is a method that works in both primal and dual spaces and in and moreover it gives no preference to either space, it works simultaneously in primal and dual spaces as if they are alike as if they are of trees them on equal footing and hence computes the primal and dual optimal solutions together simultaneously. This method is what is called an interior point method. So I will show you this interior point method only for the case of linear programming because that is where it is easiest to explain although this method has been extended now for non-linear programming also. So what is the idea? Suppose here is my linear program I am minimizing C transpose X subject to A X equal to B and X greater than equal to 0 and what is its dual? Let us write its dual in terms of a Lagrange multiplier lambda or its dual variable lambda. So its dual is this maximize B transpose lambda subject to A transpose lambda. Now the usual dual is that it is A transpose lambda less than equal to C but instead of writing it this way what I will do is I will insert also I will insert a slack variable. So I will write this as A transpose lambda plus S equals C and put S as greater than equal to 0. Now what we have seen from our studies of complementary slackness and so on is that X and S are complementary variables. So they would end up satisfying complementarity slackness or complementary slackness means so X and S are complementary variables. So they would end up satisfying complementary slackness which means either for each component either X i is each component i either X i is 0 or S i is 0 at the optimal solution. Now one way of writing these optimality conditions is as follows. So optimality conditions would be that A transpose lambda plus S equals C I must also satisfy A X equal to B this ensures primal and dual feasibility. I must also have X comma S both greater than equal to 0 and I should have X i S i as I said for every component either X i is 0 or S i is 0. So this is equal to 0 the product X i S i is equal to 0 for all i from 1 to n. So the difficulty in solving a linear program lies precisely in ensuring these two conditions. Out of these two conditions that have to hold simultaneously is what makes the problem difficult. The first two equations that one needs to solve are simply linear equations. So those can be solved by any classical technique. The trouble is we have to also satisfy X and S greater than equal to 0 and at the same and also the complementary slackness. Of these even the complementary slackness condition we may say is eventually a non-linear equation. So it could be combined or clubbed along with these two as just as something of the same category as the first two first two equations that the first two equations are simply look linear equations well this is an additional non-linear equation. But then the trouble is also this particular inequality. This is now an inequality that needs to hold in addition to these linear stroke non-linear equations. So the challenge is ensuring that these two are satisfied and that is what makes this complicated. So what we will do is we will be trying to solve these as non-linear equations and at the same end but our effort at solving the non-linear equations would also ensure we will solve these as non-linear equations while always ensuring that X and S are greater than equal to 0. So effectively we are going to solve the first, second and third equation as a system of non-linear equations while ensuring that we do not violate equation 4. So that is the idea. So in order to express this let us write the note f X lambda S to note this consider this function f X lambda S and write this as this function is going to be A transpose lambda plus S minus C A X minus B X S and so this is X S and A transpose lambda plus S minus C A X minus V X S and 1 where 1 remember is my vector of 1s it is a column vector of 1s. Now what is capital X and capital S? capital X is simply a diagonal matrix comprising of the components of X of the small x. So it would you can write this as diag of x or equivalent of diag of x 1 to x n and S as diag of S 1 to S n. So X is this matrix X 1 till X n and 0s elsewhere and S is this matrix S 1 till S n and 0s elsewhere. So X times S is simply going X times S is going to be another diagonal matrix. So X S therefore is another diagonal matrix where you have X 1 S 1 dot dot dot till X n S n and 0s elsewhere and that times 1 the vector 1 is X S 1 is simply going to give us this X S 1 is simply summation of X i S i I going from 1 till n. So what are we then looking for? We are looking for we want to find as in order to solve the optimization problem we need solve f of X this equal to 0 and X, S greater than equal to 0. All right now at the way we will do this is we will do this iteratively we will take steps at each for at each iteration and we will move in a certain direction at each iteration. The direction that we want to but we remember this is not just about solving non-linear equations this is we also need to maintain feasibility with respect to X and S greater than equal to 0 this. So this additional constraint means that it is not vanilla non-linear equation solving so but we can take inspiration from the solution of non-linear equation. So we can write say we can take say for instance a Newton step. So we can take something like a Newton step so for instance we can look for direction a direction for search as delta X delta lambda delta S that satisfies. Now what have I written here? I have written J of X lambda S is simply the Jacobian of F this is simply the J of X lambda comma S is simply the Jacobian this is the Jacobian of F all right. So in short what this equation is is right so this is you can one can see that this is actually nothing but a Newton step. So delta X delta lambda delta S simply a Newton step for solving solving it. Now if you so this so in a typical Newton iteration a typical Newton iteration would do Newton iteration X X k plus 1 lambda k plus 1 and S k plus 1 equal to X k lambda k S k plus the step that we have right which is delta X delta S delta lambda delta. This would be the typical Newton iteration that it would you start with start with your original you the step the point where you are compute the direction and you add simply that direction to the point where you are to get your new iteration. But unfortunately if you just simply so what this would amount to is taking a full step along the Newton direction right Newton along the taking a full step along the Newton direction or then take along the Newton step. Now that usually leads to a problem which is we need which is that we need to remember maintain also feasibility with respect to the inequalities. So if one takes a full step along this direction so a full Newton step usually leads to X comma S greater than equal to 0 being violated a full or typical Newton step would lead to this being violated. So what we can what we need to do therefore is to search along the Newton direction all right. But search to the point where we do not violate this X comma S greater than equal to 0 right. So the idea is to take inspiration from the Newton method but do not go whole hog into the Newton method because we are one you are not really solving non-linear equations in the most basic form right. This is a much more specific and structured problem and we are developing a our own way of solving that particular problem for okay all right. So now if you so what an interior point method interior method usually does X k plus 1 lambda k plus 1 k plus 1 as approximately X k lambda k S k plus alpha times delta X delta lambda delta S okay this is what it usually does all right. Now this also can be diluted further so what one even here you know take you do not want to go all the so in an it is not merely that you do not want to while you know it is not merely that you want to maintain feasibility with respect to this and that is actually this is one of the insights that led to this method becoming so successful that one does not really want to it is not just that you want to stay feasible with respect to this. If you if you did that what you would end up doing is you would take a Newton step to the extent that you can without violating feasibility. So you keep going along the Newton direction and you will stop when you hit one of these quadrant lines right you would when you hit one of these one of these quadrant half spaces you would stop. So you that would then take your iteration in the sort of in this sort of manner if these were if this is your if these were your axis say this is the X axis X axis and this is the S axis what it would do is it would take your iteration here then here then here and here then here then here and so on effectively you would try to avoid you you would keep taking steps that would go all the way to the point where you you will take as large a step as you can without violating the X greater than equal to 0 and S greater than equal to requirement. So this is the X greater than equal to region greater than equal to 0 region this is the S greater than equal to region right. So you would go you would say starting from here you would you would go up to the point where S becomes 0 sorry X becomes 0 then from here you would go when X becomes 0 and so on and so forth. Now the the an actual primal dual method usually does something even less aggressive it does not it does not go even this far what it would try to do is it would it would try to look at this kind of a kind of central region here okay and effectively it would try it would make sure that your iterates remain within this central region okay. So more concretely the termed if that is often used is what is called a central path a central path a central path is simply a is is simply points because of points that satisfy X comma S X times S equal to tau for some for some for some tau right. So and as you vary the tau and as you vary and as you make the tau smaller you would you you are sent your you you end up getting this this central sort of a path now and what we want your method to do is to follow this particular path that is given to it. Now as a result what it does is it does not the method does not sort of zigzag or go over from one form a point where you know one point where S is 0 to a point where X is one of the X is 0 and so on and so forth. So effectively it actually does not ever hit any of the corner points of the polyhedral either the primal or the dual polyhedral and what it does it actually reaches it reaches to the final actual fine solution through the interior of the polyhedral it gets to the eventual corner point solution but through the interior of the polyhedral right. So that is the origin of also the name the interior point method right. If it did exactly any of the any of these things then it would be simply another way of doing what any of the older methods for linear programming were doing which is simply searching over corner points of the linear of the polyhedral. Since we know the solution lies at a corner point it would be you would you would be simply going from one corner point to another looking for the solution. So the interior point method actually abstains from doing that or refrains from doing that and comes up and is therefore able to move directly to the solution. So now so the actual form or the diluted form is the following. So what in order to define it let us define this particular thing it is called the duality measure. So duality measure is mu defined as the 1 by n summation xi si. So the duality measure is simply equivalently written as x transpose s divided by the duality measure is a measure of how far you are from complementary from satisfying complementary slackness. Complimentary slackness would require if it satisfied the duality measure would be exactly 0. So that is so it sort of captures that particular that particular property. Now what it what what it the what your the actual interior point method would do is that it would not it would it would take a Newton step but move to take it to the point where it to the point where it in the towards towards the point where you are where xi si is close to some multiple sigma of mu. So what we want to do is it does take a Newton step towards this particular part. So in short it it is taking a Newton step but trying to make sure that it is it is it it remains in this in the vicinity of this central part right because xi it wants all the all the products xi si to be around approximately sigma times the the duality measure the duality measure is is a is an average of this xi si. So it is trying to make sure that all of them are are you know within a multiple sigma or roughly a multiple sigma of the of the average. So then the modified step is then becomes something like something like this. So the Jacobian evaluates to a transpose i a 0 0 s 0 x and then you have your step that you want delta x delta lambda delta s equals now the you have the right hand side remember was minus f and minus f would be so this is now equal to minus r c this is minus r b and minus x s 1 plus sigma mu times 1. So what is what is r c and r b? Well r c is simply the residual the residual that that remains from the dual constraint. So the residual infeasibility of the dual constraint r b is the is the residual infeasibility of the primal constraint. So the actual new the what the actual method is basically doing is simply this. So the it is it is trying to find it is taking steps in such a way that it satisfies this particular this this particular equation that is that has been written here. This is the master equation. So let me write out the entire algorithm. So the algorithm then is that initialize with initialize your x 0 lambda 0 s 0 such that all the components x 0 and s 0 are greater than 0. Now for k equal to 0 1 2 dot you solve the following. So choose a sigma sigma k in 0 to 1 and solve 0 a transpose i a 0 0 s k 0 x k k greater delta x k delta lambda k delta s k equals minus r c k minus r b k minus x k s k times 1 plus sigma k mu k times 1 where mu k is simply the duality measure at step k it is equal to x k transpose s k divided by n. You set x k plus 1 lambda k plus 1 s k plus 1 as x k lambda k s k plus alpha k times delta x k delta lambda k delta s k where alpha k is chosen so that x k plus 1 comma s k plus 1 are greater than 0. So already what it do the way this algorithm has been specified it what it is already doing is it is sort of tilting your search direction in more towards the central path and you choose any you choose an alpha k so that you can in such a manner that you do not violate feasibility. So it is taking a step in along a modified Newton direction and Newton direction tilted slightly more towards the central path and asking you to proceed in that direction to the point where you do not violate the feasibility of the inequality constraint. So this is basically the idea now there are much more concrete ways of making sure that you do not violate feasibility and you move towards the central path. For example, it is possible to define the neighborhood of the central path and around that neighborhood and you keep searching till you reach that particular neighborhood that is a more practical way of doing what the of implementing this sort of a method those are all details that can be that can always be plugged in but this is basically the overall the what I have explained is the overall structure of a primal dual method. So what is the what is the advantage of what is the result that we can expect from a primal dual method. So primal dual method is effectively what it will what it does is so what is it that one can expect from the primal dual method what one gets a result that which essentially shows that your duality measure satisfies something like this is less than equal to some gamma times mu k where gamma is a constant that is between 0 and 1. So the duality measure and the constant also depends it might depend on n but it is a constant between 0 and 1. So the duality measure has this particular property that it keeps it goes down to 0 geometrically. So mu k plus 1 is always less than equal to mu gamma times mu k where gamma is some constant between 0 and 1. So which means that for as k becomes larger and larger your duality measure shrinks and once your duality measure goes to 0 you would have satisfied complementary slackness and the way you have designed your search is that it has always maintained feasibility with respect to this particular constraint. So you have not only satisfied the linear equations you have also made a linear or non-linear equations you have also satisfied feasibility and then eventually satisfied complementary slackness. So that effectively has ensured that so as k goes to will become larger you end up satisfying complementary slackness and that effectively ensures that your method has converged to the solution. So this is basically the essence of an interior point method. You can see the things that I mentioned to you at the start of them at the start which is that it has made no distinction between the primal variable and the dual variable. It has searched simultaneously, searched or however you want to call it it has computed simultaneously the primal and dual variable and its effectiveness lies in being able to do this that it has computed both for you. It has attacked the problem jointly in the primal dual space and in that space really optimization is not about a function over a set but rather in the formulation that it is looking at in that case the optimization problem is a bunch of non-linear equations that need to be solved and that is what it has tried to do. So with this I think I will stop with my coverage of the Newton method and that also covers the algorithms that we plan to cover in this course. In the rest of this course what I will now do is dynamic optimization and little bit of dynamic programming to show you how exactly and relate that also to static optimization the kind of optimization that we have studied so far.