 Okay, so the next kind of problems we may be interested in solving are to minimize this norm, the Euclidean norm between Euclidean norm of ax minus b over some subset of r to the n, not over, not the unconstrained problem over all possible x's. So for example, you may want to solve for the minimum ax minus bl2 subject to norm x2 equals 1. This is called least squares minimization over a sphere. Okay, so before we look at this particular problem or problems of this kind, or it could be for example, norm x2 less than or equal to alpha. So this is like looking among points within a certain sphere. So instead of before looking at this particular problem, let us look at least squares problems with a quadratic inequality constraint. So this the general form of such a problem is given like this minimize the l2 norm of ax minus b subject to the l2 norm of bx minus d is less than or equal to some value alpha. So here a as usual is of size m by n, b as usual is of length m. And this capital B matrix here is of size p cross n and d is of size is of length p. And alpha is something which is greater than or equal to zero. So clearly if alpha is less than zero, I'm asking that some norm should be less than or equal to a negative number, which is never possible. So there won't be any solution. This problem is meaningful only if I set if I use some threshold, I mean, if I if the bound on bx minus d2 is less than or equal to some positive, at least not negative number alpha. Now in order to solve this problem, we need one more result, which is called the generalized SVD theorem. And again, I don't have time to prove this theorem for you. But this is the result. So we'll just utilize this result to find the solution to that problem. So if you are given two matrices, a which is of size m by n m greater than or equal to m, b which is of size p cross n, then you can find two orthonormal matrices u of size m by m and v of size p cross p and an invertible matrix x of size n cross n, such that u transpose ax is a matrix dA, which is a diagonal matrix containing alpha 1 to alpha n. And v transpose bx is db, which is a diagonal matrix containing beta 1 through beta q. And here q is equal to the min of bn. So q is the min of bn. And u and v are orthonormal matrices. So this is the theorem, which we are going to utilize. So for now, it's just some notation. But then the point is that this x here is invertible. So that's one thing to keep in mind. So using this theorem, I can write ax minus b l2 norm to be, I can pre multiply by u transpose. And I can insert an xx inverse here. So I get u transpose ax, x inverse times small x minus u transpose b l2 norm. And u transpose ax is this matrix dA, which has entries alpha 1 to alpha n on the diagonal. And u transpose b I'll define to be b tilde. And this x inverse times small x, I'll define it to be y. So just doing a variable transformation here. And similarly, I can write bx minus d to be db times y minus d tilde. So I'm pre multiplying by v transpose. And so d tilde is v transpose times small d. And db is this matrix diagonal beta 1 through beta q. So with this, our objective function or the optimization problem, it becomes minimize dA y minus b tilde l2 norm square subject to dB y minus d tilde l2 norm squared less than or equal to alpha square. But since dA and dB are both diagonal matrices, I can easily expand this out and write it as the summation i going from 1 to n, alpha i yi minus bi tilde square. And for the remaining entry i equal to n plus 1 to m, the yi doesn't touch those entries. So I'll just, so this term is equal to zero. And so I'm just left with bi tilde square. And similarly here, I'll have beta i yi minus di tilde square. And what I've done here is to take the upper limit to be equal to r, where r is the rank of this matrix B. And so that these beta 1 through beta r are always nonzero. So if r is the rank of B, beta r plus 1 through beta q are always equal to zero. And so for those terms, this term is zero. And so I have i equal to r plus 1 through p di tilde square less than or equal to alpha square. This is my constraint. So this problem looks a little bit easier, but we still have to do some work in order to solve this. Okay, now, first of all, notice that no matter what y I choose, this dB y minus d tilde l2 norm squared is always at least equal to this term, sigma i equal to r plus 1 to p. Okay. The internet was working beautifully till now. I don't know. Today I'm having some trouble with my internet. Oh, well, so let's continue. I was just saying that we reduce the problem to this one, where we want to minimize dA y minus B tilde square, which is actually diagonal matrices. So I can actually expand it out and write it out. Subject to dB y minus d tilde square, which again expanded and written out is like this less than or equal to alpha square. Now, no matter what y I choose, this cost function or this part here is at least equal to the second term here, because this is non-negative and this is non-negative. And so if this itself is bigger than alpha squared, then there is no solution. Now, if these two are exactly equal, then we have no choice but to make these two equal. So that means we must choose beta i y i equals di tilde for i equal to 1 to r. Or we should choose y i equal to di tilde divided by beta i for i equal to 1 to r. And we can let alpha i y i equal to bi tilde for i equal to r plus 1 to n, because we want to try to minimize this cost here. And we can do that if alpha is not equal to 0. But if alpha itself equals 0, we don't, again, we can't really affect this cost by choosing any y i. So we might as well choose y i equals 0. So this is the solution when summation i equal to r plus 1 to p, di tilde squared is exactly equal to alpha square. It's y i is equal to di tilde over beta i for i going from 1 to r and bi tilde over alpha i for i equal to r plus 1 to n, but alpha i not equal to 0 and 0 for the other cases. So we were able to solve it in this case. And so 1, 2, the next case is if i equal to 1 to, i equal to r plus 1 to p di tilde squared less than alpha square. In this case, we have more options to pursue because there is some leg room over here. There are, you can possibly choose different y i's, which all of which will satisfy this inequality constraint. So for this case, what we'll do is we'll follow kind of the opposite approach. Okay, so we'll follow the opposite approach. That is, we'll solve the unconstrained problem first. Just minimize ax minus b square and we'll check whether that satisfies the constraint, which is that db times y minus d tilde l2 norm squared less than or equal to alpha square. Okay, if it satisfies that constraint, then we are done. If it doesn't satisfy the constraint, then we'll have to do some more work. Now, if I look at da y minus b tilde square, that is minimized simply by choosing bi tilde yi to be equal to bi tilde over alpha i for all the alpha i's, which are nonzero. But if alpha i equals zero, I know that I don't care what yi I chose. So I might as well choose it to minimize the constraint part, which is di tilde over beta i. This is true for i equal to 1 to r. And otherwise, if alpha i equals zero and it is beyond r, i is r plus 1 to n, then I can arbitrarily choose yi equals zero. Okay, so as far as the cost function is concerned, if alpha i equals zero, I can choose yi arbitrarily. So for i equal to 1 to r, I'll choose yi to be di tilde over beta i to minimize the constraint as much as possible. So with this solution, we check whether this, with this y, we check whether db y minus d tilde square is less than alpha square. If it is true, if so, then we are done. But if not, which is this condition here? So this is just db times y minus d tilde square. And these are the other terms of y, which do not affect that constraint part. So if this itself is greater than alpha square, then the unconstrained solution is not feasible. So we need to actually do some more work here. Now, so here, actually, there is another term. If I look at this, just for clarity, if I look at db y minus d tilde square, then there will be three terms. There will be a term like this, alpha i not equal to zero, i equal to 1 to r. And then there will be a term sigma alpha i equal to zero and i going from 1 to r of beta i. But in this case, I'm going to choose the yi to be di tilde over beta i minus di tilde square. That is this part here. Second case alpha i equals zero and i going from 1 to r. So if alpha i equals zero, this is the y, but then you see that these cancel and this is equal to zero. So that term I'm not writing here. But if this is greater than alpha square, then the unconstrained solution is not feasible. So the constraint is active. And in this case, this is a small exercise. You have to show this. The solution actually occurs at its boundary. That is, the solution will occur when this constraint is met with equality. The way you show this is that if you are able to find a solution such that this summation, this part here di db y minus d tilde square is less than alpha square, then you will be able to find a solution, a better solution. That is, it means you have some headroom, which you can utilize to further reduce the original cost, which is dA y minus b tilde l2 norm square. So the solution occurs at the boundary. So which means that we need to solve this problem. Minimize dA y minus b tilde square subject to db y minus d tilde square is equal to alpha square. So we can utilize the method of Lagrange multipliers, where we write the Lagrangian function l of lambda y to be dA y minus b tilde l2 norm square plus lambda times db y minus d tilde square minus alpha square. Once again differentiating this with respect to y, we just get dA y minus b tilde transpose times dA. This is again vector derivatives. And these are actually like the very first or second result you will see if you start, if you look up vector derivatives, it's very easy. It works similar to scalar derivatives, but you have to keep track of these transposes that show up when you take vector derivatives. So it's something like if I differentiate x squared, I'll get 2x and I'm dropping the two part here and writing it as just x. So if I differentiate ax the whole square, I'll just get 2a square times x. So that is the dA times dA type of term that's coming up. Okay. So coming back to this, this is dA y minus b tilde transpose times dA plus lambda times db y minus d tilde transpose times db. So just take the transpose of this, I get dA transpose dA y minus b tilde plus lambda db transpose times db y minus d tilde. Lambda is just a scalar here, it's in a branch multiplier. So combining the terms that involve y, I'll get dA transpose dA plus lambda db transpose db times y is equal to, I'm taking these two terms to the other side. I'll get dA transpose b tilde plus lambda db transpose d tilde. So now these are just diagonal matrices. Okay. And let's assume that this is non-singular. In fact, the singular case is also easy to handle. The other thing is you note that this lambda here is actually a parameter. So I can choose lambda to be different values and try to make this non-singular. If in spite of choosing different lambdas, if this is singular, then you will have to handle that separately. But for the moment, let's assume that this matrix is non-singular. In that case, I can just directly write out what yi of lambda is. This is just a diagonal matrix. It's just the diagonal entry of this, which is alpha i times the ith entry of this is alpha i bi tilde plus the ith entry of this is beta i times di tilde divided by the ith entry of this is alpha i square and the ith entry of this is lambda beta i square. And this is true for i going from 1 to r. Beyond i equal to r, the beta i's are equal to 0. So I'm just left with bi tilde divided by alpha i. So i equal to r plus 1 to n. So, and how do I find this lambda? I need to solve for the lambda which satisfies the constraint db times y of lambda minus d tilde squared is equal to alpha squared. So if I write, if I just expand this out and call this function phi of lambda, I'll just get i equal to 1 to r beta i times yi of lambda, this quantity here minus di tilde square plus i equal to r plus 1 to p. For those terms, this y of lambda will multiply 0. And so I'm just left with di tilde squared is equal to alpha squared. And if I take this alpha i squared plus lambda beta squared common between these two terms and in the numerator, I'll get alpha i beta i times bi tilde plus lambda beta i square because this is beta i here times di tilde minus di tilde times this denominator, which is alpha i squared di tilde plus lambda beta i squared di tilde. These two of course cancel. And so what I'm left with is sigma i equal to 1 to r alpha i times alpha i is common between these two terms, I've taken it out alpha i times beta i bi tilde minus alpha i di tilde divided by this quantity alpha i squared plus lambda beta squared square plus summation i equal to r plus 1 to p di tilde squared equal to alpha i alpha squared. These equations are called this equation is called the secular equation. The name apart, the point is that lambda is sitting in the denominator here and you have a sum of r terms here. So you can't solve this in closed form. But when I set lambda equal to 0, phi of 0 is actually the solution I would have obtained by solving the unconstrained problem. And we know that phi of 0 is, I mean, that's why we came to came into all this because phi of 0 must be bigger than alpha squared. And because lambda is sitting in the denominator here, phi of lambda is actually a monotone decreasing function of lambda. So as I increase lambda, this phi of lambda will keep decreasing. And then at some point, there will be a unique positive lambda star such that phi of lambda star is exactly equal to this alpha square here. But you'll need to use some numerical recipe to solve this equality condition. So that completes my description of how to solve least squares problems with quadratic inequality constraints. So we just learned how to solve this problem. Again, then knowing how to solve this is actually very powerful because, so for instance, if you want to perform least squares minimization over a sphere, that is, you can do that by just setting b equal to the identity matrix and d equal to 0, then it just becomes norm x2 less than or equal to alpha. So you want to minimize the L2 norm error between Ax and b subject to norm x less than or equal to alpha. There's also another succinct solution you can obtain by SVD, but I won't discuss that here. Similarly, if you want to solve an equality constraint least squares problem, that is minimize Ax minus b L2 norm subject to an equality constraint Bx equals d, then all I have to do is to set alpha equals 0 here. So I just set alpha equals 0 and I can use the solution developed here to solve it. But there is also another succinct solution via qr that I won't discuss here. So basically, this brings me to the end of what I wanted to discuss in this course. Are there any questions?