 The last time we were looking at variational characterization of eigenvalues, by which we mean that we are looking at characterizing eigenvalues as solutions to an optimization problem. This is specific to Hermitian matrices and which have the property that the eigenvalues are all real, and so you can consider ordered eigenvalues. So you can order them in increasing order and that is a set of eigenvalues. We saw this Rayleigh-Ritz theorem. It said that if you have a Hermitian symmetric matrix with ordered eigenvalues lambda 1 to lambda n, then x Hermitian A x is lower bounded by lambda 1 times x Hermitian x and upper bounded by lambda n times x Hermitian x for any x belonging to c to the n. So the length x Hermitian x is the length Euclidean length squared of x. It gets scaled when you do x Hermitian A x, and the smallest possible scaling is lambda 1, and the largest possible scaling is lambda n. So that gives you bounce on how large or small x Hermitian A x can become compared to x Hermitian x. Further, lambda max or lambda n is equal to the largest value that x Hermitian A x over x Hermitian x can take over all x not equal to 0, which is the same as maximizing over vectors lying on the unit n-dimensional complex ball given by x Hermitian x equals 1 of x Hermitian A x. Similarly, lambda 1, which is the smallest eigenvalue, is the minimum of x Hermitian A x over x Hermitian x for all x not equal to 0, or overall x not equal to 0, and is the same as minimizing x Hermitian A x over the unit n-dimensional complex over the n-dimensional complex unit sphere. A corollary to this was that if A is a Hermitian matrix, then if we define alpha to be x Hermitian A x over x Hermitian x, for any non-zero x in c to the n, then there is at least one eigenvalue of A in the interval minus infinity and alpha, and at least one eigenvalue in the interval alpha to infinity. Now, today we continue this discussion and talk about further results on such variational characterizations of eigenvalues. So this really reads theorem. Is there a question? Yes, sir. What is the use of this corollary? So it allows you to identify the intervals in which eigenvalues of A must lie. So we'll see some examples of further results we can derive based on these results. In fact, this corollary is an easy consequence of the Rayleigh-Ritz theorem. So sometimes we may not explicitly refer to the Rayleigh-Ritz theorem, refer to this corollary and actually go back to the Rayleigh-Ritz theorem to show it, but in fact it's a consequence of this corollary as well. But we'll see some examples of where this will be useful. But for now, just note that if you know any X, so for example I could take X equal to E1. If I take X equal to E1, X Hermitian AX will be A11, the 1 comma 1 element of the matrix A. Of course, X Hermitian X is equal to 1 for that vector. So what I know then is that there is at least one eigenvalue of A, which is between minus infinity and A11, and at least one eigenvalue in the interval A11 to infinity. This applies to any diagonal entry. If I take X equal to EK, I'll get different diagonal entries of the matrix A. So what this is saying is that there's at least one eigenvalue, that is less than or equal to any one of the diagonal entries of A, and at least one eigenvalue, which is greater than or equal to any of the diagonal entries of A, and so on. In fact, it's often useful to approximately locate these eigenvalues. You may not want to get the exact eigenvalues simply because computing the exact eigenvalues is a computationally expensive task, especially for very large dimensional matrices. So finding bounds or intervals in which these eigenvalues may lie is actually very useful. Thank you, sir. Yeah. Okay. So we'll continue. Now, this result tells us something about lambda max and lambda min. A natural question is, what about the other eigenvalues? Can we have a variational characterization of the other eigenvalues? So now, suppose any Hermitian symmetric matrix is unitarily diagonalizable. So suppose A can be written as U lambda U Hermitian, where U is unitary and we'll denote its columns as U1 through Un is unitary. And lambda is a diagonal matrix containing the eigenvalues of the matrix A. Now, suppose we consider only the vectors X that are orthogonal to U1. So if we consider only those X in C to the N that are U1, the first column of U, which has the corresponding eigenvalue lambda 1. This is the smallest eigenvalue. Okay. Then we have the following. So if I consider X Hermitian Ux, this is equal to, I'll expand it out. So A is U lambda U Hermitian. So I can write this as summation i equal to 1 to N, lambda i times the entry of U Hermitian X, the ith entry, ith entry square, which in turn is equal to the ith entry of U Hermitian X is simply Ui Hermitian times X because U has columns U1 to Un. And so I can write that as sigma i equal to 1 to N, lambda i times Ui Hermitian X square. Now, U1 Hermitian X is equal to 0 because I'm assuming that I'm considering only an X which is orthogonal to U1. And so I can further drop the i equal to 1 term and write this as i equal to 2 to N lambda i Ui Hermitian X square. Now, this is a non-negative number. So this is a non-negative combination of lambda 1 to lambda N and lambda 2 is the smallest number. So if I replace all these eigenvalues by lambda 2, I'm only making this summation here smaller. So then I get, so it's a non-negative combination of lambda 2 to lambda N. So I have X Hermitian AX is greater than or equal to lambda 2 times sigma i equal to 2 to N, Ui Hermitian X square. And this, again, see, U is a unitary matrix. And so this is actually equal to lambda 2 times sigma i equal to 1 to N. I mean, I'm reinserting that 0, which was U1 Hermitian X. And I'll write it as U Hermitian X, ith component square. And this is just nothing but X Hermitian U, U Hermitian X. And U is a unitary matrix. So this is equal to lambda 2 times X Hermitian X. So basically, so we have now that X Hermitian AX is at least greater than or equal to lambda 2 times X Hermitian X. For any X that is orthogonal to U1. Now, we can achieve equality in this by choosing X equal to U2. So that means that, I mean, you can see that from here itself, if X equal to U2, then only the U2 term will survive. And this will become equal to lambda 2 times U2 Hermitian X square. All the other terms will be equal to 0 because these are orthonormal eigenvectors. And so then this will become equal to lambda 2 times X Hermitian X. So U2 Hermitian U2, which is equal to 1. So U2 Hermitian A U2 is equal to lambda 2. So that means that the minimum over all non-zero X that are perpendicular to U1 of X Hermitian AX over X Hermitian X, which is actually equal to, instead of considering all X here, I can as well minimize over all X such that X Hermitian X equals 1. And retaining this constraint X is perpendicular to U1. X Hermitian AX, X Hermitian X equals 1. So I don't have to divide by that. And this is equal to lambda 2. So this shows how I can characterize other eigenvalues in terms of as a solution to an optimization problem. So if I want lambda 2, I need to insert a constraint. X should be perpendicular to U1. By making the same exact argument and extending it, we have that the min over X not equal to 0, X perpendicular to U1, U2 up to UK minus 1. X Hermitian AX over X Hermitian X, which is equal to the min over X Hermitian X equals 1. X perpendicular to U1 up to UK minus 1. X Hermitian AX is equal to lambda K. And this is true for K equal to 2, 3, etc. It's also true for K equals 1, except that this inequality or this constraint here, X perpendicular to U1, drops off when I consider K equal to 1. So we'll follow that convention going forward. And so we may even write K equal to 1, 2, 3, etc. But when I say X is perpendicular to U1 through UK minus 1, and if I said K equals 1, it's kind of saying X is perpendicular to U0, but there is no such vector like U0. So what that means is that this constraint drops off. So this is one way to write all the eigenvalues of the matrix A in terms of an optimization problem. And similarly, so if you remember, we started by looking at, I mean, in this, we have a lambda max also, which is the max, the solution to a maximization problem. So starting from lambda n, if you had considered all X that are perpendicular to Un, and then proceeded with exactly these arguments, what you can show is that this is also equivalent to saying that the max over X not equal to 0, X perpendicular to Un, Un minus 1 all the way down to Un minus K plus 1 X summation Ax over X summation X is equal to the max over X summation X equals 1, X perpendicular to Un, Un minus 1 all the way up to Un minus K plus 1 X summation Ax is equal to lambda K. So sorry, this is lambda. I've gone up to n minus K plus 1. So this is lambda n minus K. And again, K equal to 1, 2, et cetera. Because when I put K equal to 1, I get lambda n minus 1. So this is another way to characterize the eigenvalues of A as a solution to a maximization optimization problem. So we've seen these variational characterizations of all the eigenvalues of the matrix A. Now, this is nice, but it has a small drawback, which is that in order to set up the optimization problem, in this case, for example, you need to know what U1, U2 up to UK minus 1 R. Or in this case, you need to know what Un minus 1 up to Un minus K plus 1 R. We can overcome this dependence on the knowledge of these eigenvalues as follows. So let W be an arbitrary vector in c to the n. Then the maximum, which we'll write as soup for no particular reason. So for the purposes of this course, soup and max are the same. The textbook writes soup, so I'm also writing soup here. X-hemation x equal to 1, and x perpendicular to this vector W that is given to us of x-hemation Ax is equal to the soup of x-hemation x equal to 1, x perpendicular to W of x-hemation. I'll substitute xA is equal to U lambda U-hemation. So this is what I get. And this is again equal to just expanding this in terms of a summation of summation i equal to 1 to n lambda i times U-hemation x ith component square. Now what I'll do is I will, I'll call this vector z. U-hemation x equals z or x equal to Uz. So multiply by U-hemation. U-hemation x equals z. So then I get that this quantity here is equal to the soup. So x-hemation x equals 1 is the same as saying z-hemation U-hemation Uz equals 1, but U-hemation U is the identity matrix. So I can write the constraint as z-hemation z equals 1. And x is Uz, so Uz is perpendicular to W of the summation i equal to 1 to n lambda i times mod zi square, which is equal to. Now if Uz is perpendicular to W, that means that mathematically this is the same as saying z-hemation U-hemation W equals 0, which is the same as saying z is perpendicular to U-hemation W. Actually, these are both all equivalent statements. So instead of constraining Uz to be perpendicular to W, I can say that z should be perpendicular to U-hemation W. So I'll write this as the soup. And then I'm writing that this is greater than or equal to the supremum over the summation i equal to 1 to n lambda i mod zi squared subject to z-hemation z equals 1, z perpendicular to U-hemation W, and z1 equals z2 equals et cetera up to zn minus 2 equals 0. Are you able to hear me? Yes, I only just now, sir. OK. So I've not gone much further ahead. All I did was I said that this quantity is greater than or equal to the same quantity, but with the extra constraint z1 equals z2 equals zn minus 2 equals 0. Are you able to hear me now? So your voice starts breaking up every now and then. Hello, can you hear me? Yes, I understand. But unfortunately, I don't have a very good internet connection right now. So you don't have to tell me if you're able to follow the argument I'm making. I'm making one small argument here that this quantity that we came up to is greater than or equal to this quantity here, which is the same as this, except that there is this additional constraint that z1 through zn minus 2 equals 0. Are you able to hear me? Yes, sir. So all I've done is to add a few extra constraints, OK, making or forcing some of the zis to 0 can only decrease the value of this cost function. Whatever supremum you could achieve here, you may or may not be able to achieve it here because you have this additional constraint that z1, z2, et cetera, up to zn minus 2 must be equal to 0. So the cost function value will decrease, OK? And so this is the same as supremum. So since z1 to zn minus 2 equals 0 and z Hermitian z equals 1, I can write that as zn minus 1 square plus zn squared equals 1. And vector z is perpendicular to u Hermitian w. Since the first n minus 2 zis are equal to 0, I can drop those terms here and write the cost function as lambda n minus 1 times zn minus 1 square plus lambda n times zn squared, OK? And of course, between these two, this quantity is the smaller quantity. And we are taking, essentially, a convex combination of these two terms because zn minus 1 square plus zn squared equals 1. And so when you take a convex combination of two numbers, lambda n minus 1 and lambda n, then whatever this is, there's going to be some number between lambda n minus 1 and lambda n. And so this is greater than or equal to lambda n minus 1. OK, so what this says is that what we have just shown is that so x Hermitian x equals 1, x perpendicular to w, x is greater than or equal to lambda n minus 1. And this is true for every w thing. Now it should be OK. So I'll just very quickly repeat what I was saying. So our starting point was we were looking at the largest value or the supremum of x Hermitian ax over all x such that x Hermitian x equals 1 and x is perpendicular to w. We went through a few simplifying steps and we came up to a point where we showed that this is exactly equal to the supremum of summation i equal to 1 to n lambda i times mod zi squared subject to z Hermitian z equals 1 and z is perpendicular to u Hermitian w. And then we did something which I consider quite brilliant, which is to say that this is greater than or equal to the supremum of the same quantity summation i equal to 1 to n lambda i times mod zi squared subject to z Hermitian z equals 1, z perpendicular to u Hermitian w. But we threw in one extra constraint that z1, z2 up to zn minus 2 are all equal to 0. That's because throwing in an extra constraint can only reduce the value of the cost function because not all points that are feasible here are going to be feasible here. Here you're only allowed to search. You not only have to respect these two constraints that z Hermitian z equals 1 and z is perpendicular to u Hermitian w. You also have to respect another additional constraint that z1, z2 up to zn minus 2 equals 0. So this cost function cannot be as, may not be as large, it can never be larger than this cost function. And so this is greater than or equal to this. And this, now that I've set z1 to zn minus 2 equals 0, I can drop the first n minus 2 terms in this summation and write this as the supremum over and similarly in this constraint. This is nothing but z1 squared plus z2 squared plus et cetera up to zn squared equals 1 and the first n minus 2 terms are equal to 0. So I can replace the constraint with this constraint here, zn minus 1 squared plus zn squared equals 1 and the cost function becomes lambda n minus 1, zn minus 1 squared plus lambda n zn squared. Now this is, these two things in, these two quantities add up to 1. So they are numbers between zero and one, they are non-negative numbers. And so this is just a convex combination of lambda n minus 1 and lambda n. And lambda n minus 1 is smaller than lambda n. So the smallest this can ever be is just lambda n minus 1. Okay, so in effect what we ended up showing is that the supremum of X-hemission AX subject to X being perpendicular to W and X-hemission X equals 1 is at least equal to lambda n minus 1. And this is true for any arbitrary W which is in C to the n. Okay, so since it's true for any W, even if we throw in an infimum, even if we take the minimum of the left-hand side that will still satisfy this inequality. In other words, I can fix my W to be anything. I'll fix it to be the one that achieves the minimum over all W in C to the n of the supremum X-hemission AX subject to X-hemission X equals 1, X perpendicular to W. This is equal to lambda n minus 1. Sorry, is greater than or equal to lambda n minus 1. Okay, so but then from what we saw above, this quantity will achieve equality if I said W equals U n. Sir? Yes. Sir, in the infimum statement, why is there not equality? So that's what I'm coming to next. So that's the point. I can achieve equality here. So what showed is that the infimum is at least equal to lambda n minus 1. But then when I said W equals U n, I will get lambda n minus 1. That's what we showed earlier. And so the conclusion is that the infimum over all W in C to the n of the supremum over X-hemission X equals 1, X perpendicular to W, X-hemission AX is equal to lambda n minus 1. Okay, so in other words, in this particular optimization problem, this is a different optimization problem that characterizes lambda n minus 1. And in this optimization problem, instead of saying I'll take the supremum over X perpendicular to U n, I'm doing a supremum over an arbitrary W and then taking an infimum over all such possible Ws. And so I don't, I mean, at least technically, the way this optimization problem is set up, I don't need to know what U n is in order to solve the problem. It's another matter that the solution to this optimization problem occurs at W equal to U n. But in the problem setup itself, I don't have a requirement that I need to know what U n is.