 Okay, now the next thing I want to say is something also very useful, which is that in general the eigenvalues of a matrix are roots of its characteristic polynomial. But for Hermitian symmetric matrices, we can write the eigenvalues of a matrix as solutions to optimization problems. Okay, and that different way of looking at eigenvalues of Hermitian symmetric matrices is called the variational characterization, the eigenvalues of Hermitian matrices. Variational characterization just means that it is the solution to an optimization problem by varying some cost function and looking for local minima, local maxima, saddle points of this cost function, you can identify all the eigenvalues of the matrix. Okay, so recall that the eigenvalues of a Hermitian symmetric matrix are all real. So, we can consider ordered eigenvalues. Hermitian matrix A will order them in this order. So, we'll call lambda min or lambda 1 to be the smallest eigenvalue and that is less than or equal to lambda 2 less than or equal to etc. Lambda n is the largest eigenvalue. So, we'll consider ordered eigenvalues like this. So, we have the following theorem. This is called the Rayleigh-Ritz theorem. So, let A and C to the n cross n be Hermitian. Then, lambda 1 times x Hermitian x is less than or equal to x Hermitian A x is less than or equal to lambda n is Hermitian x for every x in C to the n. What this means is that if I consider for any x in C to the n, if I consider the quantity x Hermitian A x, that's lower bounded by lambda 1 times x Hermitian x and upper bounded by lambda n times x Hermitian x. And in fact, the both this lower bound and the upper bound are achievable and you can achieve them by setting x to be the eigenvector corresponding to the smallest and largest eigenvector eigenvalues respectively. So, basically we have that lambda max is equal to lambda n is equal to the max over all nonzero x's of x Hermitian A x over x Hermitian x. But then I can always so if I scale x by some constant, the numerator scales by that constant modulus squared and the denominator also scales by that constant model square. And so I can also write this as max over all x such that x Hermitian x equals one of x Hermitian A x. So, this is what you would call an unconstrained optimization problem except there's a small constraint that x cannot be equal to 0. And this is a constrained optimization problem. So, I can find out lambda n by solving this optimization problem, which is to maximize x Hermitian A x subject to x Hermitian x being equal to one. And similarly, I can write lambda min equal to lambda 1 equal to the min over all x not equal to 0 x Hermitian A x divided by x Hermitian x which is equal to the min over x Hermitian x equals one x Hermitian A x. So, let us see this. This is a very important result, which we will use many times in the coming classes. So, if A is Hermitian, that means that there exists a unitary u such that A equals u lambda u Hermitian where lambda is a diagonal matrix containing the eigenvalues along the diagonal. Now, consider for any x in c to the n x Hermitian A x is the same as x Hermitian u lambda u Hermitian x, which is of course equal to u Hermitian x Hermitian times lambda times u Hermitian x. Now, lambda here is the diagonal matrix. So, I can expand this out and write this as sigma i equal to 1 to n lambda i times u Hermitian x. And then I take the ith component of it and then mod square. And each of these terms is non-negative. So, that means that if I am taking a linear combination of these terms scaled by lambda i's, if I replace all these lambda i's by lambda min, that will be a lower bound on whatever value this can achieve. And if I replace all these values by lambda max, that will be an upper bound on whatever this can achieve. So, that means that lambda min times summation i equal to 1 to n mod u Hermitian x ith component squared is less than or equal to x Hermitian A x is less than or equal to sigma i equal to 1 to n lambda i mod, sorry, this is equal to this. So, I wanted to write lambda max sigma i equal to 1 to n mod u Hermitian x ith component squared. And because u is unitary, if I take this summation, this summation is nothing but the summation of mod xi squared, which is x Hermitian x. So, sigma i equal to 1 to n u Hermitian xi squared is equal to, I can write this, the other way to think about this is that it is u x Hermitian u x, which is x Hermitian u u Hermitian x and u u Hermitian is equal to the identity matrix for u being unitary. And so, this is nothing but x Hermitian x, okay. And so, substituting that in here, I get lambda min times x Hermitian x is which is equal to lambda 1. So, lambda min is the same as lambda 1 in my notation, x Hermitian x is less than or equal to x Hermitian ax is less than or equal to lambda n times x Hermitian x, which is equal to lambda max x Hermitian x. Lambda max is the same as lambda min, as lambda n by my notation. Okay, so we will call this start for later use. Now in, so we found these bounds, now when can this equality be attained here and here? So, equality can be attained by choosing x equal to the eigenvector corresponding to lambda 1 for the lower inequality, the first one, and equal to the eigenvector corresponding to lambda n for the upper inequality, that is this part here. Further, from this equation for x not equal to 0, x Hermitian ax over x Hermitian x, I am just taking x Hermitian x to the other side and that is less than or equal to lambda max or lambda n with equality when x is an eigenvector of a corresponding to lambda n. And similarly, x Hermitian ax over x Hermitian x is greater than or equal to lambda 1 with equality when x is an eigenvector of a corresponding to lambda 1. So, these two in turn imply that the max, so this is always less than or equal to this and it retains equality when x is an eigenvector. So, then what this means is that max over x not equal to 0, so it is an upper bound that is achievable. So, x Hermitian ax over x Hermitian x is equal to lambda n and min over x not equal to 0 of x Hermitian ax over x Hermitian x is equal to lambda 1. So, and finally, if x is not equal to 0, then if f of x equals x Hermitian ax over x Hermitian x, then f of alpha x is alpha star x Hermitian a alpha x divided by alpha star x Hermitian alpha x which is equal to mod alpha square times x Hermitian ax divided by mod alpha square x Hermitian x and alpha squares cancel each other and so this is equal to f of x. So, basically I can solve these optimization problems equivalently by considering I can just scale any nonzero x I can just scale it to have unit norm and so can equivalently solve by or rather over x L2 non equals 1 to get max x such that x L2 equals 1, x Hermitian ax equals lambda n and min over x such that non x L2 equals 1, x Hermitian ax equals lambda 1. So, that's basically the proof of this theorem. So, geometrically what's happening is that the largest eigenvalue is the largest scaling that can happen to the norm or is the largest value of x Hermitian ax as I vary x over the unit sphere complex n dimensional sphere and lambda 1 is the smallest value of x Hermitian ax as I vary x over the complex unit sphere in n dimensions. So, that's this result. So, this Rayleigh Ritz theorem I'll go through the statement of the theorem once again because it's this result is very, very crucial and we'll be using it quite extensively. So, if a is a Hermitian matrix then lambda 1 times x Hermitian x is a lower bound on x Hermitian ax and lambda n times x Hermitian x is an upper bound on x Hermitian ax for all x in c to the n. So, for example, I mean this the fact that the matrix is Hermitian is crucial for this result to hold. If a is not Hermitian then this result need not hold. So, just to give you a silly example to illustrate that we go back to our favorite defective matrix. So, if I take n equal to 0 1 0 0 and if I take x equal to 1 over square root of 2 and 1 over square root of 2 then if I compute x Hermitian nx that's going to be equal to 1 over square root of 2 times 1 over square root of 2. So, that is equal to half which is actually greater than 0 which is all the eigenvalues of it. So, in other words the inequality required by the Rayleigh Ritz theorem does not hold for this example. So, also from this the fact that lambda 1 x Hermitian x is a lower bound on x Hermitian ax and lambda n times x Hermitian x is an upper bound on x Hermitian ax. We have the following eigenvalue inclusion result. So, a is again a Hermitian matrix and x is some vector in c to the n which is non-zero then let alpha be defined as x Hermitian ax over x Hermitian x then there is at least one eigenvalue of a in minus infinity alpha and at least one eigenvalue of a in alpha infinity because if I take an arbitrary x this x Hermitian ax over x Hermitian x is going to be between lambda 1 and lambda n and so there is at least one eigenvalue to the left of this thing including the point alpha and there is at least one eigenvalue to the right of this thing including the point alpha. Now of course this result talked about essentially bounding x Hermitian ax in terms of the smallest and largest eigenvalues of the matrix a and so one could wonder what about the other eigenvalues. So, can we can we also develop variational characterizations for for example lambda 2, lambda 3 and the other eigenvalues? Turns out the answer is yes and that's another very cool theorem that we will cover in the next class. Okay, Kuran Fisher theorem.