 The reason why we are looking at many of these are because ultimately when we use, when we develop algorithms for optimization, we have to deal with matrices and these matrices, you often end up with powers of these matrices and you want to know whether your algorithm converges. And so sort of a fundamental question is whether a power k is something that converges as k gets large. So basically let me maybe first say, give this definition, a matrix A, c to the n cross n is said to be a convergent matrix if k tending to infinity of a power k equals 0. So this is the n cross n, all 0 matrix. So this is a convergent matrix and what we will be interested in is under what conditions can we say that a matrix A is going to be convergent. Obviously, you can imagine that in particular applications, you may not be, you may not have a matrix A such that a power k goes to 0. But what we will see is that by suitably scaling the matrix, you can ensure that the matrix will be a convergent matrix. What I mean by that is if you scale a matrix, you know that all its eigenvalues also scale by the same amount. And so I can potentially scale a matrix such that all of its eigenvalues become smaller and I can choose that scaling factor to be appropriate so that all its eigenvalues in magnitude get bounded between 0 and 1. So then all its eigenvalues are less than 1 and what we will see momentarily is that if, so the scale matrix has a spectral radius which is less than 1 and if a matrix has a spectral radius less than 1, then it is convergent, that is what we will see. So just, so let us get there. So we have the following lemma, it is going all over the place today for some reason. Lemma, let a be an n cross n matrix. If there is a matrix norm such that is less than 1, then so what this means is that every entry of a power k will go to 0 as k goes to infinity. So we will just quickly see, this is a one line proof, it is just that if norm of a is less than 1, then norm of a power k by sub multiplicativity is less than equal to norm a power k and this quantity is less than 1 so when you take it to a large enough power this quantity will go to 0 as k goes to infinity. So basically what it says is that the norm of a power k is going to 0 as k goes to infinity and so this matrix A, now you can, there are various ways to argue it but one way is that this is a matrix norm, you are taking a matrix norm of some matrix called B, let us say B is equal to a power k, then if the norm of, this is a non-negative quantity and this a power k is less than or equal to something that is going to 0, so as k gets large the norm of this matrix is going to go to 0 and if the norm of the matrix goes to 0 by property of matrix norm that, the positivity property of a matrix norm, if the norm of a matrix goes to 0 it must mean that the matrix itself must be going to 0, so that is one way to argue it. The other way to argue it is to say that, so a power k goes to 0 as k goes to infinity with respect to, this is something we saw again in a previous class but with respect to this particular norm but then a norm is also a vector norm on c to the n squared, any matrix norm is also a vector norm on c to the n squared and all vector norms on c to the n squared are equivalent, so what that means is that a power k goes to 0 in the sense of with respect to the infinity norm as well, norm infinity which means that this is the maximum magnitude entry which implies every entry of a power k converges to 0, so it was not quite a one line proof but nonetheless the core part of the proof is really this step here, okay, now. Sir, what does this equivalent mean, is it related to the homework assignment problem? So what, yeah it's, yeah the problem was a homework problem on this also but it's a property that I discussed again in a previous class that you can always bound, given a particular norm you can bound any other norm in terms of, in terms of that norm, so there are constants c small m and c capital M such that the first norm is bounded between c small m times the second norm and c capital M times the second norm for any vector and so instead of this some norm we've used here we can instead replace that with this norm and since an upper bound on this norm is going to go to 0 then the infinity norm must also go to 0 which means that every entry of a power k must go to 0. Yes sir, okay sir, thank you sir. So now we saw, so we know that if there is a norm for which this is less than 1 then limit is 0 and we also have seen that rho of a is lower bound on any possible norm. So obviously if rho of a is less than 1 I know that I can find a norm such that the norm of a is within rho of a plus epsilon. So if rho of a is strictly less than 1 then I can always find a norm under which the norm of a is strictly less than 1 and therefore such a matrix will be converged. So that is this lemma here. Let a and c to the n cross n then limit k going to infinity a power k equal to 0 if and only if rho of a is less than 1. So this is saying more than what I just said it's an if and only if statement but the other part is very easy if a power k if a power k goes to 0 as k goes to infinity and so and 0 not equal to x in c to the n is such that a x equals lambda x then so this is one of the eigenvectors of the matrix then if I look at a power k x this is just repeated multiplication of x with a each time I multiply I'll get another lambda factor so this is going to be lambda power k times x and for for this to go to 0 x is a non-zero fixed vector so the only dependence on k is coming through lambda power k and so this will go to 0 only if mod lambda is less than 1 so and and this must hold hold for all that all eigenvalues of a which implies that rho of a is less than 1 okay so that's the if part and for the converse if rho of a is less than 1 less than 1 then there is some norm such that norm of a is less than 1 okay and so this implies that by the previous lemma k can be infinity a power k equals 0 so if rho of a is less than 1 then this limit a power k as k goes to infinity will be equal to 0 that shows the other side so I mentioned that in algorithms we often want to consider scaled versions of matrices that will allow us to bound the entries of a matrix as you take higher and higher powers and this is a corollary that helps us bound the size of the entries of a power k as k goes to infinity so that's this corollary okay so let k and c to the n cross n and epsilon be some positive number okay then there exists a constant c such that mod of a power k ij th entry is less than or equal to this constant c times rho of a plus epsilon power k for k equal to 1 2 and ij being any entry okay so that's the corollary so what it says is that you there's a way to based on this row of a you can bound the ij th entry of a power k in terms of some constant times rho of a plus epsilon power k so proof is very simple again these bounds come in very useful when you're analyzing the convergence behavior of algorithms so suppose i define a tilde to be rho of a plus epsilon inverse times a okay then i claim that rho of a tilde is less than one why is this true i just scaled every entry of this matrix a by rho of a plus epsilon inverse and so the eigenvalues also get scaled by rho of a plus epsilon inverse and in particular the largest eigenvalue in magnitude also gets scaled by rho of a plus epsilon inverse and so the largest eigenvalue then in magnitude becomes rho of a divided by rho of a plus epsilon which is going to be less than one so this has a row of a tilde less than one which means a tilde is a convergent matrix that is a tilde power k goes to zero as k goes to infinity so one fundamental property of any convergent sequence so now think of it as there is a sequence of matrices and this sequence of matrices which is a tilde a tilde square a tilde cube a tilde power four etc this sequence of matrices are converging i can also think of this as there are n squared sequences each sequence corresponding to a distinct entry of this matrix a tilde and all these n squared sequences are converging one fundamental property of a of a convergent sequence is that every convergent sequence is bounded so that means in words the entries of of a tilde power k are bounded so what bounded in turn means is that there exists some constant c greater than zero such that mod a tilde ij a tilde power k ij is less than or equal to this constant c and this is true for all k and for all ij and so that implies now i'll just remove the scaling that i applied and take the scaling factor to the other side so that means that mod of a power k ij is less than or equal to this constant c times rho of a plus epsilon k for every k equal to one two etc and for every ij j equal to one two to n okay now if you remember two classes ago we were discussing about invertibility of matrices and we made a small remark that let me just find that here we said that in terms of convergence if x is a scalar okay so let's say real or complex and mod x is less than one then one minus x inverse can be expanded as one plus x plus x squared plus etc and we were asking the question of it suggests this kind of a formula that if I want to find i minus a inverse I can write that as i plus a plus a squared plus etc and we asked you know when is this valid this is valid for mod x less than one and we said that this is true if norm of a is less than one and any matrix norm will do and now we know that if rho of a is less than one then I can find a norm under which this condition will hold and so the equivalent condition is that rho of a should be less than so that's the next lemma corollary so a in c to the n cross n is invertible if there is a matrix norm such that norm of i minus a is less than one if this condition is satisfied then I can write a inverse is equal to sigma k equal to 0 to infinity i minus a power k so notice that you know I wrote this in terms of a but now I'm writing it in terms of i minus a so effectively you replace a by i minus a here then this becomes a inverse and this becomes i plus i minus a plus i minus a whole squared plus etc and that's what this formula here says and so and this condition here gets replaced with norm of i minus a less than one so it's exactly what we said earlier at that point it was a conjecture but now we that's actually formally stating the result so given what we've seen so far again the proof is very simple so if this norm of a i minus a is less than one then the series sigma k equal to 0 to infinity i minus a power k converges why so remember that I'm not just looking at i minus a power k here I'm looking at the summation of such terms so we've already seen that if the norm is less than one we go here it's here so we've already seen that if norm of a is less than one then limit a power k is 0 you've seen that but what we are doing now is kind of adding up these a power k type of terms from k equal to 0 to infinity and what we are saying here is that this kind of a summation it actually converges let's say converges to some matrix okay that's because the entries of this matrix i minus a power k of i minus a power k are at most that is we just saw this result that the magnitude of the entries of i minus a power k are at most some constant times rho of a rho of i minus a plus some small number epsilon power k and since rho of i minus a is some number that is less than or equal to this norm of i minus a which is less than one it means that there exists some small enough epsilon such that rho of i minus a plus epsilon is also less than one which then in turn implies that every entry i minus a power k summation i minus a power k is convergent now so now all i have to do is to show that whatever this thing converges to is actually a inverse so that's very easy so i just do a a times whatever is supposed to be a inverse so sigma k equal to zero to n so i'll take n terms and then make n go to infinity i minus a power k this is equal to i'll write a as i minus i minus a sigma k equal to zero to n i minus a power k now if i expand this out so i'll get i minus a i minus a squared etc when i multiply with i but then i'll get also negative terms which will start with i minus a and then you'll get i minus a squared i minus a cube and so on that the first term the first i minus a term will cancel the i minus a term i get when i multiply i with this summation here so this is what is called a telescoping sum so all the alternate terms will cancel off and this is going to be equal to i minus i minus a power n plus one and this i minus a power n plus one goes to zero the zero matrix as n goes to infinity so this converges to identity as n goes to infinity so that's why when i do so that's why i minus a this summation k equal to zero to infinity i minus a power k will be equal to a inverse so okay so this is one lemma about how we can use the fact that the norm of i minus a is less than one to write a inverse as a series but keep in mind that if norm of i minus a is greater than one it doesn't say anything about the invertibility of the matrix a it could well be invertible even if norm of i minus a is greater than one you can easily see that so obviously row of a could be greater than one if row of a is greater than one then every matrix norm of i minus if so let me actually to avoid confusion let me write that note a may be invertible so this is a one-way result is all i'm trying to say even if i minus a is greater than one for all norms so for example row of a row of i minus a could be greater than one okay in which case every norm of i minus a will be greater than one but a invertible is possible a very closely related result is something called the vernacular and i like to state it because the proof is slightly different so let b in c to the n cross n and be any operator norm that is it's induced by a vector norm c to the n cross n if norm of b is less than one then i plus b is invertible one plus norm of b inverse less than or equal to norm of i plus b inverse is less than or equal to one minus norm of b inverse so this is a lemma that allows you to bound the norm of i plus b inverse in terms of the norm of b it's at least one plus norm of b inverse and at most one minus norm of b inverse so we'll just quickly see how this is done then so contradictions so suppose i plus b is not invertible then that that means that there is a non-zero v in c to the n which lies in the null space i plus b times v equals zero because i plus b is a singular matrix so if i expand this out this means that b v equals minus v so that means that there is at least one eigenvalue whose magnitude is one so that means the largest magnitude eigenvalue must also be greater than or equal to one so norm b is greater than or equal to one since this norm is actually an induced norm okay so let me it's not because of eigenvalue being equal to one in magnitude but because this is an induced norm that means that yeah i'm just explaining this so if bv equals minus v the this norm here norm b it's an operator norm so this is actually equal to the max of let's say x not equal to zero of norm this whatever norm that induce this norm b x divided by norm x now if i choose a particular value of this for this x that will only give me a lower bound on this so this is going to be greater than or equal to norm of bv over norm v but bv equals minus v so that is equal to norm of minus v over norm v and norm of minus v is equal to norm v and this is one so that's basically this claim here but then this is a contradiction because we started out by saying that if norm b is less than one so basically what that means is if norm b is less than one then i plus b must be invertible okay now for the last step note that one we know that any norm for any norm the norm of the identity matrix is at least equal to one this follows from the sub multiplicativity of norms in fact for operator norms these are equal okay and this is in turn less than or equal to the norm of i plus b times i plus b inverse which is less than or equal to the norm of i plus b times the norm of i plus b inverse sub multiplicativity and i can further upper bound this by using triangle inequality we say that this is norm of i plus norm of b times norm of norm of i plus b inverse so we got one part already norm norm of identity is equal to one so that means norm of i plus b inverse is greater than or equal to one over one plus norm of b and similarly for the second half again i is equal to i plus b times i plus b inverse which i can write as i times i plus b inverse plus b times i plus b inverse and now i take norms rather i'll actually bring this to the other side and so i plus b inverse is equal to i minus b times i plus b inverse so now i take norm on both sides i plus b inverse is equal to the norm of this quantity i minus b times i plus b inverse again triangle inequality norm of i and sub multiplicativity for the second term norm of b times norm of i plus b inverse now i take this to the other side so that means one minus norm of b times norm of i plus b inverse less than or equal to one so just take this to the other side this is greater than zero so you can take this to the other side so norm of i plus b inverse is less than or equal to one over one minus norm of b so that concludes the proof so that's all we have time for today and we'll continue again on Friday