 Okay, so the equivalence of matrix norms basically says that if you have two different norms you can bound the norm of any matrix with respect to the first norm in terms of the norm of the matrix with respect to the second norm. And this is useful again because you may be interested in showing convergence of certain algorithms and in these algorithms you will get a sequence of matrices. Once again we have seen that if you are able to show that the sequence of norms of these matrices, if that converges then the matrix itself will converge. And so we are often interested in showing that the sequence of norms of a matrix converges but then for different problems it may be more easy to take different types of norms and show that that particular norm converges. But that may not be the norm in which you are actually executing this optimization problem that you are interested in. Fortunately for us these norms are all equivalent so if one particular norm is converging then every norm will converge, maybe not to the same value but to some other value. And so it is useful to have this kind of result on equivalence of matrix norms because it means we can use any norm that is convenient for us in trying to show convergence properties. So basically given any two norms, two matrix norms alpha and then there exists a least positive constant cm of alpha beta such that the alpha norm of A is less than or equal to cm of alpha beta times the beta norm of A for every A. In fact, cm of alpha beta can be computed by solving cm of alpha beta is equal to the maximum overall A not equal to 0 of norm A alpha divided by norm A beta. So you take this ratio and you find the biggest number it can take then of course it must be true that norm A alpha is less than or equal to cm of alpha beta times norm A beta for any other A because that is like a sub-optimal A that you are choosing which must satisfy this inequality. So by if I in this you know I just said alpha and beta and there is no I can always just exchange alpha and beta and the way to say that is that there exists this this cm of alpha beta there exists similarly a cm of beta alpha such that A beta is less than or equal to cm of beta alpha times A alpha and this cm of beta alpha is computed as the maximum of norm A beta divided by norm A alpha. So that's a different completely unrelated sort of different optimization problem. So it's not so basically in general there is no relationship between cm of alpha beta and cm of beta alpha but for induced norms that is both alpha norm and beta norm must be an induced norm they are equal. So this is actually theorem 5.6.18 now the textbook has you know a lot a lot a lot of theorems I obviously cannot cover all of them but where where appropriate I'll indicate some theorems mainly I'll focus on stating and proving theorems that we'll actually try and use later in the course but this is just an interesting result that is there in the text that you can go take a look if you're interested. Okay now there's just two more definitions. The first is the notion of a uniterally invariant norm. So we say that this norm is uniterally invariant I'll write this in short here this is the same as this statement here if norm of A equals norm of uav for every A belonging to c to the n cross n c to the n cross n. So one example is the spectral norm okay is uniterally okay this is a small exercise show this start from the definition and you will be able to show that the norm of uav spectral norm is equal to the spectral norm of A for any A and all possible unitary matrices. The second definition is is that so suppose is a matrix norm on c to the n cross n then so this is a new function I'm defining with an h on top which is defined to be the norm of A Hermitian is also a matrix norm okay now this norm that I'm defining here need not be an induced norm it's true for any norm any matrix norm that I can define on c to the n if I instead of computing the norm of A if I say the norm of A is this whatever norm I've defined operated on A Hermitian that is also a matrix norm it's actually straightforward to show this from the definition. So in particular if you take the Frobenius norm this is the square root of the sum of the squares of all entries in the matrix A this is equal to by definition the A Hermitian Frobenius norm so this is the Hermitian or the H norm of the Frobenius norm which is the Frobenius norm of A Hermitian of course a conjugate transpose doesn't change the sum of the squares of all of the magnitudes of all the entries in A so this is exactly equal to A2 and similarly we we defined this with two bars A1 to be the sum of the magnitudes of all the entries of A so similarly if I define this H norm to be the one norm of A Hermitian then again taking the conjugate transpose doesn't change the magnitudes of the entries in A so this is also equal to A1 so this is what we call the L1 norm and the spectral norm so if I take the H of the spectral norm I'll work with the square because that's easier this is equal to I have to compute so when I take the H norm I have to compute the spectral norm of A Hermitian instead of the spectral norm of A and then I have to square it now the spectral norm of A the spectral norm of a matrix is the square root of the largest eigenvalue of A Hermitian A and so if I apply that to this to A Hermitian I get that this is equal to the square of this is going to be rho of the spectral radius of A A Hermitian and this is another result that you have to show in your homework that rho of A A Hermitian is the same as rho of A Hermitian A which is equal to the spectral norm of A square okay so I'm slipping in two different things here one is that I'm showing that the spectral norm of A is invariant to doing this H operation the H norm of the spectral norm is the spectral norm itself and the second thing is I'm saying that the spectral radius is has a nice relationship with the spectral norm in that the spectral norm squared norm A2 squared the spectral norm squared is exactly equal to the spectral radius of A Hermitian A okay so that's the relationship between the spectral radius and the spectral norm or that's a relation between the spectral radius and the spectral norm okay so this is homework so you see that here when I take this H norm it gives you the same norm here also when I took the H norm it gives gives me the same Frobenius norm and the H norm of the L1 norm is also equal to the L1 norm such norms for which H is the same as the norm itself are called self-adjoint norms okay and by the way note that if I compute the max column some norm A H that is equal to the max column some norm of H A Hermitian which by definition is the max row some norm of A and that is different from the max column some norm of A so this is not a self-adjoint norm so max column some norm and max row some norm are not self-adjoint okay so there's one very nice result that says that one the spectral norm is the only matrix norm that is both induced and uniterally invariant the second property is that the spectral norm is the only norm that is both induced and self-adjoint okay the proof of this is in the text but these are again two special properties of the spectral norm which is one of the reasons why and you know multiplying by unitary matrices or taking the conjugate transpose operations are fundamental operations that arise in many many single processing and engineering applications and that's why for many of these applications we are interested in working with the spectral norm because it's invariant to these two operations it's a very special norm in that way okay so it's it's we still have about 12 minutes in the class but the next thing I want to talk about is about some uses of these norms we discussed several theorems but now we can maybe talk about solving linear systems and how these norms help us for example in bounding the error in computing inverses or solving linear systems now if I start on the next thing I need about let me see okay so let's let's just maybe make a few remarks and see how far we can go inverses solutions to linear systems okay so basically if we are given a matrix A which is non-singular we know that A inverse exists but when we want to try to compute A inverse then we may have to compute it on a finite precision arithmetic machine or maybe we don't get to observe A we only get to observe a noisy version of A and then we compute the inverse on a noisy version of A it turns out that under appropriate modeling you can actually generally consider both of these as a system where or a general simple mathematical model under which you can consider both these types of errors is to consider that what you have inverted is some other matrix A plus E so given A in C to the n cross n non-singular so pay attention to this this is important I'm starting out by assuming that the original matrix I wish to invert is actually non-singular and that's why I'm brave enough to try to compute its inverse and we wish to compute A inverse so instead we compute A plus E inverse where E is an error matrix so basically here E is small such that A plus E is also invertible so the error in there are error I've incurred will be this is like an error matrix which is A inverse minus A plus E inverse which I can write as I'll pull out an A inverse out of this so I'll write it as A inverse minus I plus A inverse E inverse times A inverse okay I'm just using the fact that A B inverse is B inverse inverse to write this now yeah go ahead so here we are trying to compute the inverse of a non-singular matrix so this matrix E is it deliberately added or is it you know observed I mean to make it invertible no no no A is non-singular to begin with okay so I can freely write quantities like A inverse otherwise this would be a meaningless thing to write if A could be singular so A is invertible A plus E is also invertible okay it's just that I couldn't compute or I didn't have A exactly in my hands what I got to observe was A plus E where E was a small perturbation on A okay okay such that A plus E was also invertible so I wanted A inverse but I have A plus E inverse in my hand the difference between these two is the error yes now what we've seen is that if the spectral radius of A inverse E is less than one then we can write I plus A inverse E inverse is equal to we can write this as a series sigma k equal to 0 to infinity minus 1 power k A inverse E power k so this is what we just recently we saw this this result okay so so we we can write it like this so now I'll substitute this in here which means that A inverse minus I plus A inverse E inverse times A inverse is equal to A inverse minus sigma k equal to 0 to infinity minus 1 power k A inverse E power k times A inverse now if I take the k equal to 0 term I'll get minus 1 power 0 this thing power 0 is the identity matrix times A inverse so the first term here exactly cancels this A inverse and so I'll be left with this is equal to all the other terms k equal to 1 to infinity and I'll absorb this minus 1 into this and write it as minus 1 power k plus 1 times A inverse E power k times A inverse okay so this is true if so keep in mind my starting point is if row of A inverse E is less than 1 now suppose this is a matrix norm A inverse E measured according to this row of A inverse E is also less than 1 and if I compute the norm of A inverse minus A plus E inverse so that will be equal to the norm of this summation k equal to 1 to infinity minus 1 power k plus 1 A inverse E power k A inverse which is less than or equal to I'll take the norm inside and I'll apply the sub multiplicativity sigma k equal to 1 to infinity norm of A inverse E power k times norm of A inverse now this is not dependent on k so it can come out of the summation and norm of A inverse E is less than 1 so this is summable and so I can write this to be equal to norm of A inverse E divided by 1 minus norm of A inverse E times norm of A inverse so we now know thus we know that the relative error if I define it to be norm of A inverse minus A plus E inverse divided by the norm of the guy I wanted to compute so this is the relative error in computing A inverse this can be upper bounded by norm of A inverse E divided by 1 minus norm of A inverse E if norm of A inverse E is less than 1 that was the assumption we made so we see that you know norms are useful to help us bound the relative error in computing things like A inverse the many more uses which we will see in the next class also with respect to some linear equations but we'll stop here for today and continue on Monday thank you