 In the last few lectures, we have finished the direct methods for solving linear systems. As the next topic, we are supposed to take up iterative methods for linear systems, but we will take a small deviation from our usual stream and learn a tool called matrix norms. This tool is very important for us to do error analysis for iterative methods. We will start our discussion with vector norms. Well, first let us understand what is mean by a norm. A norm is a abstract version of a distance concept. We all know that if we have two points x and y, say both x and y are real numbers, then the distance between x and y is given by norm x minus y, right. Similarly, if x is a vector in the plane given by x 1, x 2 and similarly y is a vector on a plane given by y 1, y 2, then the distance between these two points is given by root of x 1 minus y 1 the whole square plus x 2 minus y 2 the whole square, right. We know that this is called the Euclidean distance or Euclidean norm. So, we generally denote it by x minus y with a suffix 2, ok. This is the physical distance between the two points in whatever unit that we measure. Now, our interest is to generalize this idea and in that way we can bring in more functions into this class which can mimic this distance idea. That is what the aim of defining norms. Let us see the definition of vector norm. A vector norm is basically a function defined on R n such that when you apply that function on a vector x in R n, it gives a non-negative real number. That is the basic need of a distance that is distance between two points is always a non-negative real number, right. Let us denote this norm by this notation. Generally, a function means we use notation f, g and so on, but here we have this notation where you just give any point x that gives us a number which is say alpha which is greater than or equal to 0. And not only that this function has few properties, what we do is we just carefully observe some of the important properties that a distance function satisfies. That is the Euclidean distance in the case of R 2 and similarly this idea can be generalized to any R n and in one dimension it is just the modulus, right. So, we will carefully observe some of the important properties of these distance functions and just impose those conditions on this function and call the resulting function as a vector norm. That is the idea. Obviously, if you take any vector x in R n and if you find the distance between x and the 0 vector, right. That is what we denote by this. So, this will now indicate in some sense a distance function. So, you should see the definition of norm by keeping the distance concept in mind. So, the norm should give some number for a given vector x which is non-negative. That is the first condition that we want our function to satisfy. Well that is already inbuilt in the way we have defined our function. The next condition is that the distance between x and the origin 0 is 0 if and only if that x itself is on the origin, right. That is also something which you can easily understand without any problem. The same condition is also imposed for our norm function. Next is the scaling property. You multiply alpha with x and then find the distance that should be same as you find the distance between alpha and 0 and then multiply with mod alpha. They both should be the same, ok. So, this condition is also imposed on the vector norm and finally, the well known triangle inequality which is satisfied by the Euclidean distance. Also in 1D modulus satisfies this that is mod x plus y is less than or equal to mod x plus mod y is a very simple property that we all know. We are imposing that property also on to our function and if all these conditions are satisfied by this function then we will call that function as vector norm. Well, we will see some examples. First is the physically realistic distance function called Euclidean norm and that is given like this when x belongs to R n, right. We wrote it for n equal to 2 which will be the same for any n that we take and this is the Euclidean norm and that is the physically realistic way of finding distance between a given point x and the origin, right. Now, we can extend this distance concept to something abstract and say that there are also other functions that can mimic our distance function. One example is what we call as L infinity norm and that is denoted by this notation with the suffix infinity and its definition is you take the modulus of each component of the vector x and then take the maximum of all these guys and that is what we will assign as norm infinite norm of x, ok. You can verify that all the important properties that we listed in the previous slide that is satisfied by the physically realistic distance function is also satisfied by this function. See the number that this gives for a given x and the number given by this function for a given x may be different, but this function will satisfy all the properties that this function satisfies that is the only idea behind this abstract concept of norm. The next example is what we call as L 1 norm and that is denoted by this parallel lines with a suffix 1 and its definition is that you take modulus of all the components of the vector x and then sum them up. You can again verify all the properties listed in the definition of vector norm by this function and therefore, it is also a vector norm. Remember these are the three norms that are commonly used in numerical analysis while doing error analysis which involves any vectors. However, there are many other ways that we can define vector norms, but we will not go into listing all of them. Whatever we do in our course we will use one or the other of these three norms only mostly we will use L infinity norm because it is very convenient to use. You may ask the question when we have the physically realistic distance concept why are we putting effort to make an abstract idea out of this distance concept and get all these definitions which are not physically realistic. They just mimic the distance concept, but they are not going to give you physically what comes as the distance. So, if that is so, why are we even worrying about this abstract concept? Well, there are two reasons for why we are interested in making this distance concept abstract and considering some of this which are actually physically not realistic. There are two reasons one is that we need to find the distance not only between two vectors. We will also come across situations where we want distance between two matrices. Now, how will you even imagine finding distance between two matrices or in general we may have to also find distance between two operators or two functions something like that. So, it is not very clear how to define distance between two such mathematical objects right. So, one needs to put this distance concept in abstract form so, that that can be used to adopt some distance between two objects such as two matrices that is one reason for going for an abstract definition of distance. The next one is often physically realistic distance concept like this are not very easy to handle they are very difficult to handle. Therefore, to understand the errors and their behavior we may have to simplify our problem by going for some other distance concepts which are perhaps equivalent to studying such error analysis with the physically realistic distance concept. That is why we also want to take up some of the other ways of defining the distance which may not be physically realistic, but is some way equivalent to our physically realistic distance. These are the two reasons why we go for putting the distance concept in a abstract way well we understood this and now let us go to define what is mean by matrix norm. Remember wherever I say norm you have to keep in mind the distance concept and see norm as a generalization of the distance concept right. Now, we are trying to make a tool to measure distance between two matrices. Well once you understood how we measure distance between two vectors now we can just mimic that to define what is mean by matrix norm. Again we will use the same notation for matrix norm we will generally not distinguish the matrix norm with vector norm through its notation because the argument itself will tell us whether we are having a vector norm or a matrix norm right. So, a function denoted by this symbol which is now going to be a function between the set of all n cross n matrices with real entries to non-negative real numbers and that function should have the following properties in order to be called as matrix norm. The first property is of course, given here that you take any matrix A plug in into this function that function should generate a number that should be greater than or equal to 0 right. The next property is if that number given by the matrix norm is 0 it means you have given the 0 matrix and the third condition is the scaling property. The same set of properties that we have listed for vector norm is also listed here that is all alpha times A you take and then take the norm that should be equal to mod alpha times norm A that should hold for any real number alpha that is important and finally, the well known triangle inequality should also be satisfied. Well we will generally use this usual notation for matrices just keep this notation in mind and let us see some examples of matrix norm. First thing is if you define a function which takes a matrix A and gives this as the number you can see that clearly it is a non-negative number and equal also check that all the other properties of the matrix norm is satisfied by this expression. Therefore, a function given by this for a given matrix is indeed a matrix norm. Similarly, you take the modulus of all the elements of the matrix A and then take the maximum of that will also define a matrix norm and finally, take all the absolute values of the elements of A and then sum them all up that also will define a matrix norm. These are some of the examples of matrix norm there are also other examples an important example of a matrix norm which is quite common in applications I will tell you why it is so important little later. Now, let me just define this matrix norm it is generally denoted by this notation that is norm with a suffix 2 and it is defined as the maximum of the modulus of all the Eigen values of the matrix A transpose A. It is not the Eigen values of A, but it is the Eigen values of A transpose A that is the point that one has to keep in mind. So, this particular norm is very important you can check that it is indeed a matrix norm. Now, we will discuss a very useful class of norms called matrix norms subordinate to vector norms. The idea is you give me a vector norm I will generate a matrix norm with the help of that vector that is the idea how am I going to generate a matrix norm for a given vector norm. Let us see as I told you are given a vector norm on R n. Now, I am going to give you a matrix norm which is the so called subordinate to the vector norm and it is defined as supremum of norm A x for all x in R n such that norm x is equal to 1 that is you take all the unit vectors in R n find A x and then take the vector norm of all these vectors A x. Remember, whenever we put this two parallel lines on both sides of a vector it denotes a vector norm and the same if you put on both sides of a matrix then it is a matrix norm. Subordinate matrix norm is very useful because this particular way of defining a matrix norm for a given vector norm is going to satisfy three important properties which are very useful in our error analysis. Also, when we are dealing with the error analysis of an iterative method for solving A x equal to b you will naturally see that the equation is some way combining a vector and matrix therefore, you have to have some way of linking your matrix norm with the vector norm in order to get some feasible results that is why we are coming up with this kind of ideas that is we would like to work with a matrix norm that is generated from the vector norm in this way. In other words, we will consider a particular vector norm and then we will take the corresponding matrix norm subordinate to that vector norm and we will use these two norms in our error analysis for iterative methods in the next section. So, for that we have to understand the concept of subordinate norm. Let us put the definition of subordinate norm in a little different way that is more useful for us in working with them that is you are given any matrix A and a given vector norm. Now, you know at least mathematically how to define a subordinate matrix norm. You can see that subordinate norm can also be written in this way. Remember by definition we have to take the supremum over unit vectors, but now what we are doing we are taking the supremum over all non-zero vectors that is the only difference, but this should not be a difficult thing to understand because you take any non-zero vector z. You can always write x is equal to z divided by norm z then you can see that x will become a unit vector. Therefore, your subordinate matrix norm which is by definition given like this you are taking the supremum over all unit vectors that can be now written as simply substitute this expression into x and that gives you maximum over all z not equal to 0 a instead of x now I am putting this expression right. Just take z outside and then you will get this expression that is what you would like to show here right. So, it is a very simple proof and it is very useful way of defining the subordinate norms rather than the norm that we have defined that is only for the definition sake, but in most of the results we will be using this definition for subordinate norms. Let me list all the three properties that I told previously which are very important for us in doing error analysis in the next section. The properties of the matrix norm that we are interested are here. The first is norm A x is less than or equal to norm A into norm x. Remember this is a vector norm, this is a matrix norm and again this is a vector norm. Let us see how to prove this first result. Well if x is equal to 0 it is very clear because in this case A x is 0 therefore, norm of 0 is 0 on the right hand side norm of 0 vector is again 0 therefore, you have 0 equal to 0. Therefore, if x is equal to 0 this equality holds in the first case. Now let us take x not equal to 0 then you can write norm A x divided by norm x is less than or equal to maximum over all x not equal to 0 norm A x divided by norm x right because this is just any vector from this set and the right hand side is obtained by taking maximum over all such vectors. Therefore, this value should surely be less than or at most equal to this right and that is nothing, but norm A by our previous lemma from here we can immediately get the first inequality. The second inequality can be easily obtained from the first inequality right that should not be a problem. Let us take the third property in the third property what you can do is you just write A B x divided by norm x and take the maximum over all x not equal to 0 right that is nothing, but your norm A B by definition now this can be seen as A into some vector therefore, by first property you can write this as maximum over all x not equal to 0 norm A into norm B x divided by norm x right. Again you apply the first inequality for this term norm B x that will be less than or equal to maximum over all x not equal to 0 norm A into norm B into norm x right divided by norm x. Now, you see x and x gets cancelled and now what remains is independent of x therefore, that is equal to norm A into norm B and that is the proof of the third inequality well these are some of the properties that we will be using quite often in our error analysis of iteration methods. Now, the interesting part is I will give you some vector norm how can I generate a matrix norm subordinate to that vector norm. If you carefully look at the definition it is very abstract you cannot make any sense of how the matrix norm will look like for a given vector norm right, but interestingly you can get a very nice formula for subordinate matrix norm for the important three vectors norms that we have given at the beginning of this lecture that is L 2 norm L infinity norm and L 1 norm. Let us give the formulas for the matrix norm subordinate to these three vector norms we will not prove this theorems, but we will just consider them as formulas. Suppose your given norm is L infinity norm that is recall your vector norm is given like this then the corresponding subordinate matrix norm is defined like this. How will you do that if you carefully see you take each row and find the sum of the absolute values of the column elements of that row and then sum that. Now, in a n by n matrix you have n such rows therefore, you have n such numbers like this. Now, take the maximum of that ok. So, this is called the maximum of row sum norm. Let us see an example suppose you are given this matrix from the first row you get the number 3 by taking the absolute value of each column of that matrix and then sum them up. Then the second row will give 5 and the third row will give 4. Therefore, the matrix norm subordinate to the L infinity norm denoted by norm L infinity is given by 5 because you have to take the maximum over these three numbers. Now, for the matrix norm subordinate to L 1 norm it is just row should be replaced by column in the previous definition that is you take each column and take the absolute values of the elements of each row in that column and then sum them. Like that you have n numbers now for each column you have one number like that n numbers. Now, you take the maximum of all those numbers and that is called the maximum of column sum norm and this is the matrix norm subordinate to L 1 norm. You can just compute the L 1 norm of the previous matrix. Now, coming to the L 2 norm that is the Euclidean norm that is if you are given the vector norm as the Euclidean norm the physically realistic norm given by this if you recall. Then the corresponding subordinate norm is the one which I have already introduced that is maximum of all the Eigen values of the matrix A transpose A. This is very important it is the Eigen values of the matrix A transpose A and then take the square root here and that is the matrix norm subordinate to the physically realistic norm that is the Euclidean norm. Now, you can see that if you have to work with the physically realistic norm then the corresponding matrix norm involves the computation of the Eigen value of this matrix right. So, you it involves the product of these two matrix and then finding the Eigen values of this matrix. So, that is computationally very costly that is why we often do not prefer to work with L 2 norm rather we will try to work with L infinity or L 1 norm because you can see that the subordinate norms in the case of L 1 and L infinity are very easy to compute right. So, that is the reason why also we want to look for some equivalent way of doing our error analysis rather than going for this physically realistic distance concept. Again let us take an example and compute the two norm for the matrix A for that you have to find the Eigen values of A transpose A in this case they are given by these numbers and clearly the L 2 norm of this matrix is square root of 12.9128 and that is given by this number. This is the L 2 norm of the matrix A given by this. In the next class we will see a concept called condition number of a matrix which will tell you how sensitive it is for us to perform any computation of a matrix on a computer ok. That in fact involves matrix norm and therefore, it is very important for us to understand the meaning of matrix norm and also you have to understand how to compute the matrix norm subordinate to a given vector norm in particular we will be restricting ourselves to one of the these three norms. Thank you for your attention.