 Yeah, I mean, I think two weeks after the classes have begun, you should know which meeting to join. Yeah, so please join the right meeting. Okay, so we'll begin. So the last time we were discussing about, we discussed a bit about determinants. And then we started discussing about norms. And today we'll discuss several properties of norms. So, yeah, so today we'll discuss several properties of norms. We call so that we have this right starting point. Recall that something is a norm provided for any X and Y belonging to this vector space V over which the norm is defined. The norm of any vector X is greater than or equal to zero and it's equal to zero if and only if X is zero. It has the homogeneity property, namely that the norm of C times X is equal to the magnitude of C times the norm of X for every C in this field F. And finally, the triangle inequality norm of X plus Y is less than or equal to norm X plus norm Y. If property 3 is not satisfied, then we call it a pre-norm. And if property 1A is not satisfied, then we call it a semi-norm. Also recall the definition of an inner product. The inner product is defined like this. If it's defined from two points, you have to choose two points in V and then it maps those two points to the field F. So given any X, Y and Z belonging to V, the inner product of X with itself is non-negative and the inner product is equal to zero if and only if X is equal to zero. The inner product is additive. So if you have X plus Y, Z, then that is X, Z plus Y, Z. And it is homogenous in the first argument, namely that CXZ inner product is the same as C times XZ for every C in the field F. And it's also Hermitian. If you exchange the order to inner product of YX, you get the complex conjugate of the inner product between X and Y. So also we saw one crucial property that if this sum dot, comma dot is an inner product, then X, comma X power half, the inner product of X with itself power half is a vector norm on V. So that's one crucial property that connects inner products to norms. So using any inner product, you can define a norm. Now, today we'll start by discussing some example norms. These are perhaps the most popular norms that are used in different applications. The most popular norm is the Euclidean norm, which is also known as the L2 norm or more simply as the two norm. It's just the sum of the squares of the entries of X raised to the power half. So one important simple formula is that norm X squared, this is called the L2 norm. So L2 norm of X equal to X transpose X. It's a very useful formula. And it gives you an algebraic way of writing this particular Euclidean norm. The Euclidean norm of X, the norm of X minus Y, L2 norm of X minus Y measures the Euclidean distance between X and Y, meaning our conventional notion of length between X and Y. The second norm I want to discuss is what is known as the L1 norm or the sum norm. And this is also known as the taxicab norm or the Manhattan norm. So the Manhattan area of New York is famous for having perfectly rectangular, the streets divided into perfectly rectangular grids. And so this essentially measures, if you're given a point A and a point B, you have to go, it's like this grid you see in the background here. So if you have a point here and another point here, the way you can go from this point to this point, you can go like this or you can take and go like this. But however you go, the total distance you will traverse is actually the same as long as you're going along the sides of this grid. And that is basically the sum norm. So it is equal to, it's written as norm XL1. And it is equal to mod X1 plus mod X2 plus et cetera, plus mod Xn. The sum of the magnitudes of the entries in X. So a small exercise for you is to verify that this is in fact a vector norm. That means it must satisfy those four properties that we discussed just now. And another property is that it is not derived from an inner product. The third norm is what is known as the max norm or the L infinity norm. This is also called the Cartesian norm and this is written as X infinity. And the reason for the subscripts will be obvious in a second. It's the max of, so it's the max magnitude entries of X. The largest magnitude entry next, that is the norm. So one thing is that if you, I mean, if you think about it, these are all different ways of measuring the length of a vector. And taking the sum of the squares and taking to the power half is one way to measure the length of a vector. And in a two dimensional space, if I draw a vector, say like this, then the length of the vector is actually this squared plus this squared power half. We are just using Pythagoras theorem to say that's the length of this vector. And so it's a reasonable way of measuring the length of a vector. Similarly, the L1 norm, it's measuring the length you have to travel if you were restricted to go along the sides. And the L infinity norm essentially picks off the biggest entry in X in magnitude. And that's also, I can't give you an example of how that will be the length. But you can imagine that maybe the cost is completely dominated by the largest segment in one of the dimensions that you need to traverse. And therefore, that is the L infinity norm. However, for example, if I took the min here, the min of the magnitude entries of X, that is not a norm. Can anybody think why? So is the second one where we are taking the min, is it due to the second property of positivity in the same time? Yeah, that is one. You can also show that, yeah, so certainly positivity doesn't hold. If any one entry is zero, then this star is going to be equal to zero. And so it won't be zero only if all the entries are zero. You can also probably show that it does not satisfy the triangle inequality. That's also easy to show. So similarly, this when P is less than one, this definition of LP norm, it does not satisfy the triangle inequality. So show this, okay, show. So the triangle inequality for the LP norm for P greater than or equal to one, it basically reads norm X plus Y, and P is less than or equal to norm X, P plus norm Y, P. Again, it's something to think about how do you show this for any P greater than or equal to one. If you define the LP norm like this, then it satisfies this triangle inequality. And this inequality is called Minkowski's inequality. So here in this inequality, in this definition of LP norm, if I substitute P equals one, then it is mod Xi power one, whole power one. And so that reduces to the sum norm. And if I take P very, very, very large, then what happens is that when I'm taking mod Xi to a very large power and I'm adding them up across all the Xi's, the largest magnitude of magnitude entry in the vector X will completely dominate the sum. And therefore the value of the sum is equal to, as P tends to infinity, the value of the sum will be equal to the magnitude of the largest entry of X raised to the power P. And then I'm taking it to the power one over P. So this will lead to the largest entry in magnitude in X as the LP norm as P tends to infinity. And that's the reason for this notation, norm X infinity equals the maximum of these entries. So this P norm, it reduces also when P equals true, it's the sum of the squares of the magnitudes of X, the entries of X raised to the power half, which is exactly the same as the Euclidean norm. So this is a generalization that includes the L1 norm, L infinity norm and L2 norm as its special cases. So now also to just get a feel for how these norms look like, one can ask what is the, so you can look at a two dimensional space and ask what is the locus of points that have a fixed norm. So if I take the L2 norm, if I take the set of points V such that norm VL2 equals 1 on the two dimensional plane, what will it look like? It's a circle. So assume that this is a circle and its radius will be equal to 1. And if I take the set of points such that V such that norm V infinity equals 1, so the norm is now the largest entry. And so basically what will that look like? If you are fleet footed, you can think about it. Squares. Squares. Exactly. So that will look like a square. So for any point along this line here, the largest, so whatever the value of Y, the value of X is equal to 1. And so this is 1 and this is 1. And this is the origin. And for any point along this line, the X value is equal to 1. And so the infinity norm of any point along this line is equal to 1. Similarly, any point along this line, the L infinity norm is 1 like that. So that's how you get the square. And finally, if I take all the V such that L1 norm of V equals 1, what shape will I get? A line. I'll get a diamond. I'll just call it a diamond for simplicity. And these are points. This is point 1 comma 0. And this is 0 comma 1. This is the origin. And so for any point along this line, the sum of these two coordinates is always equal to 1. And so that's how you get this diamond shape. So that's kind of the shape of these norms. And if I take the L3 norm or L4 norm, that will be like a circle that is further bulged out. It will end up looking a bit like this. So it's not quite a circle. It's bulged out compared to a circle. So I'm trying to draw it a little more bulged out. So this could be like the, so that's how these will look. Okay. Now, what are norms good for? There are several things that they're good for. So I'll just give some examples. So one very important use is for showing convergence. So basically, for example, we know this formula that if I take 1 minus X inverse or 1 over 1 minus X, and I can write this as 1 plus X plus X squared plus etc. It's an infinite series. Now, when is this true? X less than 1. X to the power infinity tends to 0. Exactly. So X mod X should be less than 1 and that is the magnitude of X should be less than 1. So this is true for a scalar, but suppose I wanted to find identity minus a matrix A inverse. And so when can I write it as i plus a plus a squared plus etc. Now, obviously this condition here suggests that maybe we should, we need a condition on somehow the size of this matrix A. And the answer is that this is true if a matrix norm on A, which I'm going to write with three lines. So going forward, I will use three lines to denote matrix norms. And I need to tell you in what, which norm I will use here. And it turns out that any matrix norm will do. If you can find a matrix norm under which the norm of A is less than 1, then a formula like this can be used to compute the inverse of i minus A. And the other use is, if you know that mod X is less than 1, you can actually bound how much, how big the rest of the series will be. And in turn, you can determine how many terms you need to use in the summation in order to get a sufficiently accurate estimate of 1 minus X inverse. And similarly, the norm of A will tell us how many terms I need to include in the series in order to get a sufficiently accurate representation of i minus A inverse. And so in a more general sense, it's useful for determining how many iterations of an iterative algorithm you need to use to solve a certain problem to a desired level of accuracy. And in fact, the second use is more about quantifying the accuracy of matrix computations. And these are again things that we're going to look at later in the course when we look at stability of matrix computations. So suppose we want to find A inverse, but instead the entries of A are noisy. And so what we get to see is suppose A was equal to A0 plus E. And so what we've done is we've gone and computed A inverse, but what we really want is A0 inverse. Then we want to know what is the potential error that we've incurred by computing A inverse. So to find the error in computing A inverse, which is A0 plus E inverse instead of A0 inverse. And again, the answer lies in the norms. And the third use is in bounding eigenvalues or perturbations of eigenvalues. So if you perturb a matrix by adding a small error matrix to it, how much will the eigenvalues get perturbed? And all of these answers lie in norms. So this is also something that we are going to see later in the course. Now, another thing is that we've seen a few kinds of norms, but the question is can we come up with new norms based on existing norms that we know? And so for example, you can do that. And these exploit some properties, which are known as algebraic properties of norms. And I'll give two examples here. So the first is that if say x alpha and x beta are vector norms, then if I define norm x to be equal to the max of these two numbers, then this is a vector norm. Again, a property that is easy to verify. But you have to check that it satisfies those four properties of norms. Similarly, if this is a vector norm, say on c to the n and t in c to the n cross n is non-singular. It's a non-singular n cross n matrix. Then if I define this, so we'll call it the t-norm, this is equal to the norm of tx. Then for any x belonging to c to the n is a vector norm. So you can produce lots of different norms. For example, you have a set of norms. You can take the max of any pair of them or any number of them, then you get another vector norm. Take any matrix t in c to the n cross n that's non-singular, then the norm of tx gives you yet another norm. Obviously, the length of tx is going to be different from the norm of x, so in general. So in particular, for example, if I'm taking the Euclidean norm and if t is a unitary matrix, then xt will be equal to norm of x, but otherwise it need not be equal. So we can produce lots of different norms like this, but the question, another question is where do we use these different norms? In particular, the... Yeah, go ahead. Do orthogonal matrices only preserve Euclidean norms or every norm? What do you think? Every norm. Orthogonal matrices. Ortho-normal matrices. Yeah. Ortho-normal matrices. Euclidean norm only. And the reason is very simple. Because it is expected to be in a product. Right. In a product, in fact, what we call the usual in a product. So if I write this where it is equal to x transpose x. So which implies if t is ortho-normal, then if I compute tx, this is going to be tx transpose, so this is bad notation. So this is tx transpose tx, which is equal to x transpose t transpose tx. And t transpose t is the identity matrix for ortho-normal matrix. And so this is equal to x transpose x, which is equal to x to square. But for other norms, you can't write it like this. And so it's not true that it preserves other norms. So you can, in fact, ask, are there classes of matrices that preserve, for example, the L1 norm or that preserve the L infinity norm. And in general, it's hard to find matrices that will, I mean, you can always find a matrix that will preserve the L1 norm or L infinity norm for a particular vector. But for any x, you cannot preserve its L1 norm or L infinity norm by multiplying it by an n cross n matrix. So that's again something that you can try to show.