 Last time we looked at convergent matrices and also discussed about polynomials and matrices and in particular we defined the notion of a monic polynomial and then we started looking at other matrix factorizations in particular Gaussian elimination and then at the end of the class we started discussing triangular factorization in particular the LUD composition. So today I will talk about this LUD composition. Our goal is to find a matrix L which is lower triangular and U which is upper triangular such that an n cross n matrix A can be written as the product of these two matrices L times U. And here A is an n cross n matrix and so are L and U, both are n cross n matrices. So the basic unit of such a decomposition is these Gauss transforms and essentially what this is going to do is to perform Gaussian elimination. And this Gaussian elimination is equivalent to a sequence of Gauss transforms. That is to say that there exist matrices M1, M2 up to Mn minus 1 such that if you take the product of all these matrices times A you will get an upper triangular matrix and which is the same as the upper triangular matrix you would get if you had performed Gaussian elimination on that matrix. So MK here, the kth matrix in this series of matrices is the one that introduces zeros below the main diagonal on the kth column of A after the previous k minus 1 transforms. And because of this when you do the first transform you are introducing zeros below the diagonal element of the first column. When you do M2, when you multiply that with M2 you will keep the first column intact but you will introduce zeros below the main diagonal of the second column and then the third column and so on. So in the first N minus 1 transforms you have introduced zeros below all the first N minus 1 transforms and so the result is upper triangular and this Gaussian elimination process is complete. So what I need to tell you now is how to determine what this M1, M2 etc is and then if I want to write A as LU I need to do two more things. I'll have to take all these matrices M1, M2 up to M1 to the right hand side. Then I will get the inverse of the product of all those matrices times U and I need to show you that the inverse of the product of those matrices is lower triangular so that you've effectively been able to write A as L times U. So those are the steps that remain. So we'll start by first understanding what is the structure of this Mk matrix. So in order to answer this, suppose you have already found M1 through Mk minus 1 and we'll discuss how to find Mk and if you know how to do that then we can start with M1 and then find M2, M3 up to Mn minus 1. So suppose for some k less than n, we already have M1 through Mk minus 1 and these are such that if I take A, I'll define A k minus 1 to be Mk minus 1, Mk minus 2 all the way up to M1 times A has this structure because as I said in each step you are introducing zeros below the main diagonal of the successive column. So this is A11 k minus 1, A12 k minus 1, this is 0 and this is A22 k minus 1. So this corresponds to the first k minus 1 rows and so this will have n minus k plus 1 rows and this is k columns and this will be n minus k plus 1 columns. This is the structure you have arrived at so assuming that you know how to find M1 through Mk minus 1, the product of these matrices times A will have this structure. Now we want to end here in particular because we are assuming we have figured out the first k minus 1 matrices A11 k minus 1 is upper triangular. Okay so for the next stage our goal is to find Mk, so we want to find Mk such that Mk times A k minus 1 has two properties A11 k minus 1, I don't want to disturb that I've already placed zeros where I want them. So this is preserved and B the first column of A22 k minus 1 has n sub with zeros below the main diagonal that is to say zeros below the first element of this matrix after multiplication by Mk. So what is an Mk that will do this for me right? So this is going to be the Mk that will have these two properties so Mk is equal to the identity matrix minus this is of size n cross n alpha k E k transpose. So here E k alpha k is an n by one vector so I'll write what that is alpha k so it has k zeros followed by l k plus 1 comma k l k plus 2 comma k l n k transpose. So this is going to be a vector in r to the n and there are k zeros here. So this is an n cross one vector this is the transpose of an n cross one vector and E k so this vector here has zeros everywhere except in the kth position it has a one. So what is the rank of alpha k E k transpose? One. So any matrix say B of the form u, v transpose where u and v are n cross one vectors is always of rank one. So this kind of a matrix Mk is actually called a Gauss transform. Okay and so I also need to tell you how to choose these values so l i k will choose this to be a i k of k minus one divided by a k k of k minus one for i equal to k plus one. So all these indices. Okay, so here I am assuming that this a k k of k minus one is non zero and this is this this plays a very significant role in Gaussian elimination and this is called a pivot element. So go back and look at our Gaussian elimination we defined the last time you will find that in Gaussian elimination we are exactly doing using a quantity like this to multiply the rows of a matrix and add them to other rows. Okay, so this is actually doing exactly the same process as what we were doing in the Gaussian elimination, except it's putting it in a different way. So with this. So notice that this has non zero entries only in the k k plus one to nth position and this is a vector like this. So it looks like this with k zeros, followed by something non zero over here and this is a vector E k has zeros in the first k k minus one positions and a one there and then zeros everywhere else. So if I multiply these two together, I'll get a may I'll get an n cross n matrix, which has this whatever isn't here is repeated in the kth column here in the in the product and everything else will be zero because everything else is multiplying as zero. So MK has the following structure. It has once along the diagonal. And this is the kth column. Okay, and here it has minus L k plus one, not enough space. Let me do this a little bigger. And here it is minus L k plus one comma k. And it's minus L k plus two comma k all the way up to minus L n comma k. And zeros everywhere else, zero here and zeros here. So this is the structure of MK. And notice that it's, this is a k cross k block. This is the kth row k cross k, k cross k identity. And this is the kth column. Okay, so what this, what this means is that if I do MK times a k minus one. So this is like rough notes, I'll write it over here. MK, a k minus one, it will be like multiplication of a k cross k identity. Zero, zero. And then this has things over here and has this form the diagonal entries are one and it has non zero entries. Actually, all of these are non zero this this entry. It's, it's something here. Okay, it's not. So it is this block here. It's minus L k plus one comma k and then there's a one here and then once along the diagonal, the rest of the way. And then all these entries over here, but it is some structure like this and a k minus one as the structure. A 11 k minus one. Okay, there's these are not size matched. So I'll consider this to be the I k minus one matrix and it's easier to explain. This is a 12 k minus one. And then this is zero. This is a 22 k minus one. So if you do this, then you see that this identity multiplies this a 11 k minus one so that so this a 11 k minus one will get preserved. And this part will have some something the top right will have something because of this a 22 k minus one. And this bottom part will get multiplied by zero. So these zeros remain unchanged. And this bottom part will get multiplied with this and I'll get something and this is the part where I'll end up introducing zeros below the main diagonal. So the first k rows a k minus one remain unchanged by multiplication with mk lower left block a k minus one remains zero after remains a zero block after multiplication with mk. So it has these two properties and m so basically mk only affects this a 22 k minus one block of this a k minus one matrix. Now, these like is are exactly the same as required by Gaussian elimination to place zeros in these positions of the matrix. So pre multiplication by mk performs exactly the same row operations as Gaussian elimination. So this it will replace row i by row i minus a i k of k minus one divided by a k k of k minus one times row k. So this is for k equal to one up to n minus one and i equal to k plus one and so specifically the if you look at the Keith column. What's happening is that the this operation is exactly cancelling off in the row k your the row kth row case kth entry when you divide by this it gets my it the a k k of k minus one cancels and you have a minus a i k one which cancels with the a i k of k minus one in the ith row and you get zero in that position and all other entries could potentially change. So basically what this is what this means is that a 22 of k minus one is replaced with a matrix whose first column has zeros below the main diagonal. So, so now we know how to construct these matrices m1 m2 up to mn minus one and each of these matrices are upper triangular by construction and the product of a series of upper triangular matrices is upper triangular. And so if if we so basically what we have is mn minus one all the way up to m1 times a is equal to you and this product of all these matrices is lower triangular. Each of these matrices m1 m2 up to mn minus one they're all lower triangular by construction and this you ultimately after all these operations it will reduce this matrix a to an upper triangular form. Now, if I define this matrix this product to be L inverse. Then what I have is or L inverse a equal to you, which implies this this is lower triangular and the inverse of a lower triangular matrix is also lower triangular. And so this implies a is equal to LU. Okay, and also by construction, notice that each of these matrices is lower triangular with once along the diagonal. And so the for a lower triangular matrix the eigenvalues are the diagonal entries. And so all its eigenvalues are equal to one and so it is non singular can be inverted. And so you have a is equal to LU. So, basically there is a one to one correspondence between Gaussian elimination and LU factorization. They're equivalent operations. Okay. Now, so at first glance it appears that in order to find the LU decomposition, I need to take these n minus one matrices m1 to mn minus one, and I need to multiply them, and then I need to invert that matrix to obtain this L. But it turns out that it's actually very easy to recover L because of the structure in these matrices. So, essentially, first let's note that L is actually equal to the inverse of the product of these matrices and then you invert the multiplication order gets reversed. So it's m1 inverse m2 inverse up to mn minus one inverse. And so we'll find the structure of L by identifying the structure of these matrices. And then identify the structure of the product of these matrices and we'll see that finding L is actually very easy. So structure of mk inverse. So recall that mk is the matrix such that mk times ak minus is equal to ak. And ak minus one is a matrix such that its first top left k minus one cross k minus one matrix is upper triangular and below the upper triangular part you have zeros. And this matrix is such that it's top left k cross k matrix is upper triangular and it has zeros below the top left k cross k sub matrix. So basically, this mk has the structure i minus alpha k ek transpose. That was our construction of mk. So essentially what we see is that we get ak from ak minus one by taking ak minus one minus something this times this matrix. So if we wanted to invert this operation and take mk to the other side, then it's sort of intuitive that maybe we need to do an addition operation. So the answer in fact that intuition is correct. And so if you consider mk equal to i plus alpha k ek transpose, then if I multiply this. mk minus mk inverse to be i plus alpha k ek transpose, then if I do, if I multiply this with mk, then it is i plus alpha k ek transpose times i minus alpha k ek transpose, which is equal to i plus alpha k ek transpose minus alpha k ek transpose minus alpha k ek transpose alpha k ek transpose. Okay, now this ek transpose alpha k is actually the inner product between the vector ek and the vector alpha k. So I claim that this is equal to zero. Why is that true? Because the matrix is constructed in such a way. Yes. So alpha k has non-zero entries in k plus one through n. Okay, only those entries of alpha k are non-zero, whereas ek has a one only in the kth position. So this has a one in the kth position, but the kth entry of alpha k is always equal to zero. Only the k plus one to nth entries are non-zero. Let me go back here. See alpha k had k zeros and then k plus one to n, you have all these lk plus one lk plus two up to lkn. So these entries, so ek has a one only in the kth position. So if I take the inner product of this vector with this vector, this one will multiply zero and all these entries will multiply zeros here. And so their inner product is zero. So this matrix drops off and these two just cancel each other and so this is equal to the identity matrix. So i plus alpha k ek transpose is the inverse of mk. And notice that this is actually the only thing we used here is that the inner product of these two is zero. So if I have a matrix A equal to i plus uv transpose and u is orthogonal to v, then a inverse is equal to i minus uv transpose. So this is generally true as long as these two vectors are orthogonal to each other. So basically finding the inverse of mk is very easy. All you have to do is to change the signs of this part, which was i minus alpha k ek transpose. So basically if mk is of the form, you have one, one, then lk plus one comma k set up to minus lnk and then zeros everywhere else. And then once here on the diagonal, then mk inverse will be the same matrix, but one and here I have plus lk plus one comma k lnk and then once along this diagonal. So, so we now know how to find mk inverse that's super easy given that we've already found what mk is. So if I now look at what is the structure of l, l is the product of these mk's. So this l is, l is product k equal to 1 to n minus 1 mk inverse, which is m1 inverse m n minus 1 inverse. That is equal to the product of matrices of the form k equal to 1 to n minus 1 i plus alpha k ek transpose. So it's i plus here because these are the inverse matrices. And so if I just expand this out, that is going to be equal to, I'll get an identity matrix when I multiply all the identity matrix matrices together. Plus I take one of these guys and multiply with the identity matrix. That is the first, that is the next term k equal to 1 to n minus 1 alpha k ek transpose plus all these other terms which will look like of the form alpha i. These are the cross terms e i transpose alpha j e j transpose and this is for i greater than j. Then each of these, if you look at the form of, actually it's greater than i. If you look at the form of this, this has one only in the ith position and this will have nonzero entries from the j plus one and t position onwards. And so when I take this inner product, it's always going to be equal to zero. And so all the cross terms actually drop off. And so L is actually equal to i plus sigma k equal to 1 to n minus 1 alpha k ek transpose. That's it. So basically each, this is just the identity matrix which puts the, puts once on the main diagonal of L. And this is basically each of these square matrix which has nonzero entries only below the main diagonal of the kth column. Okay. So, so basically L is unit lower triangular unit meaning it has once on the main diagonal and nonzero entries only below the main diagonal. So for example, for n equal to four four plus four matrices, L will be of the form one. And here I'll have a 21 of zero divided by a 11 of zero. This is actually the entries of the a matrix itself a zero is equal to a. This is a 31 of zero divided by a 11 of zero. So I'll have the first column of L from the matrix a a 41 of zero divided by a 11 of zero. And then in the second column I have zero here and one here and a 32 of one. So for this I need that matrix a one divided by a 22 of one. A 42 of one divided by a 22 of one. And this will have a 001 and then a 43 of two divided by a 33 of two and then 00001 in the last column. So that is the structure of L. So basically given given this sequence of so if you determine the sequence of Gauss transforms you can form this matrix L without any further explicit computations. So the inverses are accomplished by inverting a set of signs and multiplication is accomplished by just placing the nonzero elements of alpha k into the appropriate positions of L. Okay. So this. So again, so just to reiterate the sequence of Gauss transforms we performed is exactly the same as Gaussian elimination. And therefore, allude leak decomposition is is really actually a high level description of Gaussian elimination. There's no difference between the two. And Gaussian elimination itself is order two and cube over three floating point operations and that same thing carries over to the factorization I just discussed. And in fact it is the lowest of any triangulation technique for square matrices without exploiting any further structure in the matrix. Now since L is unit lower triangular the determinant of L is equal to one. And so L is unit lower triangular. So I've already written that so determinant of L equal to one. So, so basically a is equal to LU so since determinant of a equals determinant of L times determinant of you, we have the determinant of a equals the determinant of you and you is upper triangular and so that is equal to the product of I equal to one to and you I so the product of the diagonal elements of you will actually give you the determinant of a. Okay, so let's maybe just illustrate this with an example. So suppose my matrix A was the matrix to two minus two minus one minus two minus one zero one five, then the matrix M one is going to be equal to this thing with I have once on the diagonal. It's an upper triangular lower triangular matrix with once on the diagonal. And what is this entry, it is minus of this a 21 divided by a 11 this ratio is one so this will be minus one. And this is this entry is minus of this divided by this. And so it is plus one. And this will be zero so you're only placing non zero entries below the first column. So this is exactly that I minus alpha of one e one transpose. Just for your reference later I'll just write what this is. This is minus a 21 over a 11 and this is minus a 31 over a 11. Okay. Now if I did M one times a I'll get the matrix. It will keep the century as it is it will place zeros here and all the other entries you actually have to calculate them and you get minus one minus one minus two zero one and five. And this is what we're going to call the matrix a one. Now when you have this matrix a one M two is very easy you can immediately write it by inspection. It's going to have non zeros below the main diagonal of the second column so only one entry here will be non zero and then you have the identity structure. So this is going to look like this one zero zero zero one zero zero and this entry is going to be this divided by this with a negative sign. So this is two and with a negative sign I have to write minus two here and this is one and again for the sake of completeness this is minus a 32 of one divided by a 22 of one. And then if I calculate M two M one a that's going to be M two times a one which will give you two minus one zero zero minus one one zero zero three. So this is my matrix you so I've got it in the upper triangular form and L which is equal to M inverse. So if I call this matrix M inverse which is M one inverse M two inverse. So this will be equal to i plus sigma k equal to one to two alpha k e k transpose. And once again all I have to do is to invert the signs of these things and place them below the main diagonal. So that is equal to so I will have this one zero zero one zero and one here and below this I just have to place the negative of this quantity so it will become one minus one and below this I have to place the negative of this quantity so it will become two. So this is my matrix L. So you must check this that a is equal to L you so L is this matrix you is this matrix you multiply these two together it will give you back the a matrix and determinant of a is equal to the product of the diagonal entries of you which is minus six. So you can easily check these for yourself. So this is how the L U D composition works. Okay, so this is the vanilla version of Gaussian elimination or L U D composition that I described, but there are things one can do to make this more numerically stable, and that is called Gaussian elimination with pivoting. So we start with a simple arithmetic simple example, which just to illustrate why we need to do this. So suppose we are doing our computations in base 10 arithmetic, but we can only store three digits of any numerical computation. Because all computers operate with finite precision arithmetic so they're going to they're going to have to chop numbers that are too small that to be represented on the computer. So suppose I had a matrix like a system of equations 0.001, 1, 1 and 2. This times x1 x2 is equal to 1, 3. Suppose I wanted to solve this. You can already see that if I set x1 x2 equal to one, this will be 1.001 and x this will become three and that almost solves this problem. So if I get an answer which looks close to 111, I know that I've solved the problem. But if you try to do this using L U D composition, remember that L U D composition is one way to solve these kind of problems and it gives you some computational advantage in very large dimensional systems, because once you have the L U D composition, you can do a step of forward substitution followed by a step of backwards substitution substitution to find x. So if you work out the L U D composition exactly as I worked out in that numerical example. What you get is the following. You will get L hat. I'll just call it L hat because this is the value you will get if you did this carefully but with finite three digit arithmetic, so precision arithmetic. So this will be 1000. That is just the ratio of these two and then 01 and U hat will be equal to 0.001, 1, 0 and minus 1000. It turns out that this value is actually minus 1000. 001 or something like that but because you're doing it in finite precision arithmetic, this has an error due to this round off or chopping that happens in finite precision arithmetic. So basically if you compute L hat, U hat from this, what you end up with is the matrix, so that's easy, you can just do it very quickly. So you get 0.001, then this times this I'll get 1, this times this will give me 1, this times this actually gives me 0. Okay, so L hat, U hat is quite different from this matrix A. This is actually equal to the matrix A 0.001, 1, 1, 2 plus this error matrix 00, 0, minus 2. So you are actually making a large error. So if you use this L hat, U hat and then you did your forward substitution followed by backwards substitution. What you will get is a X hat, which will end up becoming 01, which is the answer truncated three digits, but it's quite different from X hat dash, which is equal to 1.002 and 0.998, which is the correct solution truncated to three digits. Okay, so basically instead of directly doing the LU here, the problem arose because when I tried to find this L, I had to divide this by this and that gave me this large factor of 1000. And in finite precision arithmetic, these kind of large numbers mess up your calculations. And so if I had exchanged rows one and two and then perform LU, then what I will get is the, once I exchange the rows, I'll get 12, 0.001 and then 1, this times X1, X2 is equal to, I have to exchange this side also. So 3, 1. So when I exchange the rows, everything here also it gets exchanged. So I start with this system of equations. And if I now get, now do the LU hat, LU decomposition, L hat will become equal to 1, 0.001, which is the ratio of these two, 01 and U hat will end up becoming equal to 1, 2, 0, 0.998. And if I now compute L hat U hat, that will be equal to 1, 2, 0.1, 0.001, 2, 1. And if you use this L hat U hat, the solution with the above L hat U hat is accurate 3 decimals, which is, in other words, it gives exactly this solution that I wrote earlier. So the basic idea is to stabilize the Gaussian elimination by exchanging rows and columns of, when I exchange columns, the entries of this vector get exchanged and I just have to undo that exchange after I've solved the problem. So we'll stabilize the Gaussian elimination by exchanging rows and columns such that the element with largest magnitude ends up in this top left corner of the matrix in the pivot position, upper left position. So that is going to be the core idea of Gaussian elimination with pivoting. So basically we will end up with permutation matrices, which I'm going to denote by p and pi, such that this p is an identity matrix with some rows i and j permitted. So I could write it as pij, but just simplifying notation here. And this is also the identity matrix, but with columns i and j permitted. Then pre-multiplying by p, so pa, what it does is to exchange rows i and j. And if I do a pi, this gives me a matrix where I've exchanged columns i and j. Okay, so basically what we will do is we will do this kind of row and column exchanges so that at each step we end up with the largest possible element as the pivot element and largest possible magnitude element as the pivot element. And then we'll use that to construct the LOD composition. I'll cover that in the next class. So we'll stop here for today. We'll see how to use. Okay, so that's all I have for today. So we'll continue again on Monday.