 In the last lecture, we were looking at direct methods for solving sparse linear systems. Sparse linear systems are ones in which a matrix is filled with lot of zeros, very few non-zero elements and towards the end of the lecture, we looked at the Thomas algorithm. The Thomas algorithm is for tri-diagonal systems. Before that, we also looked at block-diagonal systems. Block-diagonal systems, you can exploit the structure to come up with a computation method which is significantly, you know, computationally efficient as compared to the conventional Gaussian elimination. So, now, to continue with this, we looked at Thomas algorithm yesterday and I said that the number of multiplications and divisions is linearly proportional to n, which is significant reduction from n cube by 3, particularly for large dimensional matrices. If you have n to be 1000, n cube by 3 would be a very large number as compared to 5n minus 8. Just compare, just compare the number of multiplications and divisions that you have to do for a modest n of 1000. So, just continuing on this, I am going to talk today about Thomas algorithm for block triangular matrices or block-tri-diagonal matrices. So, for sparse systems, we looked at tri-diagonal systems and now, we look at this spatial class which is, so this is a block, this is a matrix. So, I am trying to solve an equation A x where A is given by this matrix. I am going to split again x into sub vectors. So, I have this vector x 1, x 2, up to x n. So, this is a block tri-diagonal system. Earlier case, we had scalars here. In earlier case, beta B 1, B C 1, A 2, B 2, C 2, all these were scalars and then what we had derived earlier was Thomas algorithm, when the diagonal matrices are scalar. If you look carefully, this matrix which you get here is not really tri-diagonal because each one of them could be a dense matrix. So, if each one of them is a dense matrix, the raw matrix is not tri-diagonal. It may have multiple diagonals. But we are able to identify this B 1, C 1, A 2, B 2, C 2 and so on such that this is a block tri-diagonal matrix. What I mean by block tri-diagonal matrix? The so called elements of this matrix are themselves matrices. B 1 is a matrix, C 1 is a matrix, A 1 is a matrix, A 2, B 2 is a matrix, C 2 is a matrix of appropriate dimension. Is everyone with me on this? Is this clear? See these are sub-matrices within this huge matrix. These are sub-matrices. So, this could be a 3 cross 3, this could be a 3 cross 3, this would be then let us say this is 4 cross 4, then this would be 4 cross 3 and so on. We will have to have appropriate dimensions here. They need not be all of the same dimension. They can be of different dimensions. So, now I want to exploit the spatial structure this has a lot of zeros or zero matrices or these are zero matrices. That is why I am writing them in square brackets. So, this is a huge matrix whose elements are sub-matrices. These are sub-vectors. X superscript 1 and X superscript 2 are sub-vectors of appropriate dimension. If this is 3 cross 3, this X 1 will be 3 cross 1 and so on. If this matrix is 4 cross 4, this X 2 will be 4 cross 1 and so on. So, this matrix has appropriate dimensions. So, these are sub-matrices and these are sub-vectors. The right hand side also has been divided into sub-vectors and now I am just going to apply the Thomas algorithm that we came up with. The first step, the step 1 is called block triangularization. What do you do in Gaussian elimination? What do we do in Gaussian elimination? In Gaussian elimination, we make the diagonal below the main diagonal zero. Here we are not going to make the diagonal zero. We are going to make the block diagonal below main block diagonal zero. So, we are going to retain matrices along the main diagonal. I am going to retain matrices along the main diagonal. I am going to make everything that is below that is zero. So, this block triangularization would be first matrix is gamma 1 is B 1 inverse C 1. And gamma k is if you look at the Thomas algorithm which I developed for the scalar case, you would notice that what I am doing here is just a matrix analog of what I have done for the scalar case. It is not different. So, earlier I had written scalar gamma. I am writing here capital gamma. You have to be careful now because you have matrices. You have to talk about pre multiplication, post multiplication very very carefully. You cannot interchange the order and whatever was a division earlier would come out to be a matrix inverse here and so on. Then you have this beta 1 matrix vector will be B 1 inverse D 1. So, you have to worry about these elements when we do the transformation to block triangularization and beta k will be B k. Yes, we will say B k. So, my first step is block triangularization. So, in the block triangularization step what I am going to do is if I go back here I am going to replace this by I 0 I. So, all these will be replaced by identity matrices of appropriate dimension. Everything below the main diagonal will be null matrices and these upper one will be replaced by gamma 1, gamma 1, this will be gamma 2 and this will be gamma n. These vectors will be replaced by beta 1, beta 2, beta n. So, I have done block triangularization. This is block diagonal identity. This is gamma 1, gamma 2 which we are calculating here. Now, notice one thing even though you are doing matrix inversions here, even though you are doing matrix inversions here these are small dimension matrices. Each one of them is 3 cross 3, 4 cross 4 whatever even if it is 10 cross 10, these are small dimension matrices as compared to the big matrix. So, the individual inverses which are involved here are small dimension inverses. So, these you could compute by some standard method Gaussian Gauss Jordan method or Gaussian elimination or Gauss Jordan method basically. So, this you can do quick computations because the number of computations involved is relatively small as compared to the big matrix. If I were to apply Gaussian elimination to the big matrix the computations will be much more than doing these small inverses. So, that is why the next step is of course, backwards sweep. So, in backwards sweep we start from this end. Look at this i times x n is equal to beta n. So, this is my first. So, this itself is the solution for x n component. Using x n component in the second equation I can recover x n minus 1, x n minus 2 I go back and then I recover I recover the entire state vector. So, this algorithm it exploits large number of zeros. So, actually if you if you if you go back and look at this matrices this matrix in its original form right now I have written it in a block triangular or block tridiagonal matrix form. In original form this will be a banded matrix. So, there are few diagonals which are non-zero rest all above and below are zero. We are able to express those diagonals in terms of these matrices in terms of these matrices sub matrices and that is why we are able to come up with a computationally efficient solution for solving this particular problem. There are many other such forms and as I told you that if you go to Matlab toolbox you will find there is a Matlab toolbox for sparse systems and there are many more forms which one another simple form is to solve are triangular matrices. Triangular matrices are either lower triangular matrices or upper triangular matrices. Now where do you get upper triangular lower triangular matrices? In Gaussian elimination you get lower triangular upper triangular matrices right. In Gaussian elimination you will get an upper triangular matrix and then you do backward sweep right which is. So, if you have a lower triangular matrix you can do computations very fast and then likewise if you have a block lower triangular matrix you can come up with a algorithm which is again very fast and exploit the structure not worry about the zeros which are there and then try to come up with the solution. Just look at the notes I am not going to do it on the board. We have to move on to something else there is one more concept I want to introduce here. Now block triangular or lower triangular upper triangular these are spatial structures and likewise you can go on exploiting these structures to come up with efficient Gaussian elimination algorithms or efficient modifications of Gaussian elimination algorithms which are suited for a specific structure and you can do fast computations that is the idea. So, as I said my motivation behind talking about sparse matrices was to sensitize you that there exists something called sparse matrix computations. Now what is the origin of these problems is problem discretization that is discretization of boundary value problem discretization of partial differential equations on finite element or finite difference all these methods orthogonal collocations on finite elements all these will give rise to certain nice structures which has lot of zeros and then you can exploit that to come up with solutions which are efficient. This is particularly useful when you have iterative procedures. In iterative procedures you may have to solve AX equal to B kind of equations Newton-Raphson you are actually solving multiple times AX equal to B. You never actually when you actually do Newton-Raphson step you never in a large scale problem you never do A inverse it is fine to do it for a beginner problem which has three variables or four variables but when you have large number of equations and when you are doing Newton-Raphson you actually solve A delta x equal to minus f at x k solve a linear problem and then substitute delta x and get a new x. So, you have to solve multiple times linear algebraic equations and that is where exploiting the structure. Now if the if your solution scheme has a gives rise to a specific structure to A matrix that will happen in every iteration it is not going to be different in only the numbers will change iteration to iteration the structure of the matrix will remain same. So, if you are calling a specific subroutine sparse subroutine within your Newton-Raphson that subroutine will remain same it is not going to change. So, if I am doing the problem this TRAM problem in TRAM problem suppose I decide to do say orthogonal collocations on finite elements. So, three finite elements every time I will get a matrix in the iterations whose numbers might change but the structure will be same. Then the wherever zeros appear zeros will appear every time that is because of your structure of discretization or if you decide to do it using finite difference within the iterations that matrix is always going to be tridiagonal it is not going to change. So, those features will remain same those features are not going to change just remember that. So, you can actually make use of those features and come up with there is one more one more trick for reducing the computations this is if you have a large system now I am not talking about not necessarily talking about sparse matrices the thing what I am going to talk about need not be a sparse matrix but it is a trick to make computations fast. So, solving a matrix by a matrix partitioning I am going to explain the basic concept it can be extended to more complex partitioning I want to solve a x equal to b of course where a is a large matrix I do not want to invert the large matrix or I do not want to use Gaussian elimination on the entire big matrix. One possibility is that I transform this equation into two sub equations. So, this a matrix is written as a partition a 11 a 12 a 21 a 22. So, this big matrix a I am going to partition into four sub matrices. Well if some of them let us say this one is a almost a null matrix or this is a null matrix great you know you have a simplified solution x 1 x 2 and b 1 b 2. So, I am going to partition this equation how will I proceed now the way I am going to proceed is well I will say that we have two equations a 11 x 1 plus a 12 x 2 is equal to b 1 can you try and solve this can you make an attempt how will I proceed now let us say a 11 is invertible. Then can you eliminate and write can I eliminate x 1 using the first equation how do you solve suppose I decide to write x 1 as a 11 inverse b 1 minus a 12 x 2 what next how do I solve the next part I just take the second equation a 21 x 1 plus a 22 x 2 is equal to b 2 and then I substitute this x 1 here. So, it is a 21 a 11 inverse b 1 minus a 12 x 2 plus b 1 minus a 12 x 2 plus b 1 minus a 12 x 2 plus b 1 minus a 12 x 2 plus a 22 x 2 is equal to b 2 and now I can rearrange I can put all the terms of x together I can put all the terms of x together. So, I will get an equation which is a 21 a 11 inverse a 12 minus plus a 22 x 2 is equal to b 2 minus what is the advantage of doing this see n cube you just remember that Gaussian elimination would require computations to the proportion of n cube by 3 suppose this is 1000 cross 1000 matrix I decide to divide it into 500 cross 500 500 cross 500. So, this will be the first problem first problem I have to do a Gaussian for a 500 cross 500 matrix and next time. So, I will get any a inverse a 11 inverse once I will store it I will use it in the next step again I have to do a Gaussian elimination of 500 cross 500 to Gaussian elimination of 500 cross 500 requires less computations than 1000 cross 1000 to 500 cross 500 is less than 1000 cross 1000. So, likewise I could I could I have just done a simple partitioning here I could think of a 11 a 12 a 13 a 21 a 22 a 23 I can make multiple partitions actually matrix theory is very very interesting and it has history of almost more than 100 years or 100 150 years Kelly was one of the founders of matrix theory you probably know Kelly Hamilton theorem Kelly and who was the second mathematician I forget his name I will try to remember and tell you they were friends and incidentally Kelly's friend. So, he did not get he though he was a mathematician that time in England if you have to be graduate you had to take a oath to be a devote Christian and he was a Jew and he refused to take that oath. So, he was not given degree in mathematics so he took to law and during the recess during the you know during the you know cold recess between between two sessions these two guys used to work on matrix theory that is what travels us now. So, but they came up with voluminous matrix theory which is now well they are known as twins of I will tell you the name tomorrow. So, this is lot lot to matrix theory and a lot of lot of things have been developed. The Bible of matrix theory would be there is a book called Gantt Matcher this book was published in I think 1945 we have this book in the library it is like it is two times Perry's hand book it is huge and this is in 1945. So, you can imagine what must be matrix theory by now I do not think you can have it in this. So, what actually we look at as a matrix theory is just tip of the iceberg and linear equation solving is something that you that is just bread and butter of numerical computing. So, you need it everywhere it is you cannot leave without ok. So, this is about sparse matrices or matrix partitioning and you have all kinds of tricks to make your competitions faster. Now, I am going to move from but this this method which I talked about matrix partitioning was still belonging to the direct methods I had not gone to I have not gone to iterative methods yet. Now, I am going to move on to iterative methods. So, iterative methods the idea is that you start with the guess solution ok. So, I want to solve again a x equal to b and then many times it is difficult to come up with a general number of number of multiplications and divisions for iterative methods, but in general the number of multiplications and divisions for iterative methods can be much smaller than the Gaussian elimination based methods. So, iterative schemes and next few lectures I am just going to spend on how to solve this problem iteratively. So, what is iterative schemes I start with a guess solution I want to solve this I start with some guess solution. So, let us say x naught is my guess solution ok and somehow I form a sequence. So, I want to form a sequence such that x k plus 1 is generated from x k ok. I start with x 0. So, x 0 will give me x 1 x 1 will give me x 2 x 2 will give me x 3 and so on ok. A very very simple crude way of doing this is you know I will say I will rearrange this equation as i plus a minus i into x equal to b. So, I will just write this as x equal to i minus a x plus b I have just rewritten the same equation a x equal to b. I have written as ok and then I can form an iteration scheme from this as x k plus 1 is equal to i minus a x k plus b. So, I start with a guess solution x naught I start with a guess solution x naught I will get x 1 I put back x 1 I get x 2 I get a sequence of vectors ok get a sequence of vectors and then well I need to talk about convergence. So, I will have to I will have to say when to terminate this. So, I am going to terminate this when x k plus 1 minus x k this becomes this ratio becomes very very small this is less than or equal to epsilon. So, epsilon is typically very small number 10 to the power minus 10 10 to the power minus 8 or something. So, I am going to go on doing this iterations well I will I will write a more generic form of the iterations I am going to put a a generic iteration here x k plus 1 is equal to i minus m a where m is a matrix where m is a matrix which I have to choose such that such that this sequence will converge to the solution ok sequence will converge to the solution that is what I want ok. What will happen if m is exactly equal to a inverse if m is exactly equal to a inverse then you will get the solution right because it is a inverse b ok here m is called as approximate inverse of a how to do this iterations how to construct the iterations we will see now, but basic idea is this that I am starting with a guess I construct a new guess and I go on iterating till it converges to a solution ok. Another way of looking at the convergence is through a residue. So, I look at this residue vector which is I look at this residue vector b minus when you when you form when you arrive at the solution what should be this difference it should be exactly equal to 0, but well in numerical computations we do not look for 0s we look at a small number. So, another way another criteria for conversions could be this r k r vector I it is good to normalize this with b is less than or equal to some let us say this is epsilon 1 and this is epsilon 2. So, either I look for this criteria to be satisfied or I look for this criteria to be satisfied for terminating my iterations and I am going to iterate till I reach a solution. This is the philosophy of iterative schemes now next few lectures will see how to form these iterations under what conditions you are guaranteed to converge to a solution. See the problem you might face is that well I am trying to solve a large system of equations ok and I have to guess a solution. So, tough problem because if you have to guess a vector which is 1000 cross 1 or 10000 cross 1 how do you guess well that is where you have to use your knowledge from physics engineering chemical engineering, but fortunately here even if you give a wrong guess what we will see is that if you take care of correctly choosing this m matrix ok you are guaranteed conversions to the true solution ok. So, which means even if you give a wrong guess completely arbitrary guess ok you are guaranteed conversions if you understand what makes the conversions work and that is what we are going to see. First we are going to look at these methods algorithmic part of it after we are done with algorithmic part we will move to analysis under what conditions these methods converge to the solution ok. Can you do some tweaks tricks to make the solution converge to the solution. So, those things we will look at ok. So, let us start developing this methods one by one I am going to talk about three methods which are very prominently used one is called a Jacobi method some of you might have done this in your undergraduate curriculum. Now, these iterative schemes are very often used while solving partial differential equations because you just want to go very quickly to the solution close to the solution ok. Well you might wonder by this iterative methods am I going to the true solution or am I going close to the true solution well you are getting an approximate solution no doubt, but even when you do Gauss elimination then too you are getting an approximate solution because you are doing truncation there are all kinds of errors even when you want to solve a problem a x equal to b ok. You normally end up solving a tilde is equal to a tilde x is equal to b tilde ok that is because for example, if you have a element pi coming in your matrix you cannot represent pi exactly. So, you when you do multiplications ok the computer has a finite precision. So, you truncate overflow errors. So, even when you solve using Gauss elimination you cannot construct the true solution there also there are approximations here also there are approximations. So, nothing to feel bad about approximate solution ok. So, let us look at this Jacobi method which is the simplest one. Let us say this is my Gauss solution k th Gauss solution ok where k starts of course from 0. So, this is my k th Gauss solution now starting from this Gauss. So, this is my Gauss solution starting from this Gauss I want to create a new Gauss ok. Now, where I am going to do it is I am going to look at each equation in this set of equations line by line ok. I have how many how many equations I have n equations in n unknowns right. I have n equations in n unknowns I am going to look at each equation line by line let us look at the first equation. So, a 11 x 1 k a 12 x 2 k up to a 1 n x n k is equal to b 1. This is my first equation everyone with me on this this is my first equation. Well actually since it is a guess this is not exactly equal but for the time being just understand how the method is developed. I have this guess solution I want to construct a new guess what I am going to do is I am going to rearrange this equation and say that x 1 k plus 1 is equal to b 1 minus I will put this a 11 here and this b 1 minus a 12 x 2 k a 1 n x n k is this fine ok. So, now my new guess is going to be x 1 k plus 1 is 1 by a 11 into b 1 minus ok. See I am starting from the previous guess I am starting from the previous guess I am constructing the new guess for x 1 is this clear what I have done just look at one equation ok. Well of course my assumption is that a 11 is not 0 a 11 is not 0 how will I use the second equation I can do the same trick ok. So, my second equation if I just skip in between steps I can write my second equation x 2 k plus 1 is equal to 1 by a 22 into b 2 minus a 21 x 1 k minus a 23 x 3 k minus a 2 n x n k use the second line. Take x 2 on the left hand side use the second row in the equation second equation take x 2 on the left hand side and you will get this equation I have just done the same thing which is here ok. If you are confused with I can take this inside plus helps better to compare with the previous expression then there is a close brace here close brace here ok. This is the first element this is the third element second element has been taken on the left hand side is everyone with me on this is this clear ok likewise I can go likewise I can go and my in general I can write that x i k plus 1 is 1 by a i i b i minus well equation 59 if you are the notes just correct them this should be not b 2 should be b i it is b i minus a i 1 ok. So, I am written the expression in general for the i th i th row I have written the expression for the i th row in i th row what is the implicit assumption here the implicit assumption is that all the diagonal elements are non-zero. So, if it is not like that you have to do a rearrangement of your equation such that all the diagonal elements are non-zero that is a critical thing here you cannot implement this algorithm unless that is done. But you can see here it is very very simple it is very very simple to do this calculations I am just generating from the old guess I am generating a new guess ok yeah that is another modification. So, she has rightly guessed the next modification which is called as Gauss Seidel method ok Smita right. So, Smita like yeah yeah but this is the next the obvious thing to do next is that if you are generating a new guess here in the next expression when you go here in x 2 why continue with the old guess right. See I am generating here in this step I am going to start from one ok. So, I am first going to generate x 1 k plus 1 having generated x 1 k plus 1 here I need not use old guess I could replace this by k plus 1 ok I could just replace this by k plus 1 likewise when I go to x 3 I will use x 1 x 2 new and x 4 to x n old and so on ok. So, in general here i th expression I will use k x 1 k plus 1 up to i minus 1 and i plus 1 onwards I will use the old values ok i plus i minus 1 I will use the old values this is obvious modification this is called as Gauss Seidel method and what you can show is that convergence of this method is much better than the Jacobi method ok this method will converge much faster than Jacobi method and this is very very very simple to implement. In fact, even in terms of computer storage this is easier to do because here in the earlier case you have to maintain two separate vectors one is the old guess other is the new guess here you can just keep using the new value which is created those of you who know programming well will appreciate that you do not have to maintain two vectors you can just have one vector and just write these equations old the new value will be automatically used in the next equation ok. So, this is implementation wise this is a much much efficient this is also convergence properties are better we will see why convergence properties are better a little later. So, this is one modification that we do to come up with iteration scheme now there is one more modification called as over relaxation method and I will talk about it in my next class or relaxation method. So, we say that well from Jacobi to Gauss Seidel you are actually you end up making the convergence faster. So, why not even further accelerate by putting some acceleration parameter ok. So, this acceleration parameter business will it is actually called relaxation parameter this relaxation parameter business will see later, but these iterative methods tend to work much faster than the conventional methods ok. So, for large scale systems many times these iterative methods are preferred as against and the solutions that you get are pretty much close ok. If you do a MATLAB experiment in which you solve a problem using iterative methods and probably you can also check how many number of multiplications and divisions are required. So, you will get an idea which particular method is faster. So, the next class onwards will start on how to analyze this convergence how do you make sure that convergence will occur is there are there any tricks to make enhance the convergence. So, all those things will see from our next class. So, it would help if you bring these nodes because there are too many summations i, j, k business and instead of writing it I want to explain it more than spending time on writing on the board. So, just get copy of the nodes and then we can do it. So, the so other thing which I am going to do is this is what I have written is the algorithmic part of it ok. I am going to rearrange these equations in the matrix form because once I write these equations in the matrix form it is easier for me to analyze this equation the convergence behavior. It is difficult to analyze in this raw form. This raw form is useful for implementation you can write algorithm in this way ok, but analysis when will it converge and all will require rearranging these calculations into some nice matrix forms and we will look at the matrix properties and say when will convergence occur ok. So, that is what we will do next.