 Hi, we are learning iterative methods for linear systems. In this topic in the last class, we have learnt Jacobi method and a sufficient condition for the convergence of Jacobi iteration sequence. In this class, we will study Gauss-Seidel method. This method is named after two German mathematicians called Friedrich Gauss and Philippe Ludwig von Seidel. Von Seidel was an assistant to Jacobi and it is widely believed that Gauss knew these two methods that is what we now learn as Jacobi method and Gauss-Seidel method previously itself, but many of Gauss work were not communicated. Therefore, it was not known to the mathematical community until Jacobi and Seidel discovered these methods independently. With this short historical notes, let us now start learning the Gauss-Seidel method. Gauss-Seidel method is a modified version of the Jacobi method. Let us quickly recall the Jacobi method and see how we modify this Jacobi method to get the Gauss-Seidel method. Let us again consider only the 3 by 3 system because once we understand this, generalizing it to any n by n system is not difficult. So, if you recall, we are given a linear system Ax equal to b. What we do is we keep the diagonal terms on the left hand side and push all the non-diagonal terms on the right hand side and then we divide both sides by the diagonal element ai i. Of course, we have to assume that ai is should be non-zero. Then we get an equivalent system like this. You can immediately note that any solution to this system will also be a solution to this system. It means these two systems are equivalent. With this observation, now what we will do is we will choose a vector arbitrarily and name it as x naught and plug in that on the right hand side of the equivalent system and get a new vector which we will name as x 1. Again you plug in x 1 on the right hand side to get x 2 and the iteration process goes on like this. In general, Jacobi iteration is given like this. One can also write it in the matrix notation which we have done in the last class. Now, once we understood the Jacobi method, let us come to the Gauss-Seidel method and see what is the modification that is done in order to get this new method. The idea is rather very nice. You look at the first equation. When you are computing x 1, you only know x 2 and x 3 from the previous iteration. Therefore, you are substituting the values from the previous iteration. Now, when you go to calculate x 2, you can see that the value of x 1 now is known as a recent one. We do not know whether it is going to be better than the previous iteration as far as the exact solution is concerned or not. However, what we see is that we have a latest computed version of x 1 at the time of computing x 2. Therefore, the idea is why not we will put x 1 k plus 1 here instead of the old information. That is the idea. Of course, x 3 is not known to us because we have not yet computed it. Therefore, we will again borrow this value from the previous iteration. However, this value that is the value of x 1 will be put from the present iteration. Similarly, when you go to compute x 3, at that time both x 1 as well as x 2 and x 2 are holding a rather latest information that is the value than what is coming from your previous iteration. Therefore, why not we put x 1 k plus 1 and x 2 k plus 1 instead of the values from the previous iteration. So, that is the idea of Gauss-Seidel method. Therefore, Gauss-Seidel method is defined as x 1 is equal to all the terms where these unknowns are replaced by the values from the previous iteration. When you come to x 2, this term is taken from the present iteration and x 3 is taken from the previous iteration and for the coordinate x 3, you take both the terms from the present iteration. In a sense what we are doing is you have x i k plus 1 is equal to a i 1 x 1 plus a i 2 x 2 and so on a i i x i right plus a i i plus 1 x i plus 1 up to a i n x n. Here, when you are computing the ith coordinate, all the terms below the diagonal term are known as x from the present iteration. Whereas, all the terms which are on the other side of the diagonal term is going to be known from the previous iteration only. Whereas, the diagonal term is not appearing on the right hand side because it is kept here and then you are dividing everything by 1 by a i i and there is one more term which is actually coming as b minus b i minus of these things right. So, that is how we are getting. What is important here is we are now splitting the terms which are in the lower side of the diagonal element and the upper side of the diagonal element. The lower side of the diagonal element are substituted with the value of the coordinate from the present iteration. Whereas, the terms which are on the right side that is upper side of the diagonal element are substituted with the values from the previous iteration. So, that is the main idea of the Gauss-Seidel method. I hope you have understood the Gauss-Seidel method. Let us now ask the question, when does the sequence generated by the Gauss-Seidel method converges? Again, the idea is more or less similar to the Jacobi method. We can see that if the coefficient matrix A is diagonally dominant, then the iteration sequence generated by the Gauss-Seidel method will always converge irrespective to whatever may be the initial guess that we take. Let us try to prove this theorem now. Well, again we will follow the idea that were introduced in the convergence theorem of the Jacobi method. What we do is we will try to write the error component wise and then try to estimate it. For that, we will first write the Gauss-Seidel method component wise. As I explained, the ith term is written with a split of the terms which are on the lower side of the diagonal element and the terms on the upper side of the diagonal element. The lower side we are putting k plus 1 and the other side of the terms are taken from the previous iteration. So, this is precisely what we have defined as Gauss-Seidel method. Now, you can see that the same will also be satisfied by the exact solution. That is, if you take the ith component of the exact solution, that can also be written as x i is equal to 1 by a i i into b i minus this term with x j minus the same term with x j again. Only thing is the j runs from 1 to i minus 1 here and this it will run from j equal to i plus 1 to n. Now, in order to get the ith component of the error involved in the k plus 1 iteration, what you have to do? You have to subtract these two equations. You have to subtract this and this. Again, as we did in the Jacobi method, you can see that b i will get cancelled and x j minus x j k plus 1 will give you e j k plus 1 and similarly here it will give e j k and the coefficient here. So, that is what we will get when we subtract these two e i k plus 1 b went off. Therefore, we have to left out with these terms where x minus x j k plus 1 is e j k plus 1 and x j k is now e j k because we are subtracting x i minus x i k plus 1. Now, we have this expression for the error. Note that in Jacobi method, the only difference is these terms are combined together with the superscript k whereas, here these terms are split into two parts. The left part is with k plus 1 and the right part is with k. That is the only difference in the Gauss-Seidel method. Now, what we do? We will define this term as alpha. Of course, we are taking modulus because we are going to take modulus on the both sides in the next step. Therefore, we give a notation for the modulus of this term as alpha i and modulus of this term as beta i. Here you can see that when i is equal to 1, the sum goes from j equal to 1 to i minus 1. It means when j i equal to 1, it goes from 1 to 0, which is not looking nice. Therefore, what we will do is we will just define the case i equal to 1 for alpha as 0 separately. Similarly, the same problem will come for beta when i is equal to n. In that case, j will run from n plus 1 to n, which does not look nice. Therefore, we will separately define beta n as 0. So, this is just a notational convention that we will keep in mind when we are defining these terms. So, these are some notations we have taken. Now, what we will do? We will take modulus on both sides of this equation. Then when we push the modulus into this sum, we will get a less than or equal to sign. Then what I will do as we did in the previous theorem on Jacobi method, we will dominate each of these errors by their infinite norm. Then what we will get? We will get, remember we are taking modulus of this equation and that gives me mod E i k plus 1 less than or equal to, when I take modulus here and take that modulus inside this sum, I have alpha i for this term that I have taken here. Similarly, beta i for this term I have taken and I have mod E j k plus 1. What I will do is, I will replace each of this term by their maximum. That is what we call as E k plus 1 with infinite norm and that is what is sitting here. Similarly, the same thing we will do for the second term also and that is what is sitting here as the infinite norm. Therefore, this less than or equal to is coming from two steps. One is when you take this modulus inside this sum and another one is when you replace modulus E k plus 1 and modulus E j k by their respective infinite norms, you get this less than or equal to sign. Remember this inequality is satisfied by E i for each i equal to 1 to up to n. It means all the coordinates of the error vector E k plus 1 will satisfy this inequality. Therefore, it will also be satisfied by that coordinate where the infinite norm of E k plus 1 is achieved. Say the infinite norm of the E k plus 1 is achieved at say some l th coordinate where l is something lying between 1 to n. Then, since this inequality is satisfied by all such i's, it will also satisfy for that l at which this maximum norm is achieved. Therefore, for that coordinate you will have E l k plus 1 is less than or equal to alpha l into E k plus 1 infinite norm plus beta l E k infinite norm. So, that is what I will write here. E k plus 1 l that is nothing but the infinite norm of the vector E k plus 1 is less than or equal to alpha l norm E k plus 1 infinity plus beta l norm E k infinity. So, let us keep this inequality in hand and just have a small observation. The observation is the following. We have so far derived this inequality and now let us have this observation. If you recall the mu which is given here is familiar to us from the convergence theorem of Jacobi method also. If you recall when we were proving the convergence theorem of Jacobi method we have defined this mu. Now, you can see that the same mu is nothing but maximum of alpha i plus beta i. Why? Because in the definition of mu we are removing the diagonal element and we have j going from 1 to i minus 1, then i is removed, then you are going from i plus 1 to n that is precisely how we defined alpha i's and beta i's. Therefore, taking maximum on this entire sum is equivalent to taking maximum on the sum of alpha i's and beta i's. So, at this level we have not done anything new it is just similar to what we did with the Jacobi method. Only thing is we are now considering the terms at the left of the diagonal and the those terms on the right of the diagonal are treated separately that is the only difference here. Now, you see we have assumed that A is diagonally dominant and from the direct observation you can see that diagonal dominance of A implies that mu is less than 1 right. This we have explained in the last class itself why diagonal dominance of A implies mu is less than 1 it comes directly from the definition if you carefully look into this expression and see the definition of diagonal dominance you will immediately see that mu should be less than 1. Now, mu is less than 1 and mu is nothing, but the sum of alphas and betas right. Therefore, just alpha i's alone you take they all will be less than 1 right in particular this alpha l will also be less than 1 that is the important observation that we are getting here from the hypothesis that A is diagonally dominant. Now, what we will do is we will take this term to the left hand side what you will get is 1 minus alpha l you will get right this goes off and then you have this. Then you bring this term to the right hand side again and write 1 minus alpha l. Now, I can do that because 1 minus alpha l is positive right why it is because just now we saw that alpha is less than 1. Therefore, 1 minus alpha is positive therefore, you can divide both sides by 1 minus alpha to get norm ek plus 1 is less than or equal to beta l divided by 1 minus alpha l into ek norm. So, this is much more nice for us because all the ek plus 1's are kept on the left hand side and we have all ek's on the right hand side. Now, let us try to understand how this number behaves that is what is our aim now for that first we will remove the dependency of this number on a particular coordinate because in any estimate we want some fixed number it should not depend on anything which may vary here l may be anything between i equal to 1 to n right. So, let us dominate it by something very specific for that we will take maximum on all the coordinates of this expression because it such a expression is possible for each i right beta i minus 1 minus alpha i all are positive number because beta is positive all alpha i's are less than 1 therefore, 1 minus alpha i is also positive for each i equal to 1 2 up to n. Therefore, when we take maximum of all that that number will also be a positive number and what we will do is we will now dominate this term by eta right and therefore, we will have ek plus 1 norm is less than or equal to eta times norm ek that is what we finally, got. So, this is much more nice than this because we have a nice estimate on this now you see the question is what about this eta because we can now recursively write it like this eta square norm ek minus 1 right the same idea that we always put to get a nice estimate finally, is k plus 1 times e 0 infinity why this is nice because we already chosen x naught and therefore, this term is something which is fixed we have chosen this and the x is something which we may not know, but it is already given to us and this is fixed quantity that is more important. Now, we have to only see that eta k plus 1 should go to 0 as k tends to infinity for that we have to see whether eta is less than 1 or not it is not very clear from the way it is defined and the condition that a is a diagonally dominant matrix right, but it is not very difficult to see. Let us take this expression if you recall this expression is what finally, let to mu after taking maximum and this is the expression which gave us eta after taking maximum right. So, let us take this as mu i eta i and you see how this is going to be for that we can just simplify this expression after simplification you can get this expression to be alpha i into 1 minus alpha i plus beta i divided by 1 minus alpha i. Now, this i can dominate by mu from the lower side because there is a negative sign here therefore, this term will be greater than or equal to the same expression with this replaced by mu right. Now, you can see that this is positive and also this is positive because of the diagonal dominance property of a which says that mu is strictly less than 1 therefore, this is also positive therefore, this both are going to be positive that shows that mu i which is this expression eta i this is this minus this is greater than or equal to 0 it means this is greater than or equal to this. Now, what we have proved is we have shown that eta i is less than or equal to mu i from here is greater than or equal to 0 now this is true for all i equal to 1 to n in particular for that component where the maximum was achieved also will satisfy this therefore, I can as well write it as eta i's are less than or equal to mu remember mu is nothing but the maximum of all this expressions right. Now, again this is true for all i therefore, for that i for which eta attains its maximum also will satisfy this equation therefore, we can write eta is less than equal to mu that is what I have written and we already know that mu is less than 1 therefore, eta is less than 1 therefore, this is less than or equal to up to eta k plus 1 into norm E naught now you know how to conclude that the sequence x n converges to x because we know that this is less than 1 and therefore, this term goes to 0 as k tends to infinity this is a fixed term right and this is always greater than or equal to 0 therefore, you use sandwich theorem to show that this goes to 0 as k tends to infinity therefore, the Gauss Seidel iteration sequence will also converge if the coefficient matrix A is diagonally dominant. Now, our next question is can we write Gauss Seidel method in the matrix notation recall we have written Jacobi method in the matrix form that is we have written the Jacobi sequence as x k plus 1 is equal to sum matrix B into x k plus some vector C. Let us denote this matrix by B j and it is called the Jacobi iteration matrix and similarly we will call that C as C j in order to see that it is the vector and the matrix coming from the Jacobi method if you recall B j is given by D inverse C and C j is given by D inverse B where D is the diagonal matrix whose elements are precisely the diagonal elements of the matrix A. Similarly, C is given by D minus A now the question is can we write Gauss Seidel method also in this form that is the question well the answer is S we can write Gauss Seidel method also in this form how to write it if you recall instead of keeping all the non diagonal elements as one matrix what we are doing we are further decomposing that non diagonal elements as lower part of the diagonal element and upper part of the diagonal element right that is what we are doing in the Gauss Seidel method. Therefore, we will now write the matrix A as L which contains all the elements all the lower part of the diagonal elements plus diagonal element plus all the upper part of the diagonal elements as separate matrix once you do that then the given system A x equal to B can be written as L plus D plus U into x equal to B that can be written as L plus D x equal to B minus U x right. Now, from there we can easily write x is equal to L plus D inverse B minus L plus D inverse U x right and thereby we can define the Gauss Seidel iteration sequence in this form that is x is equal to some B x plus C that is the form which we would like to have for the Gauss Seidel method also which was already there for the Jacobi method and now we know how to write this iterative matrix this is called the Gauss Seidel iterative matrix let us give the notation B g for it and similarly the vector is denoted by C g where B g is given by minus D plus L inverse times U and C g is given by plus D plus L inverse into B. Well, let us take an example you can clearly observe that in this example the coefficient matrix is not a diagonally dominant matrix not only that any interchange of rows will not make this system diagonally dominant right. Therefore, none of our previous theorems whether it is Jacobi method or Gauss Seidel method will give us any idea of whether the sequences are going to converge or not right. Let us see what is going to happen with this system when we apply Jacobi and Gauss Seidel methods remember the coefficient matrix is given by A is equal to 1 0 1 minus 1 1 0 and 1 2 and minus 3. First to write Jacobi iteration in the matrix form you take D which is precisely the diagonal matrix with all the diagonal elements from A and rest of the elements are arranged to get C right. Therefore, your Jacobi iteration matrix is given like this similarly for Gauss Seidel iteration what you do you collect all these terms that is the lower part of the diagonal terms as L then keep all the diagonal terms with D and take all the upper part of the diagonal elements and arrange them in this form to get U. Once you get that you can write the iterative matrix for Gauss Seidel method and it is given like this. Therefore, the Jacobi iteration sequence in the matrix notation is given like this and the Gauss Seidel iterative sequence in the matrix notation is given like this. You can also write it in the component form and also you can implement it as a computer code that may be much more simple and efficient rather than writing in the matrix form because you need to invert this matrix which may not be efficient computationally. This is just for our understanding I am giving you this you will understand why I am giving this form probably in the next class, but for now we will do the iteration using Jacobi method and Gauss Seidel method for that we will take our initial guess as 1 1 1 I am not now going to choose the 0 vector because 0 vector is the solution of our system. Therefore, that may not be a good choice for us as a initial guess. So, I am taking this vector there is a reason for this also, but that I will tell you later. Let us do the computation using Jacobi method. You can see that the Jacobi method started with this iteration x 1 and the L 2 error is given in this column. Well I am not showing you all the iterations, but I am only showing you some iterations. After 10 iterations the Jacobi method gave this vector whereas, the error is well it is improved, but not so fast. You see even after 10 iterations we have not even captured one significant digit of the exact solution that is really too much because generally 10 iteration itself is very big. Not only that I went up to 50 iteration hardly perhaps one significant digit it might have captured or even not that and after 10 iterations you can see that only 2 digits after decimal points are captured as 0 right that is really very slow and then further I went up to 200 iterations even there hardly 4 digits of 0 are captured. So, this shows that well Jacobi method seems to be converging, but it is very very slow 200 iterations are generally computationally not affordable because you can count the number of operations involved in each iteration and now if you are going say 500 iteration, 600 iteration, 1000 iterations then it may not be meaningful for us to go for iterative methods rather than going for Gaussian elimination method right. So, we should get the convergence within 10 20 or something like that depending on the dimension of the system such convergence are really not affordable computationally. Well let us try to see what is going to happen with Gauss-Seidel because we may expect something better in Gauss-Seidel why because we have seen that eta is less than or equal to mu right what it means in the error bound for Jacobi method we had this mu k plus 1 times norm E 0 right of course we computed it with infinite norm, but that does not matter and similarly this is for Jacobi method for Gauss-Seidel method we had E k plus 1 infinity is less than or equal to eta times k plus 1 into E 0. Therefore, at least for diagonally dominant matrices this coefficient which we generally call as rate of convergence sometime the rate of convergence seems to be better for Gauss-Seidel method than for the Jacobi method. Therefore at least in the diagonally dominant case Gauss-Seidel method is expected to perform better than the Jacobi method because of this factor, but now remember we are working with non diagonally dominant system will Gauss-Seidel method perform better than Jacobi method let us see we will compute the iteration sequence of Gauss-Seidel method and see what is happening. Before that I will just show you the graph of the error that is the two norm of the error is shown in this graph you can see that as you go on increasing the iteration the x axis is the index of the iteration that is k and y axis is the l 2 norm of the error you can see that the error is gradually decreasing you also have some oscillations here this is quite common in the iterative methods it means the iteration here was better than the iteration here right. So, the iteration error just increased again it fell again it increased like that it goes, but as you go on the error is tending to 0. So, this is the 0 line so it is tending to 0, but it is quite slow in the Jacobi case let us see what is happening to Gauss-Seidel case at the first iteration we get x 1 is equal to minus 1, minus 1 and minus 1 and the l 2 error is given like this remember we have the initial guess as 1 minus 1 and 1 right if you go back to the iterative matrix of the Gauss-Seidel method you can see that b g is given like this. Therefore, when you multiply it with 1 1 1 clearly you will get minus 1 minus 1 and minus 1 as x 1 right. So, that is why you see this vector as the first term of the iterative sequence in the Gauss-Seidel method. Now, you can also see what happens with the next iteration by just looking at the iterative matrix right it will be simply x naught. Now, once you have this again x 3 will be x 1 and so on right. So, in this way we get an oscillating sequence in the Gauss-Seidel method and therefore, the Gauss-Seidel method never converges. In fact, I have gone up to 200 iterations and you can see that the sequence simply oscillates and keeps the error at a constant level. Therefore, in this example we can see that the Jacobi method is converging but rather very slowly whereas, Gauss-Seidel method does not converge it simply oscillates. So, what we understood in today's class is that when the matrix that is the coefficient matrix is diagonally dominant then Gauss-Seidel method will perform better than Jacobi method will both of them are order wise they are linear order. However, Gauss-Seidel accuracy will be little better than Jacobi's accuracy if the coefficient matrix is diagonally dominant. If the coefficient matrix is not diagonally dominant then we cannot say anything Jacobi method may converge for which Gauss-Seidel may not converge the same can happen other way also Gauss-Seidel may converge but Jacobi may not converge in some other examples. There are examples like that where Gauss-Seidel converges but Jacobi never converges. Here we saw as instant where the sequence is not converging but it is just oscillating the sequence can also diverge. It means it can just go to infinity also anything can happen if the coefficient matrix is not a diagonally dominant matrix. Therefore, our understanding on these methods is still not complete something interesting is happening in this methods which is when the system is not diagonally dominant. Therefore, there is still some scope for us to analyze and understand what is going on with this methods when the coefficient matrix is not diagonally dominant. This is what precisely we are going to do in the next class. Thank you for your attention.