 In our last lecture, we have seen that if the coefficient matrix A in the system of linear equations A x is equal to b, if it is ill conditioned that means if norm of A into norm A inverse, if it is a big number, then the solution can be sensitive to the perturbation in the right hand side and in the coefficient matrix. Now, if the matrix A which is given to us, if it is ill conditioned, then we cannot do much about it, but what is in our hands is not to make a well conditioned matrix into ill conditioned matrix by our operations. Let me explain. If we look at Gauss elimination method, in that case we are multiplying a certain row by a non-zero constant and then subtracting from another row. So, if you divide by a small row, then your multiplier is going to become big and it can make the originally well conditioned matrix into a ill conditioned matrix. So, today we are going to consider this Phenovena by an example and then we will consider backward error analysis. We will not go into details, but I want to give you some idea about the backward error analysis. Now, let me recall the floating point representation of a real number and what happens if you subtract two numbers which are about the same. So, first the floating point representation of a real number. So, we have got x is a real number and then floating point of x, it will be this is the standard form plus or minus then dot d 1 d 2 d n beta raise to e. We will assume that d 1 is not equal to 0. The numbers d i's they are going to lie between 0 and beta minus 1. This beta it is known as base or radix and it takes values binary representation then 2, then decimal representation that means the base is 10 or hexadecimal then the beta is 16. The number n that is going to depend on your computer. So, in any case these are d 1 d 2 d n these are going to be like the n is going to be finite and this is known as significant or mantissa then e is the exponent. So, the exponent will lie between small m and capital M. So, again the values of small m and capital M they depend on the computer which you use and whether you are using single precision or double precision. So, this is floating point representation if your beta is equal to 2 then your d i's they are going to lie between 0 and 1. So, if we make convention that our representation d 1 is not going to be equal to 0 then we do not need to store it because then d 1 will be always equal to 1. So, this is the floating point representation of a real number. Now, the definition of significant digits is first non-zero digit and all succeeding digit. So, if you look at the number 1.7320 then there are 5 significant digits because first non-zero digit and all succeeding digits the succeeding digit may be 0 or non-zero. Whereas, if you look at 0.0491 then we have got only 3 significant digits then this 0 and this 0 is not counted as significant digit. Now, let us see what happens in the cancellation when you subtract 2 numbers which are about the same there is going to be loss of oxygen. So, we have f x is equal to 1 minus cos x by x square suppose your x is 1.2 into 10 raise to minus 5 and this is the value of cos x which is rounded to 10 significant digits then when you look at 1 minus cos x. So, we are allowed to have 10 significant digit when you consider 1 minus cos x then what you get is this number 0 point then 4 0s, 4 0s, 0 and then 1. In 1 minus cos x there is only 1 significant digit. So, you started with 10 significant digits and 1 minus cos x you have only 1 significant digit. So, there is lot of cancellation then when you look at 1 minus cos x by x square 1 minus cos x is 10 raise to minus 10 x is 1.2 into 10 raise to minus 5. So, that is going to be 1.44 into 10 raise to minus 10 and then what you get is 0.6944. Now, this number is completely wrong because what one can show is 0 less than or equal to f x less than half. So, this is the catastrophic cancellation we had number 1 and then another was number which was very near to 1. So, when you subtract 2 numbers which are about the same or whose many digits they coincide then when you do the subtraction then there is loss of accuracy or loss of significant digits. Now, what is the way out? So, you had 1 minus cos x upon x square you can use trigonometric identities and then the same formula you can write it in a different manner. So, what I mean to say is we have to keep in mind this phenomena of catastrophic cancellation that as far as possible we should avoid subtracting 2 numbers which are going to be about equal. So, here look at the subtraction x is equal to a minus b. So, these are the exact values a cap is the perturbed value. So, a cap will be a into 1 plus delta a some error b cap is going to be b into 1 plus delta b. So, the computed result is going to be x cap is equal to a cap minus b cap which will be equal to a minus b plus a delta a minus b delta b and hence the relative error will be x minus x cap divided by x which will be minus a delta a plus b delta b divided by a minus b. This is going to be less than or equal to maximum of mod delta a mod delta b into mod a plus mod b divided by modulus of a minus b. So, in this inequality maximum of modulus of delta a delta b appearing in our error relative error or bound for relative error that is something normal. What you have to focus on is the term mod a plus mod b divided by modulus of a minus b. So, if modulus of a minus b is small then 1 upon mod of a minus b will be big and then your relative error it can be big. So, once again this illustrates the fact that if you subtract two numbers which are almost equal then there is going to be a trouble. This is the background. Now, we are going to look at the relative error. This is the background. Now, we are going to look at a system of linear equations in which case the pivot is small. So, if the pivot is small then when you look at the multipliers they are going to be big and we are the cause elimination without partial pivoting then that is going to be affected and the results which we obtain they will not be reliable or the error is going to be big. So, for the illustration we are going to consider some number of fixed significant digits. So, we are going to do the computations are going to be exact, but at a time we will be allowed to only keep certain number of digits and this is what happens when you use computer to do your computations that there will be always some fixed number of significant digits then what one does is one does either rounding of or one does it chopping, but in any case you can work only with finite number of digits. So, here is the example you have got this is the system and we will be considering four significant digits that means at any time you are allowed to do the computations and you are allowed to have only four digits. Now, here one can verify that it is a well conditioned matrix this is matrix A one can calculate A inverse and then one can look at either infinity norm or one norm and one can check that the matrix is well conditioned the matrix is so chosen that the exact solution is 1 1 1. If you look at this point 0 0 2 this is a small pivot, but it is still not equal to 0. So, I will be perfectly justified in choosing this as my pivot and entry. So, in the Gauss elimination method what we are going to do is we are going to introduce zeros in the first column below the diagonal that we will do by subtracting say A 2 1 by A 1 1 times first row from the second row and A 3 1 by A 1 1 times first row from the third row and here our A 1 A 1 1 that is a small number. So, that is why A 2 1 by A 1 1 that will be a big number. So, we will be subtracting large multiples of the first row from the second row and from the third row. So, when you do this in the process your next sub matrix on which you are going to work that will become ill conditioned. So, let us see our M 2 1 is 1.196 divided by 0.002 it is A 2 1 by A 1 1. So, it is 598.0 note that this 0 is significant it is at a time you are allowed to retain 4 digits M 3 1 will be A 3 1 by A 1 1. So, it is again 1.475 divided by 0.002. So, it is 737.5 the operations which we are going to do are R 2 minus M 2 1 R 1 and R 3 minus M 3 1 R 1 these operations will introduce zeros here. So, this is the first step of Gauss elimination method. So, here we know that we are going to get 0. So, the first thing we will be doing is 3.165 minus M 2 1 times this 1.231. So, let us do this subtraction the way we are going to do the subtraction multiplication is at any stage retain only 4 digits. So, here is the result 3.165 minus 598.0 into 1.231 that will give us 3.165 minus 736.1. So, here when you multiply then retain only 4 digits. So, you get this 736.1 when I do the subtraction again I am allowed to retain only 4 digits. So, I will get minus 732.9 that means these 2 digits 65 which were significant these are lost and this information loss it is known as swamping. So, this was for the element A 22 and same thing is going to happen for the other elements and at the end of the first step what we get is the first row is unaffected here we have zeros. So, that is where I am writing our multiplier and then you get minus 732.9 minus 1475 minus 903.6 and minus 1820. So, in our next step we are going to work on this matrix. So, we will be subtracting appropriate multiple of the second row from the third row look at this 2 by 2 matrix this 2 by 2 matrix it is 2 rows are almost linearly dependent because they are almost multiples of 1.231 and 2.471. This condition we get is 1 by 2 matrix. So, the number of A tilde with infinity norm is going to be big it is about 8400 if you remember we had seen that the ill conditioning of the matrix it has nothing to do with the small determinant, but it has to do something with linear independence and linear dependent. So, we started with a matrix A which was well conditioned because of our small pivot the multipliers became big and then we subtracted large multiples of the first row from the second row and the third row. So, in the process the modified 2 by 2 matrix its rows they were almost multiples of the same vector. So, they become almost linearly dependent and that is what makes the condition number of that matrix to be big. Now, if the condition number is big then it is going to be sensitive to the perturbation and then the solution which we get the computed solution it will be much further away from the exact solution. So, we have seen the first step of Gauss elimination method. So, let us continue. So, in the second step our multiplier will be 903.6 divided by 732.9. So, it is 1.233 the multiplier is not big and then here anyway we are going to get 0. So, we need to do the subtraction minus 1820 minus multiply minus 1475 by 1.233 when you multiply you are going to get minus 1820 plus 1819. Now, in this subtraction we are going to get the subtraction. Again there is severe cancellation there is catastrophic cancellation. So, you are losing lot of significant digits and then you have reduced the system A x is equal to b to u x is equal to y the swamping and the severe cancellation that is the back substitution. You can verify that these are the values you get only thing is when you do the computations even though you are doing hand computations whatever result you get like you multiply you multiply you get some 6 7 digits. So, then you truncate and you round it off and at every stage keep only 4 digits that is how you are going to do the computations. If for this system if you assume that you have got infinitely many digits at your disposal the way we do for hand computations then there will not be a problem. The problem is because you can retain only 4 digits at a time and that happens when you are doing computations using computer. So, we have got A x is equal to b we reduce it to upper triangular form we do the back substitution and then the solution which you get the exact solution was 111 and the solution which you obtain is completely different. It has got the first component to be 4 second component to be minus 1.012 and the third component to be 2.000. So, what had happened was small pivot it implies large multipliers. So, very large multipliers of the pivotal row get subtracted from other rows. So, there is loss of information that is swamping resulting sub matrix is ill conditioned and ill conditioning it leads to cancellation. So, these were the problems the main problem in this whatever we have done it was the small pivot the multiplier was big the pivot was small it was 0.002. Why not do the multiply the first equation by say 1000? If I do that then the pivot will become 2 and then there should not be any problem. When I consider a system of linear equations and multiply a equation by a non-zero constant I do not change the system the solution of the new system is going to be exactly same as the original system. So, let us see whether this will work that the problem was that 0.002 is equal to 0.003. So, 0.002 the pivot was small. So, let me multiply by 1000 the first row and see what happens. So, this was our original system and then you multiply the first equation by 1000. So, I am throughout I am going to multiply. So, only the first equation will change. So, after multiplication this is going to be my new system the second and third row they are the same the pivot 2.00 is not small now look at m 21. So, it is 1.196 divided by 2. So, it is 0.5980 the m 31 which is a 31 by a 11 that is 0.7375 when I consider 3.165 minus the pivot 0.590 into 1 2 3 1 when I consider 0.510 and do this I get exactly the same number as before. Let us see we had here when we had considered we got minus 732.9 and now when we do the new calculations again you are going to get minus 732.9. So, this was for this particular element. Now, let us see how the other elements they get modified in the other elements you are going to have exactly the same thing like this was our new system after multiplying by 1000. The pivots are or the multipliers they are small the 2 by 2 matrix here it is exactly same as before. So, this was in the earlier case this was our first row the pivots were big, but a tilde 2 by 2 matrix is going to be the same. So, later computations they will be the same and then you are going to get exactly the same solution as before. So, what happened what we did was we multiplied the first row by 1000 and made the pivot element to be say it was sort of arbitrary that we made it to be 2, but recall our result that if you multiply a matrix A by a non-zero number then the condition number remains exactly the same because if I consider instead of A say alpha times A then I have to look at inverse of alpha times A that will be 1 upon alpha times A inverse. So, the condition number remains exactly the same, but if I multiply a row by a non-zero number then the condition number is going to be becoming it will change that was the idea in row scaling and column scaling. We have seen that if you have got 2 columns which are like one column it has got norm much bigger than the norm of the other column then your condition number is going to be big. So, one tries to do the scaling so that rows and columns they are as far as possible of the same order. Now, here we had our matrix to be well conditioned in order to get rid of small pivot we multiplied the first row by a big number in the process the well conditioned matrix we make it ill conditioned. So, that is the problem. So, this sort of thing making the pivot arbitrarily large is not going to work. So, here in this case the solution is do not do cos elimination without pivoting, but use it with partial pivoting. That means, look at the first column look at the entry which has got maximum modulus interchange the corresponding rows and then the results which you get they will be acceptable. So, here as I said earlier system we had small pivot and large multiplier in the new system the first row is large. Now, if the first row is large see you here you have got 1 2 3 1 2 4 7 1. So, the rows and columns are out of scale we had seen that condition number is bigger than or equal to norm c j by norm c i where these are the j th column and i th column and what one says about columns it is true for rows also. So, this system is ill conditioned and that was the problem. So, now let us look at the error analysis. We have to start with a system a x is equal to b then when we are using computer instead of system a x is equal to b we are going to solve nearby system. Now, we have to see whatever computations we do we are going to do in finite precision. So, there are going to be two sorts of error one will be because of the catastrophic cancellation and another will be at each stage there is round off error and then it will keep accumulating. So, these are the two errors. Now, the round off errors accumulation of that that does not happen much in practice. So, if I want to look at the Gauss elimination method with pivoting then in the forward error analysis what one does is at each stage see we are going to perform various operations. We are going to subtract multiple of a row from the another row then we will be doing back substitution forward substitution. So, in each operation if we can guarantee that there is no catastrophic cancellation then we can say that the error is going to be acceptable. Now, this is something very difficult that keeping you know track of all the errors at every stage. So, then instead of that what one does is one does backward error analysis. You have gradual accumulation of small errors which does not happen in practice if no cancellation occur in an algorithm then the result will be accurate or accurate enough this is something difficult to verify. In the forward or direct approach one finds a bound for each intermediate result and this is something not possible because for each addition or subtraction one has to prove that there is no catastrophic cancellation. So, now, about the backward error analysis. So, you have exact equation A x is equal to b x cap is a computed solution. So, one tries to find a matrix delta A such that A plus delta A x cap is equal to b and then use perturbation theory to find and bound for relative error norm of x minus x cap by norm x is less than or equal to we have seen that condition number of A norm delta A by norm A upon 1 minus condition number of A norm delta A by norm A. We have system of equations A x is equal to b. You are either going to do Jolesky decomposition or Gauss elimination with partial pivoting or without pivoting and we get a computed solution x cap. If we can show that this computed solution x cap is exact solution of a nearby system that means A plus delta A is equal to b x cap is equal to b. So, if I can find such a delta A then I can use the perturbation theory which we have developed to say something about norm of x minus x cap by norming. The condition number of A it appears in that bound that is going to be something inherited to the system. So, we cannot do much about that. Another term which is coming in the bound is norm delta A by norm A. So, if our computed solution x cap is the exact solution of a nearby system with norm delta A to be small then our relative error is going to be something small. Now, this is backward error analysis. We will not be doing this in detail. I just want to state some results. So, what is possible? Like look at A x is equal to b and then look at lu decomposition of the matrix lu x is equal to b. Then we have to find what one can show is norm delta A is less than or equal to 3 times n epsilon normal norm u the delta A norm. It is going to depend on the norm of l and norm of u. If you are doing Gauss elimination with partial pivoting then the entry of normal they are going to be have modulus less than or equal to 1. So, then this will be something acceptable. So, then we have to see how norm u increases. In case of Gauss elimination with partial pivoting what one can show is norm delta A infinity to be less than or equal to 3 g n cube epsilon norm A infinity. Epsilon is going to be precision of your computer n is the size of the matrix and then g it is known as the growth factor. So, this g is less than or equal to 2 raise to n minus 1 and one can construct an example where g is equal to 2 raise to n minus 1 and 2 raise to n minus 1 is going to grow much faster than n cube and that is something one has to worry about. If you consider Gauss elimination with complete pivoting then your g the growth factor is going to grow much slower and in case of Cholesky decomposition there is no growth factor. So, let me summarize. If you consider Cholesky decomposition there is no growth factor and that is why the method is going to be stable. So, that was the reason that of course, you cannot do Cholesky decomposition for all systems your matrix it should be a positive definite matrix then only you can do the Cholesky decomposition. But still when it is possible then it is going to be a stable method when you consider Gauss elimination with complete pivoting. So, in the complete pivoting what one does is look at all the elements of your matrix there are n square look at the one which has got maximum modulus and interchange row and columns. So, that this element of the maximum modulus it occupies the space 1 1 that means first row first column and then do similarly. So, this is Gauss elimination with complete pivoting and then in this case also the growth factor does not increase too fast and the method will be stable. However, this complete pivoting is going to be expensive because you need to do lot of comparisons when you consider Gauss elimination with partial pivoting. In case of partial pivoting your growth factor as I said you can construct examples when the growth factor is 2 raise to n minus 1. So, that can be a very big number. So, but in practice like when people have done extensive computations it was realized that it does not happen in practice. So, Gauss elimination method with partial pivoting it really works well. Gauss elimination method without pivoting you should not do because we have seen that what can happen is your starting matrix is well conditioned and then it can become ill conditioned. So, among the methods which we have studied if your matrix is positive definite use Cholesky decomposition if your matrix is not positive definite then one sort of compromises and one uses Gauss elimination with partial pivoting. So, this completes our study about solution of system of linear equations these were the direct methods. Now, there are methods which are known as the indirect methods or the iterative methods. So, we will be considering those methods, but before we do those methods we are going to look at some of the problems and we will first consider solution of non-linear equations. So, that means if you want to find 0 of a function. So, f x is equal to 0. So, we will be first considering though that topic and then at the end we will consider two iterative methods which are known as Jacobi method and Gauss-Seidel method. So, let us look at some of the problems. We have considered vector norms and matrix norms. So, for the vector norm we had defined one norm infinity norm and the two norm. Then you can define analog of these norms for the matrices. So, analog of the two norm or the Euclidean norm for the vector and the Euclidean norm for the vector is the Frobenius norm. Now, what we said was that instead of defining these we will define what are known as induced matrix norm. So, start with a vector norm and then you define norm a to be maximum of norm a x by norm x x not equal to 0 vector. This induced matrix norm it has got some desirable property. We have got the fundamental inequality norm a x is less than or equal to norm a into norm x where a is matrix x is a vector. We also have consistency condition. You can multiply two matrices provided they are of appropriate size. So, if a and b are two square matrices I can multiply them. So, we have got consistency condition norm a b less than or equal to norm a into norm b. Now, I want to consider analog of maximum norm for the vector its analog for the matrix norm and show that it does not satisfy the consistency condition. So, here is the example or a problem a is n by n matrix. Suppose, I define norm a max is equal to maximum of modulus of a i j 1 less than or equal to i j less than or equal to n. So, this is the does the following property hold that norm a b max is less than or equal to norm a max norm b max. Now, I have not used the notation norm a infinity because we have reserved that notation for the induced matrix norm. So, what was our norm a infinity it was norm a infinity is maximum of norm a x infinity divided by norm x infinity x not equal to 0 vector and norm x infinity is maximum of modulus of x j 1 less than or equal to j less than or equal to n and we proved that norm a infinity is maximum summation j goes from 1 to n modulus of a i j 1 less than or equal to i less than or equal to n. So, this is norm a infinity is the row sum norm and our norm a max is equal to maximum of modulus of a i j 1 less than or equal to i less than or equal to n 1 less than or equal to j less than or equal to n. So, let us first verify that it satisfies the three properties of norm and not the fourth property. So, we have norm a max which is maximum of modulus of a i j 1 less than or equal to i j less than or equal to n. So, norm a max will be bigger than or equal to 0 and norm a max will be equal to 0 if and only if norm a max will be equal to 0. So, norm a is a 0 matrix then second is norm of alpha a maximum if you multiply matrix a by number alpha this will be maximum of mod of alpha a i j because the each entry a i j will get multiplied by alpha 1 less than or equal to i j less than or equal to n which is equal to mod alpha times maximum of modulus of a i j maximum over i n j. So, this is equal to mod alpha times norm a max. So, this second property is satisfied now the third property is the triangle inequality. So, we have norm of a plus b maximum of a i j this will be equal to maximum of modulus of a i j plus b i j maximum over i n j this will be less than or equal to maximum over i j mod a i j plus maximum of mod b i j over i j. So, this will be norm a max plus norm b max. Now, the question is whether norm a b max whether it will be less than or equal to norm a max or norm a max will be equal to norm b max. Now, this is not true and when one constructs counter examples one should always try for a matrix of a small size in order to reduce your work or simplify your work. So, try for 2 by 2 matrix and see whether it work. So, what we want is we wanted 2 by 2 matrix or we want 2 2 by 2 matrices such that norm of a b is going to be strictly bigger than norm a into norm b. Now, what is our norm a it is maximum of all the entries. So, let me look at say a matrix 2 by 2 matrix 1 1 0 1. So, then its maximum norm a max is going to be equal to 1. Now, take b to be equal to the same matrix when I am going to multiply these 2 matrices. Then in the multiplication I will get one of the entry as 2 and that will make norm a b max to be strictly bigger than norm a max into norm b max. So, here is our a which is 1 1 0 1 and that is also equal to b which is 1 1 0 1. So, norm a max is going to be equal to 1 which is equal to norm b max and let me look at a b. So, it is 1 1 0 1 multiplied by 1 1 0 1. So, this will be the first entry will be 1 first row into first column, second entry will be 2 first row into second column. Then we will have 0 here and then we will have 1 here. So, norm a b max it is going to be equal to 2 because it is the biggest entry and this is strictly bigger than norm a max into norm b max. This condition does not hold for the max norm which is the analog of infinity norm for the vector. So, now in the next lecture we are going to start a new topic and that is solution of non-linear equations. Thank you.