 So, in this lecture, we will be discussing numerical aspects in linear systems. In the previous lecture, I told you that in linear systems, we encounter certain good situations for which we should try to take advantage of it in terms of efficiency and sometimes we encounter bad situations, where there can be numerical errors and overflows and underflows, in which case we need to know how we handle these special situations. So, in this lecture, we will be considering those measures which will help us in handling bad systems. So, here first we will define a few terms like norms, condition number, ill conditioning and sensitivity of matrices and then we will study rectangular systems and then the most general solution of linear system through singularity over solution and finally, we will have a quick glimpse on iterative method of solving linear systems. First, the underlying definitions. Norm means size. When you talk of norm of a vector, we can basically talk of a measure of size of the vector and all of you are familiar with the ordinary Euclidean norm or two norms that is represented like this or in general situation, we can represent it with a subscript 2, which means two norms and that is simply the length of the vector x in ordinary geometric sense and that is x 1 square plus x 2 square up to x n square to the power half means square root. The way you take the distance from origin to x y z point as square root of x square plus y square plus z square in that same manner, this formulation works. This gives you the ordinary Euclidean norm or two norms. In this same sense, you can define the general P norm that is the sum of the p th powers of the absolute values of the coordinates x 1 to x n and then that sum is taken through a power of 1 by p means p th root of that sum of the p th powers of x 1 mod x 2 mod and so on. This is the general formulation in which insertion of p equal to 2 will give you the ordinary Euclidean norm with which all of you are familiar. In that sense, for other values of p, you can define other norms. In particular, you can define one norm, which is this the sum of the absolute values and you can also define the infinity norm, which is sometimes also called max norm and that is defined in the limit when p tends to infinity and that is defined in this manner and this is also called max norm because finally, what you get out of this is the maximum value of the coordinate from all the coordinates because as you raise x 1, x 2, etcetera through a large power, then the largest of these magnitudes dominates the sum and then when you take the p th root, then all others die in proportion and you recover the p th root of the largest, which is the largest coordinate and that is why it is also represented in this manner and that is why infinity norm is also known as the max norm. Even in this ordinary Euclidean norm, you could define the norm not directly as x transpose x under root, but through this relationship where you say that this is the norm defined with respect to a weight matrix w and you can represent it in this manner. The condition is that this weight matrix must be symmetric and positive definite, otherwise this function will fail to define a distance measure. Now, with these kind of definitions of norm of a vector, we can then proceed to define the norm of a matrix unless otherwise stated in all our dealings, we will be talking about the Euclidean norm, but then in the same context other norms also can be used. So, when you talk of norm of a matrix, we ask what is the job of a matrix, in what sense we can talk of the size of a matrix. The task of a matrix is to multiply a vector and give another vector, this vector has a size and this vector has a size, matrix A will possibly produce a magnification in the size. So, that magnification will be size of the resulting vector divided by size of the original vector and that in a way could be considered in some sense the size of a matrix, but then the problem starts when we find that on different vectors, this matrix the same matrix will produce different magnifications and therefore, to define the size of the matrix, we say that we will take the size of a matrix or the norm of a matrix as the maximum magnification that it is capable of providing. That means in this kind of a situation on all vector x, it will produce different kinds of magnifications and the maximum of those magnifications, we will take as the measure of the magnification produced by the matrix A. That means, we will define the norm of the matrix or size of the matrix as the maximum over all x of the magnification, size of a x vector divided by size of x or if you wanted to take all x's which are of the same size, then you could also define it in this manner. It is all x's taken of size 1, in that case this will reduce to the value 1 and then you define it in this manner. That is on a unit sphere, you take all vectors and out of that the vector which is magnified to the largest extent that would define the norm of the matrix. As a direct consequence of this, we will find that the norm of a x will always be less than or equal to the product of norm of the matrix defined like this and norm of the vector x, because the largest possible of this has been defined as the norm of the matrix. So, when the largest possible is sitting here, on the other side whatever you have that can be at most equal to this, most of the time it will be less. That is why as a direct consequence of this definition, we have this inequality. Another important term that we need to define is condition number. We already know that when the columns of a square matrix are linearly dependent, then we call that matrix as singular, its determinant is 0 and inverse does not exist. Even when the matrix is not exactly singular, we can talk of its closeness to singularity. If the determinant is close to 0, then we say that the matrix is close to singularity and there is a measure of closeness to singularity and that is given by the condition number or rather it is a measure of being away from singularity. That means that if this condition number which is the product of the norm of a as defined above and the norm of a inverse, then this product is called the condition number. If the condition number is 1, that means the matrix scales all vectors in equal proportions. In most of the cases that will not be the situation in which case the matrix will scale some vectors less and some vectors more. In that case, the condition number will be higher than 1. In the worst case where the matrix is singular, some of the matrices will be mapped to 0 in which case the magnification is almost like 0, but on the other hand there will be some other vectors which are magnified higher and in that case, you will have a very disparate spectrum of magnifications and the condition number, the product of these two norms will tend to infinity. For a singular matrix in the limit, the kappa will tend to infinity. So, that means that higher the condition number close is the matrix to singularity. Lower the condition number, its mapping is well rounded. Now, if all the vectors are mapped to equal magnification, then we call such a matrix as isotropic, iso equal tropic direction. The performance of that matrix in all directions is similar. In the other extreme, we have singular matrices that gives you the two limits of condition number 1 and infinity. In between, if we find very high condition number for example, 200, 400 something of that sort, then you call that matrix ill conditioned. Its health is bad. It scales some of the vectors to a very high magnification. On the other hand, some of the other vectors are scaled to very low magnification. That is illness, that is ill conditioned, ill conditioning of the matrix. In between, if you have small numbers as condition number 4, 5, 7, then you say that this matrix is nice behaving, who has condition you can say. Now, why this situation of large condition number is called ill conditioned? Why it is ill? In what sense? To see that, let us consider a small example. Here, we have two equations, 0.9999 x 1 minus 1.0001 x 2 equal to 1 and then x 1 minus x 2 equal to 1 plus epsilon. You can see that for epsilon equal to 0, that solution will be half minus half, because here you will have 1 and for epsilon equal to 0, half minus half will satisfy, half minus half will satisfy this equation exactly. In terms of epsilon, you can say that the solution is this. Now, note that with epsilon equal to 0, you have the solution as x 1 equal to half, x 2 equal to minus half. Now, we put epsilon equal to something, then you see what will happen? As you put epsilon equal to some small value, even a value of 0.0001 will change x 1 and x 2 drastically. Why? With a small value of epsilon, which is 0.0001, this get magnified here with 1001 and here with 9999 and that means, there will be a significant change in x 1 and x 2. That tells us that this solution of this system is very sensitive to small changes in the right hand side. With small changes in epsilon, there will be significant changes in the solution x 1 and x 2. So, this sensitivity to small changes in the right hand side is the ill-conditioning. If you try to find out this matrix, the condition number of this matrix, you will find that it is very large. You can see that, because the first row is very close to 1 1 and the first column is very close to 1 1 and the second column of the coefficient matrix will be very close to minus 1 minus 1 and they are obviously close to linear dependence. That means, the matrix, the coefficient matrix will be very close to singularity. That means, ill-conditioned according to our definition and that ill-conditioning manifests itself in this high sensitivity to small changes in the right hand side. Why this is dangerous? Because, all numbers that we get in practical situations are results of measurements or other calculations. Now, any measured data or any result of calculations will be susceptible to some small errors. Now, that will mean that with small errors in the data, the final solution will suffer badly and this is the result of that ill-conditioning. In this sense, it is ill. If there is a person who is not well, who is suffering from bad health, then small exposure to heat, cold or fatigue will put him to other illnesses and in that sense, this coefficient matrix will be ill. Apart from being sensitive to small changes in the right hand side, there will be another bad result of ill-conditioning of this matrix and that is in terms of validation of a guess. Suppose, somebody makes a guess of this solution of this system, then with a wide range of guesses, the equation will verification through that equation will tell us that it is almost right. That means, small mistakes in the guess of the solution will not be identified, will not be captured in this kind of a system. In order to see, let us, in order to see how it works, let us see the illustration. Here, the two lines, the two equations, those two equations in terms of lines in the x 1, x 2 plane are plotted here as 1 and 2. Their point of intersection is here. That is half minus 1, half as I told you. That is the correct solution. This is the reference system. That is the correct solution. Now, it is a small change in epsilon. The second equation changes. That means, the second line shifts parallel to itself. In this case, the line demarcated as 2 b is the shifted position and this 2 is the original dashed line. 2 is the original position of that second line. Now, see where the point of intersection is. Point of intersection is here because of a very small angle between the two lines. Through a small parallel shift of the line 2, the point of intersection suddenly jumps from this point to this point. And this is the result of small change in epsilon, large change in x 1 and x 2. That solution shifts by a huge distance. This is one difficulty. Through a parallel shift, the point of intersection changes suddenly. If we had tried to validate a guess which is not right, half minus half is the correct solution for epsilon equal to 0. But then, if we had decided that we will try a point 1 0 which is somewhere here. Even that point 1 0 is extremely close to both the lines. And here, if you try to put 1 0 as solution in the equations, you see what you get? 1 into this number minus 0. So, this is 0.99999 almost close to 1. Then 1 0, 1 minus 0 is 1 exactly right. That means, if you had tried to use point 1 0 as the possible solution as a guess, these equations will tell you that your guess is almost right. But from the figure, you know that half minus half is here. There is a correct solution and 1 0 is here. Not only 1 0, any point in this narrow zone guessed and verified will tell us that it is almost the right answer, though it is far away from the right answer. That is what you show see here in this shaded portion. Any point in the shaded portion will impersonate as almost right answer. There will be another interesting feature here. Rather than shifting the line 2 a little through epsilon, if you had rotated line 2, then that would mean that making changes here and here. Making this 0.99999 rather than 1 and making this 1.001 rather than 1 will mean slight change, slight rotation of the line. And if you anchor that line while rotating about this point, then you will get the two lines coincident. In that case, there will be infinite solutions exact singularity. On the other hand, if you anchor at any other point say at 1 0 while rotating the line, then you will get a line which is parallel. And in that case, you will find that there will be no solution another manifestation of singularity of the coefficient matrix. So, all these different qualitatively and quantitatively different scenarios can arise through small changes in the right hand side or in the coefficient matrix in case of an ill conditioned system, which will not be so if the condition number of the coefficient matrix is small. That is if the matrix is nice and healthy. So, analytically you see these things. Let us consider this system A x equal to b for which the solution is A inverse b and let us analyze the first variations. For this A x equal to b if we consider first variations, then we will say first variation of A into x plus A into first variation of x is equal to first variation of b. So, this is what you get as the first variation of the equation A x equal to b. From here, we will try to find out an expression for delta x and for that we take this first on the other side and then pre multiply overall with A inverse. As we do that, we get A inverse delta b minus A inverse delta A x that is here A inverse delta b minus A inverse delta A x. Now, this we use as our important relationship, which tells us how delta x changes with small errors in right hand side and small errors in the coefficient matrix. We consider the two issues separately. First, suppose the matrix A is exactly known. In that case delta A will be 0 and we have to consider the effect of delta b. Now, when we try to consider the effect of delta b only, then delta x norm we take the sides we take the sides. So, in the first case we are considering that the matrix is exactly known that means delta A is 0 delta b is something and we are interested in seeing its effect. So, delta x norm now we know that norm of this will be less than equal to norm of this into norm of this. So, this will be less than equal to the product of the norm. Now, both sides we can divide by this. Now, what we do? We consider the division and multiplication by mod of delta b norm of delta b sorry norm of b like this. Our intention is to compare the fractional error in x to the fractional error in the data that is b. So, we will compare mod of delta x by mod of x with mod of delta b by mod of b. This is our intention right. This we can write as these two have been taken care of. This we put here this has been taken care of these two remains right. What we do is that we multiply with this and we also divide by that. This multiplication division has taken care of each other and the rest of the two terms we supply back. This b norm here and this x norm here. Now, again remember that compared to norm of a into norm of x norm of b will be less. That is because of that same old relationship that is norm of x is less than equal to norm of a into norm of x. That is why this will be less than equal to this. So, that means that that is this whole fraction is less than equal to 1. So, that means that this side is less than equal to this which is equal to this which is further less than equal to this into this and that is shown here. That is this fraction is less than equal to this. What is this stuff here defined as condition number of a. That means we have this relationship. That means at the fractional error in the result x is limited is less than equal to kappa times the fractional error in the right hand side. Now if kappa is large if condition is large condition number is large. That means if the matrix is ill conditioned suppose the matrix is 1000. That will mean that by a small change in b small error in b the result can be erroneous by 1000 times that small error. If there is a 1 percent error in b in response to that as a result of that as a consequence of that the error in x can be 1000 percent. This is a result of ill conditioning what we saw in the registration. In the second situation if right hand side b is known exactly and delta b is 0 and delta a is something. Then when we try to do a parallel operation then see we reach the same conclusion. In this case we are taking delta b as 0 right hand side is exactly known. This may be something. So, the norm of delta x will be less than equal to norm of this into norm of this vector right. Again the norm of this vector will be less than equal to norm of this matrix into norm of this vector. That means the twice less than equal to further less than equal to those twice less than equal to things we get here also right. So, then again we will find that delta x norm will be less than equal to norm of this into norm of this into norm of this and this norm of x we have already divided here. So, that will mean that we will get here and as we multiply and divide again with norm of a then you see again what we have got here is this. This is cup of a condition number and this is fractional change in the size of a norm of a again here the fraction is the fractional change in size of x. That means again if condition number is something like 1000 that will mean that a 1 percent error in the matrix coefficient matrix values matrix entries or matrix norm rather will allow a 1000 percent error in the result in x and this is again the result of insensitivity. That means there is a sensitivity to the small changes in the matrix. That means if the matrix is ill conditioned if this cup is large that will mean that with small changes with small errors in the right hand side or in the position matrix you can expect large change or large error in the result. So, the solution that you will be finding numerically will not be quite reliable. Now we need to handle such situations and we know that when very large numbers and very small numbers turn up in the same computation together then the numerical errors grow faster and the resulting solution may not be very reliable. How do we handle such situations and try to find out solutions which are still robust? This is one issue that we handle in subsequent discussion. Another way other than ill conditioning another way in which a system of linear equation could be bad is by the matrix shape. If the coefficient matrix is rectangular that is if the number of equations is larger than the number of unknowns or vice versa then the standard gauss elimination type of processes will not apply directly. We know we can analyze the situation completely through the methods which we discussed earlier regarding whether solutions exist how many solutions are there and whether they are infinity and so on and try to describe them. But quite often you will find situations where irrespective of whether there are infinite solutions or whether there are no solutions you are required computationally to find one solution which is a good solution which is a representative solution and which is a useful working solution how to find that. So, first we consider two cases of rectangular systems one in which the number of equations is larger and the other in which case the number of unknowns is larger. In both cases first we consider the situation where the coefficient matrix is at least full rank which is relatively easier and afterwards we try to combine all sorts of difficulties together that is the equation system may be rectangular at the same time there may be ill conditioning and we try to see how we work out a solution still in the face of all these difficulties. First consider the system a x equal to b in which a is a matrix of size m by n m is larger that means more number of equations that means the matrix is actually a tall matrix thin and tall matrix more number of rows than columns the matrix shape will be something like this number of unknowns will be less and number of equations will be more this is the kind of system we are talking about that means less number of columns more number of rows rectangular with this kind of a shape this is the first case that we take up in which n is less than m. But then fortunately the rank of the matrix A is n that means all these n columns are linearly independent and that means its rank is n it is a full rank matrix later we will remove this special situation also and try to handle even a worst case. Now in this situation before applying all these methods we try to see that in only n unknowns there are so many m equations we do not expect them to be consistent we expect conflict as we expect conflict that means that mathematically solution does not exist. But still there are situations where we have to say that whatever little discrepancy whatever little conflict is there among the equations that is due to experimental error we do want a representative solution this kind of situations quite often arise in problems where we need to find the least square error we need to find a solution parameter set which are more or less right. But the data is erroneous because of experimental measurements in that case one quick remedy we can suggest and that is multiply both sides with A transpose. If we multiply both sides with A transpose then this transpose will be of this kind of a shape on this side also we multiply with A transpose and in that case you will find that the matrix that comes A transpose A will be a square matrix n by n size and the product here A transpose B will again be a small vector of size n small size. So that is what we get here through pre multiplication we say A transpose on both sides we will get this system which is a small n by n size and because of the full rank nature of the matrix this matrix coefficient matrix now A transpose A will be a non singular invertible matrix square matrix non singular matrix. So now we can apply our Scholesky decomposition kind of method which we study in the previous lecture and immediately get the solution and that solution is this. Now this multiplication pre multiplication with A transpose is not something ad hoc not something arbitrary that we did just in order to solve the system it has a much deeper meaning and in order to discover that meaning let us consider the question completely separately completely independently the question of minimizing the error norm. We say that we want to find out that x which minimizes the square of the error A x equal to B was the system of equation we expected conflict most probably there will be conflict. So the error will be A x minus B now we say we want to find out that x with which this error is minimum or the square of that error is minimum as we open this square of norm we find it is A x minus B transpose A x minus B which gives rise to this expression. Then we say that for this function to be minimum this error square to be minimum it first derivative with respect to x that is its gradient with respect to x must be 0 and when we find the gradient we find it is this and this equal to 0 is nothing but the equation system which we actually solve A transpose A x is equal to A transpose B. That means through this small trick of pre multiplying the equation system with A transpose on both sides what we have done is that we have filtered out a solution which minimizes the error square this is why the resulting solution x is called the least square solution and the matrix here from here to here before B is known as the pseudo inverse or Moore-Penrose inverse because in a way that is acting like an inverse because it is giving you something like a solution of A x equal to B through the multiplication pre multiplication of B something into B is the solution of A x equal to B that means that something is in a way acting like the inverse of A that is why it is called pseudo inverse it is not rightly inverse but it is some sort of an inverse it is called pseudo inverse or Moore-Penrose inverse in this case it is also called the left inverse because multiplication of this on the left side of A will give you identity now this is one case in which the shape of the matrix is like this. Now we consider the other case in which the shape of the matrix will be fat short that is the case where we will have m less than n that means you have got A like this. So again here in A x equal to B this will mean that you have got less number of equations in larger number of unknowns that means too many unknowns to determine this problem is typically indeterminate that is infinite of solutions you will expect if the rank is full as is shown in this particular case the rank is full m that means m only rows are there with too many columns n columns but those m rows which are there they are all linearly independent with that situation you will have infinite solutions for the system of equations but then there may be many situations in which you are not interested in those infinite solutions you want to find only one solution which in a way will be a very nice good useful solution for that how to find out one solution you want a square matrix. But then like last time you cannot multiply with A inverse on the left side A transpose with the left side on the left side because here matrix A is of this shape x is large if you multiply with A transpose then you will be essentially multiplying with a matrix like this and as a result you will get a matrix which will be of this size huge matrix you will get. So, this matrix was m by n with m less n larger this is n by m and the resulting matrix A transpose A that you will get will be n by n which is large and this matrix is sure to be singular because this matrix of rank m and this matrix of rank m in product will never give you a matrix which is of rank higher than m. So, at most it can have rank m and which is less than n that means this matrix will be certainly singular and your method like gauss elimination or l u decomposition will certainly fail. So, this is not the way in which you should go in this kind of a situation. So, what to do you apply a different trick in this case what you must do here is not to pre multiply with A transpose, but you look for a vector lambda m dimensional vector lambda that satisfies A transpose lambda and as you do that then A transpose lambda equal to x that means you look for an m dimensional vector lambda which will satisfy A transpose lambda equal to x and then in place of x you try to insert this A transpose lambda. That means that here in place of x you try to insert A transpose lambda and what is this A transpose? A transpose is the transpose of this matrix that means transpose of this short fat matrix will be a long tall thin matrix and the product of A A transpose will be again short of this size square size small matrix which is m by m. So, then we have this system A A transpose lambda equal to b and rank of A is m that will mean that this matrix A A transpose is symmetric and non singular that is positive definition and then you can solve this and get lambda that will be A A transpose inverse b that is from here to here will be the solution and that lambda when multiplied with A transpose will give you x. So, you get this solution which is somewhat different from the earlier solution in which you got A transpose A inverse post multiplied with A transpose here that is p multiplied with A transpose earlier it was A transpose A inverse post multiplied now it is A A transpose inverse pre multiplied with A transpose. So, this is the solution now again last time we found that the particular trick with multiplication with A transpose had a meaning in terms of minimization of error what is the meaning here to discover that meaning again you consider this small optimization problem this is a constrained optimization problem you try to minimize this size of the solution itself x mod square half of that subject to these constraints these equality constraints which is actually the system of equations that you wanted to solve. So, here because of infinite solutions of this system you are asking for that vector x which satisfies this equations as a required condition and among all the infinite solutions it tries to find you the smallest size solution. If you try to formulate this constraint optimization problem one possible formulation is to find the extremum of the Lagrangian that is Lagrangian is this objective function minus lambda transpose A x minus b and the extremum of this Lagrangian is found at that point x and lambda where the derivative of this with respect to x and the derivative of this with respect to lambda the corresponding gradients are both 0. When you apply that condition through differentiation of this and those gradients you find then the gradient with respect to x gives you x minus A transpose lambda equal to 0 that means this which is what we assumed here and the gradient with respect to lambda equal to 0 tells you A x minus b equal to 0 which is the original system of equations and these are the two systems of equations the solution of which has been found here in this manner. That means through this small trick you find that x which is the minimum size x that satisfies the system this system of equations apart from many others which you are not bothered with. Now this particular solution gives you the foot of the perpendicular on the solution plane that means a lot of solutions are possible for this system because this system actually defines a plane like entity in the space of x and out of all the points that you can take on that plane like entity the one which is closest to the origin the size of which is the least that is actually the one which will minimize this subject to this constraints. The constraints come in the equation of the plane and this minimization actually gives you the foot of the perpendicular to that plane and this matrix A transpose A transpose inverse which is the entire thing working like some sort of an inverse that is the pseudo inverse in this case or the Moore-Penrose inverse or in this case it is called the right inverse because from the right side if you multiply with this matrix to A then the result is identity. Now we found two situations in of rectangular systems rectangular systems of equations in one case number of equations is more in other case number of unknown system. Now in the most general situation where it could be either this way or that way compounded with the additional problem of ill conditioning that is rank of A is not even m is less in the other case rank of A is not even n even less that will mean that you have got the problem of the system being rectangular combined with the additional problem of rank deficiency or singularity of A transpose in this case and A transpose A in that case compounded with that could be the ill conditioning issue that means the most general situation this is the situation quite often encountered in ill pose problems in which the statement of the problem is not enough not completely clear and in that in that situation we still can find out a solution which is a very good working solution and which is quite robust not so sensitive and that is called the singularity robust solution. The method to do that is called Tikhonov regularization and this particular technique gives you a recipe for any linear system m greater than n or m equal to n or m less than n with any condition it could be well condition it could be ill condition or it could be even singular in all cases one recipe can be useful with a little additional computational cost which will and with a little extra error but how much is the error that is in your hand but a single recipe will be able to solve all cases of sizes and shape and with any condition the way to handle that is to first make the observation that the system of equation A x equal to b may have conflict. So, in order to handle that conflict you pre multiply with A transpose and get this so called normal system of equation which is certainly consistent now then you say that still the coefficient matrix that we have in hand A transpose A that may be singular or that may be ill condition then what you can do is that in order to handle this equation this matrix which may be ill condition you rig the system of equation add a little new square in the diagonal entries that is equivalent to adding new square identity to this matrix. So, add you as you add this you enrich the diagonal entries by a little amount now you see that A transpose A even in the case of ill condition or singularity would be at least positive semi definite that means its value for any x with x transpose this matrix x would never be less than 0 in the worst case it could be equal to 0 now with this addition it will not even be 0. So, now this matrix this coefficient matrix is symmetric and positive definite you can prove that you can prove this that this matrix is symmetric and positive definite in the book some of the one of the one or two of the exercises advise you to prove this particular statement that this matrix is always symmetric and positive definite. Symmetry is of course visible positive definiteness you can prove now if it is symmetric and positive definite then ordinary methods of composition or any other method will suffice to give you a solution of this system and that means a value of the vector x. So, the idea of this method of techno of regularization is that expecting or being concerned of some level of ill conditioning in the system you first immunize the system with this little dose of error that is the little price which will result in a slight amount of error how large you will allow that will depend upon this magnitude of nu you can choose nu to be as small as you like you can choose it to be 10 to the power minus 3 10 to the power minus 6 that is based on your understanding of up to how much level your computation is going to proceed without any trouble based on that your error value will be large or small now with a little immunizing error here you make the system solution process safe there will be no further difficulties. So, the issues are the choice of nu which is in your hand and another small issue is that if m is smaller than n that is the number of equations is less then rather than handling an n by n matrix here you could get computational advantage by considering an equivalent system rather than solving this you could rather solve this rather than n by n system here you will be then solving an m by m system get lambda and then evaluate it separately. So, if m is significantly smaller than n then this alternative way will give you an advantage computation less number of computational operations and you can show that this calculation will result in the same final result as this calculation there is a small exercise which I leave for you apart from these methods which are non iterative methods in which you do not need any initial guess there are a few methods which are quite useful in many application problems they are iterative in nature. Two iterative methods for solving linear systems are quite well known one is Jacobi's iteration method and the other is Gauss-Seidelic method now in both of these methods you start with a guess and through iterations you try to improve at every equation you change one of the unknowns and through iteration after iterations you try to improve the estimates of the unknowns. Now, the iterations will result in improvement only when we organize the equations in a particular manner and that particular manner is related to the concept of diagonal dominance. When the diagonal entries of the coefficient matrix are significantly larger than the off diagonal entries then you call that kind of a matrix as a diagonally dominant matrix and in that case these kinds of steps will actually lead to improvements in the values of excel. These methods work very well in the case of the availability of good initial approximation a situation which quite often happen in solutions involving the linear equations that arise from differential equations and therefore these Gauss-Seidelic and Jacobi's methods are quite often used in the methods for solution of differential equations and these methods in general are called relaxation based methods. So, in this chapter in this lesson we have considered these important points that is we have noticed we have noted that the solutions are unreliable when the coefficient matrix is ill conditioned and we have found that finding pseudo inverse and finding a quick solution for full rank coefficient matrix even rectangular one is easy and when that is not the situation then we have found the efficient regularization method to find singularity robust solution and finally we have seen the iterative method which may have an edge in certain situations where good approximations are available. Now I will take you briefly to the original contents of this course in these two lectures that we have covered till now we have covered these chapters these lessons which complete one module of our course that is the module of systems of linear equations. Next we will be taking up in the next lecture we will be taking up the problem of algebraic Eigen value problem that will go through 1, 2, 3, 4, 5, 6. So, these 6 lessons we will cover the algebraic Eigen value problem and here you will note that we have covered quite a few topics in the system of linear equations and if you have covered these lessons extremely fast then I will remind you that without going through the exercises will sometimes make you lose contact with the subject matter that we are discussing and I will draw your attention back to this tutorial plan that means in the chapters 2, 3, 4, 5, 6, 7 in the book that we have covered till now in these few lectures these are some of the problems in the exercises which you must attempt in order to keep the pace of understanding at a good position and these are the problems which are particularly important which are flagged here as possible tutorial problems. So, try to cover as many problems in the exercises in the book as possible and certainly including these with special attention to these problems which are flagged here as tutorial problems. In the next lecture we will make an introduction to the Eigen value problem and continue forward. Thank you.