 In our last lecture, we were looking at matrix conditioning. So matrix conditioning allows us to separate bad matrices from good matrices and we know when the calculations can go wrong because the matrix is bad and we are able to make a judgment on the quality of solution for linear algebraic equations. So I gave a very simple example polynomial approximation. Some continuous function you are trying to approximate f of x, f of z some function which you are trying to approximate and then if you develop an approximation which is polynomial approximation then in polynomial approximations we have shown that you get equations of the type h theta is equal to u. So h is the Hilbert matrix, theta is the parameter vector parameters of your polynomial coefficients and u is the depending upon how you have formulated the problem. You will have, you will be a vector which is a finite dimensional vector and this is exactly like solving A x equal to b, x is theta, b is u and A is matrix h and what I showed you last time was that h 3, h 3 is Hilbert matrix which is one half one third one fourth and well I will just write this here. So this is my h matrix and then I took a solution here my right hand side was 11 by 6, 13 by 12 and 47 by 60 when theta is equal to 1, 1, 1 transpose when theta is equal to 1, 1, 1 unit vector containing 3, 1, 1, 1. Your right hand side will be this and this is the exact solution. I just showed you how things can go wrong even for this matrix for which the condition number is not so bad. The condition number we found here was 408, no condition number we found here was 748. So for this particular matrix condition number c infinity or c infinity of h 3 was 748. I just showed you how things can go wrong instead of solving this problem instead of solving the original problem we solved slightly modified version of this problem which is 1.5, 0.333, 0.5, 0.333 and 0.25. This is my this matrix I would say is my h 3 plus delta h 3. This matrix has a slight error right? This matrix has a slight error as compared to the original matrix h 3. So if I start doing computations with this matrix and instead of this u if I take slightly perturbed u which is 1.83 actually I have done is I have just truncated the fractions okay just truncated the fractions 1.08 and 0.783 okay. So this is my u plus delta u I just change I am just calling this u plus delta u because I am truncated I decided to truncate and write the same equation. My solution changed so drastically 1.09. So this is my theta plus delta theta okay this is my theta plus delta theta. This is for a matrix for which you have condition number of 700 okay. So actually if I try to estimate what is the fractional change that we have made on in h matrix or in u matrix it is of the order of 0.3 percent okay of the order of 0.3 percent change, 0.3 percent error I can take I can find this error using norms of delta h3 by norm of h3 okay or norm of delta u divided by norm of u I can find out what is the percentage change if it is of the order of 0.3 percent okay but the solution changes by almost 50 percent drastic change in the solution just because and here whatever you try to do you do maximal pivoting you do whatever reordering all the calculations you will get into trouble that is because this matrix is ill conditioned okay. Now the example that we gave earlier that I gave you earlier 2 cross 2 example might lead you to believe that it is something to do with singular matrices this is nothing to do with singularity. What matters here is the condition number condition number if you take in terms of 2 norm okay condition number is ratio or square root of ratio of the singular values of the matrix largest singular value by smallest singular value find angle values of A transpose A and take ratio of maximum Eigen value of A transpose A by minimum Eigen value of A transpose A find the square root that is what matters okay if that ratio is small okay if that ratio is small matrix then the matrix is well conditioned that ratio is large okay individual Eigen value may not be equal to 0 but if that ratio of the largest Eigen value of A transpose A to smallest Eigen value of A transpose A if that ratio is large matrix is ill conditioned and that is what can that can create problem for you using any sophisticated program or any sophisticated computer it will create a problem for you. So that is inherent problem with the matrix not the problem with the computer or the program okay. So there is one more example which I have given here to I will not write the numbers because the numbers are very small but I have tried to show here in this example is that very very simple example I think I demonstrated this to you this A matrix okay this simple A matrix 1, 2, 3, 4, 5, 6, 7, 8, 9 okay this is not called by any name just this particular matrix is highly ill conditioned okay condition number of this matrix it appears what is there you have just written 1 to 9 numbers in the particular sequence the condition number of this matrix if you ask the condition number that is C2 of A okay so which is square root of lambda max of A transpose A by lambda min okay lambda min of A transpose A this is this turns out to be 3.81 into 10 to the power of 16 okay condition number of this matrix is very very large you can do a simple experiment in matlab or sila or any software take this matrix okay find its inverse well matlab will give you a warning this matrix is highly ill conditioned the results may not be reliable and you can check that if you find inverse of this matrix and what should happen if you find inverse of this matrix and multiply with it the matrix itself you should get identity matrix okay if you do that that experiment in numerical experiment in matlab you will get matrix which has nothing to do with identity matrix you will get some some other matrix you get you get numbers like 2, 8, 18 when you multiply A into A inverse for this matrix because this is highly ill conditioned matrix okay and then I have given one more example one more example so what I want to stress here is that inherently a given matrix is you know every matrix will come in come with its own characteristics and then that will dictate how the calculations proceed okay and you should be able to recognize bad matrices or ill conditioned matrices okay this is one more matrix I have shown here okay for this matrix if you do A into A inverse you can do that experiment in matlab you will get you will never get identity matrix but I will give you another matrix in 10 to the power minus 17 1, 2, 1, 2, 1, 2 I have taken this matrix okay now you might say that this matrix is it is it like a null matrix 10 to the power minus 17 into 1 10 to the power minus 17 into 2 okay it looks like a null matrix right all the all the elements of this matrix are close to 0 though I have written here 1 to 1 it is multiplied by 10 to the power minus 17 okay now if I do inversion of this matrix okay now if I do inversion of this matrix and then multiply inverse of this matrix into B matrix matlab will give you perfect identity matrix why condition number of this matrix even though this looks this is a this is like a null matrix all elements are close to close to 0 okay the condition number of this matrix C to B it turns out to be 5.474 very well conditioned matrix no problems with the calculations okay that is because if you take B transpose B find out its maximum eigenvalue minimum eigenvalue take a ratio that will come out to be this and square root of that square root of that it will come out to be this so this means inherently you are not going to get into any trouble when you do calculations with this matrix which is close to null matrix its eigenvalues are very close to 0 okay that does not matter what matters is the condition number what matters is the ratio of maximum to minimum singular value square root of that is what dictates the calculations okay so with this we come to an end of discussion on linear algebraic equations we looked at many things we looked at I suppose you have learned much more than what you know about solving linear algebraic equations as compared to your undergraduate courses on AX equal to B you probably thought you have you knew everything about AX equal to B right Gaussian elimination and then you are done but what what you see here is far far more than what you know we looked at sparse matrix methods okay efficient way of calculating that too I just could cover a few of them just to give you a taste of what it is it is much more to it there is a sparse matrix toolbox in MATLAB or SILAB you know there are many many routines which exploit spatial structure of a matrix and do fast computations the reason for introducing sparse matrix was to sensitize you that there exists something called sparse matrix computations so in your problem in your m tech problem or PhD problem when you hit upon large scale matrices try to look for sparse matrix you know try to exploit sparsity if you can you can make your computations very very fast okay then you know next thing we looked at was iterative methods iterative methods like Jacobi method Gauss-Seidel method and so on okay iterative methods in general it is difficult to prove but in general they work faster than you know these Gaussian elimination particularly for large scale matrices we looked at two classes of iterative methods one was Gauss-Seidel those kind of methods the other one was optimization based gradient method conjugate gradient method and so on so we looked at two classes in particular for Jacobi method and Gauss-Seidel method those kind of methods we also analyzed the convergence behavior when are you guaranteed to converge to a solution of ax equal to b okay so we looked at convergence properties we also looked at how to tweak my problem to ensure convergence okay so now we have broadened our base our toolkit for solving ax equal to b we have many more methods now for solving ax equal to b moreover we know what really matters in iterative methods eigenvalues eigenvalue problem you know probably unexpectedly propped up when you try to analyze this it is not really unexpected it's actually problems have come when you try to analyze linear difference equations and then we related you know spectral radius we related the maximum magnitude eigenvalue of certain matrix if it is inside unit circle we said that we are guaranteed conversions and so on we also looked at some theorems to ensure convergence right so and then finally we moved to this matrix conditioning try to weed out good matrices and bad matrices weed out bad matrices from the set of matrices so we know now how to how to you know judge whether a matrix is bad and that's why the comp that's why you are getting wrong answers or big you know or your you know problem formulation the strategy for computing is bad and matrix is good but you have made mistakes so you know where where how to distinguish between these two so with this now you have a good idea of how to deal with ax equal to b now let's move on let's move on to solving nonlinear algebraic equations so that's what I'm going to do next I uploaded my notes okay for this so we now start with the next tool so let me draw the diagram again that this to bring you back to the entire theme of this course we have this original problem we have this mathematical model and some problem which we cannot solve directly so we use this okay we had a mathematical model and some original problem that we wanted to solve so using approximation using approximation theory we transform this problem to a computable form okay and then we said we are going to look at four different tools or there are four different approaches typically to solve this problem I'm going to cover three of them so one is one is solving ax equal to b could be using this tool to solve this problem okay or I could be using solving f of x equal to 0 it's quite likely that to do this I might be using this Newton's method I'm using ax equal to b to solve okay so this could be directly being used or it could be indirectly being used do not the third tool which the third tool is OD initial value problem solvers IVP solvers all this Euler method Runge-Kutta method so the third one which we are going to look at is that IVP solvers and of course the fourth tool is the stochastic tool which stochastic tool I'm not going to look at the stochastic tool so this one this one we are done with I'm moving to this tool now and towards the end of the course we would be covering this this will be left untouched because it probably would need one more course to cover stochastic tools and what do you get here is the approximate solution this is the approximate solution for the original problem that you get so this is done we are moving to this okay eventually we'll move to this and that is end of the course so this is this is the overall structure just to get back on the you know just to give you a global picture of what has been happening okay so now let's move on to f of x equal to 0 solving nonlinear algebraic equations we have already done something about this we have already derived Newton's method starting from Taylor series approximations right now you might wonder well I know Newton's method what's there why do I need many more things but just like Gaussian elimination is one way of solving linear algebraic equations you will realize that Newton's method is just one approach there are many ways of doing it and the reason why there are many ways of doing it is because there's no method which is panacea you know one method which works for everything sometimes one approach works better sometimes the other approach works better so you have to you have to be ready with multiple tools and you know use appropriate tool whenever whenever required in some cases you don't require Newton's method some cases it is not possible to apply Newton's method because Newton's method requires Jacobian calculations if I have hundred equations in hundred unknowns okay you have seen that kind of scenario in solving partial differential equations developing a matrix even if it is numerically developing a matrix 100 cross 100 at each iteration it's painful it's computationally intensive and just imagine if you are trying to solve steady state simulation of a complete chemical plant thousands of equations to be solved simultaneously right thousands of equations to be solved simultaneously if you are trying to simulate a section of a plant many many thousand equations non-linear algebraic equations to be solved simultaneously if you have to compute Jacobian even numerically okay it is not an easy task okay so we have to see what has happened is as computers have become more and more you know powerful we are also trying to solve problems which are larger and larger problems okay 25 30 years back probably nobody thought of solving thousand equations in thousand unknowns in a classroom now you can do it as a part of your assignment okay which would be probably an m-tech thesis sometime back so things have changed because of you know what we want to solve with growing power of computers also has changed so there are problems which earlier with slow computers would take days to solve well now also there are problems which take days to solve except what you are trying now is different from what you are trying earlier okay so even with very very fast computers and very fast good software you still have problems which are and that will that is there is no end to this you know still just keep on growing okay so now let's look at different methods for solving non-linear algebraic equations okay so I can now just work with abstract forms because you have seen many many examples where you have to solve non-linear examples equations so my intention is to solve fi x equal to 0 where i goes from 1 to n x belongs to Rn okay or now we are comfortable with the notion of a function vector or I can write this into function vector fx is equal to 0 same problem right fx equal to 0 where f is a map from Rn to Rn more sophisticated way of writing the same thing is that f is a function vector okay you are trying to look for that value of x where f of x will give you 0 vector this is 0 vector f of x equal to 0 f is a map from Rn to Rn n dimension to n dimensions okay so these are the kind of equations that we are interested in solving okay what would be what would be the simplest method so first of all you know now well for solving non-linear algebraic equations except for some very very special cases where you can solve them analytically if you remove those you know small set of problems where for example you can solve multi-dimensional quadratic equations simultaneously you can construct just like you can solve one-dimensional quadratic equations simultaneously you can solve multivariable quadratic equations simultaneously but these kind of analytical solutions are very very few in general even if you have a polynomial of nth order in one variable you cannot solve it analytically okay it's very difficult to construct solutions or roots of that equation okay so we need methods that can solve non-linear algebraic equations well one thing I would say is that if there are methods which require less computations better it is now first of all let's look at methods which do not require derivatives derivative calculations okay I want to solve f of x equal to 0 without having to compute derivatives or even if I have to compute derivatives I can do it in some simple way rather than computing entire Jacobian so I am going to give you a gradation of methods okay finally we will of course move to Newton's method but in Newton's method you know the the problem step in terms of large computations is Jacobian calculations okay so there are methods which can do what is called as a Jacobian update okay so Jacobian updates are do not require explicit differentiation they try to construct an approximate Jacobian by using last value of the Jacobian and adding some correction because you have moved to a new point okay these methods broadly called as broaden updates or quasi-Newton methods is also what we will look at okay so the first method class of methods I am going to call these are known as well in iterative schemes everything is successive substitutions but this class is also specially known as successive substitution methods what I mean here by successive substitution methods is this subclass of methods by which you do not have to compute any derivative okay that's what I mean right now in general every method that we are looking at iterative method is successive substitution okay so the question is can I arrange can I arrange my calculations in such a way that I start with some initial guess x0 okay I start with some initial guess x0 and then I generate a new guess then I generate a new guess from the old guess okay I want to solve for f of x equal to 0 okay in some situations in some problems for example tubular reactor with axial mixing that problem which we have been taking as a theme example throughout the course okay you can rearrange the these equations f of x equal to 0 into a special form a x is equal to g of x where a is a constant matrix a is a constant matrix and g is a some non-linear function so actually so actually if you want to see what is f of x f of x is nothing but a x minus g x okay trying to solve f of x equal to 0 in this particular case reduces to this problem in some cases like tubular reactor axial mixing or some other is you might get naturally this kind of a form okay another way of creating this form is just you know add x on both sides if I add x on both sides and call this as g of x okay so this is x is equal to g of x or I could do in general more some matrix here some matrix here bx is equal to so the you know the form is same the form is same so I can either do this transformation or in some cases the problem discretization will yield this kind of a form okay depending upon what kind of structure the problem has so you get this the special form now what can I do with this so if I have this special form x equal to x is equal to g of x okay I could convert this into solving linear algebraic equations okay by a very very simple trick so if I solve for if I start with some initial guess that x0 is my initial guess x0 is my initial guess okay then what I am going to do I am going to solve for ax k plus 1 is equal to g of is everyone with me on this see I start with x0 I start with x0 if I substitute x0 here I can compute this g of x this is known to me what is not known to me xk plus 1 x1 is not known to me but then it becomes a linear algebraic equation this is this is b this is a this is a ax equal to b I can solve for xk plus 1 then you know I can solve for xk plus 1 using any method of ax equal to b Gauss-Seidel elimination or Gauss-Seidel method or whatever whatever so so this will generate an iteration and when do you terminate what is the advantage of doing this I am not computing Jacobian I am not computing Jacobian so I will terminate my iterations when xk plus 1 minus okay so this is less than some epsilon I will terminate when this becomes less than epsilon I will terminate my iterations okay this method in general it looks very simple to formulate no Jacobians okay you can compute well this method in some cases does work and when we move on to you know implicit methods for solving OD initial value problems we will see merit in using this method okay what is very very critical here is that this method will converge if you give a initial guess which is very close to the solution okay when will this method converge how will it converge we will postpone that discussion to a little later part I will discuss about that towards the end at least I will mention about it though we cannot go too much into details this particular this particular method if you give a good initial guess okay reasonable initial guess this method will converge to the solution okay generating good initial guess may not be always possible particularly for large problems if you have solving simulation for you know steady state simulation of an entire section of a plant generating initial guess is no joke it is quite difficult okay so it might be difficult to use it there but in some small problems where you can generate initial guess quite well for example implicit Newton method or trapezoidal rule where you can use explicit method to create a good guess for the implicit method this method will work quite well okay so this kind of now while implementing these kind of methods I can also have variations which are similar to Jacobi method which are similar to Gauss-Riedel method which are similar to relaxation method so I am now going to talk about variants of successive substitution method which are like Jacobi method or which are like Gauss-Riedel method okay so when I do that I cannot of course use this vector matrix rotation what we did in Jacobi method we went equation by equation remember we went equation by equation so the same thing I am going to do here so I go back to my original form so instead of writing the equation that I want to solve I am going to rearrange into this form x i is equal to g i x for i is equal to 1 2 okay where g of x is nothing but g 1 x g 2 x this small g is nothing but one element in the function vector okay I am looking at element by element I am looking at element by element converting into this form is not difficult I can pre-multiply both the sides by a inverse so it will be x is equal to okay removing a matrix is not a not a big deal okay so now suppose I have this equation and then how will you form Jacobi like iterations my Jacobi iterations will be x i k plus 1 is equal to g i of x k okay I am going to use the old value and create new value okay for i is equal to 1 2 n how will you create Gauss-Seidel like iterations use the new value as it get created okay so so Gauss-Seidel iterations is a concept you can use it in context of linear algebraic equations you can use it in the context of non-linear algebraic equations to understand the concept you can do relaxation iterations same same ideas okay so my first equation will be x 1 k plus 1 is equal to g 1 x k this is my first equation okay my second equation that is x 2 k plus 1 will be g 2 now here I will use x 1 k plus 1 okay I will use x 2 k x 3 k x n k well unlike the linear algebraic equation x 2 will appear on both sides because these are non-linear equations you may not be able to separate them what will be my x 3 k k plus 1 this will use g 3 x 1 k plus 1 x 2 k plus 1 x 3 k is this clear see as and when the new value gets created I am using it in the next next equation I am solving n equation equation by equation okay I am solving equation by equation okay one equation at a time okay this would be Gauss-Seidel iteration okay and I can write a generic form for this okay for ith case you use new values up to i minus 1 and old values for i to n okay so you can write a generic form for this how will you create the iteration for relaxation method x new will be x k plus omega times omega times the Gauss-Seidel step where omega is greater than 1 or less than 1 depending upon here it is difficult to say whether the convergence will occur between 0 to 2 and all that it is not possible to say here okay in linear case we could say that the we could give necessarily sufficient conditions for convergence it is not possible to do that here okay so inherently you know because you are using the old value new value every time it generated one would expect that this Gauss-Seidel iterations will converge faster than the Jacobi iterations and so on so this iterations would be better in terms of convergence properties than this is this cannot be you know proved but at least you can you know hope that this correction will so one can one can devise relaxation iterations here by saying x i k plus 1 is equal to x i k plus omega times so you make one Gauss-Seidel iteration choose an omega which is positive omega greater than 0 and create a new create a new guess which amplifies which amplifies the change predicted by the Gaussians step Gauss-Seidel step and so on so one can one can have all these kind of variations here okay so these methods these methods are advantage of these methods is that no gradient evolution no Jacobian calculations okay the flip side is that they will converge if you have a good initial guess okay if it if without gradient calculations if you have a good initial guess and if it works great you know you are able to save on computations you can do solve the equations very fast if not you will have to go for gradient based calculations now I want to talk about one method which is in between okay this method is this method will use gradient evaluations okay but will not do the full Jacobian okay will not do the full Jacobian it only calculates some gradients and so this particular method is called as Wexstein method or multivariate secant method okay so let's let's move on to now gradient based methods okay so in the class of derivative based methods we already looked at Newton's method okay now I am going to revisit Newton's method for the univariate case why I am looking at this will become clear soon because of because I want to talk about this intermediate method called Wexstein method so the motivation comes from univariate methods so univariate method Newton's method if I have f of x equal to 0 where x belongs to R I want to solve one variable equation f of x equal to 0 okay Newton's method of course if this function is differentiable you can write x k plus 1 is equal to x k minus f of x k divided by I can write this as f of x k by f prime x k derivative of f with respect to x okay we can have a slight variation of this method this is the classical Newton's method there is a slight variation of this method called secant method in secant method what we do is this f prime x k we approximate this derivative we approximate this derivative using last two iterates okay so we approximate this as f x k minus f x k minus 1 divided by this small variation is called a secant method where this f prime x k is replaced by an approximation of the derivative okay here this is not in a true sense a good they are not be a good approximation because delta x x k minus x k minus 1 need not be small okay so this may not be a good approximation but this method works quite well for many simple problems so this variation is called a secant method where now to kick off the secant method you need two initial guesses not one initial guess you need two initial guesses x 0 x 1 then you can create the next get x 2 starting from x 0 x 1 because this gradient calculation this gradient calculation will require x 0 x 1 then you compute the gradient from two initial guesses and then you can move on to the you know the x 2 then from x 2 x 2 x 1 you can create x 3 from x 3 x 2 you can create x 4 and so on okay so you start with two initial guesses x 0 x 1 create x 2 using x 2 x 1 create x 3 and so on okay one more variation of this method called as regular fall see probably all these method names which I am talking about right now might be familiar to you from your BTEC background because one variable Newton's method secant method and regular fall see are typically taught in the undergraduate curriculum the slight variation okay this is based on observation that if you have this is my x and I am plotting f of x okay f of x has some behavior like this I am looking for this point where f of x equal to 0 right I am looking for this point or I am looking for this point where f of x is equal to 0 I am looking for roots of the equation f of x equal to 0 which means I am looking for point where f of x reduces to 0 value of x for which f of x becomes 0 okay now one observation is that whenever there is an interval in which f of x has positive sign on one side and f of x has negative sign on the other side f of x crosses 0 if f of x is a continuous function okay it will cross 0 somewhere at least once it may be multiple times okay we do not know but it at least once it crosses 0 okay so this regular fall see method actually tries to use this idea and makes a modification to secant method okay so it tries it starts with two initial guesses okay so it will start with x0 and x1 such that function evaluated x0 and function evaluated x1 have opposite sign okay evaluated x0 and x1 have opposite sign and then as it does proceed in the calculations it tries to maintain this okay it tries to maintain the fact that the two successive guesses should always have function values which are opposite sign if that is maintained then convergence to the solution can be faster okay so this regular fall see so this is where you know f of xk is greater than 0 and this is where f of xk is less than 0 if you have a scenario where you know function value changes sign from positive to negative this is only true for a one variable function not for difficult to say something like this for a multivariable function for one variable function multivariable function vector I am talking of one variable scalar function changes sign at two different points and there is a root somewhere in between okay that is the idea so the modification here is that I will just write down this modification here because okay you carry out the iterations by this formula if it is less than 0 and a second case is so whether you use xk or whether you use xk-1 when you move forward okay to compute the derivative approximation okay that will be based on the sign of f of xk-1 which you get okay so when you go for the new iteration calculations you keep checking the sign and based on the sign you make a judgment as to how to proceed further so this is regular fall see approximation now what I am going to do next is use this univariate method I am going to use univariate secant method and create a multivariate secant method okay this multivariate secant method is called as Wexstein method the advantage of multivariate secant method is that number of Jacobians or the number of derivative calculations is very very small equal to number of equations whereas in Newton's method the classical Newton method you have to compute the full Jacobian n cross n elements okay which can be quite large okay so if you see this software is like Aspen plus they seem to be preferring this Wexstein method which works quite well which is multivariate secant method so we will look at it in our next lecture