 So, we have been looking at methods for solving non-linear algebraic equations and initially we looked at this successive substitutions very, very briefly so which are derivative free methods no gradient calculation and then I moved to this univariate Newton's method basically a variant called secant method and then the motivation for doing this is to look at a method which is intermediate between univariate method and a multivariate method. This method is called as Wegstein iterations and then we will move on to modifications of the Newton's method. Now the derivation of Newton's method we have already done using Taylor series expansion so this is application of Taylor series expansion but there are many more modifications which are useful for implementing in a practical scenario so I will discuss about them today okay. So first is this secant method so secant method I want to solve for f of x equal to 0 x belongs to R univariate problem I want to solve for f of x equal to 0 and the way this is done is so this is update rule for the Newton's method and then in secant method you just approximate this you replace the derivatives not you replace this derivative at is approximated as f at x we approximate this instead of using exact derivative you use last to it the current the last two iterations and then you find the next guess so to kick off secant method you need two initial guesses and now what I want to do is come up with a multivariate analog of secant method okay so now my problem is so this is this is multivariate secant method so it is known as wextine iterations or wextine method and what I am going to do now here is I want to solve for f of x equal to 0 where x belongs to Rn and f is a n cross one function vector so this is n cross one function vector okay this is let us say there is some way by which we can arrange this set of equations so we essentially have fi x equal to 0 x belongs to Rn okay I going from 1 to okay and let us say we have some way of arranging this equations into some way of arranging these equations as xi or x we have let us say some way of arranging this equation as xi minus gi x equal to fi x equal to 0 okay I have some way of arranging this into this kind of a form xi minus a simplest way could be just add you know if I start from this if I start from this if I add xi plus fi x if I rewrite it like this I could call this as gi x I could call this as gi x okay so basically I want to rearrange it as some xi minus okay this is just adding and subtracting xi on both sides okay so this why I am putting into this form because a particular way this method is implemented that is the reason why I am putting into this form but what you will see soon is that this method is nothing but a multivariable analog of the secant method okay so now what I am going to do is something like this okay I am going to define this si I am going to define this slope okay not exactly partial derivative but this is some kind of a crude derivative of g si is a crude derivative of g with respect to xi okay is a crude derivative of g with respect to xi or the rate of change with respect to xi that is how you can look at it okay okay now I am going to apply the secant method I am going to apply secant method to ith equation okay I am going to apply secant method to the ith equation I have these equations fi x equal to 0 okay just for the time being keep this si this side okay why am I defining this si will become clear soon now I am going to apply okay now look at this this is ith component of x vector okay the new one is old value of the ith component plus a correction which is like secant method applied only to the scalar variable sorry applied only to the scalar function fi fi is a scalar function f capital F is the vector function fi is the component of this the vector function f okay so I am taking one scalar function I am one taking one scalar function ith scalar function okay and this is if you if you look carefully this is fi xk minus fi xk minus 1 divided by this 1 upon that is become right so this derivative has been computed basically this derivative has been computed for the ith this derivative has been computed for the ith function ith function with respect to ith element in x okay and that is how I am going to generate if you do this if you do this this is nothing but the vexteen method if you generate iterations like this see what is the advantage of this what is the advantage of doing this way over Newton's method first of all you do not require explicit derivative calculations this is only an approximation how many such derivatives suppose you call this as a derivative approximation how many such derivative approximations you need equal to number of equations okay compare with the Newton's method you need to compute Jacobian how many elements in the Jacobian n cross n right so if you have 100 equations to solve you have to compute derivatives which are 100 cross 100 if you have 1000 equations to solve you have to compute numerical derivative even if you compute it numerically it is 1000 cross 1000 okay huge number of calculations whereas here you have less number of calculations so this is somewhere in between it is not now it does use some kind of rate information but not all possible rates okay some partial rate information is used but not all possible rates so this is computationally more you know let us say attractive because it requires less calculations okay so it does use some kind of rate information but not fully okay some partial rate information is used now this is not a way it is normally reported or implemented okay a slight variation is done not in terms of the formula but in terms of the way it is implemented now which is what I am going to now this derivative is same as now fi is xi-gi right I am going to substitute that here so if I substitute that here it will be xi k-xi k-1- okay let us develop a slightly let us develop a slightly simplified notation then the things will become so let us call gi k is defined as gi of xk okay so that if gi k-1 which means gi evaluated at xk-1 and so on okay so with this notation with this notation this derivative this derivative here I am going to write this as xi k-xi k-1-gi k divided by xi k which is equal to 1- which is equal to 1- how was xi k defined see this this will give you 1 right and this divided by this is xi k so this formula which I have here now okay in terms of xi k okay this can be written as xi k-1 is equal to xi k okay xi k now what is fi-xi k-gi k what is fi k this is xi k- we have defined it like this into 1 upon 1-xi k same formula I have just rewritten in a different form okay same formula I have written in different form if you were to implement if I just go back here if you were to implement this formula as it is it is fine nothing wrong with it that itself that will be wextine method why I am doing this is because I want to clamp some values I want to introduce some heuristics into my iterations so that is where I am just doing this rearrangement okay so this particular equation now I can rearrange this as follows okay I will just rearrange that equation nothing else just put it okay now I am going to define I am going to define a new I am going to define another variable which is omega i k which is 1 upon 1-xi k I am going to introduce a new variable omega i k which is 1 upon 1-xi k okay and in terms of omega i k I am going to rewrite this as 1-omega i k xi k-omega i k okay the reason I am doing this is to draw some parallel with successive substitution with relaxation method this is like relaxation iterations okay except the omega in relaxation is typically fixed here omega is not fixed omega is changing with the iterations from one iteration to the other iteration omega is changing okay so time varying omega it is like successive substitution with relaxation okay and now what we are going to do this is some kind of a thumb rule to make your computations typically we try to restrict omega we try to restrict omega between 0 and some alpha okay suggested value for this alpha is 5 typically you restrict this between 0 to 5 okay so you are doing something like relaxation method successive substitutions with variable omega okay and omega you want to restrict between some number between 0 and 5 okay so all this jugglery I have done because I wanted to put this limit okay I wanted to put this limit that is why I am doing all this jugglery otherwise I could have so this multivariate version of secant method together with this limit imposed on omega this is called as wextine iterations this is very popular method and if you go to many of these plant wide steady state simulation softwares like aspen or heises you will find one of the options they will give is wextine iterations okay wextine iterations can be performed even if the function fi is not differentiable you do not require different stability here you just require evaluation of functions at two points and divided by you know you may have discontinuities when you are writing equations for some unit operation okay you may have different equations in different regions depending upon the operating region and so on if flooding occurs you may have some different equation normal operation you may have some different equation you can have in chemical plants you can have these kind of situations so we actually many times prefer this as an in between way of doing calculations between completely gradient based and completely gradient free okay so this is somewhere in between so given a large scale problem what I would do is well I would first try a wextine method if it works great okay it does not work probably I should look for something else okay so because this is computationally more you know friendly so is this clear any doubt about this okay let us move on to now the Newton's method which we have looked at we have derived from Taylor series approximations multivariate Taylor series approximations then we also you know did some exercises in which we solved some problems using Newton's method right so you already know something about Newton's method now what more is there to it so the next one the next one is on our radar is Newton's method and you will say well we already know about Newton's method I want to solve for f of x equal to 0 x belongs to Rn and then all that you do is at each time point is you solve for this let us define this Jacobian at time k to make my writing simple I am going to use this terminology k jk is dou f by dou x evaluated at x equal to xk okay and then one more notation that I am going to introduce here is fk this is function vector f evaluated at xk okay I am just introducing this notation so that my subsequent derivations become simpler in notation okay just remember that f superscript k is nothing but function f evaluated at xk okay so my formula which you have implemented is xk plus 1 is equal to xk plus delta xk and delta xk is computed by solving this linear algebraic equation this is what you know right now as Newton's method isn't it this is what well one variant which is often implemented rather than doing this is to make this equation well condition you pre multiply this by jk transpose okay so instead of doing this equation okay often this is done in fact the tutorial computing tutorial which I have given you I asked you to modify this step like this and then solve this resulting problem using Gauss-Seidel method right why what is the reason this become positive definite Gauss-Seidel method is guaranteed to converge okay in general when we work with positive definite matrices matrices are well condition easier to work with positive definite matrices so this is instead of this step we often implement this step there is one more modification this is called as damped Newton method okay okay so what we do here is that so one modification I told you is this other modification is we change this this particular equation we modify it to xk plus 1 is equal to xk plus lambda times so lambda is chosen between 0 and 1 okay now what is the reason first of all remember when you took Newton's step when you took Newton's step it was based on local linearization of function vector f of xk right how did we take this step it was based on the local linearization this jk is a local Jacobian local derivative okay so actually actually what should happen is that when you take this step okay the function vector okay should reduce because I want to go to f of x equal to 0 when I go from xk to xk plus 1 you know this function vector should reduce now a step which is based on linearization okay may not ensure that at the new point actually the function is reducing because you have done a local approximation of a nonlinear function made some decision of move based on the local approximation okay this may not lead to a good value of xk plus 1 see what should happen I should move towards the solution right should move towards the solution isn't it now what is the guarantee that if I if I make some decision based on the local slope alone it will lead to a it will lead to you know decrease or it will lead to a smaller value of f okay to put it in little more mathematical words see what should happen I want that f of norm of fk plus 1 this should be less than norm of do you agree with me when I take a new step if I evaluate the function vector at the new point see I want to finally go to fx equal to 0 right when I make a new step okay so this function evaluation at the new step that is xk plus 1 okay should actually reduce okay as when compared to this now this may not happen if I set lambda equal to 1 it may not happen okay because delta x has been determined using local slope okay so what is the way out okay so why should we choose lambda less than 1 let's look at the rational behind it so essentially I choose a lambda okay by some means such that this condition that is function evaluated at k plus 1 is less than function evaluated at k okay you could do this okay let me let me go back here before before I move on to the rational so what I want to do is first I check lambda equal to 1 if for lambda equal to 1 if this condition is satisfied I am happy okay I accept I accept delta x choose lambda equal to 1 and proceed with the next iteration if it does not happen okay I will reduce lambda I will reduce lambda to say 0.9 for example I will give you a very crude way of doing it I will reduce it to 0.9 then for lambda equal to 0.9 delta x is fixed I am not going to change delta x has been found by you know using the Jacobian okay so I am going to reduce lambda I will reduce lambda and check whether this condition is satisfied if not I will further reduce lambda I will go on reducing lambda till this condition is satisfied moment I get 1 lambda for which this condition is met okay I will take that step and so how this is done one algorithm for doing this is called as Armijo line search and I am not going to go into details of that I have given this here in the notes it is in table 1 damped Newton algorithm with Armijo line search okay so those are details as to how do you select iteratively lambda what I want to do on the board is not the algorithm as to how to select lambda such that this is met because that is matter of implementation I want to give the rational why this is done okay this is done this lambda which is less than 1 is done because we are doing Taylor's is approximation and Taylor's is approximation is actually valid in a small neighborhood okay Taylor's is approximation is valid in a small neighborhood and then this lambda actually helps us to find out what is that neighborhood where you should apply Taylor's is approximation okay so what I am going to do now is I am going to look at this function I am going to look at this function phi which is I am going to look at this function okay f is my function vector fk transpose what is this is nothing but half fk plus 1 to square right nothing but norm 2 square I am going to look at this function okay now this particular function this particular function is nothing but half f of xk plus lambda delta xk transpose actually the raw method which we have studied in the beginning and which we have implemented for some simple problems works only for simple cases okay to make it work for a large size complex problems to do all kinds of tricks okay so this is a scalar function this is a scalar function phi is a scalar function okay I am going to now expand phi in the neighborhood of xk okay I am going to expand phi in the neighborhood of xk so my phi xk plus 1 is equal to phi so this Taylor's theorem is like foundation you know it helps you everywhere wherever you move in applied engineering mathematics it is one of the cornerstones okay now when you are writing it like this what is unknown to you here only lambda delta x we have already calculated okay xk is known to us okay so this vector is known this vector is known I am just now worried about choosing lambda correctly okay so lambda dou phi by dou lambda plus lambda square by 2 dou 2 phi by dou lambda square and so on I am expanding this as a function of right so now what should happen is phi xk minus 1 minus phi xk what should this quantity be positive or negative it should be negative this quantity should be negative this is nothing but look here look here this is if I were to square this and subtract okay I will get this quantity I am just multiplying by half is not going to make much difference half is a positive quantity so this quantity okay is what I am worried about now we know that delta xk is equal to minus jk inverse fk lambda times dou phi by dou lambda this is same as lambda times phi xk transpose delta xk just check this I am just doing I am just doing derivatives by in succession I first differentiate phi with respect to entire quantity and then then see this is this is dou phi by dou x into dou x by dou lambda okay dou phi by dou x into dou x by dou lambda I am not writing that just skipping the in between step okay