 Today, we will start non-linear estimation and here is the content of this topic linear models and non-linear models. So, we will give detailed definition of non-linear models and then we will talk about least square in non-linear case. So, how to fit non-linear model using least square technique. So, given a set of observations say x i y i, the starting point is that we start with simple linear regression model or if we have more than one regressor, we start with multiple linear regression model. Of course, those models are linear model also you must have realized that the polynomial regression is also linear model. So, let me give the detailed definition of non-linear model and also linear model. So, linear model so models that are linear in non-linear models are linear in non-linear models. So, what parameter are called linear model. So, this is important linear in parameters not in the regressor. So, you know that this is the model for multiple linear regression model, beta naught plus beta 1 z 1 plus beta 2 z 2 plus say beta k minus 1 x sorry z 1 plus k minus 1 plus epsilon. So, where the z i is any function of the basic regressor variables basic regressors. So, what I mean by this say the basic regressors are say x 1, x 2, say p minus 1 then y equal to say beta naught plus beta 1 into x 1 minus x 2 plus beta 2 x 1 minus x 2 square plus epsilon. So, this is this is a linear model because here this z 1 plus beta z 2. So, this is my z 2 z 1 z 2 could be any function of the basic regressors, but this model is linear in parameters. The parameters are beta naught beta 1 and beta 2. So, this model is called linear models. So, then what we mean by non-linear model, non-linear model or models. So, models that are non-linear in parameters are called non-linear model right. So, non-linear in parameters. Let me give you examples say y equal to e to the power of theta 1 plus theta 2 into t plus epsilon. So, here y is the response variable and in non-linear case generally we represent the regressor variable by t instead of x. So, this is a non-linear model because it is non-linear in theta 1 and theta 2. Let me give one more example say y equal to theta 1 by theta 1 minus theta 2 into e to the power of minus theta 2 t minus e to the power of minus theta 1 t plus epsilon. So, this is very clear that this model is non-linear in parameters. So, the parameters are theta 1 and theta 2 here and this is also non-linear. So, these are the non-linear models and here t is regressor variable and theta r parameter. Well, now let me call it say 1 and 2. So, 1 and 2 are non-linear in parameters. So, 1 and 2 are non-linear in the sense that they involve the parameters theta 1 and theta 2 non-linear way. That is why these two models are called non-linear and now you can see that this 1 can be transformed to say if you take a log base e. So, l n y is equal to theta 1 plus theta 2 t plus epsilon. So, then this once you have a non-linear model. So, take this transformation, this non-linear model becomes linear. So, linear on the data l n y and t. So, this type of non-linear functions are called intrinsically linear. So, this 1 is called intrinsically linear because this 1 is called intrinsically linear. You can very easily transform this non-linear model to a linear model by taking some transformation. Well, so we understood what is linear model, what is non-linear model and this is the notation for the linear and non-linear case. So, here given a non-linear model, this is linear model we are going to estimate the parameter of the non-linear model. Here are some notations, you can see that for response variable in the linear case, we represent it by y. So, y generally stands for the response variable and in non-linear case also y stands for response variable and subscripts it is for linear case, it is i is from 1 to n and then here you know just notation and here we will use instead of i, instead of calling y i, I will say y u and u is also for 1 to n. So, n observations and in case of linear models, we use the regressor variables, the standard notations are x 1, x 2, x k minus 1, but here we will be using t 1, t 2, t k minus 1. Well and in case of linear, the parameters are beta naught, beta 1 and beta k minus 1 and in case of non-linear, we use theta 1, theta 2 and theta k minus 1, this notation you know and to make your life difficult. Well, now let me start with the non-linear model, say general model y equal to f t 1, t 2, t k and theta 1, theta 2 up to theta p. Now, let me start with the non-linear model t plus epsilon. So, this is a non-linear model. So, f is non-linear in the parameters theta 1, theta 2, theta p and in vector notation, we write t equal to t 1, t 2, t k and theta vector is theta 1, theta 2, theta p and then in terms of this vector notation, we can write y equal to f t theta 1, theta 2, theta 2 and theta 3. Plus epsilon or also we can write expectation of y. So, this is the model, non-linear model we are talking about, expectation of y is equal to f t theta, if we assume expectation of epsilon is equal to 0. So, also we assume that we also assume that epsilon are uncorrelated, that is and also variance of epsilon is equal to sigma square and epsilon vector follows normal 0 sigma square, so independent. So, these are the basic assumption we do in case of simple linear regression model also, simple or multiple linear regression model also. So, this is the same thing is true here also, we are making the same assumption for the non-linear model here and suppose we have n observations like y u, t u, like we had before y i, x i for i equal to 1 to n, say here notation difference is u from 1 to n. Then we can write y u for the u th observation is equal to f t u theta plus epsilon u. Now, we have the model say y u equal to f t u theta plus epsilon u. Now, we have the model say y u equal to f t u theta plus epsilon u and we have to fit this model, fitting this model means given a set of observations 1 to n and fitting this model means you have to estimate the regression coefficients here. So, by using the least square technique, we consider the residual or error sum of square, error or residual sum of squares call it s theta is equal to y u minus epsilon u theta plus epsilon u theta minus f t u theta whole square. So, this is basically if you put hat here, this becomes the observed response and this is the estimated response. So, this is the u th residual and residual square is s theta. So, this is the quantity we want to minimize in order to find the least square estimates. So, to find least square estimates theta hat, we need to differentiate s theta with respect to theta. So, theta is a vector. So, it has p components. So, if you differentiate, I mean so basically you have to differentiate with theta 1 theta 2 theta p and from there you will get p normal equations. The p normal equations are summation y u minus f t u theta. So, differentiating this function with respect to say theta i and then the partial derivative, this involves theta i. So, the partial derivative of this one f t u theta with respect to theta i and that at the point theta i equal to theta hat. I mean you I think you understood you know this is the quantity we want to estimate. So, this equal to 0. So, this is the ith normal equation and if you differentiate with respect to theta j, you will get the jth normal equation. This way you will get p normal equations and now you should realize that when this f t u theta u sorry theta is linear, then this derivative partial derivative of this one f t u theta delta theta i is a function of t u only. If it is linear in t i, then when you differentiate it does not involve t i sorry theta i. So, it is a function of regressors only and independent of theta i. Or independent of I should say more specifically theta i here. So, when this model is non-linear when this model is non-linear means this f is non-linear. So, when the model is non-linear in thetas, so will be the normal equations. So, if the model is linear, then the partial derivatives are independent of theta, but if the model is non-linear, then the partial I mean this normal equations are also non-linear. Let me give an example, consider this model say y equal to f theta t plus epsilon, where f theta t is equal to e to the power of minus theta t. That means, we are considering the model e to the power of minus theta t plus epsilon. So, this is the model. So, what is my s theta here? My s theta is equal to y u minus e to the power of minus theta t u. It is only one regressor say t, this square sum over u. So, this is my s theta and we want to minimize, we want to estimate theta in such a way that this is minimum. That means, we differentiate it the single normal equation is obtained by differentiating this s theta with respect to theta. This equal to 0 implies summation y i sorry y u minus e to the power of minus theta t u and then the derivative of this one with respect to theta. So, this is t u e to the power of minus theta t u. This is equal to 0 and sum over u. So, this gives that your normal equation is y u minus e to the power of minus theta t u minus t u e to the power of minus twice theta t equal to 0. So, what I want to say here is that if the model is non-linear, then the normal equations are also non-linear. So, this is you can see that this is the normal equation and it is non-linear in theta. So, finding theta which satisfy this equation is not easy. So, finding theta hat is not theta hat means you solve this, the theta you got is the theta hat. So, this is not easy here. So, now how to estimate the parameters of non-linear systems? So, estimating the parameters of non-linear systems. Notationally it may look very difficult, but you know the idea is simple here. So, I am given a model like this. What I am given is that I am given the model y u equal to f t u theta plus epsilon. So, this is non-linear in theta and the idea here is that to estimate the parameter as you must have observed that if the model is non-linear, then the normal equations are also non-linear and then the solving the non-linear systems of equations are difficult. So, what we will do is that we will approximate this non-linear function by linear functions, by linear function using Taylor series. So, this is non-linear and we will approximate this one by a linear function. So, for that let me talk about the Taylor series. As you know Taylor series of a real or complex function f x that is infinitely differentiable in a neighborhood of a real or real. So, a complex number a is f x equal to f a plus f prime a by 1 factorial into x minus a plus f double prime a by 2 factorial into x minus a square like this. So, here you know that we are expressing a function any function in terms of polynomial. So, if I so here what we will do is that similarly we will approximate this non-linear function by a linear function. So, we will take up to this term in the Taylor series. So, let theta 1 0, theta 2 0 and theta p 0 be initial values for the parameter, for the parameters theta 1 theta 2 theta p. And then we carry out you know this Taylor series expansion about this non-linear function initial value. So, carry out Taylor series expansion of this non-linear function f t u theta in the neighborhood of this point about the point say theta naught which is equal to theta naught 1 theta naught 2 theta naught p. So, we will carry out the Taylor series expansion of this non-linear function about the point or in the neighborhood of this point. So, we will take up to this term because we are looking for a linear approximation for this non-linear term of this non-linear function. So, here is the Taylor series expansion. So, f t u theta is equal to f t u theta naught plus the partial derivative of this t u theta theta naught plus the partial derivative of this t u theta delta theta i theta i minus theta i naught. And this derivative at the point theta i equal to theta i naught and this i is from 1 to p. So, this you must have understood that this is the Taylor series expansion of this non-linear function up to the first up to the second term linear term. Now, let me use some more notations. So, you set this one as f u naught is equal to f t u theta naught. And this term will denote by beta i naught which is equal to theta i minus theta i naught and then this term will denote by z i u naught. So, that is nothing but delta sorry partial derivative of this one theta i at the point theta i equal to theta i naught. So, now, if you use all this notations in this linear approximation you can write this term as this one as say f t u theta. So, non-linear function is equal to f u naught because f u naught is this plus z i u naught beta i naught and this i is from 1 to p. So, this is the linear approximation of the non-linear function in theta. So, here it is in terms of beta well. So, my y u this is the model we started with y u was f t u theta plus z i u plus epsilon u. So, this is the model we started with. Now, if I plug this linear approximation of this non-linear function here what I will get is that I will get y u minus f u naught is equal to z i u naught beta i naught plus epsilon u. So, now you can you see that you know this is a linear model in beta. So, this is same as you know this multiple linear regression model. So, in this model now we can estimate this parameter the transform parameter beta i naught using the least square technique. So, we can now estimate this beta i naught for i equal to 1 to p by applying least square technique. So, let us see now how to do this thing. So, we have the linear model now y u minus f u naught equal to z i u beta i u sorry beta i naught i is from 1 to p plus epsilon u. So, we started with the non-linear model and then we have a linear model by using the Taylor series approximation and we are trying to solve this linear model now. So, if we this we can write in matrix form. So, if we write my coefficient matrix z naught which is equal to say z 1 1 naught z p 1 naught and then z 1 n naught and then z 1 n naught and then z naught and z p n naught. So, this is my coefficient matrix and beta naught is my coefficient vector that is beta 1 naught beta 2 naught and beta 2 naught. So, beta p naught and let me write my response as this y vector. So, y vector is y 1 minus f 1 naught and then y n minus f n naught. So, using this matrix notations I can write my simple linear I mean this is basically the multiple linear regression model now I can write it in the form y naught is equal to z naught beta naught plus epsilon. Then the estimate of this beta naught we know now we can apply a least square technique of this one is given by beta naught which is equal to basically beta naught hat is z naught equal to z naught prime z naught x prime x inverse x prime y. So, here x is z naught is nothing but x in the multiple linear regression model. So, z naught prime y naught. So, what we have now is that we have the let me write is in the vector notation only. So, we have the fitted model. So, the fitted model is I mean of course, this is not the final estimate we wanted to estimate theta and now we are estimating beta where beta is sort of theta minus theta naught and we will what we will do is that we will improve this estimate iteratively well the vector this vector beta naught which is estimate of b naught which is estimate of beta naught this minimize y u minus f u naught minus beta i naught z i u naught z i u naught square. Now, this is nothing but s theta basically with respect to beta i naught. So, i is from 1 to p where my this beta i naught is equal to theta i minus theta i minus theta i naught theta i naught. So, we need to understand this part we want to estimate theta i's or theta theta 1 theta 2 theta p. So, this i is from 1 to p. So, we started with some theta naught we want to estimate theta we want we started with theta naught and then we have used this Taylor expansion about this point theta naught to make the non-linear function linear and after making the function after making the model from non-linear to linear we have estimated this difference. So, we have estimated this beta i naught and that is nothing but b i naught. So, let us write let us write beta sorry b i naught is equal to theta i minus theta i naught. So, we started with theta i naught as a estimate of theta i. So, we and then we see that this quantity this difference is we estimated this difference which is nothing but beta i naught and we have estimated this difference. Now, what I am trying to say is that you know we started with the initial point and then we will try to improve this estimate iteratively. So, let me put one here. So, this gives my estimate of theta i at the first iteration. So, theta i 1 is equal to theta i naught plus beta i naught. So, we started with theta i naught then we improved this theta i naught by theta i 1 and this is the revised best estimates of theta. So, what we will do is that now again we will place. So, we now place theta i 1 in the same role as theta i naught and go through the same procedure. So, this will lead to another revised estimate theta i 2 and so on. So, this will lead to this. So, we started with theta naught and well let me just give a little idea about what I am doing here is that we have a non-linear model and that is the non-linear in theta. So, first what we do is that we take an initial estimate of theta that is theta naught and then we consider the Taylor series expansion of the non-linear function about theta naught and we make we approximate the non-linear function by a linear function and once we have the transformation from non-linear to linear. Now, we can use the result of simple linear regression model and we estimate the linear model and from that estimate of linear model using the least square technique what we get is that from theta naught we get theta 1. Now, we put this theta 1 the revised estimate in the same role as theta naught was initially. So, we will do the same thing you know again we take the Taylor series expansion of the non-linear function about theta 1 and then once you have the linear function once you transform the non-linear model to linear model we can apply the results of linear regression and from theta naught from theta 1 we will get theta 2 and so on. So, at some point after the jth iteration what we will get is that we will get theta j plus 1 equal to theta j plus beta j. So, in the jth j plus 1th iteration we improved theta j by using the same technique and the improved one or the revised one is that theta j plus 1. So, this one is nothing but theta j plus this b j we know what is this b j b j is z j prime z j inverse z j prime y minus f j. So, where this z j prime z j prime z j prime z j prime z j prime z j prime z j prime z j is equal to z i u to the power of j I mean the notation not to the power of j this one and I am sure that you understand what is this one is z i u is the derivative of that non-linear function t u theta with respect to theta i and theta equal to theta j. So, we are using the result of the jth iteration and my f j is the derivative of that non-linear is equal to f 1 j f 2 j f n j prime and my theta j is equal to theta 1 j theta 2 j theta p j prime. So, you understood from every iteration we are improving the f n j prime z j prime estimation and there should be some stopping criteria. So, when the result of jth iteration is not much different from the jth plus 1 th iteration we stop there. So, this iterative process continue until this difference is i this is the result of j path j plus 1 th iteration minus theta i j th iteration when this is less than some delta which is a small quantity some pre specified value say for example, 0 0 0 0 1. So, when you see that the difference between the result obtained from the j th plus 1 th iteration and the j th iteration is very small there is significant difference that means, you can stop there. So, this is what about the non-linear estimation. So, you understood what is the non-linear model non-linear means since non-linear in parameters theta 1 theta 2 theta p and given a non-linear model you now know how to approximate that non-linear model by a linear model using the Taylor series expansion and you also know how to estimate the parameters of non-linear estimation non-linear model using the least square technique that is all for today. Thank you.