 So we have been looking at this optimal state estimation and we have arrived at this stochastic state space model here WK and VK are stochastic processes 0 mean white noise processes classic example of a system where XK will be a correlated stochastic process why it is correlated because XK is correlated with XK-1, XK-1 is correlated with XK-2 by this difference equation XK-2 is correlated with XK-3 so there is a correlation between XK, XK-1, XK, XK-2, XK, XK-3 okay. So even though WK and VK are white noise processes XK is not here we are assuming that we know characteristics of WK and VK we know that WK is assumed to be a 0 mean white noise process VK is a 0 mean white noise process we know co-variance of WK and VK okay so we know this so this is my model okay. Given this model I want to come up with optimal estimates of states X okay I want to come up with optimal estimates of states X in such a way that I use the information about these two noise sequences I have this information that co-variance of WK I do not know what is exact value of WK okay see there is a difference between these two inputs one input is UK is known input WK is unknown input we have not we do not have any measurement yet we have characterization of that input in terms of a noise and in terms of mean and variance okay so we have some model for it it is not that we are completely blind about what is happening in terms of unknown inputs okay. So now how do I systematically incorporate the information of these co-variance is while doing state estimation how do I optimally estimate the states okay my task is follow okay I want to somehow when I do state estimation okay my measurements are correlated my measurements are corrupted with noise okay and there is an input that goes into X I do not have a measurement of input okay but remember effect of WK will percolate to YK it will percolate to YK plus one actually it will percolate to okay so effect of the state disturbance is present in YK okay and when I do state estimation I want to somehow account for this WK because it is an input that is going to the state dynamics okay I better account for it okay at the same time VK is measurement noise it is complete dirt in my data I want to remove it okay so even though we call this as a noise it is customary to call this as a noise state noise and this is a measurement noise this VK is unwanted okay whereas WK you would like to have an estimate and compensate the state estimation for WK even though you do not have a measurement you would like to have an estimate okay very very important I want to compensate my state estimate for WK I want to use this I want to use this information Q and R while doing so okay so Q and R are some uncertainties that represent uncertainty in the input and R quantifies the uncertainty or the variability of the measurement and then question is how do I design an optimal estimator that is what we are looking at okay first of all WK and VK are stochastic processes and by virtue of the fact that the difference equation is driven by stochastic inputs okay XK is also a stochastic process okay XK is a function of UK and XK is a function of UK and WK WK is a stochastic process then XK is also stochastic process there is a randomness in XK okay randomness in XK okay then this I just talked about that you know through difference equation XK is going to be a colored noise what is the colored noise time correlated noise if you try to find out autocorrelation between XK and XK-1 XK and XK-2 it will be nonzero autocorrelation function for WK autocorrelation for what is autocorrelation function for white noise it is equal to covariance for lag 0 and any other lag correlation is 0 okay that is not going to be the case with XK XK is going to be a correlated random variable it is a stochastic process which is correlated in time and we have to uncover its characteristics okay that is one of the main thing. So what I want to do is that I want to somehow link the statistical characteristics of XK okay and RK and WK that is what I am going to do okay that is what is my next task okay now before we reach the final point okay I have to do lot of preliminaries and you have to bear with me for some time for the entire picture to become clear okay so we are I am creating some intermediate results keeping them aside and then I will combine everything into you know this Kalman filter development okay so first of all I am defining this set okay I am defining this set this set is Y0 U0 Y1 U1 this is a set of measurements that I have collected over time I have this data with me okay I am going to call this set as Y superscript K this is not Y raise to K this is just notation okay it only means that set of all data collected from time 0 to time K so if I write time K plus 1 which means data collected from time 0 to time K plus 1 if I write Y superscript K minus 1 data collected from time 0 to time K minus 1 okay that is the notation okay now what I want to do is well when we are doing an observer we are going to use measurements to correct the states right I am going to use the measurements information to correct the state estimation not correct the states the true states cannot be corrected the estimate of state is going to be corrected using feedback from the measurements okay so what you can show is that this estimate of X will be function of both WK and VK why it is function of WK because WK is affecting the state dynamics why it is a function of VK because we have a feedback correction coming in okay in the observer I have a feedback correction which is based on the measurement okay and just imagine why entire data is required why I am saying that entire data is required entire data is required because when I estimate Y X1 I am going to use information of Y0 and Y1 when I am going to estimate X2 I am going to use Y0 Y1 Y2 when I estimate you know X10 I am going to use information of Y0 to Y9 Y10 okay so measurements are going to be used to reconstruct the state estimates okay. So what I want to find out is that estimate is a stochastic process estimate itself is a stochastic process differentiate between the two things the true system dynamics is also stochastic process estimate of X is also another stochastic process these are two different things okay now we are saying that the best estimate that you can get okay best in estimate that you can construct is the conditional mean that means generate conditional density of X condition on what condition on the measurements that you obtain okay how to generate this looks very abstract how to generate this condition density I will work out the algebra okay so that we will be doing but if I can get this conditional density and if I take its mean that is the best estimate of the XK so somehow I want to arrive at this conditional mean okay so I have two steps I have a prediction step and I have a correction step we are going to develop a filter Kalman filter okay I have prediction step I have correction step now what I am doing here is I am just using this difference equation this go back here I am going to use this difference equation and find out conditional means okay okay so this is the difference equation now written at time point XK so forget about this YK-1 for the time being just look at this XK okay is equal to Phi XK-1-gamma UK-1-WK-1 this is what we have okay now if I use information up to YK-1 okay this what is the meaning of this YK-1 all measurements up to K-1 have been used okay conditional density of this okay so I am going to take an expectation operator on the right hand side okay so look here Phi is a matrix so I have taken it outside and I am writing expectation of XK-1 YK-1 is everyone with me on this okay plus this is a deterministic input we know what is UK we know gamma so this comes out okay so expectation of this is nothing but this itself okay it is not a stochastic variable and what about this guy expectation of WK it is a zero mean white noise what is the best value zero okay so this is zero so which means I get a recurrence relationship I get a recurrence relationship what is the recurrence relationship the new mean the new conditional mean see this is this is estimate of X or this is mean of XK condition on YK-1 what is the meaning of this mean of XK condition on YK-1 see here K-1 is appearing okay because we used here set up to K-1 this is conditional mean of X up to using information up to K-1 is Phi times conditional mean X at K-1 K-1 plus new input let us gone in okay so what I am saying here see do you remember sometime back when we talked about stochastic processes we said that for a general stochastic process the mean can be function of time this is an example okay mean is time varying okay so this XK is a stochastic process whose okay and yeah last slide yeah because there is no information of WK-1 ever contend in YK-1 because WK-1 will affect XK XK is in future yes XK is in future so WK-1 its effect will be present in YK but not in YK-1 so understand because that one lag this is 0 okay see to characterize a stochastic process I need two measures which are very very important mean and variance okay what does forget about this is a multivariate time varying process forget about it for the time being just one single random variable what does mean tell you most probable value right most probable value the random variable will take what does variance tell you it tells you roughly physically this do not forget that physical interpretation it tells you spread okay larger the covariance okay the possibility of X being anywhere in that band is you know you get a larger band around the mean where X can take a value smaller the covariance or smaller the variance okay X will be closer to the mean right value of X which will actually occur see expected value of X and actually value X will take are different X is a random variable when you say that mean is an best estimate you are estimating you are predicting okay so you know that is what given the density function that is what best you can say this is the best value you can guess okay and covariance will tell you the spread so do not forget this particular idea okay though we are going to get into more complex match okay so now what I am going to do is I want to find out now covariance I found out mean and now I want to find out the covariance okay so to find out the covariance I have to subtract the mean okay I have to subtract the mean and then so this is my mean this is how the mean changes as a function of time okay this is how mean changes as a function of time we just now found out this is the original process I am going to subtract this from this okay if I subtract this from this see what will happen if I subtract this equation from this equation this ?uk-1 and ?uk-1 will cancel these are constants which are adding same constants being added to the both the equation so this will cancel wk-1 will remain okay this will remain okay so if I actually find out this difference okay I am defining this difference xk given k-1 so this is I am going to call it as a prediction error okay I have predicted value of x right now I have not done any correction I have not done any correction okay and this is estimation error estimation error is xk true value of x-estimate of x conditional estimate of x okay using information up to k-1 so I am getting a difference equation I am getting a difference equation that relates new error with the past error and wk okay so this error itself is a stochastic process now we have to find out you know yeah x bar is nothing at x hat I have written as x bar because you know if I write this here you know it goes outside the slide I have practical difficulty if I write here this minus expectation of this inside this x bar is nothing but expectation of xk given yk-1 okay so mean I want to write mean okay conditional mean but that conditional mean if I write inside I have trouble okay now I am just keeping that result I am moving to the update step we will sort out how to come to the covariance what is going to happen in the update step I am going to take the predicted estimate this is prediction estimate okay I am going to merge it with measurement how I am going to merge it with the measurement using a gain matrix lk okay what is how do I choose the gain matrix lk that is my design problem I want to choose gain matrix lk in some best possible way okay so that the estimate is optimal now what is this optimal that is what we will see is everyone with me on this now I am going to fuse why I am going to fuse why which is measurement what is this y hat k given k-1 estimate of y using information up to k-1 how do you get this estimate c into x hat k given k-1 so okay now I am writing this equation right now for some arbitrary l okay I am doing this for an arbitrary l my task is to decide l itself okay where what is ek is typically called as innovation or residual okay ek is nothing but yk- y hat k given k-1 okay what is yk yk is cxk-vk okay so this is cxk-cx hat k given k-1 which is nothing but c epsilon k given k-1-vk okay what you can appreciate here is that these estimates now are going to be functions of wk and vk why it is going to be function of wk wk effect is present in yk wk effect is present in yk isn't it effect of wk wk is a some noise entering you know the real plant so it is effect is present in yk okay vk is yeah wk-1 effect will be present in wk-1 w effect what I meant to say is disturbance effect will be present in yk past disturbances effect will be present in yk okay okay so now what I have done here is I have written this I have written this x hat k given k is x hat k given k-1 plus lk ek but ek is nothing but c epsilon k plus vk so I have derived a combined expression this is just an algebra okay if you have notes just look at it just an algebra I have just derived a combined expression which is combining this step okay with the so now what do you have at the end of this at the end of this we have two different equations just look here what we had in the last in the last equation in the last equation this equation was relating error k-1 k-1 to kk-1 okay and in the next equation okay I have relationship between kk-1 to kk okay so I have found out difference equations I have found out difference equations between successive errors okay how the errors are governed by is this okay everyone with me on this yeah okay so now I want to find out mean value of the error what is the mean value of the error okay so now my error dynamics together I have written these two equations together I have these two equations these two are coupled equations you see this k-1 k-1 gives me kk-1 kk-1 gives me kk okay kk will be used in at time k plus 1k and then that will give you k plus 1 k plus 1 and so on okay so everything depends upon everything depends upon epsilon 0 0 if you know epsilon 0 0 okay you will know epsilon 1 0 if you know epsilon 1 0 you will know yeah epsilon 1 1 if you know epsilon 1 1 you will know epsilon 2 1 and you can start rolling okay you can start rolling so I have just combined this and I am writing now k I have just combined these two equations just substituted this and this okay so now I have equation between kk as a function of k-1 k-1 fine just an algebra pure algebra nothing okay now here is a simplifying assumption okay what I am saying here is that expected value of x0 expected value of x0 see when you this is a tricky thing okay and expected value of the true value at time 0 okay see I have to give an initial guess right to start the difference equation I have to give an initial guess how do I give an initial guess so when I give a guess I am it is a random variable I am guessing okay I have to create a random variable okay I am making an assumption that the guess comes from the same distribution the guess comes from the same distribution as the two distribution they have same mean okay so x0 is a random variable initial guess that I give you the random variable okay but both of them have same mean okay what does it mean and then the covariance of you know both of their covariances are also equal so I am just saying that you know the initial guess is a random variable whose mean is same as the true mean okay whose variance is also same as the true variance this is my simplifying assumption okay yeah we do not know there is a problem so but see the just assume this for the time being and see beauty of this algorithm I mean this is a something like this algorithm has you can say change the world okay so and see main thing in you know in science or in mathematics is to make right assumptions simplifying assumptions okay if you are able to make right simplifying assumptions you know because when you model there is always a disconnect between the reality and the model okay so you should know where to draw the boundary if you know that boundary then you are there okay so I am assuming see what I am assuming is that expected value of the initial error is 0 okay now what is the consequence of this assumption just see what is the consequence of this assumption what is e11 see I am going to use this difference equation now I am going to just use this difference equation to gather with this assumption what is the assumption expected value of the initial error is 0 I am not saying initial error is 0 I am saying expected value of the initial error mean of the initial error so I have guessed okay and error that you have committed okay if you if all of you guess okay and if you take mean of those guesses okay then the mean will be same or the mean of the error will be 0 that is what I am saying yeah x0 is a true state why x0 is a stochastic variable x see x is a stochastic process see time 0 is some arbitrary time right let us say it has started at some time – infinity see we are taking some arbitrary 0 time and then starting okay see you are in a imaginary world in which the stochastic process has started at some time – infinity when the universe started okay and so now today you decided to start doing the calculations so x0 is arbitrary fixed to today but something has happened in the past right okay see suppose let us take an example real example you know x is the temperature in this room okay and k is you know that is a day you know k-1 is yesterday k-1 is tomorrow k-2 is day after tomorrow okay so I decide something I decide to start doing calculations using today so today is my 0 okay so x0 that is the true state today is actually function of all the randomness in the past okay so what is x now is a x is a stochastic process where I play 0 is my choice okay so something has been happening before time 0 okay all that information is contained in x0 okay I decided to use equation from today so I will start using it xk-1 is tomorrow xk-2 is day after tomorrow okay but past history is there in x0 right and it is a random variable okay it is a coordinate random variable we are assuming that we have some information about it through p0 we know in some sense we are saying that we know it is mean value okay there are different ways of interpreting this equation different books you will find different interpretations okay so I am going to start using this equation rolling in time okay so now what is expected value of e11 expected value of e11 will be i-l0 here i-l0c x expected value of ? x e00 okay plus w0 what is expected value of e00 our assumption that is 0 okay what is expected value of w0 0 what is expected value of v1 0 it is a white noise okay so what is e22 e22 will depend upon e11 but e11 is 0 okay so yeah so what you can show is that if you make an assumption okay that initial error has 0 mean okay if you make an assumption that initial error has 0 mean then all subsequent errors also have 0 mean okay so this is called as unbiased estimator okay expected value yeah e00 is not equal to 0 expected value of e00 is 0 most likely no I am not saying that I am saying that there is going to be initial error but that error is random error whose you know distribution has characteristic that is mean is 0 and covariance something so actually actual value of the error at time 0 will be drawn from that distribution actual value is different from the expected value yes that is an assumption that is a simplifying assumption okay so way to think about it is that suppose I tell all of you okay see I tell you that actually x0 come from a Gaussian distribution okay and there is a method to draw a sample from the Gaussian distribution okay Matlab will give you a sample if you ask for a sample from Gaussian distribution of certain mean certain covariance it will give you a sample okay I tell each one of you to generate one sample okay and I subtract it from the true value I know the true value and I have given you the distribution you generate a sample okay then I will find out the mean of each one of them and take sorry I take a difference of each one of you know your sample his sample her sample her sample I take difference between the true value which I know and the samples which you are given me okay then I got all possible e00 okay see I asked you to guess temperature in this room okay I know the true value you can guess okay each one of you will give a guess then I will find an error okay now you have you have given me a guess and if I take a mean of all the possible error values between the true which I know and the guess which you are given that mean is 0 that is what I am saying is that clear okay is everyone with me on this up to here okay so this error it is mean value 0 okay it is mean value 0 now so I found out the mean of the error what is the meaning of mean of the error is 0 so I am saying that epsilon k k is x k minus x hat k given k right this is what is our definition that is x k minus expected value of x k conditioned on y k so I am saying that expected value of epsilon k given k is equal to 0 that is what I proved under the assumption which I made okay how do I prove this I could prove this using difference equation together with the assumption that expected value of e00 is equal to 0 this together with the difference equation follows right is that okay okay now I am just going to interpret this so expected value of x k minus expected value of x k given y k is equal to 0 right 0 vector this is nothing but epsilon k right fine okay or this also implies that expected value of x k is equal to expected value of okay so conditional mean is going to be an unbiased estimate of x there is no some constant coming up okay there is no some constant coming up expected value of x k is same as conditional expectation of x k conditioned on the measurements expected value of x k is a random variable okay conditioned expectation of x k is also another random variable I am saying I am showing an equivalence what I would say is that do not expect that you will understand everything in one shot okay so things will sleep in sleep in slowly okay so at some point you will have to accept and proceed and then yeah but what is expectation is already a expectation of expectation is the same value expectation is a deterministic quantity expectation of a value is a deterministic quantity expectation once you have found an expectation is a deterministic value just think about it is another I am just saying that estimate is a random variable truth is a random variable both have same means the way I am constructing estimate the mean the estimate see what this says is that conditional expectation of x is same as the true expectation of x so it is fine to find out conditional expectation of x if I want to mean value of x I could as well find the conditional expectation of x okay I am just showing the equivalence between the two okay say it again I am not following what you say okay let us discuss it later see if it becomes clear if it does not become clear you ask again okay so now to find out covariance what do you do I am going to find out what is the expected value of this quantity 0 okay expectation of e k is 0 right e k k is 0 e k given k-1 is also 0 okay both of them are 0 because they are if one is 0 the other is 0 because they are related through algebra see see we showed that expected value of e k expected value of e k k is 0 and by this correlation expected value of k given k-1 is also 0 both the errors have expected value 0 okay so now I am going to use this to find out covariance so I am defining this covariance matrix see how covariance matrix is defined covariance matrix is defined as expectation of as expectation of k k-1 k k-1 transpose epsilon k why mean is not appearing here because they are 0 mean both the errors are 0 mean okay so I just have to take epsilon k k-1 into epsilon k k-1 transpose and here okay now I am going to derive a recurrence relationship between the two okay this is stochastic process I am going to develop a recurrence relationship so I just use the difference equation I just use a difference equation k k-1 k k-1 transpose is same as phi k-1 k-1 wk into this whole thing into transpose right simple algebra I am just multiplying left hand side multiplying right hand side okay and then I am going to take an expectation I want to take this expectation here okay so I am going to take an expectation now just think about this wk-1 wk-1 and error k-1 k-1 are not correlated just go back and look at that is why I wanted to have an equation I cannot keep going back in the slides just have a look at if you have the print out just have a look at the equations I these two wk-1 and epsilon k-1 k-1 they are not correlated so expected value of this quantity 0 and from this it follows you get this recurrence relationship okay very very simple derivation you just do multiplication you just do multiplication of these quantities these are vector and matrix quantities so you have to be very careful when you take transpose and all that okay if you take transpose and take expectation you will get this recurrence relationship that is predicted covariance okay is phi times updated covariance this is called updated covariance this is called predicted covariance updated covariance at k-1 okay plus covariance of wk why is this q term coming here because you will have wk you will have wk wk transpose okay see what are the terms that you will get when you do this multiplication just go back here you will get phi you will get phi transpose okay then you will get term between you know epsilon into w you will get two terms between epsilon and w and then you will get one term between w w transpose so there are four terms okay out of which two terms cancel only two terms remain so which two terms cancel by virtue of the fact that wk-1 and error at k-1 k-1 are uncorrelated we get this recurrence relationship yeah pk k-1 is the covariance of the prediction error so this is covariance of the prediction error this is covariance of the estimation error okay this is a recurrence relationship okay just like for the mean I found a recurrence relationship previous mean is correlated to the new mean okay same way I am finding a recurrence relationship between I am finding a recurrence relationship between the covariances so I have characterized the stochastic process in terms of mean and covariances okay I now have to do one more step okay okay now look here what is e k e k is yk-innovation okay so this is c epsilon k k-1-vk okay what is the mean value of e k you take expectation of e k what will you get you will get 0 why do you get 0 see because I have shown that error yk-y hat k-1 is a function of wk-1 that is what you wanted to know it is also a function of vk see I am just doing substitutions I am substituting for epsilon k given k-1 look at your slides previous slide okay now if I take an expectation of e k it will be c phi into expectation of this quantity expectation of this quantity is 0 expectation of wk-1 is 0 and expectation of vk is 0 okay so actually what I want to do is I want to so first thing that I want to convey here is that this e k it contains information about the disturbances but it also contains vk so we want to now separate out we want to separate out information about vk and wk somehow try to compensate the estimate of the state using information for wk-1 I want to construct an estimate of wk-1 using e k and that is what actually you are doing when you are putting this lk times when you are doing this you are actually constructing an estimate of wk and then you are actually correcting it here this is an interpretation okay why am I doing this correction see this is the correction step the predict the corrected estimate is equal to prediction estimate plus correction this correction is the gain times e k okay this correction is gain times e k e k contains the information of wk-1 okay but it also contains vk so I want to multiply it by a factor what is the role of this factor this factor will help you to correct for wk-1 it will try to filter out vk okay this is an interpretation which I am giving for what we are doing okay now how do I decide lk is the golden question we have not answered that yet that is what is going to be my next task okay so what is the expected value of e k expected value of e k is 0 I what is can you find out covariance just do it find out covariance of e k just do it tell me what is covariance of e k there should be some little break in between yeah yeah which one so it is estimate of noise at the previous instant using information up to current instant okay so there are three things in estimation okay one is called as prediction estimate k given k-1 is a prediction estimate k given k is the current estimate or filter estimate that is called filter estimate and you know when you are estimating something in the past using information in up to future that is called as smooth estimate okay see I have collected it is like saying you know you collected data of temperature in this room starting from say first of January till today you have time series you collected data now you have a model and I ask you to do prediction of what is an estimate of temperature tomorrow that is prediction estimate okay second thing is I will ask you that do an estimate of temperature of today using measurement up to today okay that is our xkk see here what is xkk is estimate of today's temperature using information up to today okay I can also pose the question ultimately we are just estimating so I can say what is the best estimate of yesterday's temperature using information up to today right I can have that or what is the best estimate of temperature three days back using information up to today that is called as if you can consider estimate you can consider estimate and that is called as a smooth estimate okay so you can now you have collected more information see ultimately when you generated xkk it is an estimate ultimately okay so if you have more information you can improve upon the estimate okay so smooth estimates are constructed smoothing is done okay just do this find out what covariance of this ek what is covariance of ek you express it in terms of find out covariance of ek in terms of covariance of epsilon k given k-1 what will it be see pk given k-1 c transpose plus r see what you will get is like this I am calling this pe and ek transpose is see this plus this whole quantity into the bracket transpose I am taking expectation of this so you will get see now this error and vk are uncorrelated this error and vk are uncorrelated so cross covariance between epsilon k given k-1 and vk is 0 and then that gives me this relationship okay okay each of the quantities stochastic quantities I have three stochastic quantities that I have to worry about one is epsilon kk-1 other one is epsilon kk and the third one is ek okay third one is ek so for all three of them I want to find out mean and covariance okay this is my preparation going on okay we will derive the observer after this so I am finding out now estimation error first I will find out its mean value now since I know mean of this and I know mean of this I can use it use this relationship and find out mean of this why I am doing this afterwards is because I have to go sequentially okay these are coupled equations and then I have to go one after another okay and then I find out covariance of kk okay is everyone with me on this I am just finding out covariance of this quantity and covariance of this quantity will turn out to be if you just do the algebra it will turn out to be now here there are some cross covariance terms that are coming up which I cannot neglect okay so if I do the algebra you will have to sit and work out this little bit of algebra if you do this little bit of algebra you will get this equation which says that new covariance is that is updated covariance is predicted covariance plus lk pe pe is covariance of e itself and this is cross covariance between e and epsilon okay so what it tells you what does this equation tell you that the new covariance updated covariance is going to be function of lk how do you choose lk will decide what is pk given k okay now forget about the algebra what does covariance tell you spread covariance tell you spread what kind of estimate that I want spread should be maximum minimum small large what do you say as small as possible okay I want to choose lk I want to choose lk such that this covariance predicted covariance is as small as possible okay coming up to this point it just algebra you sit down with these equations patiently do the algebra you will get these equations I have given all intermediate steps okay it is nothing they look little complex but you know just doing patiently doing the algebra and you will get these expressions okay yeah no see there are this equation is there right kk k-1 into lk e k so you need to know both about e k and lk see why did I derive here pe here because I am going to get a term when I do epsilon k epsilon k transpose I will get a term which is e k e k transpose I should be ready for that that is why I have done this preparation which one which one huh slight 15 there can be a typo but which one you can derive it see here for this one no but I did I substitute for yk there this equation right talking about this equation okay so have I made a error here let me check there is a okay the way I derived this equation is simply by subtracting so you start with this equation you start with this equation yeah so I have just written this equation that is epsilon k-1 oh here it should be k-1 k-1 right okay this equation you are talking about this equation no no see these two are same these two are same equations see I can choose to use this equation or this equation see here what I have done is I have expanded e k and just written vk here okay these two are one and the same equations here e k is written in terms of c epsilon k k-1 k-1 vk then you get this equation okay but you can choose to work directly with this equation no so see there is lot of algebraic tricks involved in this okay so this equation and this equation are identical actually they are not different equation okay so I could I could have proceeded using this equation I could have proceeded using this equation I have decided to proceed using this equation okay so they are not different yeah thanks for pointing out there are two different expressions for the same thing they are inter convertible okay is so is this fine up to here last equation okay now what I want to do is I want to do device a minimum variance controller I want to device a minimum variance estimator I want to find out that gain lk which gives me smallest possible variance in the estimate of x okay I want to find out that gain lk which gives me smallest possible now what is the relationship between the predicted covariant updated covariance and the gain lk that is given here okay I want to choose lk in such a way that this is as small as possible now this is a matrix what is a small what is a matrix as small as possible is this a special matrix is this a matrix which is positive definite is covariance always positive definite matrix why is this matrix positive definite matrix just look at the terms is this a positive definite matrix covariance covariance matrix just go back and look at the derivation initial covariance covariance matrix p0 is always positive definite if p0 is positive definite you can show that p1 is positive definite if p1 is positive definite p10 is positive definite if p10 is positive definite you can show that p11 is positive definite and it follows that all these are sequences of positive definite matrices okay covariance covariance is always a positive definite matrix it could be positive semi definite but it can never be negative definite it is a positive definite matrix. So now it depends upon how we choose this LK so question is how do I choose LK and maintain the positive definiteness of pKK okay. So now what I am going to do is I want to find out estimation find out the gain matrix LK such that estimation error has minimum variance so I need some you know scalar quantity which defines the volume of this matrix I want to talk about large matrix small matrix okay let us assume that this is a diagonal matrix if the diagonal entries are large covariance is large if diagonal entries are small covariance is small okay if this is a diagonal matrix and it is a positive definite matrix all the elements will be positive okay. So trace what is trace some of the and also are there some relation to the Eigen values some of the Eigen values of a what is Eigen values of a positive definite matrix are they always positive they are positive okay so minimizing variance turns out to be equivalent to minimizing trace of this matrix okay how do you minimize a function with respect to some what is the necessary condition the necessary condition for optimality is derivative with respect to the objective function set equal to 0. So now what I need to do is to differentiate trace of pKK with respect to LK and set it equal to 0 then I will get the solution okay then I will get the solution. So once you I want you guys well guys and girls to go back and do this algebra at home it is not just do not listen to this lecture just go back and try to derive these things okay first of all now I need some intermediate result first of all you have to understand that trace of C plus D is same as trace of C plus D and trace of C is same as trace of C transpose okay see why I need this because now what I am going to do is I am going to take trace of pKK which is trace of this whole quantity on the right hand side okay so trace of this is some of trace of this plus of this plus of this plus of this that is one thing okay and some of the other quantity is see if you look at this and this matrix is transpose of this matrix but the traces are equal okay so I am going to use that I am going to use that so I need this relationship okay next thing I need is how to differentiate how to differentiate a scalar function of a matrix with respect to the matrix itself see what I am doing X is a matrix okay right now well I should not have used X X is now not the state here I am writing the result mathematical result as a side thing okay X is some matrix and Y is some scalar function of this matrix then what is doh Y by doh X that is defined like this I should have used some other notation X is not a great thing X do not confuse on this particular slide X does not represent the state it is just a algebraic result which I am writing on the side okay now there are some rules of differentiation okay of a scalar function of a matrix with respect to the matrix itself okay so trace of A times X doh trace of A times X by doh X X is a matrix is A transpose you can prove these results with ideas algebra you can prove these results okay use this definition take AX and then meticulously write down I tried it it works out okay so there are also results for X transpose BX BX BX transpose why do I need all this look here I have a term which is matrix L this into L transpose so I need to differentiate this term I need to differentiate this term right I need to differentiate these two terms what is the derivative of this with respect to LK yes you said it 0 see pk given k-1 does not depend upon LK only these three terms depend so I have to differentiate this term I have to differentiate this term I have to differentiate this term okay to differentiate these three terms I need these two results these two algebraic results one is you know when you differentiate trace of AX or XA with respect to X you will get A transpose when you differentiate X BX transpose with respect to X you will get 2 X B okay this is just algebra the final result looks exactly same as scalar differentiation except here you will get A transpose okay otherwise it looks almost you know close to I am going to use this and differentiate okay I have to differentiate first this term I have to differentiate this term I get 2 LK PE okay I have to differentiate this term I have to differentiate this term and from this I will get 2 here is everyone with me on this two results is it okay this time using that algebra which I showed you differentiating a scalar function of a matrix with respect to a matrix what is the matrix here L okay I am differentiating pKK with respect to LK I will so finally when I do this when I do this I will get this result okay so what is the optimal gain optimal gain is this L star which is P epsilon EK into PE inverse okay this is my optimal gain if I use this value of gain I will get smallest possible variance pK even K okay this was the fundamental this is a seminal result put by Kalman and then what is the optimum covariance optimum covariance turns out to be this smallest covariance this is the smallest covariance that you can get you cannot reduce covariance below this okay so this is the optimal estimator okay this is the optimal estimator okay and so let me summarize Kalman filter this is my prediction step and then I do prediction of the covariance see I am working with the stochastic process I have to keep updating mean covariance mean covariance okay so prediction step is this so this is predicted mean this is predicted covariance okay Kalman gain computation is you know this P epsilon E into PE inverse we have computed these terms earlier and kept them aside I am just substituting them here okay we have done this calculations earlier just look at the slides we have done this calculations so this is how you compute the Kalman gain so if you know if you know pK given K-1 you can compute Kalman gain and if you know Phi if you know R and if you know Q so what I have achieved see I started by saying I want to derive the gain by systematically using stochastic information about W and B I have done that my gain calculation is based on Q and R how it is dependent depending upon Q through pK given K-1 is appearing here pK given K-1 is function of Q okay and LK is a function of R okay so I have systematically used Q and R information to find out so this is the celebrated Kalman filter I do an update and then I update it covariance okay this is updated mean updated covariance so you start with the initial guess x00 and this is a recursive procedure by which you construct subsequent optimal estimates for linear time invariance system these are the estimates which give you minimum covariance okay there is no better estimator you can construct there are a lot of properties I mean this particular development in 1964 or 65 about a time I was born you know guess gave rise to a flurry of results I mean it has resulted into huge number of things that you just do not know what all things that we use this thing is used in as I said imaginary construction it is used in you know underwater drilling it is used in see you have problem in oil well drilling you have some measurements coming from the you know that the drill has some measuring instruments okay it will measure some humidity and some other things there I do not know what exactly it measures but then and then you have a model for the reservoir okay and from those measurements you do you construct the states what is the state what is the size of the reservoir down there okay what is the you know how soft is material on this side and how soft is material on that side you can estimate you have to you have a model your data coming you fuse data in the model that is what you are doing here see this is my model based estimate my data why is coming here this is my predicted estimate I am fusing data with the model in this okay a wonderful algorithm why this is wonderful because it is recursive so when goss initially worked out on v square estimation he worked out on a batch of data okay batch of data is not it is a brilliant solution but you know in a computer control system where data is continuously coming data size keeps increasing you cannot work with batch of data you need a recursive solution so this is a recursive solution old estimate plus correction gives me a new estimate so that was something which was landmark because this could be used in a computer okay it just says that updated mean is 5 times the previous mean plus this and corrected mean is updated predicted mean sorry this is not updated this is predicted mean and corrected mean is predicted mean plus a correction, correction is based on y-y hat okay very very powerful algorithm which so now will look at its interpretations and all kinds of things in the next lecture and what I will show you is that every time there is a covariance reduction so every time you are getting a better and better estimate and so on okay.