 Welcome everyone. In the previous lecture we had looked at the Kalman filter which provided us a way of estimating the state of a linear system, a system that it was linearly and was disturbed by Gaussian noise. And where we were getting observations which were linear functions of the state again corrupted by Gaussian noise. The reason we started looking at Kalman filter and more generally filtering problems was because we had come to the point where we were studying this the linear quadratic problem with partial state information and there we found that the optimal controller is a linear function of this estimate of the state. This is the best estimate of the state given the information. And this we said while the controller Lk could be computed recursively using the Riccati equations and so on. This how to compute this remained of was something that was not addressed by that at that time. And so the way to recursively compute this was through filter. So we got into filtering then we specifically got into the Kalman filter when the noise was Gaussian. So now putting all this together we can describe what the complete solution to a linear quadratic Gaussian problem is. So a linear quadratic Gaussian problem would be one where this noise wk and this noise vk these are also Gaussian with and with mean 0 and they are independent Gaussian and the initial state is also Gaussian. So a linear quadratic Gaussian problem that is one where your state evolves as xk plus 1 equal to ak xk plus vk uk plus wk you get observations yk equal to ck xk plus vk and your x0 w0 to wn minus 1 v0 to vn minus 1 these are all independent and Gaussian. So remember in the earlier problem when we looked at just the linear quadratic problem the there was no particular assumption about the about the variance of the of these of this noise right. We just we only had that the we only assumed that these were independent these are only independent with 0 mean we did not assume anything specific about the distribution. Also the initial state was something given to was a distribution was given to us again we did not assume anything specific about that distribution. But now we will assume that these are independent and Gaussian in this particular problem and that gives you the linear quadratic Gaussian problem. Now with any if you had any distribution for if you so long as these were 0 mean and independent but not necessarily Gaussian there also we knew that we had found that the optimal control uk star was equal to mu k star of ik this was always if on something of this form it was lk times conditional expectation of xk given ik right. So this holds holds for any noise and by any means I mean 0 mean independent. So this holds for any distribution we do not need the Gaussian assumption for this particular for this particular result we need the Gaussian assumption in order to now compute this quantity efficiently right. So Gaussian this to be the Gaussian assumption is used in this in computing this. So Gaussian used in computing this efficiently or recursively. So this is where the Gaussian assumption is used. Now what is the implication of the Gaussian assumption well that it this in fact can be computed through the Kalman filter that comes through from the Kalman filter and the Kalman filter itself has a very nice form that we that we discussed in the previous class alright. So now the main thing that we that we will discuss in today's class is that this particular this expectation here what is the form of this as a function of ik. Remember ik here is all the observations up until time k and all the actions up until the previous time instant. So what what is the form what is the form of the form of expectation of xk given ik. So if you go back now for this we need to just go back and glance at some of the earlier some of the results we had earlier stated. See remember we had we had mentioned this one particular fact that when you are that whenever you are estimating a a a the state given given some information alright. This this estimate turns and and and the information and the state were all jointly Gaussian then the estimate turns out to be a linear function of the information. Now in our case also the the the state and the information that we have is are jointly Gaussian that is because they are all eventually determined by the distribution of of the of the noise in the system and and the initial state right. So they are all linear combinations of the initial state and the noise in the system. So consequently the state itself is some Gaussian and and the the the the the estimate also is Gaussian right. Now this this basically tells us that this particular estimate that we are deriving here this is the conditional estimate conditional expectation of xk given ik this itself is a linear function of ik. So this given ik is linear function function on ik. So which means it is a linear function of all the information that you have. So which is which is this this particular vector. Now if this is a linear function of ik remember outside we are we are applying after this a control which is also linear which is a linear multiple of of this particular term right. So you are applying something which is a linear which is a linear multiple which is which is a linear transformation of this particular state. So consequently this here all right this this this quantity here is also a a linear function of ik. So which means u star here is if I view it as a function of ik directly remember it is eventually a function of ik after all. So it is but it is here it is written in terms of the conditional expectation of xk given ik but the important thing is this because the conditional expectation is itself linear in ik you have a u star which is lk times that conditional expectation. So therefore u star is also a linear function of ik. So in other words we can think that we can consider uk star equal to mu k star of ik where mu k star is a linear function this is the beauty of this problem. So what we have seen is that eventually the once the once the for a linear quadratic Gaussian problem your optimal control is a linear function of the information right. So and in fact it is given in terms of the conditional expectation of the state given the information. So to recap let us once again see what which assumptions are used where to get that the optimal control is linear in the conditional expectation of the state given the information we do not need any we do not need any linear we do not need any Gaussian assumption on the noise. So this the highlighted equation here this holds for any noise distribution okay. The if we if you now further want to claim that this is linear in ik right if you want to further claim that this is linear in ik then we need the Gaussian assumption. So when we have Gaussian assumption we get that as once we have Gaussian assumption conditional expectation of xk given ik is linear and then therefore as a result of that uk is also linear in ik right. Here uk is linear in expect in the conditional expectation of the state given ik that holds for any noise any noise distribution okay I hope that is clear alright. So now so you can see so let me summarize these. So we we always we have for any we have for any lq problem uk star is linear function conditional expectation of the state given information alright. We have now for lq g problem uk star to be linear function of ik since this here is a linear function of ik. So this is the summary. Now here is something that so so with this summary we now need to look at the problems with partially observed state in a little in a slightly more mature light. So notice that we the way we approach problems of with partial state information or with imperfect state information was by considering the information vector as the state. So we said that we can just think of ik itself as a state and think of any new observations that are coming in as some as noise and and then keep updating this as a state vector and that way then we got another form of of of what looked like basically a perfectly observed observed system and then we said we let us write out the let us write out the Bellman equation for that system and based on that we produced we got we wrote out the dynamic programming equations and got an optimal control we said that the optimal control can be computed through the dynamic programming equations as a function of ik. Now the problem with this this approach right. So this this approach which is wherein we so the standard of so the approach so for approach to that we have taken so far approach partially observed problems is take ik to be the state take the state as ik then then the take the then and yk or earlier my notation was zk I think this either of these is okay for the short the observation as is as noise and then write and then from here you get a perfectly observed system and from the perfectly observed system we apply dynamic dp equation and compute the optimal policy mu k star of ik mu k star as a function of of of the information. But we had seen a challenge pertaining to this in one of our earlier lectures we had seen how this the the difficulty that this entails is is the is that our state space keeps blowing up with time right which the state vector itself gets longer with time because we make we have more and more observations with time and we have a longer and longer history as as time goes on. So the state vector keeps growing longer so the difficulty is difficulty state vector keeps growing growing with time okay. So this is the this is the difficulty that we have with this in this particular approach however if you see the result that we have just got for for linear quadratic problems there is something here which hints that this approach can be simplified. So if you see the the from the from the linear quadratic problem we have got without even making the Gaussian noise assumption we have got that the optimal control is a linear function is a linear function of the conditional expectation of the state given the information right. So it is we are not really taking here we are not simply saying that the the optimal control is a function of the information we are saying it is a function of something much more specific which is the conditional expectation of the state given the information. So this here is something that we have derived from the information it is a piece of information that has that we have derived from i k and it turns out that it is enough for us to define the optimal control. So the optimal control morally is only a function of i k that is that is as per the definition. But it turns out that it is in fact a much more specific function in the sense that it is a function of only some some part of the information that is available in i k and that that information is simply the conditional expectation of the state given the information right given i k. So in other way so here this motivates the following concept and the concept is called sufficient statistics sufficient statistic is that piece of that piece of information piece of information in i k that is sufficient to compute u k and then eventually also j k or j k and more generally j 0 of j 0 of pi 0. So it is it is so sufficient statistic is that that bit of information within i k which is sufficient for us to compute the optimal control and thereby the value functions or the cost to go and the and the eventual optimal value of the problem optimal cost of the problem. So this here is the thing that we that the control can really be made a function of. So in general the control is a function of just i k but if we can if we derive a from this a sufficient state from i k a sufficient statistic then we can simplify that our exploration by saying what should be the control made what can the control be made a function of well you should be made a function of that particular sufficient statistic. So so it is that piece of information in i k that is sufficient for us for computing computing these these quantity. So for the LQ problem for the LQ problem it turns out that the sufficient statistic sufficient statistic is is this this particular thing. So there are the linear quadratic problem this is our sufficient statistic you just need to know the conditional mean of of the state given the information. So only the mean of x k given the given i k is needed to compute u k star. Remember i k as a as a vector of random variables has can be used to derive many different pieces of information or many different other random variables. This is just one of them one of the random variables that can be derived from here this one here the conditional expectation of x k given i k is just one of the random variables that you can derive from i k but it turns out that that is sufficient for us to compute the optimal action in this particular problem. Now the in the in the LQG problem it further happens that this this itself the sufficient statistic itself can further be computed in a recursive form that is a further simplification that happens in the LQG problem. The sufficient statistic does not change the sufficient statistic is still the conditional expectation of state given a given the information but it is it the the sufficient statistic itself can be computed easily thanks to the Gaussian nature of the noise. Now what the what is the all of this telling us this is telling us that there should be a way for us to to approach problems with partial state information without having to having to write the the state as as the information vector because then the state becomes too large and the problem becomes too complicated. So then the question for us then is what is the right universal sufficient is there what is the right sufficient statistic. Now this here the conditional expectation of this of the of the state given the information this was the sufficient statistic for the LQ problem. This is not necessarily the sufficient statistic for every problem. If you go back to the first lecture of our course we had seen that the mean does not always capture all the information that we can get about a random variable and the same lesson is is is there here too because the mean of this may not always may not always be enough it just happens to be the case that in linear quadratic problems the mean the mean of the state given the information is sufficient it is the sufficient statistic. Now the question then arises as to what what is are the the what is the sufficient statistic then for every problem and is there even such a thing as a universal sufficient statistic which can be used for every every particular every problem every class of problems. So this this is something we will we will discuss in the next class.