 Welcome, everyone. In the previous lecture, I was talking to you about information structures in stochastic control. And I have mentioned that the kind of problems we have considered so far all involve what we called classical information structure. Classical information structure is one where the information known in the past is also available to us in the future. So, this is our definition of classical information structure and this is what we have studied so far. And any information structure which is not classical is called non-classical. And what I argued then was that if you wanted to really come in a holistic manner, study the design of the optimal design of any system in which there is some amount, either amount of some kind of decentralization involved, either something is either data is to be stored across time or it has to be transmitted across space. In either of these cases, there is there will be the issue of noise in the communication medium. And as a result, one has to basically confront that the problem will end up having a non-classical information structure. Either the problem has a non-classical information structure in the in build in the control actions itself that means the sensing and the decision making blocks or if you decide that you are going to have another a way for countering the noise in the medium by introducing a transmitter and a receiver that is going to do some kind of encoding, decoding to take care of the noise, well then that all then those transmitters and receivers have non-classical information structure and they need to be incorporated in the in your design problem because the optimal design then is involves a joint design of all of these. Of course, we have not yet shown that the optimal design is the joint design is better than a separate design. This is something that we will this is an issue we will come to in today's lecture. Now, today's lecture is about as I had mentioned in the previous class about a historic paper in the control and decision and control community. It is a paper by Bitson Hausen. So, I will just show you a glance of this paper here. This paper is called a counter example in stochastic optimum control. The author is Hans Bitson Hausen. It was published in 1968 in the science journal of control. And if you read the abstract, let us we can go through it for a moment. It is sometimes conjectured that nothing is to be gained by using non-linear controllers when the objective is to minimize the expectation of a quadratic criterion for a linear system subject to Gaussian noise and with unconstrained control variables. In fact, this statement has only been established for the case where all control variables are generated by a single station which has perfect memory. Without this qualification, this conjecture is false. So, this is what Bitson Hausen basically showed in this paper. So, he showed that if you do not have any kind of decentralization in the problem, then the linearity of your optimal controllers fails. So, what is this linearity that Bitson Hausen is talking about? We had talked about this a few lectures ago. Remember that in a linear quadratic Gaussian problem, the optimal control takes the form of a superposition where you apply the same controller that you would apply in the absence of noise, but you apply it on the conditional expectation of the state given the information. And the conditional expectation of the state given the information is to be computed recursively from a filter. Now, when the noise in when the observation noise and the system noise is Gaussian, this the conditional expectation of the state given the information is also linear in the information. So, consequently the optimal action then if you will view it as a function of the information, the optimal action then is also linear. So, what Bitson Hausen shows in this paper is that, well there is that this result that the optimal action is a linear function of the information, this result is contingent on an assumption of classical information structure. So, if your information structure is not classical, then this result fails. Now, this if let us we can see what he is trying to say about this in the introduction. He says in a stochastic control problem, control actions have to be taken at various instance in time as a function of the data then available. One seeks functions for which the expected value of the cost under given noise distributions is minimized. It is usually assumed that all actions to be taken at a given time are based on the same data and that any data available at time t will be available at any later time t dash greater than t. This situation is the classical information pattern. Remember this is something we have talked about, we have said that ik is a subset of ik plus 1 and that is the classical information pattern. Considering in particular unconstrained control of linear systems with Gaussian noise and quadratic criteria, it is well known that the search for an optimum can safely be confined to the class of affine which is linear plus constant functions. This is also something we have seen. This is the case for both discrete and continuous time systems with classical information pattern. In this course, we have not looked at continuous time systems, but the result is true in those systems as well. In this paper, it is shown that the class of affine functions is not always adequate when the information pattern is not classical. So, when the information pattern ceases to be classical, there will be problems in which non-linear controllers will outperform all affine or linear controllers. A counter example is presented for which it is established that an optimal design exists and that no affine design is optimal. Thus, there does not appear to be, there does not appear to exist any counter example involving fewer variables than the one presented here. The practical importance of non-classical patterns is discussed. So, this is something, this is what he says in this paper. So, he shows, this paper basically presents to us is the simplest, you can say, counter example to the belief of the time that linear controllers are optimal whenever there is a linear cost involved, linear system and a quadratic cost and Gaussian noise involved. So, we will, what we will do in the coming few lectures is we will study this counter example and we will understand in depth what is it that this counter example is teaching us. So, what is the, what is the counter example? So, maybe let us, we can start, I will start writing this out here. The counter, the Wittson-Housen counter example is the following. So, the counter example comprises of a simple scalar linear system. So, the state, so we have a scalar linear system. The state of the system at the initial instant is x0 here, x0 is the state of the system at time, at time 0, let us just draw it in a box here x0. Now, there is a first controller that acts on it, but he gets observations equal to y0, he gets an observation y0. Now, he based on this observation this controller acts and produces an action U1. So, U1 is the action produced by this controller. Now, the action U1 changes the state. And changes it to a new state x1. I will explain what, how x1 is derived from U1, but it changes the state to a new state x1. Then x1 gets corrupted by noise here, let me add a noise block here, noise x1 gets corrupted by noise. So, this, this is, this noise is denoted V. So, you add, this noise gets added on it. So, what comes out of this block is x1 plus V. Thanks to this noise what comes out of the block is x1 plus V. This here is your second observation. So, this is the observation y1, y1 is x1 plus V. And now, based on this observation, this there is a second controller that acts, he takes an action U2. The state then changes to x2 which is equal to x1 minus U2. I have drawn all of this in this sequence. Of course, one can draw this in a slightly different way as well. We can, we can, we can, here the observation out here which is x1 plus V has been written in the same sequence here, because I want U2 to be acting on x1 plus V. So, U2 is a function of x1 plus V, but U2 actually manipulates the state itself. So, x1 is changed to x1 minus U2 and that becomes the new state once you add, once you add U2. So, let me summarize this here. So, we have an initial, initial state x0. Then you have the state equations are like this, you have x1 equal to U0, x0 plus U1 and x2 equal to x1 minus U2. So, these are, these are the state equations. The observations that we get are these, you have the observation equations. These are the observations y0 equal to x0 and y1 equal to x1 plus V. Now, the important thing here is, so we need to choose these, the decision for us that we need to make is that we have to choose U1 and U2 and U1 and U2 are to be chosen to minimize the following cost. So, the cost function that we have is denoted as k square U1 square plus x2, x2 square. So, this is the expectation of this is the cost that we want to minimize. Now, the key, the main thing here that we need to observe is that, is the assumption that we have on the information structure. So, the classical information structure would have been, would have been that, would have been where you have the information, the information at any time k comprises of all the observations up until that time and all the actions that you have taken up until that time. So, this was our notion of the classical information pattern. So, I will not confuse you by writing out the classical information pattern here, but we will just remember this. So, what we will now do is look at the information pattern of this particular problem. So, what we will now do is look at the information pattern of this problem. So, the information that is assumed here, we assume that when we are taking action U1, the information that we have I1, let us call that I1, I1 is simply the observation y0. So, we have only the information of y0 at that time. And what is y0? Well, remember y0 is simply x0. This is y0 is equal to x0. Now, when we take the second action that is U2, the information that we have I2 is only y1. And what is that? That is equal to x1 plus this vector, this term simply x1 plus v and y0 here is simply x0. Now, what you notice here is that this is not a classical information pattern. And why is it not classical? It is not classical because I2 does not contain I1. I1 is not a subset of I2. So, this is not a classical information pattern. So, this is a non-classical information, information pattern or structure. What about the other things in the problem? Well, the other things in the problem I forgot to mention here, we are going to assume that the x0 and v, these we will assume are Gaussian and independent. So, with this assumption then this now looks like on the face of it looks like a linear quadratic problem. So, if you see the state evolves linearly here, the observations are linear functions, the noise is Gaussian and the initial state is also Gaussian. Now, what about the cost? Well, the cost is quadratic because you see the cost here is separable in the sense that you have this is the stage wise cost of stage 1, this is the stage wise cost of stage 2. This stage wise cost of stage 1 does not even involve x1 square, that is okay. The stage wise cost of stage 2 does not involve u2, does not involve u2, that is also okay. But this is eventually what is this? This is some kind of a quadratic cost that we are trying to minimize by choosing controllers for when the system state is evolving linearly, the observations that we are getting are linear and the noise in the system is Gaussian. The only difference from the problem, the linear quadratic Gaussian problems we have studied so far is this information structure. The information structure here is that i1 only knows x0 and i2 knows x1 plus v, x1 plus v is his observation y1 and importantly i2 does not know x0. So, i2 does not know x0. So, with this now that we have absorbed this particular problem, let us now try to get a understand what is it that this problem is telling us. How do we kind of understand what is going on in this problem? So, what we want to do here, remember in this is since this is a stochastic control problem, we want to find a policy. So, we want to find a policy and so what we will do is we will denote this policy by gamma 1, gamma 2. This is the policy that we want to find and what is it mean to find this policy? This basically means you want to find gamma 1, gamma 2 such that u1 is equal to gamma 1 of y0 and u2 is equal to gamma 2 of y1. So, this is such that we minimize this, minimize the cost. This is expectation of k square u1 square plus x2 the whole square, you want to minimize this with respect to gamma 1, gamma 2. So, first let us try to understand before we move forward. Let us try to first understand what is this, what is actually the underlying tension in this problem. So, the underlying if you see here, if you look at the cost that we have here, the cost is k square u1 square plus x2 the whole square. And what is x2? Remember x2 is simply x1 minus u2. x2 is equal to x1 minus u2. If x2 is x1 minus u2, then what is this particular term here? Well, it is expectation of this expectation of x2 square is equal to expectation of x1 minus u2 the whole square. Now, remember u2 is being chosen as a function of y1, which means that this term is sort of like is akin is similar to the mean square, minimum mean square estimation, estimate of x1 given y1. So, using the information in present in y1, you want to estimate x1, you want to get the best estimate or you want to basically get the best estimate of x1. That is what this second stage seems to be about. So, in fact, that is exactly what this second stage is attempting to do. So, if I write out this cost here, let me write this out in the following way. So, let me write this cost as minimize in two stages. So, I have gamma 1 outside. So, I do a minimization over gamma 1 outside and then I do a minimization over inside I have minimization over gamma 2. So, I am first doing a minimization with respect to gamma 2, then with respect to gamma 1. And the minimization inside with respect to gamma 2 is done keeping gamma 1 outside fixed. So, the inner minimization here is minimization with respect to gamma 2 of the expectation of k square u1 square plus x2 the whole square. So, now, if you see here the in this expectation, the inside this expectation here, since you are keeping gamma 1 fixed, u1 square is the one that is a influence by gamma 1. u1 square gets influenced by gamma 1. So, but it does not get influenced by gamma 2. So, when you fix the gamma 1, as far as this minimization is concerned, u1 square is a constant with respect to gamma 2. So, u1 square really moves out and I can write this as a minimization outside with respect to gamma 1. And then I have expectation of k square u1 square plus minimization with respect to gamma 2 of the expectation of x2 the whole square. So, now let us try to plug in this x2 as x1 minus u2. So, in fact, let me just erase this and write that here. So, I plug in x2 as x1 minus u2. So, I have x1 minus u2 the expectation of x1 minus u2 the whole square and then outside I have a minimization over gamma 1. So, if I fix the gamma 1, then what I am really doing in this in my second stage out here in this second stage is estimating x1 from whatever information is there for me in order to choose u2. So, if I if since u2 is being chosen as a function of y1. So, this is basically estimating x1 from x1 from y1. So, we actually know what the optimal estimate here is. So, the in fact the optimal gamma 2 then the gamma 2 star is of gamma 2 star of y1 is in fact we know is should be a conditional expectation of x1 given y1 is conditional expectation of x1 given y1 this is gamma 2 star. What we are doing in stage 2 when we are taking the second action is we are really estimate we are just estimating x1 given the information that we have and the information is y1. Now, what the so this sounds very much like what we have been doing in our in stochastic in LQG problem so far we have to estimate the state given the information. So, this does not sound eerie at all the only trouble is that what you realize is if you look at this particular expression there is something strange which has happened here. See notice that x1 here x1 the expectation here you are taking the expectation of x1 x1 the state x1 is in fact obtained from the initial state x0. So, how does x1 come from remember x1 is actually coming from x1 is x0 plus u1 and u1 is gamma u1 here is gamma 1 of y0. So, this here this the expected this x1 here is actually not is in fact we write it here this x1 is a function of gamma 1. So, it is x1 is decided once you decide gamma 1 right. So, one once you fix the function gamma 1 then x1 gets defined because once you fix gamma 1 you have your u1 gets defined and from u1 your x1 gets defined. So, there is in fact implicit here a presence of gamma 1 right. So, that is one thing the but this also happens in all LQG problems this is not a new thing this also the press control policies will influence the future state. But the thing that is unique about this problem is that the press control policy not only up appears here in x1 but it also appears here that the y1 which is what is y1 y1 was simply y1 is x1 plus v remember. So, hidden here also is gamma 1. So, this is also a function of gamma 1. Now, this here is something that does not happen in a usual control problem and that is something that I will explain to you in the next class that this is the additional complexity that is coming because of the information structure of the problem that there is that the presence of the previous steps policy is appearing also in the optimization of the next step. So, we will discuss this more in the next class.