 So, when one studies information structures one has to keep contending oneself with the issue of what is the, with the issue of information. So, we have to ask ourselves what with what information is a certain action being taken. So, a way to state this formally is to say that there is a measurable function that maps the information random variables to the action random variables, right. So, this is a concept in mathematics which needs a course in analysis. So, I am not going to go deeper into that. However, I will just appeal to an intuitive understanding of what it means for the information of a certain random variable to be available while choosing a certain action. So, if all of what I am saying today can be made formal by using concepts of major theory and sigma algebras and so on, but I would not be going into that in this course. So, it is for us for our purpose it is enough to sort of understand logically how the information of a certain random variable can be made useful, okay. So, we will use the notation that z is a, if when we want to say that z is a function of say a variable w, we will say that z belongs to sigma of w. This is actually the notation borrowed form analysis. It essentially says that z is a measurable function of w or z is adapted to the sigma algebras generated by w or simply in our simple language it just means that z z is produced from the information of w. Now, let us look at some variations of the Wittson-Hausen problem. We have x0 as the initial state, you have observation y0 equal to x0. You have an action u1 that is chosen, action of first controller and you have x1 which comes up out of this. So, x0 plus u1, okay. So, remember since x and u1 is chosen as a function of x0. So, using our notation I will write u1 belongs to sigma of x0, okay. The observation of the second controller is, so u2 is chosen as a function of y1. So, u2 then in our notation belongs to sigma of y1, right. And then thanks to u2 your state changes to x2 which is x1 minus u2 and that is the resulting state. And the cost that we want to minimize is, we want to minimize over u1 with information of x0 and u2 with the information of y1, we want to minimize the expectation of k square u1 square plus x1 minus u2 the whole square, okay. This is the, this is here the, this is Wittzenhausen's problem. So, now let us look at variations of this. So, variant 1, right. So, here in variant 1, we will consider the case of full noise observation, full noise observation. So, what does this mean? What this means is u0, u1 is still a function of x0. So, u1 is a function of x0. But when you are choosing u2, u2 has access to both x0 and the noise. So, it is a function of this comma v, right. So, u2 is a function of x0 and v. The cost is still the same. We are using the, we are keeping the same cost but we have changed the information structure, okay. The same cost as above. This is a new information pattern. So, now what is this? So, now let us recall, let us argue through what we can see here. See u2 has access to x0 and v. So, from if, since it knows x0 and v, since it knows x0 and v, from there it can go back and construct, see it knows x0 and v. Since it knows x0, it can also reconstruct u1, remember because u1 is a function of x0 itself. So, it can reconstruct u1 and therefore, it can reconstruct x1, right. So, the knowledge of x0 implies knowledge of x1. So, it can reconstruct x1. And now because it can reconstruct x1 and it knows, it actually knows v itself so, it can basically from because it knows v, it can also reconstruct x1, x1 plus v because it knows v also, right. So, as a result of this, u2 which is this, u2 has which has the information of x0 and v, this also has, this implies that u2 also knows x1 and x1 plus v, right. So, in as a result of that, it actually knows, it actually knows the, so it knows, so as a result u2 knows x1 and x1 plus v. So, then our problem becomes, we are minimizing the expectation of k square u0 square plus u1 square plus x1 minus u2 the whole square, where now u0 is a function of x0 and u1 is a function of x0, v but knowing x0, v from there it can reconstruct remember x1 itself. So, then what should be the optimal thing for this team of players to do? Well, what the, remember the second controller, the second controllers action is its purpose is to simply come as close to x1 as possible, but it actually knows x1, but from the information that it already has, it can in fact reconstruct x1. So, as a result, the second controller would be able to reconstruct x1 and therefore, make this estimation error here 0. So, then the job of the first controller then is rather simple, it just has to look at this particular cost and decide well what should its action be to minimize that. So, then the obvious choice there is to simply minimize 0, right. So, is to set this action to be 0. So, as a consequence, we find here is the optimal policy. So, you can choose sorry I, so here u1 is a function of x0, u2 is a function of, so we are minimizing this cost with the constraint that u1 is a function of x0 and u2 is a function of x0, v. But remember we have seen that once x0 and v are known from there u2 can also know x1 and x1 plus v as well, right. So, because x, now because u2 knows x1, this term which is where u2 appears can be made 0 because u2 can output x1 itself, it is what it wants to be as close to x1 as possible, it can output x1 and this term can be made 0. Now, since this term can be made 0 regardless of what the first controller does, the first controller then has a very simple task, he has to simply make this term 0. So, therefore the optimal choice is to simply choose u1 equal to 0, identically equal to 0 and u2, so once u1 is equal to 0, you remember then in that case x1 would be all, if u1 is equal to 0 then x1 would be equal to x0. Now, x1 is equal to x0 and u2 can reproduce x0 which means that u2, you can take u2 to be x0 and this here is a cost equal to 0 and this is in fact the optimal call. So, when full noise observation is available, in that case when full noise observation is available, you can choose your controllers in such a way that the cost actually hits the global, the unconstrained minimum and then in fact whose value is 0, so that therefore this becomes the optimal choice, so this is your optimal control. Now, let us look at another variant, so variant 2, let us look at another variant in which we have, which we say is a classical information pattern. Now, this is something we had seen earlier, but I want to see this, we want to emphasize something about this again. So, in this case now u1 is a function of x0 and u2 remember is a function of x0 and y1 and what is y1, y1 is x1 plus v. This is the information that is available to us, so let us say we can simply, another way to write this is that this, you can write this as u1 equal to this of y0 and this is a function of y0 comma y1. Now, let us come back here, so let us write this out in the following way, so you can write u2 which is a function of y0 comma y1, now y0 itself is x0, so this is a function of x0 comma y1 and what is y1, remember y1 itself is x0 plus u1 plus v, this is y1. Now, but if I told you that u1 is being chosen as a function of y0, but since u1 is a function of y0 which is itself equal to sigma of x0, we from there we can conclude that u2 belongs to sigma of x0 comma and now that, so therefore this can be concluded as being a function of x0 and from x0 you can compute u1 and therefore you can form x0 and x0 plus u1 plus v, we can compute v, so this therefore then becomes a function of v. So, we get back what we were getting earlier, so there is a slight distinction between these two, between these two cases, so here we already had, we were already being given the information of x0 and v while choosing u2 whereas out here we deduce that u2 could be chosen as a function of just x0 and v assuming that u1 is chosen as a function of x0. So, in the classical information pattern what you really have is that you have the knowing that the earlier controller is being chosen in a certain way, you can effectively choose your controllers as a function of the noise, so you effectively have the access to this noise, that is different from being given the noise observation in the first place. So, that is what happens in the classical information pattern, so using we have used this particular fact to conclude that u2 belongs to sigma of x0 comma v, but in any case once u2 belongs to sigma of x0 comma v the previous controllers are optimal, we can conclude that previous controllers which implies that the previous controller is optimal. Now, in either of these cases notice that since u2 is a function of the noise here x0 and v and out here also u2 has become a function of the noise, in both cases u2 is the information of u2 we have here that the information of u2 is independent of the policy gamma 1, so it does not depend on the policy anymore. So, the information of the second controller does not depend on the policy and this that holds here too, so here also information is independent of information of u2 is independent of the policy, here in fact you have been given this here we have been given this from the very beginning in this whereas in the second case because of the information pattern that if because u1 is chosen as a function of y0 and y0 is equal to x0 this becomes independent of the policy gamma 1. So, in both of these cases no dual effect is present, so we say that there is a dual effect that there is a dual effect if u2 belongs if the information of u2 depends on the policy gamma 1, but in both of these cases we have seen that there is no dual effect. Let us now look at a third variant, so let us look at a third variant here variant 3 this variant we can call this variant as a past control observation. So, what you have is access to the past control actions, past control observations this is the variant. So, here in this case now u0 is a function of y, u1 is a function of y0 here which is itself remember x0, u2 here is a function now has a it is a function of y1 which is what it knew earlier in the Wittzenhausen problem as well, but it also knows u1, so it also has access to the past control action. So, now let us go a little deeper into this. So, u2 is a function of y1 and y1 and u1, so sigma y1, u1 this is itself equal to now what is y1, y1 is x0 plus u1 plus v remember y1 is just x1 plus v, x1 is x0 plus u1 and we are also given u1. Now this here, so notice what is happening here, so I have been given u1 and I have been given x0 plus u1 plus v. Now from this what I can do is I can eliminate I can use the knowledge of u1 and from there infer the value of x0 plus v, but I cannot still determine x0 and v separately. So, this will become at the best it will become u1 comma x0 plus sigma of x0 u1 comma x0 plus v, this is what I can determine from here. Now the claim is I can the claim is there is a dual effect, so I claim that there is in this problem. So, how does one claim that there is a dual effect what one wants to which what we need to show is that the choice of gamma 1 affects the available information or choices available choices or in particular the information. So, let us see how this is how this actually matters. So, let us look at this more closely. So, suppose in order to do this all I have to show is that there I can is that there are I can try I have to pick up 2 policies gamma 1 2 values for gamma 1 and show that the information that the second controller has in under those 2 policies is different. So, in order to do this let us suppose let us take 2 different policies. So, suppose we take gamma 1 is is 1 to 1. So, suppose it is an invertible and to 1 an invertible. So, if it is invertible then remember we are choosing u1 equal to this gamma 1 of x0. So, if if if gamma 1 is invertible then u1 equal to gamma 1 of x0 implies. So, knowing u1 you can knowing u1 we can we can find find x0. So, therefore in this case sigma of u1 comma x0 plus v this becomes sigma of x0 comma x0 plus v and therefore it becomes equal to sigma of x0 comma v. So, if I choose. So, remember here you are choosing gamma 1 still as a function of x0 like we did earlier but we are choosing it to be a different type of we are going to choose different functions but they are all going to be functions of x0. So, in the first case we chose a 1 to 1 function 1 to 1 an invertible function and in that case what we find is well the information of the second controller is now going to be just x0 is going to be x0 and v. So, in other words the second controller is able to know know the noise in the system when gamma 1 is invertible. But now let us take the other extreme suppose instead of being invertible the other extreme is suppose gamma 1 is a constant. So, in that case u1 equal to again gamma 1 of x0 equal to some constant c right. So, now if it is a if it is some constant then my information u1 comma u1 comma x0 plus v is the knowing u1 actually gives me no information it just gives me it is just a constant. So, therefore this is going to be equal to sigma of x0 plus v. So, notice that in the first case here when gamma 1 was an invertible function function of x0 but an invertible function then this you could know the information about the noise separately you would know what x0 is you would know what v is. But if gamma 1 is a constant function then you know x0 plus v right. So, as a result of this this you can see there is a difference here between what information you the second controller has and that the information that the second controller has changes with the choice of the first controller right. So, this in general is not equal to sigma of x0 comma v. So, knowing x0 plus v does not directly give you does not in general give you x0 comma v. So, the information and the available choices of the second controller depends on the kind of function you depends on the choice of the function or policy you have used at the first stage right. So, as a result there is a dual effect in this problem. So, that is what we wanted to show that there is actually a dual effect in this problem. So, an interesting angle here is also that there is in fact not only a dual effect it turns out here there is in fact no optimal solution to this problem. So, in fact here is one other thing I want to claim that there is no optimal solution. So, if you wanted to show that there is no optimal solution here is what here is one argument you can make. So, suppose epsilon is some constant greater than 0 and what we will do is we will construct we will construct gamma 1 gamma 2 to get cost less than equal to less than equal to epsilon. So, the simple way to do that is to simply take u1 is as epsilon times y0 and that is simply equal to epsilon times x0 and to take u2 as since we know since we know the past control action since we know u1 here remember we are doing the past control action case. So, we know u1. So, this u1 is known here. So, you can just simply take u2 to be u1 divided by epsilon and u1 remember is epsilon times x0. So, this will therefore becomes become x0 itself. So, let us evaluate what the cost then becomes well this you can see your cost is k square u1 square plus x0 plus u1 minus u2 the whole square and this you can see evaluates to something like k square epsilon square x0 square plus what is the second term? Well it will become the second term here is u2 is substituted as u1 by epsilon u2 itself is u1 by epsilon which is x0. So, what you are left with is again just eventually just x u1 here. So, that u1 square so that will be again epsilon square x0 square and so in short it is equal to epsilon square into k square plus 1 times. So, the expected cost would therefore be expectation of this would therefore be equal to the would therefore be this times the expectation of x0 square which is so in other words it is some constant times epsilon square and so we can choose we can take epsilon going down to 0 and get as lower cost as we want. So, in other words the optimal cost is equal to 0. Now, the optimal cost here is 0 but there is no policy it turns out that gives but there is no policy no policy that achieves cost equal to 0. So, we can never actually get cost equal to 0. So, why is this the case? So, this is because to get cost 0 we need u1 to be 0 it has to be that u1 is equal to 0 and we also need u2 to be equal to x1 which is and x1 is equal to x0 plus u0. Now, but x1 is equal to x0 plus u1 but u1 is equal to 0 which means this is therefore equal to x0. So, you get u1 which is so what we need is basically we need u1 equal to gamma 1 of x0 to be identically equal to 0 and u1 which is a function of and u2 which is a function of now u1 and x0 plus v we want this to be equal to x0 this is what we need. But then this would but remember u0 is a constant but therefore we as we have seen sorry u1 is a constant. So, therefore u2 is therefore a function of just x0 plus v but we want this function of x0 plus v to be equal to x0. So, which means from what we need is from the from x0 plus v we should be able to somehow figure out x0 itself this is a contradiction this is not possible. So, we have seen this before that when u1 is taken as a constant function u2 only has access to x0 plus v and from there he cannot reproduce x0. So, this is a contradiction. So, in other words there is no policy there is no policy that achieves cost equal to 0. So, what this means is for this particular variant there is an optimal value to the control problem but there is no optimal policy. So, this is one more of the kind of subtleties that happens once you have non classical information patterns the that there is a there is a you can have problems for which there is an optimal cost but there is no optimal solution. We will see more of more about about about non classical information patterns now in the in the following lectures.