 Welcome back. So now let us look at another problem in which there are non-identical measurements for both agents. But in this case neither of the agents will have a trivial information. So the problem now is of non-identical and non-trivial measurements. So what are we going to assume? We are going to assume that the information now of agent 1 is sigma of omega 1, omega 2, omega 3. So omega 2, omega 3 have been put in one set for agent 1 and similar and for agent 2 write it here for agent 2 we have the information of agent 2 is sigma of omega 1, omega 2, omega 3. So agent 1 is not able to distinguish between omega 2 and omega 3. So as far as agent 1 is concerned omega 2 and omega 3 are identical for him and for agent 2 omega 1 and omega 2 are identical for him. He can distinguish between omega 3 and omega 1 or omega 2. This was the case in one of our earlier examples as well. But and agent 1 on the other hand can distinguish between omega 1 and not omega 1. So he cannot distinguish between omega 2 and omega 3. So now the policies for the two agents take the following form. You can write say a typical policy for agent 1 as gamma 1 of y 1 equals, equal to AB. If y 1 is equal to omega 1 he has he takes an action A or if y 1 is equal to omega 2 or omega 3 then he takes an action B. Similarly for agent 2 you have gamma 2 of y 2 that takes where he takes an action C suppose an action D. C if y 2 is equal to let us say omega 3 and D let us say y 2 equal to C if y 2 is equal to omega 1 or omega 2 and D if y 2 is equal to omega 3. So these you can see as thanks to this now both agents will now have four different policies. So there will be four policies for agent 1. Here four policies here for agent 2 as well as we had earlier. So there you would have four different policies and now you would combine them into a matrix which would be a 4 cross 4 which means a 16 entry matrix and you would have to write out the cost for each pair of policies that would give you your function j. Once you write out this 4 cross 4 matrix you would get j here from there and then you look for the entry that has the least possible cost and that will then tell you the policies that the agents should be choosing. And now this is this I have done this now for several examples. So I will not go into the details of this particular example but I will just tell you what the correct answer is so that you can check this on your own. So it turns out gamma 1 star of y 1 the optimal choice for for agent 1 is so this is for agent 1 this is for agent 2 optimal choice for agent 1 is is u d y 1 u if y 1 equals omega 1 and d otherwise so that means y 1 is equal to omega 2 or omega 3 and the optimal choice for agent 2 is gamma 2 star of y 2 that is equal to R L if y 2 is equal to omega 1 or omega 2 and L if y 2 is equal to omega 3 this these are the these are the optimal choices for the two agents. Now that is what you can compute the cost for every pair of policies and verify that these are in fact optimal. Now what we have what we see when we have sort of non-identical non-identical information for each agent is that we have to essentially enumerate every pair of policies and there is not much one can do or except for this there are some things we can do a little bit more smartly but that is that is pretty much the best we can do so far as computing the optimal policies concerned. Now so in general what happens in a static team problem can be can be sort of seen in the following way you know this has happened in the previous problem where there were non-identical measurements as also in this problem. So in every problem with non-identical measurements you would see something of this form playing out so what is that? So notice that what we want to do is minimize with J which is a function of gamma 1 and gamma 2 with respect to both gamma 1 and gamma 2 together this is what we want to do. Now what we can do is actually write this out in the following way we can say well this is we can choose to minimize with respect to one of them first and then the other. So here I am minimizing first with respect to gamma 2 and then with respect to gamma 1 and I will also write out the expression I have for J which is the expectation of L of gamma 1 of y 1, gamma 2 of y 2, psi. Now here what I can do is the following I can use the law of iterated expectation and write out this inner expectation in a certain way. So here I outside I have still minimize I am still minimizing over gamma 2 and inside what what I will do with the expectation is I will take the expectation of a conditional expectation where I am conditioning now with respect to with respect to y 2. So I condition take the conditional expectation of this L of this, this, comma psi given y 2. So the conditional since this is a conditional expectation here effectively this here is with respect to the distribution psi given y 2. So this is this conditional expectation has is is fixing y 2 and it is it is being taken with respect of with respect to the distribution of psi given y 2. In fact we can be more explicit here and instead of writing simply psi we can write psi, y 1 given y 2 knowing that both y 1 and psi are after all random variables. Now because you are minimizing this inner thing the inner minimization is with respect to gamma 2 and then having the the the function that we have inside here this here is a function of y 2. So you have this here is for each fixed value of y 2. So effectively the when we want to choose the minimization with respect to gamma when we want to do a minimization with respect to gamma 2 we can effectively choose a value of action the action that gamma 2 chooses for each value of y 2. So the what I mean is this minimization can actually go inside because whatever I have shown under the curly brackets is actually a function of of y 2. So therefore this minimization becomes minimization outside with respect to gamma 1 and then you have an expectation here the outer expectation is with respect to here with respect to y 2. So outer expectation is with respect to y 2 and inside I have a minimization of with respect to not the policy anymore but rather the action chosen by agent by by the agent. So I am doing the minimization with respect to u 2 of the condition of this inner conditional expectation conditional expectation of L of gamma 1 of y 1 y 2 comma psi given y 2. So this therefore gives us what this is what this is giving us is therefore now a scalar is a a finite dimensional optimization with respect to u 2 and that can be computed for every value of y 2 and then you from there we would get u 2 star equal to gamma 2 star gamma 2 star of y 2 but there is a little bit of a problem here you get that u 2 star equal to gamma 2 star of y 2 that is true but but remember this has been done for a fixed value of gamma 1. So we are fixing a value of gamma 1 and that gamma 1 is made an appearance here. So therefore what we get is actually not the optimal gamma 2, gamma 2 star of the problem but the optimal response to gamma 1 of a so agent 2's optimal response to a policy gamma 1 chosen by agent 1 right. So this is what you get is what we can call as the best response. This is what is called the best response best response to gamma 1 as a function and as a function of y 2. Now so effectively if one wants to compute use this particular logic to compute the optimal strategy what one now has to do is plug in this best response and view it as a function of gamma 1 and then outside minimize with respect to gamma 1 knowing that gamma 2 itself is chosen is itself going to change with gamma 1 right. So knowing that gamma 2 is now replaced by its best response and that best response will itself keep changing with gamma 1. So once we do this we can actually get we can actually compute the optimal the optimal policy for the actual problem. So once we do this we would in fact get gamma 1 star and then gamma 2 star of y 2 star gamma 1 star would then be the optimal policy. But notice that one of the things that is happening here is that once again we have we are getting to a stage where we have when we need to know where we need to know the entire function gamma 1 in order to compute even the optimal best response best response to it. So the problem has once again become what we saw in the Wittson-hausen problem where we have functions of functions coming up in the case of in the case of the stochastic control problems. So this also appears happens in a static team problem. So remember I had warned you that static team problems are the simplest but not the easiest problems they are simple but not necessarily easy they do not have the complications of the of dynamic information structures and the dual effect and so on but that does not mean that finding optimal strategies for static teams is easy either it does need some work and that is simply that is important that is because the best responses are all entangled in this sort of way. So this is what one inherently has encounters in any static team problem and the way around it is to simply list out every pair of policies 1 for agent 1 and 1 for agent 2 and compute what is the optimal choice for each of those policies. Having said that there may be occasions where things are easier for instance you know it may happen that in some cases this inner minimization here the yields a very simple answer for instance or maybe the inner the conditioning here is such that it makes the problem a lot easier for instance if one of the agents has trivial information then this the inner problem becomes rather easy and then therefore the outer problem can can then be can then be tackled tackled much more easily. So these are all optimized you know ways of optimizing computations that we can do when we when we encounter problems like these. Now since let us let us now go to the another example another example is this is this is an example in which so we let us now go to another example. So in this example agents actually do not do not get the information of agents is not described directly in terms of what part of the sample space of omega that they can they can actually see correctly it is not defined in terms of that but rather in terms of the value of another random variable that is correlated with psi. So you can think of this as a case where agents are getting noisy measurements. So let us let us write this this example down. So this in this example your agents are getting noisy measurements. So it as a simple example let us take the case where agent 2 does not get any measurement agent 2's information is empty so y2 is empty or in other words i2 here is sigma of omega 1 omega 2 omega 3 if you like to put it in that sort of way or and but y1 now is equal to a random variable z and z itself takes values in in a set of two different values which is z1 and z2. So z can take two different values z1 and z2. Now how what is so obviously you would ask what is the relation between z and z and omega. So that is essentially where the crux of this particular case lies. So z and omega z and psi are actually correlated. So you can let us write out these values here. So these are the values on the columns I have values of psi and on the rows I have values of z. So here these are values of psi and on the so psi can take values omega 1 omega 2 omega 3 z can take values z1 or z2 and these themselves occur with a certain probability. So this is so what I am writing here therefore in this table is the joint distribution of psi and z. So I will fill out some numbers here. So this is 0.12, this is 0.18, this is 0.21, this is 0.09, this is 0.12, this is 0.28. Now you can check that if you add up the columns here what we are effectively doing is taking the marginal distribution with respect to psi and the columns when added up to give you values 0.3, 0.3, 0.4 and that is nothing but the marginal distribution with respect to psi and it is in fact the same marginal distribution that we had earlier. So effectively what we have therefore is that we have been given a joint distribution P of psi, z whose marginal distribution agrees with our earlier distribution. So it is sort of a generalization of our earlier problem. So now what we can do is the following. What one can do in this case is the way to do this is we can actually change instead of thinking of the environmental random variable as psi and z as being different from it. What we can do is we can actually absorb z into the environmental randomness. So effectively what if you recall I mentioned in our intrinsic model of stochastic control that psi is supposed to include all the randomness in the problem. So when we have any additional randomness which is coming from noisy measurements or something like that like we have here in this case what we do is we absorb that as into the definition of psi and redefine psi effectively. So what we are now what we will do is we will define we will just change psi to a new variable let us call that variable theta and theta is going to denote the pair psi, z. So psi has been replaced by theta where theta is simply the pair psi, z. Now the probability distribution of theta is given through the distribution through the joint distribution that we have of psi, z. So theta can take 6 different values here. So theta can take 6 different values let us call them theta denote them by a pair of indices theta i, j it can take 6 different values given by a pair like this omega i, z, j. So you where i ranges from 1 to 3 and j ranges from 1 to 2. So theta therefore can take 6 different values so theta takes so this here the so the my state space now or the sample space now is this capital theta which is the collection of all these values theta i, j where i equals where theta i, j is this pair omega i, z, j i goes from 1 to 3, j equals 1 to 1 or 2 that is my that is my value that is my sample space theta. Now the probability distribution of theta so if I want to evaluate what is the probability that theta equals theta i, j then all I need to do is look it up in this particular table. This is therefore the probability that psi equals omega i, z equals z, j and that can be found from this table. So for example if I wanted to know the value probability that theta equals theta 1, 2 then that means the probability of that gives me that the probability of I am looking for then omega 1 and z 2 so that is that then from this table you can easily read that out is equal to 0.18 here. So that is what we can now see. So how does this help? This helps in the following way we have just what we have done is so far is just absorbed z into the environmental randomness of the problem. Now what we need to do is make sure the other elements of the problem also fit in. So in order to do that now let us look at how do we describe the information of all the agents. So what we can do is the following see notice that here the information of agent 2 here is that he can observe omega 1, omega he cannot distinguish between omega 1, omega 2 and omega 3. So he has no way of distinguishing between omega 1, omega 2 and omega 3. So now what does this mean though in terms of the state space, what does this mean though in terms of the signals, in terms of the value of z. So the agent is basically getting no piece of information. So as far as agent 2 is concerned he cannot distinguish even between values of z 1 and z 2 because he is unable to distinguish between he really gets no information. So we had earlier in our earlier model we had written his information as sigma of omega 1, omega 2, omega 3. But now that he really gets no information is information is really that he cannot distinguish between any of the values of theta. So for agent 2 then let us write this out for agent 2, for agent 2 the, for agent 2 is information is simply in this case a set which contains all the values of theta, theta ij for i equal to 1 to 3 and j equal to 1 to 2. This is, this is agent 2's information. So agent 2 is unable to distinguish between any value of z and any value of omega. That is because he is getting really no information. What about agent 1? Now agent 1 things are a little bit more subtle. So agent 1, remember can see perfectly the value of z. So his information is equal to z here. So he can see the value of z and thanks to that because he can see the value of z he is, we can say that he is able to distinguish between one of the components of theta. Theta comprises of two components psi, z and he is able agent 1 is able to distinguish between the z component. But he cannot distinguish further between the values of omega which is the psi component. So we can, we can write agent 1's information i1 in the following way. We can write this as, as he can, he can distinguish between these two sets. The first set being the pair all those values of theta in which z is equal to z1. So it is omega 1, z1, omega 2, z1, omega 3, z1 and the second is the set in which there are the values of z2. So omega 1, z2, omega 2, z2 and omega 3, z2. So agent 1 in this new space of environmental random variables is able to now distinguish between these, between these values. So he can distinguish between the values of theta in the first set from the values of theta in the second set. But he cannot distinguish within the sets themselves. So he cannot tell whether it is this theta that has occurred or this theta that has occurred. But he can tell if this, if one of these has occurred or one of these has occurred. So this is therefore now becomes the information of agent 1. So now what we have effectively done is basically changed our problem to a different space. So now our environmental random variable is not psi but it is theta. So everything is now written in terms of theta. So the cost also can be expressed in terms of theta. So we had L of u1, u2, psi but this can always be written as L of u1, u2, theta where the only component that is relevant as far as the cost, only component of theta that is relevant here is actually only the psi component matters, only the psi component matters. So and when we are computing the expected cost in this case we would again compute the expectation with respect to theta. Now remember because theta is now our random variable we will be taking the expectation with respect to theta. So having done this effectively now we can we can ask okay how many policies do we have for each agent? Well agent 2 having since he has no information this agent 2 here has no information. So as a result he has just 2 policies, 2 possible policies and agent 1 because his information can take just 2 different values and he can take 2 different actions he has 4 different policies. So once again we can write out the policies now on this particular space and you can see that you can write out the policies for this particular space and compute what the optimal policy then would be. So for example if you had a policy in which you had agent 1 gamma 1 is say d comma du that means he is playing b in the first set he is playing d if z 1 occurs and u if z 2 occurs okay and gamma and gamma 2 is suppose l which means he is regardless of any anything else he is just playing l. In that case then what is the what would be j of gamma 1 comma gamma 2 one can write this out you can we can write this as l of d l omega 1 times the probability now of of omega 1 but remember when you are writing probability of omega 1 we have to write this as probability in terms of probability distribution of theta. So this is actually probability of omega 1 comma z 1 plus you have l d then if you have omega 2 then you have that times probability of d l omega 3 and then again you have probability of omega 3 comma z 1. So what we have done here is basically taken out taken a sum over the values where of theta 1 1 theta 2 1 and theta 3 1 effectively here. So this is theta 1 1 this is theta 2 1 this is theta 3 1 now I need to do the same for for the other terms so I have then u l omega 1 times probability of omega 1 z 2 plus u l omega 2 probability of omega 2 comma z 2 plus probability plus l of u u l omega 3 probability of omega 3 comma z 2. Once we write all of these out so here we have probability here this is probability of theta 1 2 this is theta 2 2 the next term is theta 2 2 this is theta 3 2 once we write all this out we will be able to compute the cost for any for any policy. So in this case we have computed it for the policy d u for agent 1 and the policy l for agent 2. So this therefore gives us gives us a way of computing the optimal policy even for problems in which we have noisy information. So the lesson here is that one does not really need to model noise separately you know one the the original model of a static team problem and the intrinsic model of stochastic control actually has in it enough generality to in to absorb all the elements of noise in the problem into the environmental randomness. So one does not need to have separately a you know noisy observations or any of that listed out. So this is one of the in fact one of the benefits of the intrinsic model and it helps why and it also makes clear why it is so so elegant as far as analyzing information structures is concerned. So in the in the in the next part so in the remaining parts of the course we will go for we will we will discuss some more of other types of problems where where dynamic information structures come up in particular problems problems of communication.