 So, let us now begin with one example of a static team problem. So, in this example, we have two agents, right. So, they let us call them agent 1 and agent 2, agent 1 and agent 2. Now, agent 1 can take only two actions, agent 1 has can take two actions, let us call those actions up and down, these are the only two actions that agent 1 can take. Agent 2 can take similarly only two actions, its actions are let us call them left and right. The environmental noise which comprises of all the noise in the system can take for simplicity only three different states. So, psi here belongs to a set capital omega, where capital omega itself comprises of just three elements, omega 1, omega 2 and omega 3. These are the three possible elements of omega and therefore, that these three possible values of psi. Now, these themselves occur with the following probabilities. So, probability of omega 1 is 0.3, probability of omega 2 is also 0.3 and probability of omega 3 is 0.4. Now, the cost function or the loss function involved in this particular problem, remember it is of in a static team, the problem, the loss function is of the form L of u 1, u 2, psi. So, I need to write out therefore, the value of L for every value of the action and every value of psi, right. So, let us write this out, so in the, so for each value of psi, I will have, I have four different possible combinations for the values of the actions. So, for each value of psi, I can u 1 and u 2 can each take two different values. So, for each fixed value of psi, I actually have four different combinations here. So, that I will write out as a in a matrix form like this. So, I will first write out a table for the value of when psi is omega 1, so this is when psi is omega 1, here is my table. So, you have here on the columns, I am writing the actions of agent 2, on the rows I am writing actions of agent 1. So, agent 1 as I said can take two different actions, agent 2 can also take two different actions. So, agent 2 can take actions L and R, agent 1 can take actions U and B. And so, when psi is equal to omega 1, agent 1 takes action U, agent 2 takes action L, the loss L of U, the loss L of U L omega 1 is equal to 1. And similarly, when agent 2 takes action R, it is 0, then agent 1 takes action D, it is and agent 2 takes action L, it is 3 and agent 2 takes action D and agent, sorry agent 1 takes action D and agent 2 takes action R, it is 1. This here is the, is this loss function written out for psi equal to omega 1. I can write out a similar loss function like this for psi equal to omega 2, I am putting in some specific values here, so this is again agent 1, agent 2, R, U, D, 2, 3, 2, 1. And so, here you have 1, 2, 0 and 2. So, these three matrices here that I have written out together collectively defined for me the function L. So, if I want to read out the value of the function L for any pair of actions and for any value of psi, I should look at one of these matrices based on the value of psi and then look at which pair of actions is being chosen and that gives me the value of L. So, this is going to be fixed across various instances of this problem. So, what we will do now is look at different information for these two agents and let and see how the solution of this problem varies and how we and also how the our reasoning about these this how to solve this problem varies as we change as the information of these agents changes. So, the first case that we will consider is case 1 is this is the case when both agents have perfect information, so this is when they have perfect measurement. So, these are all static problems remember, but they are just different types of static problems. So, in the first case, I am going to assume that they have perfect measurement which means that both all both agents perfect the information of each agent let us call this I 1 of agent 1 and I 2 of agent 2 these are both equal to psi. So, both agents actually observe psi, so in this case, now since they know the value of psi, so how does one reason about this particular problem, remember we wanted to minimize we wanted to minimize the expectation of this cost assuming u i is a function of the information of I. Now, in that case, now how do we now that how do we you know minimize this particular cost see this since in this we in in this case since both agents know the know the value of psi it is it is fairly obvious what is it that they should be doing. So, what they should do is that they should look at each value of psi and then see which pair of actions leads them to the to the best to the best outcome. So, if they have so, for example, if psi is equal to omega 1, then it turns out that the optimal action the optimal action for both for for these agents is to choose is for agent 1 to choose u and agent 2 to choose r because then they would they would both they would the cause that they would get is 0. So, if agent 1 chooses u and agent 2 chooses r then they would get 0. So, when psi is omega 1 they would these agents would choose u comma r. So, let me write this here. So, gamma 1 star of omega 1 is equal to u gamma 2 star of omega 1 is equal to r. So, that is what they would they would do when when psi is equal to omega 1. Now, if psi is equal to omega 2 then it is then these agents should choose this particular pair because when psi is equal to omega 2 is this gives them the least cost and then in the so, agent 1 chooses d agent 2 chooses r. So, so agent 1 in when psi is omega 2 he chooses d and agent 2 when psi is omega 2 chooses r and then finally, when when psi is omega 3 let us see what what these agents should choose when psi is omega 3 these agents it is it is clear from here they want the minimum cost. So, they should be choosing this particular pair which means in agent 1 chooses d and agent 2 chooses l is equal to d and this equals l right. So, now how do I evaluate the cost that comes from this the way to evaluate the cost is to look at basically remember we have we are evaluating the expected cost that comes from this. So, this therefore is equal to probability of omega 1 times the l of the gamma 1 star when when psi is omega 1. So, this is going to be l of u comma r comma omega 1 plus the probability of omega 2 times l of d comma r comma omega 2 plus probability of omega 3 is l of d comma l comma omega 3. Now, l of u comma r comma omega 1 is equal to 0 out here notice this is 0 l of the second one is 1 and the third one is also 0. So, what we what we get therefore, this quantity eventually becomes just probability of omega 1 into 0 plus 1 into probability of omega 2 plus 0 into probability of omega 3 and that is equal to 1 times probability of omega 1. So, that is omega 2. So, it is 1 times probability of omega 2 probability of omega 2 remember was just 0.3. So, therefore, it is this cost therefore, become 0.3. So, this is what we get in the case of perfect measurement. So, now, what has happened in because of because we had perfect measurement because we had perfect measurements both agents could knew which of these three matrices was actually getting realized. There are these three different possible matrices because based on the value of psi and the agents knew which exact matrix was was the one that was that was actually getting realized because they knew what the value of psi was. And because both of them knew what the value of psi was they could both decide what the optimal action was. There was no issue of any disagreement on this. So, they chose this pair of actions here this pair of actions here and this pair of actions here. And then as a result of that they get their cost. Now, the most the important thing to note here is that when there are when both agents actually have the same information and here in particular they have in fact perfect information then actually there is there is really no dilemma for these agents because they have the same information. One can as well think of these two agents coalesced into one agent just one agent that that is in fact taking a pair of actions because they have really the same information. So, they may as well be considered as one agent. So, this is in fact the most trivial form of static team when both agents actually have perfect and identical information. So, that is why this is well it is worth starting with this one. So, next we will look at a different example. So, we will now consider in this case both agents would again have the same information but the information would not be perfect. So, they would have both have imperfect information but identical information. So, imperfect and identical measurements. So, remember omega was omega 1 omega 2 omega 3. So, was this these were the three possible states or these are the three possible values of the of the underlying uncertainty or the or the environmental randomness and what we will assume is that agents cannot distinguish between omega 1 and omega 2. So, they cannot distinguish between these two they just know that one of these as is realized either omega 1 or omega 2 is realized but they cannot tell which of these two has actually been realized. So, therefore, the information of agent 1 is equal to the information of agent 2 is basically this we can write it as sigma of omega 1 omega 2 comma omega 3. So, which means that when omega 3 is realized they know it is omega 3. But if omega if either of omega 1 or omega 2 is realized they have no way of telling whether it is omega 1 or omega 2. So, as a result of this their action has to be the same regardless of whether it is omega 1 or omega 2 they have no way of differentiating their action based on whether it is omega 1 or omega 2 getting realized. But the important thing here is that they are both having the same information they as a result they both have to act based on this level of ignorance that they actually have. So, as a result these strategy that we that these these strategies or the policies of these agents can be written in this sort of form. So, we can write gamma i of y i it is can be is equal to either a or b. So, if so you that means you are taking an action a if y i is omega 1 or omega 2. So, it is either omega 1 or omega 2 and it is b if y i is equal to omega 3. So, consequently the you can see that the in this case the problem has now become a little bit more complicated. We can now we have to now think of what exactly are the possible strategies and what are exactly are the possible action that would result from them. So, firstly how many strategies are there for each agent? So, let us look at agent 1. So, agent 1 can has his information can be either omega 1 or omega 2 this particular thing or it can be omega 3. So, it is his information can take only 2 possible values it basically says that the psi is omega 3 or psi is not omega 3. The other the action the other action the and for each of these states each of this piece of information he has to choose an action and the action can be either either a or b. So, if it is agent 1 then it is either up or down or if and if it is agent 2 then it is either left or right. So, these are the possible actions that these agents can choose. Now, what we will do is write out the policies that emerge for these agents and compute the loss or the cost that comes from each of these policies. So, now having seen the case with perfect measurements let us look at another problem in which we have imperfect measurements for both agents but identical measurements for both agents. So, the case then is this one the one that is written here you have imperfect but identical measurements. Now, how do we capture imperfect measurements? The way we capture imperfect measurements is by describing what part of the set omega where you are you know the omega would remember was the set in which the environmental randomness takes its values. So, we ask what part of the set of omega can be observed by an agent. So, here what in this particular case what I will what I am going to assume is that both agents cannot distinguish between the two values omega 1 and omega 2. So, for them omega 1 and omega 2 occurrence of omega 1 and omega 2 gives the same information. So, they have no way of distinguishing between these two but they can tell if omega 3 in fact has occurred. So, they can distinguish between omega 3 have been occurring and omega 3 not occurring that is what it effectively means. So, then in that case we write i 1 equal to i 2 equal to sigma of these two sets the two sets being omega 1 the one set omega 1 omega 2 and the other set being omega 3. So, we can think of this as these are the two values of that the information can potentially take. So, in that case then the how does one write the strategy of the agent the strategy of the agent is always a map mapping that maps information to an action. So, in this case now the information can be of two different kinds one is that it can be omega 3 or it can be one of omega 1 or omega 2 right. So, it can be omega 1 or omega 2 or in the other case it can be omega 3 itself. And for each of these cases you can specify an action for each agent. So, for instance so here I have denoted the action for the case when omega 1 or omega 2 occurs as A and the action for omega 3 as B. Now, because the agent cannot distinguish between omega 1 and omega 2 it has to be that his action then is A has to be the same for regardless of whether omega 1 or omega 2 occurs. So, his strategy or his policy is a constant on this set omega 1 omega 2 because he cannot distinguish between the values of omega 1 and omega 2. So, now so let us take for example a particular strategy here say if remember that agent 1 can take actions up and down right. So, let us write out one strategy for for agent 1 here is one strategy. So, it is equal to up if y 1 is equal to omega 1 or omega 2 and it is down if y 1 is equal to omega 3. So, this is for example one strategy for agent 1. Similarly, a strategy for agent 2 would be he takes action R if y 1 y 2 is omega 1 or omega 2 and he takes action L if y 2 is equal to omega 3. Now, it does not have to be that these actions are different. For example, it is quite quite it is alright for instance if I instead of taking L here if I had written R itself that is also that would be a different strategy. In that case, the agent 2 is choosing R regardless of what he observes whether it is omega 3 or not omega 3 that again is another strategy right. So, we can denote these strategies in the following way we can just simply write the pair that is that is being chosen. So, knowing that these are the observations we can let us denote a strategy like this like this sort of strategy let us simply denote it briefly as a pair u comma d or even shorter there as u d. So, this effectively means that u here is this is when what is chosen if omega 1 or omega 2 occurs and this is what is chosen if omega 3 occurs right. And similarly, this strategy would be denoted R R where this the first one here is for omega 1 or omega 2 and this here is the second R is for omega 3 right. So, now let us ask ourselves how many strategies do we have for each agent. So, let us take agent 1, agent 1 has 2 actions and he can get 2 possible values of information. So, as a consequence he can he can the number of strategies that he has is 2 raised to 2. So, this here is 4. So, he has 4 different possible strategies agent 1 has 4 different possible strategies. Likewise agent 2 also has 4 different possible strategies because he 2 has 2 actions and his information can take 2 different values. So, now let us write out therefore a table like this in which we list out all the strategies of all the agents and list out the cost that comes from them the expected the expected cost that would come from them. So, what I am writing here now is basically j of gamma 1 comma gamma 2 remember j was the expected cost l of gamma 1 of whatever comma gamma 2 of whatever comma comma psi right. So, let us write this out. So, on the let on here I will have agent 1 and here I am writing for agent 2. So, agent 1 and agent 2 each have 4 strategies. So, let us write write these 4 strategies here. So, agent 2 is first strategy I will write as ll the second strategy is lr third strategy is rl and the fourth strategy is rr where remember these strategies are when I write ll lr etcetera these are to be understood in the way that I have written out here in some in this in this sort of fashion. So, now let us write out these strategies for agent 1. Agent 1 again has 4 strategies or 4 policies it is uu then ud let us draw a horizontal line here just for clarity ud then you have du and then we have dd these are the 4 strategies of the agent 1. Now we make this line meter. So, with these 4 strategies now what are the what is the value of j for each of these strategies. So, let us write this out. So, the value of j how do we evaluate this? So, j here is the expectation under any of these sort any strategy like this. So, suppose for example let us take the strategy here as for agent 1 let me choose the strategy ud and for agent 2 let me choose the strategy rl. So, in that case so j of ud comma rl this would be equal to the this I have to evaluate this particular expectation. So, the expectation is going to be probability the probability of is going to be the value of l times the probability of of psi that leads to for that particular value right. So, I am going to take this as probability of omega 1 times l of gamma 1 gamma 2 omega 1 plus similarly probability of omega 2 times l of gamma 1 gamma 2 omega 2 probability of omega 3 l of gamma 1 gamma 2 omega 3. Now, I have deliberately left these brackets empty here. So, what do I need to fill in here? Remember gamma 1 when when the when when you have omega 1 here when when psi is omega 1 why is takes value you can say why is then you can be for as far as the agent is concerned. Why is either omega 1 or omega 2? So, the agent cannot distinguish between the two. So, for him it is it is as good as omega 1 stroke omega 2 here. Similarly, the same happens here for agent 2. Agent 2 also cannot distinguish between omega 1 stroke omega and omega 2 then if and if omega 2 occurs again agent 1 and agent 2 cannot distinguish between omega 1 or omega 2. So, for them it is it is whatever they would map omega 1 and omega 2 2 that is that is what comes here. But when omega 3 occurs they do know that omega 3 has occurred. So, that is what is that is what comes here alright. So, this here therefore is the is is what we are what we can write write the cost. So, now let us let us substitute and and and compute this probability of omega 1 remember was 0.3 probability of omega 2 was 0.3 and probability of omega 3 was also 0.3 was 0.4. So, I have this is 0.3 times l of what now l of gamma gamma 1 of omega 1 stroke omega 2 is has to be determined by looking at the policy we are considering. Remember now our gamma 1 is U D gamma 2 is R L. So, l of gamma 1 of omega 1. So, gamma 1 of omega 1 stroke omega 2 is simply U. So, l of U similarly this omega gamma 2 of omega 1 stroke omega 2 is R. So, this is l of U comma R comma omega 1 plus probability of omega 2 which is also 0.3 times again it is going to be the same term l U comma R comma omega 2 plus 0.4 times l of now in omega 3 agent 1 chooses D and agent 2 chooses l. So, it is D comma l comma omega 3. Now we can where do we get these values of l from? Well for that we need to go back to this original table. So, remember here we have the value of l of for l of l for any pair of actions and l and any value of psi. So, l. So, when we when I have to look up for example something like this l of U comma R comma omega 1 all I need to do is look at the omega 1 table here and then look for U comma R. U comma R here is 0. So, this value here is 0. So, the first term my first term here is 0.3 times 0 plus the second term will be 0.3 times again I need to look at U comma R but I need to look U comma R but in the second in the second table the omega 2 table. So, in that case you for U comma R I have the value 3. So, it is 0.3 times 0.3 times 3 plus 0.4 times now D comma l and in the omega 3 table. So, I have D comma l here. So, I take D comma l that that comes up here. So, D and l and in the omega 3 table. So, I have the value 0. So, I have in this term is also this times 0. So, the final cost then becomes 0.9. So, as a result of this we find that for this for the pair of strategies or policies U D comma R l the cost the value of J becomes 0.9. We can now fill in in a similar way all the other terms here I will I will just fill them in mechanic by the same sort of logic you can check if that this is this is actually correct it is 1 this is 1.3 this is 1.7 this is also 1.3 this is 1.7 this is 0.9 1.7 0.9 1.7 this is 1.9 2.3 1 1.4 this is 1.5 this is 2.3 this is 0.6 and this is also 1.4. So, once we fill this in you we can we can check we can look for what the optimal value actually turns out to be once we fill this in we find here that the optimal value is sitting right here this is this here is the optimal value. So, this is the least remember our goal was to minimize J J that is written here that is there in this table. So, we have effectively now found the the policy that minimizes J and the policy here is is this is the is this particular policy it is the policy is D D for agent 1 and R l for agent 2. So, which is effectively it is saying that the agent 1 should always play D regardless of what the information he gets and agent 2 should play R if it is omega 1 or omega 2 and l if it is omega if the information is omega 3. So, that is what that is what we find as the optimal solution. So, this this is how we have we can solve a problem of imperfect information but with identical information for both agents. So, we will now go a little we will go to more generalizations of this in the subsequent lectures.