 Now so far we've been talking about different types of searches specifically this idea that again I have this sort of starting point and I have this goal point and Everything that we've done, you know a star the breadth first step first all of them They spent this or they use this notion that specifically each Action right was leading me to a new state Say this is action one. This is action two moving to some new state And then we will eventually lead into our goal The big issue though is specifically this idea of an adversarial search now this sort of first move sure that was sort of my Move but one of the things we have to think about is maybe there are other agents in the environment And they are also working on the same environment and in fact we can think about this second action as Your move or their move their Move and so we have to actually kind of think about this We have to start to plan when someone else is moving and doing actions and Specifically we have to think about this idea that they may be working against us, you know They're they're Potentially looking at their own goals that they want to work through and some good ideas some ways to kind of see this are through Some different types of games Specifically we can start to kind of model these in different kind of terminologies You know we have sort of the ones that we typically see inside of a AI course the deterministic perfect information type Exercises these are easy to map easy to model The entire idea is you can see for example every potential move that's going on in chess You know my agent gets to do a move then maybe the another agent gets to do their move and Specifically when we think about this idea of it being perfect all of those moves can be modeled now That starts to change when we start to get to imperfect information Because maybe I can't model all of the information my agent sure has their battleships Down on their board, but I don't know where the battleships of my opposing agent Are and so I would have to kind of make some assumptions about what their side of the board works with Then when we start to get into the idea of chance well now We're adding probabilities to this dice rolls for example when we think about say for example monopoly or Shuffled a deck in poker. I don't know what card is going to be placed down next I don't know what the agents dice rolls are going to be and so it's a little harder to model those types of games But like I said specifically when we are at least kind of in this class The idea is we're working off of this idea of a deterministic fully observed and then a new fancy $5 word Zero sum game. So what is a zero sum game? the idea here is that What we're looking at is our agent. So We'll call them agent one our agent is specifically attempting to maximize a And end value if we're thinking about something like the linear search or the linear assignment problem That we've kind of talked about previously. This is I want to get the biggest value out of the configuration I want to get the biggest value out of my actions But like I said, there's an adversarial agent also in play and so that agent They're Specifically wanting to work off that same value, but they want to shrink it down as best as they can so again thinking linear assignment They want the smallest value possible and it's sort of this combash combat of Our two agents on whether or not we're gonna get a big value or a small value specifically again when we think about Our agent we are always kind of looking at it at least from our kind of perspective of we want to maximize our our value There are some other little terminologies You can you know learn about like this idea that we are dealing with a ply So agent one does a ply and then agent two does a reply All right. Uh, either way, let's start to formalize this out So when we think about things if we say for example worked off of just a simple tic-tac-toe board That'll be sort of our easiest way to kind of visualize this The big idea is this blank board. Well, you could consider that to be state zero The board is empty. We know whose turn it potentially is, you know for our sake We could ask that question we could say well, what is or who is? The player at s zero and we might you know say that that's oh player one Or for our sake since we're playing with tic-tac-toe player X player X is getting to go first It may be player. Oh, you know It's whoever's the oldest and tells you which one for tic-tac-toe I think but for our sake We'll say player X is going first. Well, then we ask that same kind of question Well, what are the legal actions? What are all the valid actions that can happen at S zero. Well, if we're again thinking that it's only X's turn, there are three potential or sorry Nine potential moves, right? We can go to the top corner. We can go to the top middle We can go to the top corner again on the other side Etc. Etc. Etc. Until we get to the bottom corner. Those are all again valid moves. Okay, fine So now what we're looking at is given these different actions. So I'll call this one a 1 a 2 a 3 all the way to a 9 What we're looking at is now finally the finally this idea of what happens as Result so think of this like when we were kind of talking about this idea of Perceiving the environment when I do an action. What happens to the environment? So In that case if I'm at s zero, right? What if I gave it the action of a nine and so again, this would produce some new state That's one that is in our case this position Fair enough, but as you can sort of see well the same kind of song and dance would happen who is the player? At s1. Oh, well now it's O's turn. All right. Well, what are all the valid options for? Or all the valid actions from this state? S1 all of them except for this a nine are available to it So we'll scratch that one out for our sake and then if we're looking at that. Let's say what is the result of? s1 with action why not Let's see six And well that would turn this into its own new state state to with an X at the bottom and an O here And you'd continue to play this out over and over again and eventually you're gonna reach Terminal states so the idea of a terminal test is when there are no possible moves or a winning condition has occurred so you can again think about this as What makes the game over in tech-tech toe? There's three different sort of ways or There's X gets to win there's Oh gets to win or we get into that weird situation where Let's see. I'm doing my best to X Oh and X so we've got X wins O wins or there is a draw Then we've got just this idea false the game is still playing. This is not a terminal state But specifically you notice that sort of second section here this idea Of a utility if we're looking at all of our different states We may want to ask this question. Well, what's the value of this particular state? So if we're looking at say for example this first notion, right where we have X is here. Oh is here X is here. Oh is here Right. Well again, we could look at that next action, right the difference between Sort of placing our X. Let me change colors here Or let me draw out just three different versions so we can sort of see them all playing out If we look at Now this notion of well, what if I did this move? What if I did this move and what if I did this move? Well, again, we're looking at this from the utilities perspective given this state for this player How well is this move? So for our player X The utility for this first one is a one or we could assign it some other value But we could have say that we are giving this winning state a one This would be a zero for our player and this would also be a zero because those are draws Now again, you could play these out differently depending on your game But this is sort of the general theme that you're looking for that same kind of idea can come into play And let me just make a different color real quickly If say for example, we were sort of again playing that same notion X is here Oh went here X went here. Oh went here for whatever reason the agent chose to go down this route Oh, that's actually perfect. I get a There we are. All right There there I'm gonna pick there there Here here, okay now so again, we're looking at the two different possibilities for X So if X Takes here and oh takes here you could say that this is a losing state for Player X again This is a bad move because I'm going to lose or my player X agent is going to lose And again, you keep on choosing different values depending on what you're looking for But this is where we start to get into this notion of something known as the minimax search Which will dig into a little later This is that idea that let's assume that there is to there are two agents We have our agent that we're gonna call max and then we have their agent That we're gonna call men and so again this notion is that if we're looking at the tic-tac-toe board What are all the possible actions that max could be working with and again? We're typically thinking about this as a recursive method So we don't immediately make a decision. We're planning. It's like oh if I Do this Dot dot dot Then it's men's turn and oh well then Men Will Do this Dot dot dot So again max has placed in that top corner men places into the top middle, okay? So now what would I do if again? I'm playing out as max. Oh well in that case I Do this Dot dot dot and so you keep on playing this out and I won't do it for the entire time You can see that it just keeps on mapping out and so what happens though is Each one of these is given this men max node And so once again just like we were saying as we move towards Terminal states where the game is in fact over you can see that if we find ourselves in a situation where our Agent aka max is the loser. Well, that would be a state that we give a Negative one two in our case our agent draws So again, that's okay. It's better than not it's better than not losing or it's better than losing. There's the word So, okay, you know, maybe that's the best we can work out towards but again You notice it specifically that negative one is less than zero and then specifically if there is a winning condition Well, we can see that our agent one Sorry our loses Our hours here we go there we are now we now we got proper English So we've got the negative one where we lose We've got the zero when we draw and then we've got the positive one when we win And so it's through these combinations that in our next video. We'll talk about the mini max search You