 So the next action I'd like to talk about is what I like to call the elements of Environments again if we use sort of my prior video as a reference We have different types of agents depending on sort of our our goals, you know a minecraft Playing agent versus a self-gardening agent versus a self-driving vehicle agent versus a tic-tac-toe Agent each one of those are going to be operating in different environments And so we have to look at what those types of environments could be The first one that we're focusing in on is sort of the terminology known as Observability the entire idea is well if something is completely observable So in this case chess for example, let me change my color If we're dealing with say a chess playing agent each action Can be directly mapped So again, I know that you know, maybe I'm playing sort of white I decide to move a king's pawn two spaces I don't know if that's a good move or not Assume it is and if it's not let me know in the comments down below But again, all right, I can map that out. I've made a move. I can accurately predict what's Going to happen if I do this and I can then map out. Oh, you know black moves left a bishop pawn one space Again, don't know if that's a good move or not I'm gonna say it is and there's nothing you can do about it No, my point being is again, you can see every single one of these actions could be then Mapped out into a tree like form and we could search that tree However, as we start to get into more advanced agents, that's where completely observable is not really a possibility So for example, when we're dealing with self-driving cars again, wherever, you know luminar That that logo is let me change my color. I'll do a sort of bright blue Let's say our agent is right here on the bottom of that picture Well, again, if we're thinking about all the different sensors that we talked about, you know One of them in this case. I think luminar does lie to our sensors, you know Again, it's projecting out radio waves. It's shooting them out into the world And you know when a radio wave hits something it bounces back So in this case you can see oh, well, you know, I can perceive Sort of this side of the car, but I don't know what's going on right here I hit, you know, one of my radio waves hit sort of this van But I still don't know what's going on on the other side There could be, you know, a person over there And if this is the only sort of perception and a sense that I've gotten out of the environment I don't know and that we call partially observed But then we also get into a different type of world specifically that idea of single or multiple agents If for example, we're dealing with say for our sake, uh, you know, just to Make this a very very simple drawing Let's imagine that we're in an amazon Fulfillment center, so a little uh kind of two sides. We'll call this the Inventory inventory Side and then over here we've got fulfillment And all throughout we've got tiny little squares with wheels That their sole job is to go from inventory to fulfillment and then back They just go back and forth Well, again, if you think about this inventory Could be multiple boxes or multiple sort of crates lined up and an agent may need to go to a particular one So in this case, I've got two agents that have intersecting paths Again, they in this sense, they probably communicate with each other So they don't crash into them But that gets into you know, if we're dealing with multiple agents Are they going to be cooperative or competitive? If you're thinking about this from say a video games perspective. Oh, well, you know I have an agent that wants to attack the player character or another computer, uh, AI agent And so in that sense again, they would want to be competitive because One wants to win versus the other one wants to also win But that actually brings up an interesting point because let's imagine that These green cars are all sort of agents in a self-driving sort of perspective Well, they're they may not be communicating with each other again You know multiple people owning these cars or whatnot You know, they may not be owned by the same person or company And so they may not communicate with each other in that sense Should they treat them like it's another agent? Or is it just a part of that environment that we were talking about in the prior video? Then we get into uh, some more terminology something known as deterministic versus sarcastic The big idea here is more specific. It's partially uh, similar to How do I describe it's Sometimes dealing with the same thing that we saw was completely observable but not at the same time When we think about a state being deterministic more specifically what's going on here is we're dealing with environments where again Everything could be mapped out and in our case, you know Based on the current state and whatever the action of the agent is going to be again It's very concise that we can make that decision Versus something that's sarcastic in that case, uh, you know, the environment may change so May change without sort of the uh decisions of our agent. Maybe for example, we're dealing with uh in this case something like Poker as an example. I can't control when my uh opponents or you know table mates are Going to fold or bet uh, raise my uh, initial bet and these types of things. So again, that's where uh, that sort of comes into play Regardless of which one you're focusing in on the big idea is we start to focus in on sort of the strategy Or another term for this is policy What do I do In certain situations, I may not be able to map out all the possible answers Uh for say for example tic-tac-toe or chess. So in that case, you know, what Gambits should I work off of instead if we're thinking of something like a connect four game You might have some very similar approaches. Oh, you know, I want to play sort of in the center I don't want to put it on the edges. I you know I want to build up and I want to attack diagonally those types of things Then we get into episodic versus sequential agents So the big idea with an episodic agent if you're thinking about uh, you know, something like face recognition recog Or you know that that uh, uh tomato harvester Agent that we were talking about earlier again, these would be what we consider to be episodic Uh, it doesn't you know, again, my agent scans the tomato plant and is evaluating only this tomato plant It does not care what happened with sort of the plant before it. That's you know, it'll keep statistics, but it's not going to uh, you know Make decisions based off that the same kind of concept going on with face recognition Again, if I map out sort of this face, uh, you know, I don't care when I move to sort of the next face Versus when we deal with something sequential Chess tic-tac-toe connect four. That's what we're dealing with when we think about Sequential, you know, again, uh, if I was if I moved move Kings pawn Right that has impact on the environment. And so the environment changes and so the next sort of stage has to occur And that's actually where we get into what we would consider A big fancy five dollar word that we're going to be seeing a lot in our problems The idea of something known as a time step The entire idea is again, if we're thinking about this as actions Happening By our agent We're dealing with sort of a t at t zero Given whatever the environment is going on around me in this case. You can see Uh, we're using a little example where the agent perceives the environment and is already sort of it's facing a direction in this case All right. Well in that case it wants to move forward At t zero move forward Okay, well what happens well that action occurs the environment is updated and we are now at t one What happens at t one again the agent perceives the world around it it sees in this case that it's uh A little further in it was on if you can notice here's a dirty tile. It's now on that tile Clean it And then as you can see as we move ahead Move forward Forward and then as you can see well in this case for our agent here when it hits t three Oh, I can't move forward anymore. So rather than move forward Let's turn left and then you can you know, obviously guess what happens next T four what happens at t four what happens at t five again each one of these time steps comes into play But just to give you a different example of that as well Uh, another world you could think of is I want to design out an agent to trade stocks because the economy's The economy and so okay. Well in that situation How do you map out a time step? This is uh, in this case Stochastic the environment may not change or may change regardless of the actions of our agent But more specifically if we're kind of using sort of the s and p 500 As an example here, you know, I'm mapping out uh individual weeks As sort of the different actions that were going on During the stock market for example, here's when covid happened. Here's When everything started going wrong Uh, and so again if we're thinking about that, all right. Well each one of these time steps I could have had my agent making a decision. So for example quite literally here Let me perceive sort of what's happened in the past. Let me perceive what's going on right now and what action Should the agent do should it buy should it sell should it Do I don't More to the point. Uh, here's a little fun fact most algo traders only work about 60 percent I don't know if you want to trust like your life savings to that But it's actually kind of interesting to think about Moving on so that's actually where uh, same kind of concepts going on Just like we were seeing with that stock market design In time steps the difference between something being a static environment in a dynamic environment. So stock agent That's definitely a dynamic environment the world's happening. Let me since I say semi over here Dynamic The world is constantly updating so your agent decides to buy or sell Well, the world is, you know changing while your agent is making those decisions What if your agent decides to hold off or is still thinking? Time in the stock market is going to keep on going Versus if we're dealing with something like a static environment. So, uh tic-tac-toe tic-tac-toe You know chess technically, you know Sometimes This is where you know, I'm using just a little example if you're playing with a clock obviously, you know There the environment is sort of moving at a rate and you need to kind of plan things out before time runs out But if you're playing a much more casual game like when I play it with a child Uh, you know, I'm just letting the child think I'm waiting You know, I'm I'm telling them to hurry up. But again, I'm waiting I am an environment that is static as we sort of decide what to do That actually gets us into sort of the idea of discreet and continuous environments Same kind of concept. Like I said, I'm waiting for, you know, the kid I'm playing chess with to make a move on that case Those moves are very discreet. Each one of those is again, uh, you know kings pawn One up or down, you know, again, depending on which, uh, uh agent or Side you're on, right? That's a very distinct motion move Or, you know, move Left Right. Those are very distinct Then we've got something like a continuous environment. So self-driving car, you know When we're thinking turn left, what does turn left mean? Uh, you know, turn left Could mean, you know Rotate Five degrees 10 degrees 15 degrees 15.1 degrees 15.0 1 degrees 15.001 degrees, etc Same kind of concepts going on there again If we're thinking about sort of that antenna designing agent that I showed earlier, you know It had the ability to rotate the x y and z axes and then move forward Well, how many millimeters is it rotating forward? And then how many degrees is it rotating each one of these axes on? Little bit more of a discrete value because again, you could move at a much finer angle if the hardware sort of support supports it And then obviously we get into sort of this last one of whether or not the environment is known or unknown again super, uh Vague terminology here, but again if we're thinking about something being known something like chess connect four These are environments where again every sort of tile is accounted for and while I may not know exactly what my You know opposing player is going to be doing. I'm able to plan that out Versus something being Unknown and so in this situation This is getting a little bit more into exploration and having a little bit of knowledge representation coming in there but If for example, I'm I don't know what's behind a door, you know, again, here's a door And here's my agent it needs to sort of go through the door And move throughout like, you know levels or in this case stages All right. Well, again, it doesn't know what the other room is going to look like or you know interact with So it's obviously again, this is where we would be dealing with sort of that terminology of strategy