 Hi everyone, this is Alice Gao. This is a short video to explain the answer for the clicker question on slide 8 of lecture 19. For this question, we are given the expected utility of the optimal policy for each state for the grid world, given in the table you can see, and we are asked to determine the optimal action for state 13. And in order to do this, all we need to do is calculate the Q values for each of the four possible actions. Let's take a look at the calculation process. So it is a matter of taking the formula and plug the values into the formulas. Let me go through the first two of the four calculations. So for the first one, our intended direction is going downward. So our intended direction going downward, but with 10% chance we're going to go to our left. And with another 10% chance we're going to go to our right. So I'm labeling these to figure out what are the utilities that I'm going to get in each of the three directions. Then all I need to do is say if I do travel in the intended direction, I will get the utility 0.660. If I travel to my left, I will get the utility 0.388. And if I travel to my right, then I'll get the utility of 0.655. Then we add these together. These give us expected utility of 0.63 something. Okay, let's look at another example where we might bump into a wall. So suppose we want to travel to our left, which means we're looking at the second line. Then if we want to travel to our left, then with 10% chance we are going to end up traveling downward. And with another 10% chance we'll try to travel upward. So if we get to our left, then that's what 80% chance we'll get a utility of 0.655. If we end up traveling downward with 10% chance, then we'll get a utility of 0.660. This is straightforward. Now if we end up traveling upward, we're going to bump into a wall. And if we bump into a wall, then we'll stay in the current state. So with another 10% chance, we will stay in the current state and we'll get an expected utility of 0.611. So in total, our expected utility is 0.6511. So you can use similar approach to plug in the formula for right and up and determine their expected utilities. And then for the optimal policy, we can simply look at the four numbers and choose the action that gives us the largest number. In this case, that's left. So back to the quicker question. The answer is that the optimal action for state 1-3 is going left. By using this approach, if you want some extra practice questions, you can look at the other states and try to figure out the optimal policy and then compare this with what I've given you in the previous videos. That's everything for this video. Thank you very much for watching. I will see you in the next video. Bye for now.