Playing Pacman using AIXI Approximation

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
635 views
Loading...
Alert icon
Sign in or sign up now!
Alert icon
Ratings have been disabled for this video.

Uploaded by on Jul 3, 2010

This shows the mc-aixi(fac-ctw) agent playing a partially observable version of the classic Pac-Man game. The agent must navigate a 17x17 maze and eat the food pellets that are distributed randomly across the maze. Four ghosts roam the maze. They move initially at random, until there is a Manhattan distance of 5 between them and PacMan, whereupon they will aggressively pursue PacMan for a short duration. The maze structure and game are the same as the original arcade game, however the PacMan agent is hampered by partial observability. PacMan is unaware of the maze structure and only receives a 4-bit observation describing the wall configuration at its current location. It also does not know the exact location of the ghosts, receiving only 4-bit observations indicating whether a ghost is visible (via direct line of sight) in each of the four cardinal directions. In addition, the location of the food pellets is unknown except for a 3-bit observation that indicates whether food can be smelt within a Manhattan distance of 2, 3 or 4 from PacMans location, and another 4-bit observation indicating whether there is food in its direct line of sight. A final single bit indicates whether PacMan is under the effects of a power pill. At the start of each episode, a food pellet is placed down with probability 0:5 at every empty location on the grid. The agent receives a penalty of 1 for each movement action, a penalty of 10 for running into a wall, a reward of 10 for each food pellet eaten, a penalty of 50 if it is caught by a ghost, and a reward of 100 for collecting all the food. If multiple such events occur, then the total reward is cumulative, i.e. running into a wall and being caught would give a penalty of 60. The episode resets if the agent is caught or if it collects all the food.

Category:

Science & Technology

License:

Standard YouTube License

All Comments

Adding comments has been disabled for this video.

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more