Playing Montezuma's Revenge with Intrinsic Motivation




Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jun 8, 2016

This is a short video showcasing the paper "Unifying Count-Based Exploration and Intrinsic Motivation" by Bellemare, Srinivasan, Ostrovski, Schaul, Saxton, and Munos from Google DeepMind. https://arxiv.org/abs/1606.01868

The video depicts a DQN agent playing Montezuma's Revenge via the Arcade Learning Environment. The agent's reward function is augmented with an intrinsic reward based on a pseudo-count, itself computed from a sequential density model. This intrinsic reward allows the agent to explore a full two-thirds of the first level of the game and achieve significantly higher scores than anything previously reported.

See also

Explored rooms during training: https://youtu.be/2q4Tv4WSj_s
Episodes at 50 million frames: https://youtu.be/qeeTok1qDZk
Episode at 100 million frames: https://youtu.be/EzQwCmGtEHs


When autoplay is enabled, a suggested video will automatically play next.

Up next

to add this to Watch Later

Add to

Loading playlists...