Loading...

Mnih Supplementary Data video 2 R3

401,164 views

Loading...

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Uploaded on Feb 23, 2015

Caption: This ~1 minute video illustrates the improvement in the performance of DQN over training (i.e. after 100, 200, 400 and 600 episodes). After 600 episodes DQN finds and exploits the optimal strategy in this game, which is to make a tunnel around the side, and then allow the ball to hit blocks by bouncing behind the wall. Note: the score is displayed at the top left of the screen (maximum for clearing one screen is 448 points), number of lives remaining is shown in the middle (starting with 5 lives), and the “1” on the top right indicates this is a 1-player game.
Credit: Google DeepMind (with permission from Atari Interactive Inc.)

Loading...

to add this to Watch Later

Add to

Loading playlists...