Loading...

"Reinforcement Learning for Recommender Systems: A Case Study on Youtube," by Minmin Chen

4,904 views

Loading...

Loading...

Transcript

The interactive transcript could not be loaded.

Loading...

Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Mar 28, 2019

While reinforcement learning (RL) has achieved impressive advances in games and robotics, it has not been widely adopted in recommender systems. Framing recommendation as an RL problem offers new perspectives, but also faces significant challenges in practice. Industrial recommender systems deal with extremely large action spaces – many millions of items to recommend and complex user state spaces -- billions of users, who are unique at any point in time. In this talk, I will discuss our work on scaling up a policy-gradient-based algorithm, i.e. REINFORCE to a production recommender system at Youtube. We proposed algorithms to address data biases when deriving policy updates from logged implicit feedback. I will also discuss some follow up work and outstanding research questions in applying RL, in particular off-policy optimization in recommender systems.

Loading...

When autoplay is enabled, a suggested video will automatically play next.

Up next


to add this to Watch Later

Add to

Loading playlists...