You are not logged in | Log in

Q-Value Weighted Regression: Reinforcement Learning with Limited Data

Speaker(s)
Piotr Kozakowski
Date
May 6, 2021, 12:15 p.m.
Information about the event
meet.google.com/yew-oubf-ngi
Seminar
Seminarium "Machine Learning"

 Sample efficiency is a major challenge in the current Reinforcement Learning (RL) systems. Another is robustness - it is hard to find one RL algorithm that will perform well in a variety of settings. I am going to present QWR - a novel RL algorithm that performs on-par with Soft Actor Critic (SAC) in continuous control tasks, works well in the Offline RL setting, unlike SAC, and surpasses Rainbow in sample efficiency on the Atari benchmark, while being significantly simpler than both algorithms. I will also present several related RL methods that influenced the design of QWR.