You are not logged in | Log in

Speaker(s): Piotr Kozakowski
Date: May 6, 2021, 12:15 p.m.
Information about the event: meet.google.com/yew-oubf-ngi
Seminar: Seminarium "Machine Learning"

Sample efficiency is a major challenge in the current Reinforcement Learning (RL) systems. Another is robustness - it is hard to find one RL algorithm that will perform well in a variety of settings. I am going to present QWR - a novel RL algorithm that performs on-par with Soft Actor Critic (SAC) in continuous control tasks, works well in the Offline RL setting, unlike SAC, and surpasses Rainbow in sample efficiency on the Atari benchmark, while being significantly simpler than both algorithms. I will also present several related RL methods that influenced the design of QWR.

Q-Value Weighted Regression: Reinforcement Learning with Limited Data

Careers

Website

Safety on Campus