Nie jesteś zalogowany | Zaloguj się

Prelegent(ci): Piotr Kozakowski
Afiliacja: Uniwerystet Warszawski
Termin: 7 kwietnia 2022 12:15
Pokój: p. 3140
Seminarium: Seminarium "Uczenie maszynowe"

Recent works have shown the effectiveness of entropy regularization in Monte Carlo Tree Search (MCTS). In this presentation I will first introduce the framework of Maximum Entropy Reinforcement Learning and show how it can be applied to MCTS. Then I will present various variants of entropy regularization. Next I will explain how the relative entropy regularization can be applied to a planning-learning system akin to MuZero and what particular benefits it can bring to planning with a learned model. Finally I will show our preliminary results on the Atari 100K benchmark.

Entropy-Regularized Planning

Kariera

Strona internetowa