Nie jesteś zalogowany | Zaloguj się

Decision making in uncertainty: Introduction to Multi-Armed Bandits algorithms

Prelegent(ci)
Piotr Januszewski
Termin
26 listopada 2020 12:15
Informacje na temat wydarzenia
google meet (meet.google.com/yew-oubf-ngi)
Seminarium
Seminarium "Uczenie maszynowe"

There are many reasons to care about bandit problems. Decision-making with uncertainty is a challenge we all face, and bandits provide a simple model of this dilemma we can study. Bandit problems have practical applications like configuring web interfaces, where applications include news recommendation, dynamic pricing, and ad placement. In this lecture, you’ll learn the theoretic framework of bandit problems. I will present you with common sense as well as state-of-the-art algorithms for solving the bandits.
 
Bibliography:
- Lattimore, T., & Szepesvári, C. (2020). Bandit Algorithms. URL: https://tor-lattimore.com/downloads/book/book.pdf.
Liaw, C. (2019). Introduction to Bandits. URL: https://www.cs.ubc.ca/labs/lci/mlrg/slides/2019_summer_4_intro_to_bandits.pdf.
- Kuciński, Ł. & Miłoś, P. (2020). Reinforcement Learning Course: Exploration and exploitation. URL: https://drive.google.com/drive/folders/11TMHI6iM0lMuhySz_eIpXnW6MokKAeic.
- Weng, L. (2018). The Multi-Armed Bandit Problem and Its Solutions. URL: https://lilianweng.github.io/lil-log/2018/01/23/the-multi-armed-bandit-problem-and-its-solutions.html.