search

actions - event

state: published

Algorithmes de bandits pour l'exploration d'arbres, avec applications aux jeux, à l'optimisation et au contrôle

old_uid	5330
title	Algorithmes de bandits pour l'exploration d'arbres, avec applications aux jeux, à l'optimisation et au contrôle
start_date	2008/10/06
schedule	13h30
online	no
summary	Bandit-based methods for tree search have recently gained popularity when applied to huge trees, e.g., in the game of go. Their efficient exploration policy of the tree enables to return rapidly a good value and improve precision if more time is provided. The UCT algorithm, a tree search method based on Upper Confidence Bounds (UCB) is believed to adapt locally to the effective smoothness of the tree. However, it is known that UCT may be over-optimistic for some problems, leading to high regret. I will describe alternative bandit-based approaches and analyze a Bandit Algorithm for Smooth Trees (BAST) which takes into account actual smoothness of the rewards for performing efficient cuts of sub-optimal branches with high confidence. I will then illustrate several fields of applications, such as minimax search in games, global optimization of Lipschitz functions, and discounted optimal control.
responsibles	Biau, Stoltz, Massart

hosted_by

Ecole normale supérieure - ENS

speakers

event_of

Apprentissage (séminaire du groupe de travail) (2008)

Event #158814 - latest update on 2022/05/17, created on 2008/09/24