Mathematics – Statistics Theory
Scientific paper
2011-05-29
COLT'11 (2011) 18
Mathematics
Statistics Theory
Scientific paper
We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic regret matches the lower bound of \cite{Burnetas96}. Our contribution is to provide a finite-time analysis of this algorithm; we get bounds whose main terms are smaller than the ones of previously known algorithms with finite-time analyses (like UCB-type algorithms).
Maillard Odalric-Ambrym
Munos Rémi
Stoltz Gilles
No associations
LandOfFree
A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences does not yet have a rating. At this time, there are no reviews or comments for this scientific paper.
If you have personal experience with A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences, we encourage you to share that experience with our LandOfFree.com community. Your opinion is very important and A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences will most certainly appreciate the feedback.
Profile ID: LFWR-SCP-O-290085