Exploiting structure of uncertainty for efficient matroid semi-bandits

Pierre Perrault 1 Vianney Perchet 2 Michal Valko 1, 3
1 SEQUEL - Sequential Learning
Inria Lille - Nord Europe, CRIStAL - Centre de Recherche en Informatique, Signal et Automatique de Lille (CRIStAL) - UMR 9189
Abstract : We improve the efficiency of algorithms for stochastic combinatorial semi-bandits. In most interesting problems, state-of-the-art algorithms take advantage of structural properties of rewards, such as independence. However, while being optimal in terms of asymptotic regret, these algorithms are inefficient. In our paper, we first reduce their implementation to a specific submod-ular maximization. Then, in case of matroid constraints , we design adapted approximation routines , thereby providing the first efficient algorithms that rely on reward structure to improve regret bound. In particular, we improve the state-of-the-art efficient gap-free regret bound by a factor √ m/ log m, where m is the maximum action size. Finally, we show how our improvement translates to more general budgeted combinato-rial semi-bandits.
Document type :
Conference papers
Complete list of metadatas

Cited literature [20 references]  Display  Hide  Download

https://hal.inria.fr/hal-02387478
Contributor : Michal Valko <>
Submitted on : Friday, November 29, 2019 - 6:11:03 PM
Last modification on : Sunday, December 1, 2019 - 1:13:23 AM

File

supplementary.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02387478, version 1

Citation

Pierre Perrault, Vianney Perchet, Michal Valko. Exploiting structure of uncertainty for efficient matroid semi-bandits. International Conference on Machine Learning, 2019, Long Beach, United States. ⟨hal-02387478⟩

Share

Metrics

Record views

59

Files downloads

41