Non-Stochastic Multi-Player Multi-Armed Bandits - Optimal Rate With Collision Information, Sublinear Without

Yuanzhi Li
Yuanzhi Li
Mark Sellke
Mark Sellke

COLT, pp. 961-987, 2020.

Cited by: 0|Bibtex|Views9
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com

Abstract:

We consider the non-stochastic version of the (cooperative) multi-player multi-armed bandit problem. The model assumes no communication at all between the players, and furthermore when two (or more) players select the same action this results in a maximal loss. We prove the first √T-type regret guarantee for this problem, under the feedba...More

Code:

Data:

Your rating :
0

 

Tags
Comments