# Multitask Bandit Learning through Heterogeneous Feedback Aggregation

Zhi Wang
Manish Kumar Singh
Laurel D. Riek

AISTATS, pp. 1531-1539, 2021.

Cited by: 0|Views9
EI

Abstract:

In many real-world applications, multiple agents seek to learn how to perform highly related yet slightly different tasks in an online bandit learning protocol. We formulate this problem as the $\epsilon$-multi-player multi-armed bandit problem, in which a set of players concurrently interact with a set of arms, and for each arm, the re...More

Code:

Data:

Full Text
Bibtex