Distilled Thompson Sampling: Practical and Efficient Thompson Sampling via Imitation Learning

Hongseok Namkoong
Hongseok Namkoong
Samuel Daulton
Samuel Daulton
Cited by: 0|Bibtex|Views13
Other Links: arxiv.org

Abstract:

Thompson sampling (TS) has emerged as a robust technique for contextual bandit problems. However, TS requires posterior inference and optimization for action generation, prohibiting its use in many internet applications where latency and ease of deployment are of concern. We propose a novel imitation-learning-based algorithm that distil...More

Code:

Data:

Full Text
Your rating :
0

 

Tags
Comments