Distilled Thompson Sampling: Practical and Efficient Thompson Sampling via Imitation Learning
Abstract:
Thompson sampling (TS) has emerged as a robust technique for contextual bandit problems. However, TS requires posterior inference and optimization for action generation, prohibiting its use in many internet applications where latency and ease of deployment are of concern. We propose a novel imitation-learning-based algorithm that distil...More
Code:
Data:
Full Text
Tags
Comments