Causal Bandits: Learning Good Interventions Via Causal Inference

Finnian Lattimore,Tor Lattimore,Mark D. Reid

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016)（2016）

引用 146|浏览54

暂无评分

摘要

We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi- arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要