Advances in Bandits with Knapsacks

arxiv(2021)

引用 0|浏览82
暂无评分
摘要
"Bandits with Knapsacks" (\BwK) is a general model for multi-armed bandits under supply/budget constraints. While worst-case regret bounds for \BwK are well-understood, we focus on logarithmic instance-dependent regret bounds. We largely resolve them for one limited resource other than time, and for known, deterministic resource consumption. We also bound regret within a given round ("simple regret"). One crucial technique analyzes the sum of the confidence terms of the chosen arms. This technique allows to import the insights from prior work on bandits without resources, which leads to several extensions.
更多
查看译文
关键词
knapsacks,worst-case
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要