High-dimensional Continuum Armed and High-dimensional Contextual Bandit: with Applications to Assortment and Pricing

Junhui Cai, Ran Chen,Martin J. Wainwright,Linda H. Zhao

ICLR 2023(2023)

引用 0|浏览25
暂无评分
摘要
The bandit problem with high-dimensional continuum arms and high-dimensional contextual covariates is often faced by decision-makers but remains unsolved. Existing bandit algorithms are impracticable due to the complexity of the double-layer high dimensionality. We formulate this problem as a high-dimensional continuum armed contextual bandit with high-dimensional covariates and propose a novel model that captures the effect of the arm and contextual on the reward with a low-rank representation matrix. The representation matrix is endowed with interpretability and predictive power. We further propose an efficient bandit algorithm based on a low-rank matrix estimator with theoretical justifications. The generality of our model allows wide applications including business and healthcare. In particular, we apply our method to assortment and pricing, both of which are important decisions for firms such as online retailers. Our method can solve the assortment-pricing problem simultaneously while most existing methods address them separately. We demonstrate the effectiveness of our method to jointly optimize assortment and pricing for revenue maximization for a giant online retailer.
更多
查看译文
关键词
bandit,high-dimensional statistics,assortment,pricing,reinforcement learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要