Competing Bandits: The Perils of Exploration under Competition.

arXiv: Computer Science and Game Theory(2019)

引用 21|浏览55
暂无评分
摘要
empirically study the interplay between exploration and competition. Systems that learn from interactions with users often engage in exploration: making potentially suboptimal decisions in order to acquire new information for future decisions. However, when multiple systems are competing for the same market of users, exploration may hurt a systemu0027s reputation in the near term, with adverse competitive effects. In particular, a system may enter a death spiral, when the short-term reputation cost decreases the number of users for the system to learn from, which degrades its performance relative to competition and further decreases its market share. We ask whether better exploration algorithms are incentivized under competition. run extensive numerical experiments in a stylized duopoly model in which two firms deploy multi-armed bandit algorithms and compete for myopic users. find that duopoly and monopoly tend to favor a primitive greedy algorithm that does not explore and leads to low consumer welfare, whereas a temporary monopoly (a duopoly with an early entrant) may incentivize better bandit algorithms and lead to higher consumer welfare. Our findings shed light on the first-mover advantage in the digital economy by exploring the role that data can play as a barrier to entry in online markets.
更多
查看译文
关键词
bandits,exploration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要