LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions.

arXiv: Learning(2017)

引用 26|浏览12
暂无评分
摘要
We present LADDER, the first deep reinforcement learning agent that can successfully learn control policies for large-scale real-world problems directly from raw inputs composed of high-level semantic information. The agent is based on an asynchronous stochastic variant of DQN (Deep Q Network) named DASQN. The inputs of the agent are plain-text descriptions of states of a game of incomplete information, i.e. real-time large scale online auctions, and the rewards are auction profits of very large scale. We apply the agent to an essential portion of JDu0027s online RTB (real-time bidding) advertising business and find that it easily beats the former state-of-the-art bidding policy that had been carefully engineered and calibrated by human experts: during JD.comu0027s June 18th anniversary sale, the agent increased the companyu0027s ads revenue from the portion by more than 50%, while the advertisersu0027 ROI (return on investment) also improved significantly.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要