Dynamic Bike Reposition: A Spatio-Temporal Reinforcement Learning Approach.

KDD(2018)

引用 118|浏览145
暂无评分
摘要
Bike-sharing systems are widely deployed in many major cities, while the jammed and empty stations in them lead to severe customer loss. Currently, operators try to constantly reposition bikes among stations when the system is operating. However, how to efficiently reposition to minimize the customer loss in a long period remains unsolved. We propose a spatio-temporal reinforcement learning based bike reposition model to deal with this problem. Firstly, an inter-independent inner-balance clustering algorithm is proposed to cluster stations into groups. Clusters obtained have two properties, i.e. each cluster is inner-balanced and independent from the others. As there are many trikes repositioning in a very large system simultaneously, clustering is necessary to reduce the problem complexity. Secondly, we allocate multiple trikes to each cluster to conduct inner-cluster bike reposition. A spatio-temporal reinforcement learning model is designed for each cluster to learn a reposition policy in it, targeting at minimizing its customer loss in a long period. To learn each model, we design a deep neural network to estimate its optimal long-term value function, from which the optimal policy can be easily inferred. Besides formulating the model in a multi-agent way, we further reduce its training complexity by two spatio-temporal pruning rules. Thirdly, we design a system simulator based on two predictors to train and evaluate the reposition model. Experiments on real-world datasets from Citi Bike are conducted to confirm the effectiveness of our model.
更多
查看译文
关键词
Bike-Sharing System,Dynamic Bike Reposition,Reinforcement Learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要