# Large-scale short-term urban taxi demand forecasting using deep learning

ASP-DAC, pp. 428-433, 2018.

EI

Weibo:

Abstract:

The world has seen in recent years great successes in applying deep learning (DL) for many application domains. Though powerful, DL is not easy to be used well. In this invited paper, we study an urban taxi demand forecast problem using DL, and we show a number of key insights in modeling a domain problem as a suitable DL task. We also co...More

Code:

Data:

Introduction

- Deep learning (DL), deep neural networks (DNNs) in particular, has emerged as a powerful and seemingly universal machine learning tool for solving many traditional problems recently [1], and many amazing state-of-the-art results have been reported in different areas, such as image recognition [2], speech recognition [3], and machine translation [4]
- Because of these successes, as the law of the instrument indicates, there is a great tendency for machine learning practitioners to treat DL as a “golden hammer” and all problems can be treated as a “nail”.
- The authors will focus on the transportation domain and show a systematic study on applying DL techniques to solve the large-scale shortterm urban travel demand forecasting problem

Highlights

- Deep learning (DL), deep neural networks (DNNs) in particular, has emerged as a powerful and seemingly universal machine learning tool for solving many traditional problems recently [1], and many amazing state-of-the-art results have been reported in different areas, such as image recognition [2], speech recognition [3], and machine translation [4]
- As the law of the instrument indicates, there is a great tendency for machine learning practitioners to treat Deep learning as a “golden hammer” and all problems can be treated as a “nail”. Such a tendency may seem to have been warranted at this moment, especially given the tremendous success of Deep learning in solving so many diversified problems
- In additional to the temporal and spatial dependencies, taxi demands are affected by some external factors, such as weather conditions and city holidays etc. These factors can be modeled as a separate input to the aforementioned deep neural networks networks
- We report the forecasting results using the root-meansquared-error (RMSE) and training time as shown in Table I on the New York City Taxi dataset
- We show deep neural networks’s improvements in prediction, a more practical question is what such improvement would mean to transportation planners or operators

Conclusion

- From the perspective of network architecture, it may seem that the FCL-Net is a more powerful network than STResNet for extracting both temporal and spatial dependencies, and it should produce better results for taxi demand forecast
- Such a study was not available in the literature [5, 6], the authors conduct a systematic study in this paper on a common dataset to understand this claim.
- This showed the importance of incorporating domain knowledge into the design of a right DNN architecture

Summary

## Introduction:

Deep learning (DL), deep neural networks (DNNs) in particular, has emerged as a powerful and seemingly universal machine learning tool for solving many traditional problems recently [1], and many amazing state-of-the-art results have been reported in different areas, such as image recognition [2], speech recognition [3], and machine translation [4]- Because of these successes, as the law of the instrument indicates, there is a great tendency for machine learning practitioners to treat DL as a “golden hammer” and all problems can be treated as a “nail”.
- The authors will focus on the transportation domain and show a systematic study on applying DL techniques to solve the large-scale shortterm urban travel demand forecasting problem
## Conclusion:

From the perspective of network architecture, it may seem that the FCL-Net is a more powerful network than STResNet for extracting both temporal and spatial dependencies, and it should produce better results for taxi demand forecast- Such a study was not available in the literature [5, 6], the authors conduct a systematic study in this paper on a common dataset to understand this claim.
- This showed the importance of incorporating domain knowledge into the design of a right DNN architecture

- Table1: Travel Demand Prediction on NYC Taxi Dataset

Related work

- A. DNN Architectures

Deep neural networks (DNNs) are the most prevalent deep learning techniques these days. Among its wide variety of applications, image and video processing is still by far the most successful one for the reasons we will explain shortly.

Convolutional neural network (CNN) [7] is a class of DNN that is designed in part to extract the spatial features of a static image by defining different convolutional kernels. The stacking of different convolutional layers in a DNN helps to increase the chance of extracting higher level spatial features that are more powerful for prediction. ResNet is one of the most popular CNN architecture with a very deep number of CNN layers yet still avoiding the difficulty of training such a deep network [8].

Funding

- The world has seen in recent years great successes in applying deep learning for many application domains
- Studies an urban taxi demand forecast problem using DL, and shows a number of key insights in modeling a domain problem as a suitable DL task
- The stacking of different convolutional layers in a DNN helps to increase the chance of extracting higher level spatial features that are more powerful for prediction
- Since sequential dependency may exist for different time horizons, input demand sequences are grouped into three different categories defined by short, medium, and long-term temporal ranges
- Where Xfuse is the fused output from the three separate streams of demand data, the multiplication is an element-wise multiplication, and XL, XM, and XS are outputs of the respective ResNets that deal with short, medium, long-term temporal ranges of input data

Reference

- Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015.
- G. Huang, Z. Liu, K. Q. Weinberger, and L. van der Maaten, “Densely connected convolutional networks,” arXiv preprint arXiv:1608.06993, 2016.
- G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal Processing Magazine, vol. 29, no. 6, pp. 82–97, 2012.
- I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Advances in neural information processing systems, pp. 3104–3112, 2014.
- J. Zhang, Y. Zheng, and D. Qi, “Deep spatio-temporal residual networks for citywide crowd flows prediction,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), pp. 1655–1661, 2017.
- J. Ke, H. Zheng, H. Yang, and X. M. Chen, “Short-term forecasting of passenger demand under on-demand ride services: A spatio-temporal deep learning approach,” Trans. Research Part C: Emerging Technologies, vol. 85, pp. 591–608, 2017.
- Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradientbased learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778, 2016.
- F. Porikli and A. Yilmaz, “Object detection and tracking,” Video Analytics for Business Intelligence, pp. 3–41, 2012.
- S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.
- L. Moreira-Matias, J. Gama, M. Ferreira, J. Mendes-Moreira, and L. Damas, “Predicting taxi–passenger demand using streaming data,” IEEE Transactions on Intelligent Transportation Systems, vol. 14, no. 3, pp. 1393–1402, 2013.
- G. E. Box and D. A. Pierce, “Distribution of residual autocorrelations in autoregressive-integrated moving average time series models,” Journal of the American statistical Association, vol. 65, no. 332, pp. 1509–1526, 1970.
- J. O. Berger, Statistical decision theory and Bayesian analysis. Springer Science & Business Media, 2013.
- T. K. Ho, “Random decision forests,” in Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on, vol. 1, pp. 278–282, IEEE, 1995.
- H. Yi, H. Jung, and S. Bae, “Deep neural networks for traffic flow prediction,” in Big Data and Smart Computing (BigComp), 2017 IEEE Intl. Conf. on, pp. 328–331, IEEE, 2017. http://www.nyc.gov/html/tlc/html/about/trip record data.html, 2017.
- [17] X. SHI, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.c. Woo, “Convolutional LSTM network: A machine learning approach for precipitation nowcasting,” in Advances in neural information processing systems, pp. 802–810, 2015.
- [18] B. Donovan and D. B. Work, “Using coarse GPS data to quantify city-scale transportation system resilience to extreme events,” arXiv preprint arXiv:1507.06011, 2015.

Tags

Comments