AutoST: Efficient Neural Architecture Search for Spatio-Temporal Prediction

KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Virtual Event CA USA July, 2020, pp. 794-802, 2020.

Cited by: 3|Bibtex|Views246|
EI
Other Links: dl.acm.org
Keywords:
Mean Absolute ErrorSpatio-temporal PredictionRecurrent Neural NetsAutoMLBeijing Academy of Artificial IntelligenceMore(16+)
Weibo:
We propose a novel Neural Architecture Search network named AutoST with an efficient search space tailored for spatio-temporal prediction task which can be generalized to multiple different scenes

Abstract:

Spatio-temporal (ST) prediction (e.g. crowd flow prediction) is of great importance in a wide range of smart city applications from urban planning, intelligent transportation and public safety. Recently, many deep neural network models have been proposed to make accurate prediction. However, manually designing neural networks requires amo...More

Code:

Data:

0
Introduction
  • Advances in location-acquisition and wireless communication technologies have resulted in massive amounts of spatio-temporal (ST) data, enabling many ST prediction tasks in cities, which is critically important for smart city applications [34].
  • Authors in [20] argue that the long-distance spatial dependency is increasing important but the stack of multi-layer convolutions [32] can only capture the neighbor correlations
  • They propose a ConvPlus component to capture the long-range spatial dependency among regions and a multi-scale fusion network to fuse multi-level features.
  • [8] belief that information in different ranges reveals distinct traffic properties, for example, the neighborhood range indicates local dependency while a long range tends to uncover the overall pattern
  • They proposed a multi-range attention network to model the diverse spatial distance dependency in graph.
  • For temporal correlations, [13] has adopted a series of 3D convolutions network to extract the spatio-temporal features (a) TaxiGY (b) CrowdBJ
Highlights
  • Advances in location-acquisition and wireless communication technologies have resulted in massive amounts of spatio-temporal (ST) data, enabling many ST prediction tasks in cities, which is critically important for smart city applications [34]
  • How to find the optimal neural architecture at at various scenarios in cities is still an unsettled problem, because ST task is usually affected by multiple complex factors: (i) the spatio-temporal correlation is complex including spatial dependency between regions and temporal correlation among timestamps; (ii) spatio-temporal correlation is diverse from location to location, for example, there is a great difference of rush hour between core city and small city; (iii) spatio-temporal correlation is heterogeneous to different tasks, for example, the local spatial correlation is important to crowd flow prediction while the global spatial correlation is significant to taxi flow prediction
  • Can AutoST be applied to a wide range of spatial-temporal prediction tasks and steadily improve performance compared with the state-of-the-art network?
  • We study the problem of spaito-temporal prediction using neural architecture search method
  • We propose a novel Neural Architecture Search (NAS) network named AutoST with an efficient search space tailored for spatio-temporal prediction task which can be generalized to multiple different scenes
  • We evaluate our AutoST on four real-word datasets varying from crowd to taxi flow prediction, the performances of which are better than fixed architectures and more efficient than other search methods
Methods
  • As shown in Figure 2, the key module in the proposed AutoST is ST-NASNet (Spatio-TemporalNeural Architecture Search Net) that is used to automatically learn spatio-temporal network architecture.
  • Distinct to neural network with fixed architecture, NAS network is composed of three modules from simple to complex: (i) a candidate cell module which defines the search unit; (ii) a operation block module which perform weighted sum over all possible operations to make the search space continuous; (iii) a NAS network module which is consisted of a series of mix operations
  • The authors will illustrate these three modules in detail
Results
  • The authors conduct experiments on four real-word citywide traffic flow datasets to evaluate the network performance.
  • The authors answer the following questions: Q1.
  • Can AutoST be applied to a wide range of spatial-temporal prediction tasks and steadily improve performance compared with the state-of-the-art network?.
  • How do the settings of AutoST, i.e., the number of layers and the number of channels impact the prediction result?.
  • The brief introduction of used datasets as shown in Table 2
Conclusion
  • The authors study the problem of spaito-temporal prediction using neural architecture search method.
  • The authors propose a novel NAS network named AutoST with an efficient search space tailored for spatio-temporal prediction task which can be generalized to multiple different scenes.
  • The proposed AutoST can automatically search the architecture which can handle the multi-range and multi-scale problems in prediction.
  • The authors evaluate the AutoST on four real-word datasets varying from crowd to taxi flow prediction, the performances of which are better than fixed architectures and more efficient than other search methods.
Summary
  • Introduction:

    Advances in location-acquisition and wireless communication technologies have resulted in massive amounts of spatio-temporal (ST) data, enabling many ST prediction tasks in cities, which is critically important for smart city applications [34].
  • Authors in [20] argue that the long-distance spatial dependency is increasing important but the stack of multi-layer convolutions [32] can only capture the neighbor correlations
  • They propose a ConvPlus component to capture the long-range spatial dependency among regions and a multi-scale fusion network to fuse multi-level features.
  • [8] belief that information in different ranges reveals distinct traffic properties, for example, the neighborhood range indicates local dependency while a long range tends to uncover the overall pattern
  • They proposed a multi-range attention network to model the diverse spatial distance dependency in graph.
  • For temporal correlations, [13] has adopted a series of 3D convolutions network to extract the spatio-temporal features (a) TaxiGY (b) CrowdBJ
  • Objectives:

    The authors aim to predict the future inflows and outflows according to the historical observations.
  • Different from previous studies which design complex network based on domain knowledge, the authors aim to automatically learn neural architecture for different data to improve the generalization ability of model and release human out of designing networks
  • Methods:

    As shown in Figure 2, the key module in the proposed AutoST is ST-NASNet (Spatio-TemporalNeural Architecture Search Net) that is used to automatically learn spatio-temporal network architecture.
  • Distinct to neural network with fixed architecture, NAS network is composed of three modules from simple to complex: (i) a candidate cell module which defines the search unit; (ii) a operation block module which perform weighted sum over all possible operations to make the search space continuous; (iii) a NAS network module which is consisted of a series of mix operations
  • The authors will illustrate these three modules in detail
  • Results:

    The authors conduct experiments on four real-word citywide traffic flow datasets to evaluate the network performance.
  • The authors answer the following questions: Q1.
  • Can AutoST be applied to a wide range of spatial-temporal prediction tasks and steadily improve performance compared with the state-of-the-art network?.
  • How do the settings of AutoST, i.e., the number of layers and the number of channels impact the prediction result?.
  • The brief introduction of used datasets as shown in Table 2
  • Conclusion:

    The authors study the problem of spaito-temporal prediction using neural architecture search method.
  • The authors propose a novel NAS network named AutoST with an efficient search space tailored for spatio-temporal prediction task which can be generalized to multiple different scenes.
  • The proposed AutoST can automatically search the architecture which can handle the multi-range and multi-scale problems in prediction.
  • The authors evaluate the AutoST on four real-word datasets varying from crowd to taxi flow prediction, the performances of which are better than fixed architectures and more efficient than other search methods.
Tables
  • Table1: Notations
  • Table2: Datasets
  • Table3: Performance comparison of different methods on three datasets
  • Table4: Performances on TaxiBJ
Download tables as Excel
Related work
  • 5.1 Spatial-Temporal Prediction

    Spatio-temporal data are ubiquitous in the physical world, such as the traffic flow and the regional rainfall. Accurately predicting the future dynamic of them from previous observations is very essential to a wide range of real-world applications like traffic management and weather forecasts [26].

    Recently, deep learning has been successfully applied to various scenarios in the ST area. For example, the architectures of (CNNs) were widely used in grid-based data modeling. Typically, [18, 19, 29, 31, 32] aim to design specific neural network structures (a) CrowdBJ (b) TaxiGY for modeling or predicting crowd flow as well as taxi demands. However, most of them are predicting the citywide traffic flow based on multi-view [29] or multi-task [33] which need to incorporate large number of expert knowledge. In addition, some researchers tries to formulate the ST prediction problem on graphs and build the model with graph convolution network [6, 12]. However, the data quality of cities has large difference that some cities release multiple yeas of data while others only release a few days of data. To tackle this problem, transfer learning and meta learning [25, 28] are often utilized for more accurate prediction.
Funding
  • This work was supported by the National Key R&D Program of China (2019YFB2101805) and Beijing Academy of Artificial Intelligence (BAAI)
Reference
  • J. An, H. Xiong, J. Huan, and J. Luo. Ultrafast photorealistic style transfer via neural architecture search. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.
    Google ScholarLocate open access versionFindings
  • Z. Barret and V. L. Quoc. Neural architecture search with reinforcement learning. In In International Conference on Learning Representations (ICLR-17), 2017.
    Google ScholarLocate open access versionFindings
  • G. Bender, P.-J. Kindermans, B. Zoph, V. Vasudevan, and Q. Le. Understanding and simplifying one-shot architecture search. In Thirty-fifth International Conference on Machine Learning (ICML-2018), 2018.
    Google ScholarLocate open access versionFindings
  • A. Brock, T. Lim,. J. Ritchie, and N. Weston. Smash: One-shot model architecture search through hypernetworks. In arXiv:1711.00536, 2017.
    Findings
  • H. Cai, L. Zhu, and S. Han. ProxylessNAS: Direct neural architecture search on target task and hardware. In Proceedings of the Seventh International Conference on Learning Representations (ICLR-2019), 2019.
    Google ScholarLocate open access versionFindings
  • C. Chen, K. Li, S. G. Teo, X. Zou, k. Wang, j. Wang, and Z. Zeng. Gated residual recurrent graph neural networks for traffic prediction. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019.
    Google ScholarLocate open access versionFindings
  • H. Chen, L. Zhuo, B. Zhang, X. Zheng, J. Liu, D. Doermann, and R. Ji. Binarized neural architecture search. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.
    Google ScholarLocate open access versionFindings
  • W. Chen, L. Chen, Y. Xie, W. Cao, Y. Gao, and X. Feng. Multi-range attentive bicomponent graph convolutional network for traffic forecasting. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI-20), 2020.
    Google ScholarLocate open access versionFindings
  • K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, and Y. Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
    Findings
  • X. Chu, B. Zhang, H. Ma, R. Xu, J. Li, and Q. Li. Fast, accurate and lightweight super-resolution with neural architecture search. In arXiv:1901.07261, 2019.
    Findings
  • X. Chu, T. Zhou, B. Zhang, and J. Li. Fair darts: Eliminating unfair advantages in differentiable architecture search. In arXiv preprint arXiv:1911.12126, 2019. [12] x. Geng, Y. Li, L. Wang, L. Zhang, q. Yang, j. Ye, and Y. Liu. Spatiotemporal multigraph convolution network for ride-hailing demand forecasting. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019.
    Findings
  • [13] S. Guo, Y. Lin, S. Li, Z. Chen, and H. Wan. Deep spatial–temporal 3d convolutional neural networks for traffic data forecasting. IEEE Transaction on intelligent transportation system (TITS-2019), 2019.
    Google ScholarFindings
  • [14] Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun. Single path one-shot neural architecture search with uniform sampling. arXiv preprint arXiv:1904.00420, 2019.
    Findings
  • [15] P. Hieu, Y. Melody, Z. Barret, V. L. Quoc, and D. Jeff. Efficient neural architecture search via parameter sharing. In In International Conference on Learning Representations (ICLR-18), 2018.
    Google ScholarLocate open access versionFindings
  • [16] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    Google ScholarLocate open access versionFindings
  • [17] Y. Liang, S. Ke, J. Zhang, X. Yi, and Y. Zheng. Geoman: Multi-level attention networks for geo-sensory time series prediction. In Proceedings of the TwentySeventh International Joint Conference on Artificial Intelligence, IJCAI-18, pages 3428–3434. IJCAI, 7 2018.
    Google ScholarLocate open access versionFindings
  • [18] Y. Liang, K. Ouyang, L. Jing, S. Ruan, Y. Liu, J. Zhang, D. S. Rosenblum, and Y. Zheng. Urbanfm: Inferring fine-grained urban flows. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-19), 2019.
    Google ScholarLocate open access versionFindings
  • [19] Y. Liang, K. Ouyang, J. Zhang, Y. Zheng, and D. S. Rosenblum. Revisiting convolutional neural networks for urban flow analytics. arXiv:2003.00895, 2020.
    Findings
  • [20] Z. Lin, J. Feng, Z. Lu, Y. Li, and D. Jin. Deepstn+: Context-aware spatial temporal neural network for crowd flow prediction in metropolis. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19), 2019.
    Google ScholarLocate open access versionFindings
  • [21] C. Liu, L.-C. Chen, F. Schroff, H. Adam, W. Hua, A. Yuille, and F. Li. Hierarchical neural architecture search for semantic image segmentation. In arXiv:1901.02985, 2019.
    Findings
  • [22] H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu. Hierarchical representations for efficient architecture search. In Proceedings of the Sixth International Conference on Learning Representations (ICLR-2018), 2018.
    Google ScholarLocate open access versionFindings
  • [23] H. Liu, K. Simonyan, and Y. Yang. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055, 2018.
    Findings
  • [24] H. Lu, J. Langford, R. Caruana, S. Mukherjee, E. Horvitz, and D. Dey. Efficient forward architecture search. In arXiv:1905.13360, 2019.
    Findings
  • [25] z. Pan, Y. Liang, W. Wang, Y. Yu, Y. Zheng, and J. Zhang. Urban traffic prediction from spatio-temporal data using deep meta learning. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD-19), 2019.
    Google ScholarLocate open access versionFindings
  • [26] X. Shi and D.-Y. Yeung. Machine learning for spatiotemporal sequence forecasting: A survey. arXiv preprint arXiv:1808.06865, 2018.
    Findings
  • [27] Y. Wang, M. Long, J. Wang, Z. Gao, and S. Y. Philip. Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. In Advances in Neural Information Processing Systems, pages 879–888, 2017.
    Google ScholarLocate open access versionFindings
  • [28] H. Yao, Y. Liu, Y. Wei, X. Tang, and Z. Li. Learning from multiple cities: A metalearning approach for spatial-temporal prediction. In Proceedings of the Web Conference (WWW-2019), 2019.
    Google ScholarLocate open access versionFindings
  • [29] H. Yao, F. Wu, J. Ke, X. Tang, Y. Jia, S. Lu, P. Gong, J. Ye, and L. Zhenhui. Deep multi-view spatial-temporal network for taxi demand prediction. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2018.
    Google ScholarLocate open access versionFindings
  • [30] B. Yu, H. Yin, and Z. Zhu. Spatio-temporal graph convolutional networks: A deep learning framework for traffic forecasting. In Proceedings of the Twenty-sixth International Joint Conference on Artificial Intelligence, IJCAI-17, 2017.
    Google ScholarLocate open access versionFindings
  • [31] J. Zhang, Y. Zheng, Q. Dekang, R. Li, and X. Yi. Dnn-based prediction model for spatial-temporal data. In SIGSPATIAL, 2016.
    Google ScholarLocate open access versionFindings
  • [32] J. Zhang, Y. Zheng, and D. Qi. Deep spatio-temporal residual networks for citywide crowd flows prediction. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), pages 1655–1661, 2017.
    Google ScholarLocate open access versionFindings
  • [33] J. Zhang, Y. Zheng, J. Sun, and D. Qi. Flow prediction in spatio-temporal networks based on multitask deep learning. IEEE Transactions on Knowledge and Data Engineering (TKDE-2019), 09 2019.
    Google ScholarLocate open access versionFindings
  • [34] Y. Zheng, L. Capra, O. Wolfson, and H. Yang. Urban computing: Concepts, methodologies, and applications. ACM Transaction on Intelligent Systems and Technology, October 2014.
    Google ScholarLocate open access versionFindings
  • [35] Z. Zhu, C. Liu, D. Yang, A. Yuille, and D. Xu. V-nas: Neural architecture search for volumetric medical image segmentation. In 3DV, 2019.
    Google ScholarLocate open access versionFindings
  • [36] B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning transferable architectures for scalable image recognition. In arXiv:1707.07012, 2017.
    Findings
Full Text
Your rating :
0

 

Tags
Comments