Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

SIGIR, pp. 95-104, 2018.

Cited by: 142|Bibtex|Views109|DOI:https://doi.org/10.1145/3209978.3210006
EI
Other Links: dblp.uni-trier.de|academic.microsoft.com|dl.acm.org|arxiv.org
Weibo:
By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets

Abstract:

Multivariate time series forecasting is an important machine learning problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. Temporal data arise in these real-world applications often involves a mixture of long-term and short-term patterns, for which traditional...More

Code:

Data:

0
Introduction
  • Multivariate time series data are ubiquitous in the everyday life ranging from the prices in stock markets, the traffic flows on highways, the outputs of solar power plants, the temperatures across different cities, just to name a few.
  • The long-term patterns reflect the difference between days vs nights, summer vs winter, etc., and the shortterm patterns reflect the effects of cloud movements, wind direction changes, etc
  • Without taking both kinds of recurrent patterns into account, accurate time series forecasting is not possible.
  • Addressing such limitations of existing methods in time series forecasting is the main focus of this paper, for which the authors propose a novel framework that takes advantages of recent developments in deep learning research
Highlights
  • Multivariate time series data are ubiquitous in our everyday life ranging from the prices in stock markets, the traffic flows on highways, the outputs of solar power plants, the temperatures across different cities, just to name a few
  • Without taking both kinds of recurrent patterns into account, accurate time series forecasting is not possible. Traditional approaches such as the large body of work in autoregressive methods [2, 12, 22, 32, 35] fall short in this aspect, as most of them do not distinguish the two kinds of patterns nor model their interactions explicitly and dynamically. Addressing such limitations of existing methods in time series forecasting is the main focus of this paper, for which we propose a novel framework that takes advantages of recent developments in deep learning research
  • We propose a deep learning framework designed for the multivariate time series forecasting, namely Long- and Shortterm Time-series Network (LSTNet), as illustrated in Figure 2
  • We presented a novel deep learning framework (LSTNet) for the task of multivariate time series forecasting
  • By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets
Results
  • The total count of the bold-faced results is 17 for LSTNet-Skip, 7 for LSTNet-Attn, and between 0 to 3 for the rest of the methods.
  • The two proposed models, LSTNet-skip and LSTNet-Attn, consistently enhance over state-of-the-art on the datasets with periodic pattern, especially in the settings of large horizons.
  • LSTNet outperforms the strong neural baseline RNN-GRU by 9.2%, 11.7%, 22.2% in RSE metric on Solar-Energy, Traffic and Electricity dataset respectively when the horizon is 24, demonstrating the effectiveness of the framework design for complex repetitive patterns.
  • Why? Recall that in Section 4.3 and Figure 3 the authors used the autocorrelation curves of these
Conclusion
  • The authors presented a novel deep learning framework (LSTNet) for the task of multivariate time series forecasting.
  • By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets.
  • In the convolution layer the authors treat each variable dimension but in the real world dataset, the authors usually have rich attribute information.
  • Integrating them into the LSTNet model is another challenging problem
Summary
  • Introduction:

    Multivariate time series data are ubiquitous in the everyday life ranging from the prices in stock markets, the traffic flows on highways, the outputs of solar power plants, the temperatures across different cities, just to name a few.
  • The long-term patterns reflect the difference between days vs nights, summer vs winter, etc., and the shortterm patterns reflect the effects of cloud movements, wind direction changes, etc
  • Without taking both kinds of recurrent patterns into account, accurate time series forecasting is not possible.
  • Addressing such limitations of existing methods in time series forecasting is the main focus of this paper, for which the authors propose a novel framework that takes advantages of recent developments in deep learning research
  • Results:

    The total count of the bold-faced results is 17 for LSTNet-Skip, 7 for LSTNet-Attn, and between 0 to 3 for the rest of the methods.
  • The two proposed models, LSTNet-skip and LSTNet-Attn, consistently enhance over state-of-the-art on the datasets with periodic pattern, especially in the settings of large horizons.
  • LSTNet outperforms the strong neural baseline RNN-GRU by 9.2%, 11.7%, 22.2% in RSE metric on Solar-Energy, Traffic and Electricity dataset respectively when the horizon is 24, demonstrating the effectiveness of the framework design for complex repetitive patterns.
  • Why? Recall that in Section 4.3 and Figure 3 the authors used the autocorrelation curves of these
  • Conclusion:

    The authors presented a novel deep learning framework (LSTNet) for the task of multivariate time series forecasting.
  • By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets.
  • In the convolution layer the authors treat each variable dimension but in the real world dataset, the authors usually have rich attribute information.
  • Integrating them into the LSTNet model is another challenging problem
Tables
  • Table1: Dataset Statistics, where T is length of time series, D is number of variables, L is the sample rate
  • Table2: Results summary (in RSE and CORR) of all methods on four datasets: 1) each row has the results of a specific method in a particular metric; 2) each column compares the results of all methods on a particular dataset with a specific horizon value; 3) bold face indicates the best result of each column in a particular metric; and 4) the total number of bold-faced results of each method is listed under the method name within parentheses
Download tables as Excel
Funding
  • By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets
Study subjects and analysis
benchmark datasets: 4
The problem then becomes a regression task with a set of feature-value pairs {X t , Y t+h }, and can be solved by Stochastic Gradient Decent (SGD) or its variants such as Adam. We conducted extensive experiments with 9 methods (including our new methods) on 4 benchmark datasets for time series forecasting tasks. All the data and experiment codes are available online 2

benchmark datasets: 4
4.3 Data. We used four benchmark datasets which are publicly available. Table 1 summarizes the corpus statistics

datasets: 4
(a) Traffic dataset (b) Solar-Energy dataset. In order to examine the existence of long-term and/or shortterm repetitive patterns in time series data, we plot autocorrelation graph for some randomly selected variables from the four datasets in Figure 3. Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay defined below

Reference
  • D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
    Findings
  • G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015.
    Google ScholarFindings
  • G. E. Box and D. A. Pierce. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American statistical Association, 65(332):1509–1526, 1970.
    Google ScholarLocate open access versionFindings
  • L.-J. Cao and F. E. H. Tay. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506– 1518, 2003.
    Google ScholarLocate open access versionFindings
  • Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu. Recurrent neural networks for multivariate time series with missing values. arXiv preprint arXiv:1606.01865, 2016.
    Findings
  • J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
    Findings
  • J. Connor, L. E. Atlas, and D. R. Martin. Recurrent networks and narma modeling. In NIPS, pages 301–308, 1991.
    Google ScholarLocate open access versionFindings
  • S. Dasgupta and T. Osogami. Nonlinear dynamic boltzmann machines for timeseries prediction. AAAI-17. Extended research report available at goo. gl/Vd0wna, 2016.
    Google ScholarFindings
  • J. L. Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.
    Google ScholarLocate open access versionFindings
  • R. Frigola, F. Lindsten, T. B. Schön, and C. E. Rasmussen. Bayesian inference and learning in gaussian process state-space models with particle mcmc. In Advances
    Google ScholarLocate open access versionFindings
  • [13] N. Y. Hammerla, S. Halloran, and T. Ploetz. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880, 2016.
    Findings
  • [14] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012.
    Google ScholarLocate open access versionFindings
  • [15] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
    Google ScholarLocate open access versionFindings
  • [16] A. Jain and A. M. Kumar. Hybrid neural network models for hydrologic time series forecasting. Applied Soft Computing, 7(2):585–592, 2007.
    Google ScholarLocate open access versionFindings
  • [18] D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • [19] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • [20] C. Lea, R. Vidal, A. Reiter, and G. D. Hager. Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision–ECCV 2016 Workshops, pages 47–54.
    Google ScholarLocate open access versionFindings
  • [21] Y. LeCun and Y. Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995.
    Google ScholarLocate open access versionFindings
  • [22] J. Li and W. Chen. Forecasting macroeconomic time series: Lasso-based approaches and their forecast combinations with dynamic factor models. International Journal of Forecasting, 30(4):996–1015, 2014.
    Google ScholarLocate open access versionFindings
  • [23] Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677, 2015.
    Findings
  • [26] I. Melnyk and A. Banerjee. Estimating structured vector autoregressive model. arXiv preprint arXiv:1602.06606, 2016.
    Findings
  • [27] H. Qiu, S. Xu, F. Han, H. Liu, and B. Caffo. Robust estimation of transition matrices in high dimensional heavy-tailed vector autoregressive processes. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1843–1851, 2015.
    Google ScholarLocate open access versionFindings
  • [28] S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain. Gaussian processes for time-series modelling. Phil. Trans. R. Soc. A, 371(1984):20110550, 2013.
    Google ScholarLocate open access versionFindings
  • [29] R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.
    Findings
  • [30] V. Vapnik, S. E. Golowich, A. Smola, et al. Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, pages 281–287, 1997.
    Google ScholarLocate open access versionFindings
  • [31] J. B. Yang, M. N. Nguyen, P. P. San, X. L. Li, and S. Krishnaswamy. Deep convolutional neural networks on multichannel time series for human activity recognition. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pages 25–31, 2015.
    Google ScholarLocate open access versionFindings
  • [32] H.-F. Yu, N. Rao, and I. S. Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In Advances in Neural Information Processing Systems, pages 847–855, 2016.
    Google ScholarLocate open access versionFindings
  • [33] R. Yu, Y. Li, C. Shahabi, U. Demiryurek, and Y. Liu. Deep learning: A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017 SIAM International Conference on Data Mining, pages 777–785. SIAM, 2017.
    Google ScholarLocate open access versionFindings
  • [34] G. Zhang, B. E. Patuwo, and M. Y. Hu. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting, 14(1):35–62, 1998.
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments