# Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks

SIGIR, pp. 95-104, 2018.

EI

Weibo:

Abstract:

Multivariate time series forecasting is an important machine learning problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. Temporal data arise in these real-world applications often involves a mixture of long-term and short-term patterns, for which traditional...More

Code:

Data:

Introduction

- Multivariate time series data are ubiquitous in the everyday life ranging from the prices in stock markets, the traffic flows on highways, the outputs of solar power plants, the temperatures across different cities, just to name a few.
- The long-term patterns reflect the difference between days vs nights, summer vs winter, etc., and the shortterm patterns reflect the effects of cloud movements, wind direction changes, etc
- Without taking both kinds of recurrent patterns into account, accurate time series forecasting is not possible.
- Addressing such limitations of existing methods in time series forecasting is the main focus of this paper, for which the authors propose a novel framework that takes advantages of recent developments in deep learning research

Highlights

- Multivariate time series data are ubiquitous in our everyday life ranging from the prices in stock markets, the traffic flows on highways, the outputs of solar power plants, the temperatures across different cities, just to name a few
- Without taking both kinds of recurrent patterns into account, accurate time series forecasting is not possible. Traditional approaches such as the large body of work in autoregressive methods [2, 12, 22, 32, 35] fall short in this aspect, as most of them do not distinguish the two kinds of patterns nor model their interactions explicitly and dynamically. Addressing such limitations of existing methods in time series forecasting is the main focus of this paper, for which we propose a novel framework that takes advantages of recent developments in deep learning research
- We propose a deep learning framework designed for the multivariate time series forecasting, namely Long- and Shortterm Time-series Network (LSTNet), as illustrated in Figure 2
- We presented a novel deep learning framework (LSTNet) for the task of multivariate time series forecasting
- By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets

Results

- The total count of the bold-faced results is 17 for LSTNet-Skip, 7 for LSTNet-Attn, and between 0 to 3 for the rest of the methods.
- The two proposed models, LSTNet-skip and LSTNet-Attn, consistently enhance over state-of-the-art on the datasets with periodic pattern, especially in the settings of large horizons.
- LSTNet outperforms the strong neural baseline RNN-GRU by 9.2%, 11.7%, 22.2% in RSE metric on Solar-Energy, Traffic and Electricity dataset respectively when the horizon is 24, demonstrating the effectiveness of the framework design for complex repetitive patterns.
- Why? Recall that in Section 4.3 and Figure 3 the authors used the autocorrelation curves of these

Conclusion

- The authors presented a novel deep learning framework (LSTNet) for the task of multivariate time series forecasting.
- By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets.
- In the convolution layer the authors treat each variable dimension but in the real world dataset, the authors usually have rich attribute information.
- Integrating them into the LSTNet model is another challenging problem

Summary

## Introduction:

Multivariate time series data are ubiquitous in the everyday life ranging from the prices in stock markets, the traffic flows on highways, the outputs of solar power plants, the temperatures across different cities, just to name a few.- The long-term patterns reflect the difference between days vs nights, summer vs winter, etc., and the shortterm patterns reflect the effects of cloud movements, wind direction changes, etc
- Without taking both kinds of recurrent patterns into account, accurate time series forecasting is not possible.
- Addressing such limitations of existing methods in time series forecasting is the main focus of this paper, for which the authors propose a novel framework that takes advantages of recent developments in deep learning research
## Results:

The total count of the bold-faced results is 17 for LSTNet-Skip, 7 for LSTNet-Attn, and between 0 to 3 for the rest of the methods.- The two proposed models, LSTNet-skip and LSTNet-Attn, consistently enhance over state-of-the-art on the datasets with periodic pattern, especially in the settings of large horizons.
- LSTNet outperforms the strong neural baseline RNN-GRU by 9.2%, 11.7%, 22.2% in RSE metric on Solar-Energy, Traffic and Electricity dataset respectively when the horizon is 24, demonstrating the effectiveness of the framework design for complex repetitive patterns.
- Why? Recall that in Section 4.3 and Figure 3 the authors used the autocorrelation curves of these
## Conclusion:

The authors presented a novel deep learning framework (LSTNet) for the task of multivariate time series forecasting.- By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets.
- In the convolution layer the authors treat each variable dimension but in the real world dataset, the authors usually have rich attribute information.
- Integrating them into the LSTNet model is another challenging problem

- Table1: Dataset Statistics, where T is length of time series, D is number of variables, L is the sample rate
- Table2: Results summary (in RSE and CORR) of all methods on four datasets: 1) each row has the results of a specific method in a particular metric; 2) each column compares the results of all methods on a particular dataset with a specific horizon value; 3) bold face indicates the best result of each column in a particular metric; and 4) the total number of bold-faced results of each method is listed under the method name within parentheses

Funding

- By combining the strengths of convolutional and recurrent neural networks and an autoregressive component, the proposed approach significantly improved the state-of-the-art results in time series forecasting on multiple benchmark datasets

Study subjects and analysis

benchmark datasets: 4

The problem then becomes a regression task with a set of feature-value pairs {X t , Y t+h }, and can be solved by Stochastic Gradient Decent (SGD) or its variants such as Adam. We conducted extensive experiments with 9 methods (including our new methods) on 4 benchmark datasets for time series forecasting tasks. All the data and experiment codes are available online 2

benchmark datasets: 4

4.3 Data. We used four benchmark datasets which are publicly available. Table 1 summarizes the corpus statistics

datasets: 4

(a) Traffic dataset (b) Solar-Energy dataset. In order to examine the existence of long-term and/or shortterm repetitive patterns in time series data, we plot autocorrelation graph for some randomly selected variables from the four datasets in Figure 3. Autocorrelation, also known as serial correlation, is the correlation of a signal with a delayed copy of itself as a function of delay defined below

Reference

- D. Bahdanau, K. Cho, and Y. Bengio. Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014.
- G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015.
- G. E. Box and D. A. Pierce. Distribution of residual autocorrelations in autoregressive-integrated moving average time series models. Journal of the American statistical Association, 65(332):1509–1526, 1970.
- L.-J. Cao and F. E. H. Tay. Support vector machine with adaptive parameters in financial time series forecasting. IEEE Transactions on neural networks, 14(6):1506– 1518, 2003.
- Z. Che, S. Purushotham, K. Cho, D. Sontag, and Y. Liu. Recurrent neural networks for multivariate time series with missing values. arXiv preprint arXiv:1606.01865, 2016.
- J. Chung, C. Gulcehre, K. Cho, and Y. Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
- J. Connor, L. E. Atlas, and D. R. Martin. Recurrent networks and narma modeling. In NIPS, pages 301–308, 1991.
- S. Dasgupta and T. Osogami. Nonlinear dynamic boltzmann machines for timeseries prediction. AAAI-17. Extended research report available at goo. gl/Vd0wna, 2016.
- J. L. Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.
- R. Frigola, F. Lindsten, T. B. Schön, and C. E. Rasmussen. Bayesian inference and learning in gaussian process state-space models with particle mcmc. In Advances
- [13] N. Y. Hammerla, S. Halloran, and T. Ploetz. Deep, convolutional, and recurrent models for human activity recognition using wearables. arXiv preprint arXiv:1604.08880, 2016.
- [14] G. Hinton, L. Deng, D. Yu, G. E. Dahl, A.-r. Mohamed, N. Jaitly, A. Senior, V. Vanhoucke, P. Nguyen, T. N. Sainath, et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6):82–97, 2012.
- [15] S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.
- [16] A. Jain and A. M. Kumar. Hybrid neural network models for hydrologic time series forecasting. Applied Soft Computing, 7(2):585–592, 2007.
- [18] D. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- [19] A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
- [20] C. Lea, R. Vidal, A. Reiter, and G. D. Hager. Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision–ECCV 2016 Workshops, pages 47–54.
- [21] Y. LeCun and Y. Bengio. Convolutional networks for images, speech, and time series. The handbook of brain theory and neural networks, 3361(10):1995, 1995.
- [22] J. Li and W. Chen. Forecasting macroeconomic time series: Lasso-based approaches and their forecast combinations with dynamic factor models. International Journal of Forecasting, 30(4):996–1015, 2014.
- [23] Z. C. Lipton, D. C. Kale, C. Elkan, and R. Wetzell. Learning to diagnose with lstm recurrent neural networks. arXiv preprint arXiv:1511.03677, 2015.
- [26] I. Melnyk and A. Banerjee. Estimating structured vector autoregressive model. arXiv preprint arXiv:1602.06606, 2016.
- [27] H. Qiu, S. Xu, F. Han, H. Liu, and B. Caffo. Robust estimation of transition matrices in high dimensional heavy-tailed vector autoregressive processes. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pages 1843–1851, 2015.
- [28] S. Roberts, M. Osborne, M. Ebden, S. Reece, N. Gibson, and S. Aigrain. Gaussian processes for time-series modelling. Phil. Trans. R. Soc. A, 371(1984):20110550, 2013.
- [29] R. K. Srivastava, K. Greff, and J. Schmidhuber. Highway networks. arXiv preprint arXiv:1505.00387, 2015.
- [30] V. Vapnik, S. E. Golowich, A. Smola, et al. Support vector method for function approximation, regression estimation, and signal processing. Advances in neural information processing systems, pages 281–287, 1997.
- [31] J. B. Yang, M. N. Nguyen, P. P. San, X. L. Li, and S. Krishnaswamy. Deep convolutional neural networks on multichannel time series for human activity recognition. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, pages 25–31, 2015.
- [32] H.-F. Yu, N. Rao, and I. S. Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In Advances in Neural Information Processing Systems, pages 847–855, 2016.
- [33] R. Yu, Y. Li, C. Shahabi, U. Demiryurek, and Y. Liu. Deep learning: A generic approach for extreme condition traffic forecasting. In Proceedings of the 2017 SIAM International Conference on Data Mining, pages 777–785. SIAM, 2017.
- [34] G. Zhang, B. E. Patuwo, and M. Y. Hu. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting, 14(1):35–62, 1998.

Full Text

Tags

Comments