## AI帮你理解科学

## AI 精读

AI抽取本论文的概要总结

微博一下：

# An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback

KDD, pp.239-243, (1998)

EI

摘要

We introduce an extended representation of time series that allows fast, accurate classification and clustering in addition to the ability to explore time series data in a relevance feedback framework. The representation consists of piece- wise linear segments to represent shape and a weight vector that contains the relative importance of...更多

代码：

数据：

简介

- Time series account for much of the data stored in business, medical, engineering and social science databases.
- Much of the utility of collecting this data comes from the ability of humans to visualize the shape of the data, and classify it.
- Attempts to utilize classic machine learning and clustering algorithms on time series data have not met with great success.
- For an example of both these difficulties, consider the three time series in Figure 1.

重点内容

- Time series account for much of the data stored in business, medical, engineering and social science databases
- We use piece-wise linear segments to represent the shape of a time series, and a weight vector that contains the relative importance of each individual linear segment
- We introduced a new enhanced representation of time series and empirically demonstrated its utility for clustering, classification and relevance feedback

结果

- To test the above algorithm, the authors conducted the following experiment. The authors constructed 500 “Type A”, and 500 “Type B” time series, which are defined as follows:

Type A: Sin(x3) normalized to be between zero and one, plus Gaussian noise with σ = .1 -2 ≤ x ≤ 2

Type B: Tan(Sin(x3)) normalized to be between zero and one, plus Gaussian noise with σ = .1 -2 ≤ x ≤ 2 Sin(x3) A) B) C) Tan(Sin(x3))

Twenty-five experimental runs where made. - To test the above algorithm, the authors conducted the following experiment.
- The initial query was made, and the quality of the ranked sequences was measured as defined below.
- To test the algorithm presented above the authors ran experiments on the following datasets.
- NN is a simple nearest neighbor algorithm that uses the raw data representation of the time series.
- NNS is the same algorithm as NN except it uses the sequence representation and the distance measure defined in section 2.2

结论

- The authors introduced a new enhanced representation of time series and empirically demonstrated its utility for clustering, classification and relevance feedback.

- Table1: The merge algorithm
- Table2: Results of relevance feedback experiments. The values recorded in parentheses are for the queries built just from positive feedback
- Table3: The CTC learning algorithm
- Table4: Experiment results of classification experiments

相关工作

- There has been no work on relevance feedback for time series. However, in the text domain there is an active and prolific research community. Salton and Buckley (1990) provide an excellent overview and comparison of the various approaches.

引用论文

- Agrawal, R., Lin, K. I., Sawhney, H. S., & Shim, K.(1995). Fast similarity search in the presence of noise, scaling, and translation in times-series databases. In VLDB, September.
- Faloutsos, C., Ranganathan, M., & Manolopoulos, Y. (1994). Fast subsequence matching in time-series databases. SIGMOD Proceedings of Annual Conference, Minneapolis, May.
- Hagit, S., & Zdonik, S. (1996). Approximate queries and representations for large data sequences. Proc. 12th IEEE International Conference on Data Engineering. pp 546-553, New Orleans, Louisiana, February.
- Keogh, E. (1997). Fast similarity search in the presence of longitudinal scaling in time series databases. Proceedings of the 9th International Conference on Tools with Artificial Intelligence. pp 578-58IEEE Press.
- Keogh, E., Smyth, P. (1997). A probabilistic approach to fast pattern matching in time series databases. Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining. pp 24-20, AAAI Press.
- Pavlidis, T., Horowitz, S., (1974). Segmentation of plane curves. IEEE Transactions on Computers, Vol. C-23, No 8.
- Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. JASIS 41. pp. 288-297.
- Shaw, S. W. & DeFigueiredo, R. J. P. (1990). Structural processing of waveforms as trees. IEEE Transactions on Acoustics, Speech, and Signal Processing. Vol. 38 No 2 February.
- Rocchio, J. J., Jr.(1971). Relevance feedback in information retrieval: The Smart System - Experiments in Automatic Document Processing. Prentice-Hall Inc., pp. 337-354.
- Zebrowski, J,J. (1997). http://www.mpipks-dresden.mpg.de/~ntptsa/Data/Zebrowski-D/

标签

评论

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn