Abstract
There has been much recent interest in adapting data mining algorithms to time series databases. Many of these algorithms need to compare time series. Typically some variation or extension of Euclidean distance is used. However, as we demonstrate in this paper, Euclidean distance can be an extremely brittle distance measure. Dynamic time warping (DTW) has been suggested as a technique to allow more robust distance calculations, however it is computationally expensive. In this paper we introduce a modification of DTW which operates on a higher level abstraction of the data, in particular, a piecewise linear representation. We demonstrate that our approach allows us to outperform DTW by one to three orders of magnitude. We experimentally evaluate our approach on medical, astronomical and sign language data.
Chapter PDF
Similar content being viewed by others
Keywords
- Anchor Point
- Dynamic Time Warping
- High Level Representation
- Subsequence Match
- Time Series Classification
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Agrawal, R., Lin, K.I., Sawhney, H.S., Shim, K.: Fast similarity search in the presence of noise, scaling, and translation in times-series databases. In: VLDB (September 1995)
Bay, S.: UCI Repository of Kdd databases. University of California, Irvine, Department of Information and Computer Science (1999), http://kdd.ics.uci.edu/
Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: AAAI 1994 Workshop on Knowledge Discovery in Databases (KDD 1994), Seattle, Washington (1994)
Caiani, E.G., Porta, A., Baselli, G., Turiel, M., Muzzupappa, S., Pieruzzi, F., Crema, C., Malliani, A., Cerutti, S.: Warped-average template technique to track on a cycle-by-cycle basis the cardiac filling phases on left ventricular volume. in: IEEE Computers in Cardiology. NY, USA, Vol. 25 Cat. No.98CH36292, (1998)
Das, G., Lin, K., Mannila, H., Renganathan, G., Smyth, P.: Rule discovery form time series. In: Proceedings of the 4rd International Conference of Knowledge Discovery and Data Mining, pp. 16–22. AAAI Press, Menlo Park (1998)
Debregeas, A., Hebrail, G.: Interactive interpretation of Kohonen maps applied to curves. In: Proceedings of the 4rd International Conference of Knowledge Discovery and Data Mining, pp. 179–183. AAAI Press, Menlo Park (1998)
Derriere, S. (1998), D.E.N.I.S strip 3792: http://cdsweb.ustrasbg.fr/DENIS/qual_gif/cpl3792.dat
Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast subsequence matching in time-series databases. In: Proc. ACM SIGMOD Conf., Minneapolis (May 1994)
Gavrila, D.M., Davis, L.S.: Towards 3-d model-based tracking and recognition of human movement: a multi-view approach. In: In International Workshop on Automatic Face and Gesture-Recognition, IEEE Computer Society, Zurich (1995)
Gollmer, K., Posten, C.: Detection of distorted pattern using dynamic time warping algorithm and application for supervision of bioprocesses. In: Morris, A.J., Martin, E.B. (eds.) On-Line Fault Detection and Supervision in the Chemical Process Industries (1995)
Hagit, S., Zdonik, S.: Approximate queries and representations for large data sequences. In: Proc. 12th IEEE International Conference on Data Engineering, New Orleans, Louisiana, pp. 546–553 (February 1996)
Keogh, E., Pazzani, M.: An enhanced representation of time series which allows fast and accurate classification, clustering and relevance feedback. In: Proceedings of the 4th International Conference of Knowledge Discovery and Data Mining, pp. 239–241. AAAI Press, Menlo Park (1998)
Keogh, E., Pazzani, M.: An indexing scheme for fast similarity search in large time series databases. In: Proceedings of the 11th International Conference on Scientific and Statistical Database Management (1999) (to appear)
Keogh, E., Smyth, P.: A probabilistic approach to fast pattern matching in time series databases. In: Proceedings of the 3rd International Conference of Knowledge Discovery and Data Mining, pp. 20–24. AAAI Press, Menlo Park (1997)
Kruskall, J.B., Liberman, M.: The symmetric time warping algorithm: From continuous to discrete. In: Time Warps, String Edits and Macromolecules: The Theory and Practice of String Comparison. Addison-Wesley, Reading (1983)
Pavlidis, T., Horowitz, S.: Segmentation of plane curves. IEEE Transactions on Computers C-23(8) (August 1974)
Rabiner, L., Juang, B.: Fundamentals of speech recognition. Prentice Hall, Englewood Cliffs (1993)
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoustics, Speech, and Signal Proc. ASSP-26, 43–49 (1978)
Schmill, M., Oates, T., Cohen, P.: Learned models for continuous planning. In: Seventh International Workshop on Artificial Intelligence and Statistics (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Keogh, E.J., Pazzani, M.J. (1999). Scaling up Dynamic Time Warping to Massive Datasets. In: Żytkow, J.M., Rauch, J. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1999. Lecture Notes in Computer Science(), vol 1704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-48247-5_1
Download citation
DOI: https://doi.org/10.1007/978-3-540-48247-5_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66490-1
Online ISBN: 978-3-540-48247-5
eBook Packages: Springer Book Archive