Skip to main content

Advertisement

Log in

Time series indexing by dynamic covering with cross-range constraints

  • Regular Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Time series indexing plays an important role in querying and pattern mining of big data. This paper proposes a novel structure for tightly covering a given set of time series under the dynamic time warping similarity measurement. The structure, referred to as dynamic covering with cross-range constraints (DCRC), enables more efficient and scalable indexing to be developed than current hypercube-based partitioning approaches. In particular, a lower bound of the DTW distance from a given query time series to a DCRC-based cover set is introduced. By virtue of its tightness, which is proven theoretically, the lower bound can be used for pruning when querying on an indexing tree. If the DCRC-based lower bound (LB_DCRC) of an upper node in an index tree is larger than a given threshold, all child nodes can be pruned yielding a significant reduction in computational time. A hierarchical DCRC (HDCRC) structure is proposed to generate the DCRC-tree-based indexing and used to develop time series indexing and insertion algorithms. Experimental results for a selection of benchmark time series datasets are presented to illustrate the tightness of LB_DCRC, as well as the pruning efficiency on the DCRC-tree, especially when the time series have large deformations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence databases. In: Proceedings of International Conference on Foundations of Data Organization and Algorithms, pp. 69–84. Springer, Boston, MA (1993)

  2. Chen, C.L.P., Zhang, C.Y.: Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  3. Dau, H.A., Keogh, E., Kamgar, K., Yeh, C.C.M., Zhu, Y., Gharghabi, S., Ratanamahatana, C.A., Yanping, Hu, B., Begum, N., Bagnall, A., Mueen, A., Batista, G., Hexagon, M.L.: The UCR time series classification archive. https://www.cs.ucr.edu/~eamonn/time_series_data_2018 (2018)

  4. Edstrom, J., Chen, D., Gong, Y., Wang, J., Gong, N.: Data-pattern enabled self-recovery low-power storage system for big video data. IEEE Trans. Big Data 5(1), 95–105 (2019)

    Article  Google Scholar 

  5. Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 12:1–34 (2012)

    Article  Google Scholar 

  6. Fu, T.C.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)

    Article  Google Scholar 

  7. Grabocka, J., Wistuba, M., Schmidt-Thieme, L.: Fast classification of univariate and multivariate time series through shapelet discovery. Knowl. Inf. Syst. 49(2), 429–454 (2016)

    Article  Google Scholar 

  8. Guttman, A.: (1984) R-trees: A dynamic index structure for spatial searching. In: ACM Sigmod International Conference on Management of Data, pp. 47–57. ACM, New York, NY (2018)

  9. He, H., Tan, Y.: Unsupervised classification of multivariate time series using VPCA and fuzzy clustering with spatial weighted matrix distance. IEEE Trans. Cybern. 50(3), 1096–1105 (2020)

    Article  Google Scholar 

  10. Hu, J., Yang, B., Guo, C., Jensen, C.S.: Risk-aware path selection with time-varying, uncertain travel costs: A time series approach. VLDB J. 27(2), 179–200 (2018)

    Article  Google Scholar 

  11. Ignatov, A.: Real-time human activity recognition from accelerometer data using convolutional neural networks. Appl. Soft Comput. 62, 915–922 (2018)

    Article  Google Scholar 

  12. Itakura, F.: Minimum prediction residual principle applied to speech recognition. IEEE Trans. Acoust. Speech Signal Process. 23(1), 67–72 (1975)

    Article  Google Scholar 

  13. Kacprzyk, J., Wilbik, A., Zadrożny, S.: Linguistic summarization of time series using a fuzzy quantifier driven aggregation. Fuzzy Sets Syst. 159(12), 1485–1499 (2008)

    Article  MathSciNet  Google Scholar 

  14. Keogh, E., Ratanamahatana, C.A.: Exact indexing of dynamic time warping. Knowl. Inf. Syst. 7(3), 358–386 (2005)

    Article  Google Scholar 

  15. Keogh, E., Wei, L., Xi, X., Vlachos, M., Lee, S.H., Protopapas, P.: Supporting exact indexing of arbitrarily rotated shapes and periodic time series under Euclidean and warping distance measures. VLDB J. 18(3), 611–630 (2009)

    Article  Google Scholar 

  16. Lemire, D.: Faster retrieval with a two-pass dynamic-time-warping lower bound. Pattern Recogn. 42, 2169–2180 (2009)

    Article  Google Scholar 

  17. Li, H., Yang, L.: Extensions and relationships of some existing lower-bound functions for dynamic time warping. J., Intell. Inf. Syst. 43(1), 59–79 (2014)

    Article  Google Scholar 

  18. Li, Q., Chen, Y., Wang, J., Chen, Y., Chen, H.C.: Web media and stock markets: A survey and future directions from a big data perspective. IEEE Trans. Knowl. Data Eng. 30(2), 381–399 (2018)

    Article  Google Scholar 

  19. Lin, S.C., Yeh, M.Y., Chen, M.S.: Non-overlapping subsequence matching of stream synopses. IEEE Trans. Knowl. Data Eng. 30(1), 101–114 (2018)

    Article  Google Scholar 

  20. Liu, M., Zhang, X., Xu, G.: Continuous motion classification and segmentation based on improved dynamic time warping algorithm. Int. J. Pattern Recognit Artif Intell. 32(2), 1850,002 (2018)

    Article  MathSciNet  Google Scholar 

  21. Mikalsen, K.Ø., Bianchi, F.M., Soguero-Ruiz, C., Jenssen, R.: Time series cluster kernel for learning similarities between multivariate time series with missing data. Pattern Recogn. 76, 569–581 (2018)

    Article  Google Scholar 

  22. Mondal, T., Ragot, N., Ramel, J.Y., Pal, U.: Comparative study of conventional time series matching techniques for word spotting. Pattern Recogn. 73, 47–64 (2018)

    Article  Google Scholar 

  23. Mori, U., Mendiburu, A., Lozano, J.A.: Similarity measure selection for clustering time series databases. IEEE Trans. Knowl. Data Eng. 28(1), 181–195 (2016)

    Article  Google Scholar 

  24. Mueen, A., Chavoshi, N., Abu-El-Rub, N., Hamooni, H., Minnich, A., MacCarthy, J.: Speeding up dynamic time warping distance for sparse time series data. Knowl. Inf. Syst. 54(1), 237–263 (2018)

    Article  Google Scholar 

  25. Mueen, A., Keogh, E.: Extracting optimal performance from dynamic time warping. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 2129–2130. ACM, New York, NY (2016)

  26. Park, S., Lee, D., Chu, W.W.: Fast retrieval of similar subsequences in long sequence databases. In: Proceedings of 1999 Workshop on Knowledge and Data Engineering Exchange, pp. 60–67. IEEE, Chicago, IL (1999)

  27. Rakthanmanon, T., Campana, B., Mueen, A., Batista, G., Westover, B., Zhu, Q., Zakaria, J., Keogh, E.: Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 262–270. ACM, New York, NY (2012)

  28. Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)

    Article  Google Scholar 

  29. Shen, Y., Chen, Y., Keogh, E., Jin, H.: Accelerating time series searching with large uniform scaling. In: Proceedings of the 2018 SIAM International Conference on Data Mining, pp. 234–242. SIAM, Bologna, Italy (2018)

  30. Son, N.T., Anh, D.T.: Discovery of time series \(k\)-motifs based on multidimensional index. Knowl. Inf. Syst. 46(1), 59–86 (2016)

    Article  Google Scholar 

  31. Sun, T., Liu, H., Yu, H., Chen, C.L.P.: Degree-pruning dynamic planning approaches to central time series through minimizing dynamic time warping distance. IEEE Trans. Cybern. 47(7), 1719–1729 (2017)

    Article  Google Scholar 

  32. Tan, C.W., Petitjean, F., Webb, G.: Elastic bands across the path: a new framework and method to lower bound DTW. In: Proceedings of the 2019 SIAM International Conference on Data Mining, pp. 522–530. SIAM, Alberta, Canada (2019)

  33. Tan, C.W., Webb, G.I., Petitjean, F.: Indexing and classifying gigabytes of time series under time warping. In: Proceedings of the 2017 SIAM International Conference on Data Mining, pp. 282–290. SIAM, Houston, TX (2017)

  34. Tan, Z., Wang, Y., Zhang, Y., Zhou, J.: A novel time series approach for predicting the long-term popularity of online videos. IEEE Trans. Broadcast. 62(2), 436–445 (2016)

    Article  Google Scholar 

  35. Tang, J., Cheng, H., Zhao, Y., Guo, H.: Structured dynamic time warping for continuous hand trajectory gesture recognition. Pattern Recogn. 80, 21–31 (2018)

    Article  Google Scholar 

  36. Wu, X., Zhu, X., Wu, G., Ding, W.: Data mining with big data. IEEE Trans. Knowl. Data Eng. 26(1), 97–107 (2014)

    Article  Google Scholar 

  37. Wu, Y., Tong, Y., Zhu, X., Wu, X.: NOSEP: Nonoverlapping sequence pattern mining with gap constraints. IEEE Trans. Cybern. 48(10), 2809–2822 (2018)

    Article  Google Scholar 

  38. Yi, B.K., Jagadish, H.V., Faloutsos, C.: Efficient retrieval of similar time sequences under time warping. In: Proceedings of the 14th International Conference on Data Engineering, pp. 201–208. IEEE, Orlando, FL (1998)

  39. Zhou, M., Wong, M.H.: Boundary-based lower-bound functions for dynamic time warping and their indexing. Inf. Sci. 181(19), 4175–4196 (2011)

    Article  Google Scholar 

  40. Zoumpatianos, K., Lou, Y., Ileana, I., Palpanas, T., Gehrke, J.: Generating data series query workloads. VLDB J. 27(6), 823–846 (2018)

    Article  Google Scholar 

Download references

Acknowledgements

The authors sincerely thank the editors and the anonymous reviewers for the very helpful and kind comments that have enhanced the presentation of our paper. The authors would also like to thank the UCR time series classification archive and Prof. Keogh for providing the datasets used in the study. This work is supported in part by the National Natural Science Foundation of China (Grant Nos. 61751205, 91746209, 61772102)

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongbo Liu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, T., Liu, H., McLoone, S. et al. Time series indexing by dynamic covering with cross-range constraints. The VLDB Journal 29, 1365–1384 (2020). https://doi.org/10.1007/s00778-020-00614-9

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-020-00614-9

Keywords

Navigation