Optimizing dynamic time warping’s window width for time series data mining applications

Dau, Hoang Anh; Silva, Diego Furtado; Petitjean, François; Forestier, Germain; Bagnall, Anthony; Mueen, Abdullah; Keogh, Eamonn

doi:10.1007/s10618-018-0565-y

Optimizing dynamic time warping’s window width for time series data mining applications

Published: 09 April 2018

Volume 32, pages 1074–1120, (2018)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Hoang Anh Dau ORCID: orcid.org/0000-0003-2439-5185¹,
Diego Furtado Silva²,
François Petitjean³,
Germain Forestier⁴,
Anthony Bagnall⁵,
Abdullah Mueen⁶ &
…
Eamonn Keogh¹

2207 Accesses
55 Citations
Explore all metrics

Abstract

Dynamic Time Warping (DTW) is a highly competitive distance measure for most time series data mining problems. Obtaining the best performance from DTW requires setting its only parameter, the maximum amount of warping (w). In the supervised case with ample data, w is typically set by cross-validation in the training stage. However, this method is likely to yield suboptimal results for small training sets. For the unsupervised case, learning via cross-validation is not possible because we do not have access to labeled data. Many practitioners have thus resorted to assuming that “the larger the better”, and they use the largest value of w permitted by the computational resources. However, as we will show, in most circumstances, this is a naïve approach that produces inferior clusterings. Moreover, the best warping window width is generally non-transferable between the two tasks, i.e., for a single dataset, practitioners cannot simply apply the best w learned for classification on clustering or vice versa. In addition, we will demonstrate that the appropriate amount of warping not only depends on the data structure, but also on the dataset size. Thus, even if a practitioner knows the best setting for a given dataset, they will likely be at a lost if they apply that setting on a bigger size version of that data. All these issues seem largely unknown or at least unappreciated in the community. In this work, we demonstrate the importance of setting DTW’s warping window width correctly, and we also propose novel methods to learn this parameter in both supervised and unsupervised settings. The algorithms we propose to learn w can produce significant improvements in classification accuracy and clustering quality. We demonstrate the correctness of our novel observations and the utility of our ideas by testing them with more than one hundred publicly available datasets. Our forceful results allow us to make a perhaps unexpected claim; an underappreciated “low hanging fruit” in optimizing DTW’s performance can produce improvements that make it an even stronger baseline, closing most or all the improvement gap of the more sophisticated methods proposed in recent years.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generalizing DTW to the multi-dimensional case requires an adaptive approach

Article 15 February 2016

Mohammad Shokoohi-Yekta, Bing Hu, … Eamonn Keogh

A Scalable Segmented Dynamic Time Warping for Time Series Classification

Using dynamic time warping distances as features for improved time series classification

Article 07 May 2015

Rohit J. Kate

Notes

“Essentially,” since some clustering algorithms are not defined (or lose certain guarantees) for non-metric distance measures.
NMI is an information-theoretic interpretation of clustering quality. It has values in range 0 and 1, the higher the better.
For conditional entropy, smaller is better.

References

Albert MV, Kording K, Herrmann M, Jayaraman A (2012) Fall classification by machine learning using mobile phones. PLoS ONE 7(5):e36556. https://doi.org/10.1371/journal.pone.0036556
Article Google Scholar
Assent I, Wichterich M, Seidl T (2006) Adaptable distance functions for similarity-based multimedia retrieval. Datenbank Spektrum 19:23–31
Google Scholar
Athitsos V, Papapetrou P, Potamias M, Kollios G, Gunopulos D (2008) Approximate embedding-based subsequence matching of time series. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data. ACM, pp 365–378
Bagnall A, Lines J (2014) An experimental evaluation of nearest neighbour time series classification. arXiv Preprint arXiv:1406.4757
Bagnall A, Lines J, Bostrom A, Large J, Keogh E (2017) The great time series classification bake off: a review and experimental evaluation of recent algorithmic advances. Data Min Knowl Discov 31(3):606–660. https://doi.org/10.1007/s10618-016-0483-9
Article MathSciNet Google Scholar
Bagnall A, Lines J, Vickers W, Keogh E (2018) The UEA and UCR time series classification repository. www.timeseriesclassification.com
Basu S, Banerjee A, Mooney R (2002) Semi-supervised clustering by seeding. In: Proceedings of the 19th international conference on machine learning (ICML-2002), pp 19–26
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. Int Conf Knowl Discov Data Min (KDD). https://doi.org/10.1145/1014052.1014062
Google Scholar
Batista GE, Prati RC, Monard MC (2004) A study of the behavior of several methods for balancing machine learning training data. ACM SIGKDD Explor Newsl Spec Issue Learn Imbalanced Datasets 6(1):20–29. https://doi.org/10.1145/1007730.1007735
Article Google Scholar
Beecks C, Uysal MS, Seidl T (2010) Signature quadratic form distance. In: Proceedings of the ACM international conference on image and video retrieval. ACM, pp 438–445
Begum N, Ulanova L, Wang J, Keogh E (2015) Accelerating dynamic time warping clustering with a novel admissible pruning strategy. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’15, pp 49–58. https://doi.org/10.1145/2783258.2783286
Bilenko M, Mooney RJ (2003) Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining—KDD’03, p 39. https://doi.org/10.1145/956755.956759
Cao H, Li XL, Woon DYK, Ng SK (2013) Integrated oversampling for imbalanced time series classification. IEEE Trans Knowl Data Eng 25(12):2809–2822. https://doi.org/10.1109/TKDE.2013.37
Article Google Scholar
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953
Article MATH Google Scholar
Chen Y, Hu B, Keogh E, Batista GE (2013) “DTW-D: time series semi-supervised learning from a single example. In: KDD '13: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining, pp 383–391. https://doi.org/10.1145/2487575.2487633
Chen Y, Keogh E, Hu B, Begum N, Bagnall A, Mueen A, Batista G (2015) The UCR time series classification archive. www.Cs.Ucr.Edu/~Eamonn/time_series_data
Dau HA (2018) Supporting page 2018. http://www.cs.ucr.edu/~hdau001/learn_dtw_parameter/
Dau HA, Begum N, Keogh E (2016) Semi-supervision dramatically improves time series clustering under dynamic time warping. In: 25th ACM international conference on information and knowledge management, pp 999–1008. https://doi.org/10.1145/2983323.2983855
Dau HA, Silva DF, Petitjean F, Forestier G, Bagnall A, Keogh E (2017) Judicious setting of dynamic time warping’s window width allows more accurate classification of time series. In: IEEE international conference on big data
Demiriz A, Bennett KP, Embrechts MJ (1999) Semi-supervised clustering using genetic algorithms. In: Artificial neural networks in engineering (ANNIE-99), pp 809–814
Deng H, Runger G, Tuv E, Vladimir M (2013) A time series forest for classification and feature extraction. Inf Sci 239:142–153. https://doi.org/10.1016/j.ins.2013.02.030
Article MathSciNet MATH Google Scholar
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. Proc VLDB Endow 1(2):1542–1552. https://doi.org/10.1145/1454159.1454226
Article Google Scholar
Ding R, Wang Q, Dang Y, Fu Q, Zhang H, Zhang D (2015) YADING: fast clustering of large-scale time series data. VLDB Endow 8(5):473–484. https://doi.org/10.14778/2735479.2735481
Article Google Scholar
Esteban C, Hyland SL, Rätsch G (2017) Real-valued (medical) time series generation with recurrent conditional GANs. arXiv Preprint arXiv:1706.02633
Ferreira LN, Zhao L (2016) Time series clustering via community detection in networks. Inf Sci 326:227–242. https://doi.org/10.1016/j.ins.2015.07.046
Article MathSciNet MATH Google Scholar
Forestier G, Petitjean F, Dau HA, Webb GI, Keogh E (2017) Generating synthetic time series to augment sparse datasets. In: 2017 IEEE international conference on data mining (ICDM), pp 865–870. https://doi.org/10.1109/ICDM.2017.106
Geler Z, Kurbalija V, Radovanović M, Ivanović M (2014) Impact of the Sakoe–Chiba band on the DTW time series distance measure for kNN classification. In: International conference on knowledge science, engineering and management. Springer, pp 105–114
Górecki T, Łuczak M (2013) Using derivatives in time series classification. Data Min Knowl Discov 26(2):310–331. https://doi.org/10.1007/s10618-012-0251-4
Article MathSciNet Google Scholar
Górecki T, Łuczak M (2014) Non-isometric transforms in time series classification using DTW. Knowl Based Syst 61:98–108. https://doi.org/10.1016/j.knosys.2014.02.011
Article MATH Google Scholar
Guennec AL, Malinowski S, Tavenard R (2016) Data augmentation for time series classification using convolutional neural networks. In: ECML/PKDD workshop on advanced analytics and learning on temporal data
Guna J, Humar I, Pogačnik M (2012) Intuitive gesture based user identification system. In: 2012 Proceedings of 35th international conference on telecommunications and signal processing, TSP 2012, pp 629–633. https://doi.org/10.1109/TSP.2012.6256373
Ha TM, Bunke H (1997) Off-line, handwritten numeral recognition by perturbation method. IEEE Trans Pattern Anal Mach Intell 19(5):535–539. https://doi.org/10.1109/34.589216
Article Google Scholar
Hayashi A, Mizuhara Y, Suematsu N (2005) Embedding time series data for classification. In: International workshop on machine learning and data mining in pattern recognition, pp 356–365
He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proceedings of the international joint conference on neural networks, pp 1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969
Hu B, Rakthanmanon T, Hao Y, Evans S, Lonardi S, Keogh E (2014) Using the minimum description length to discover the intrinsic cardinality and dimensionality of time series. Data Min Knowl Discov 29(2):358–399. https://doi.org/10.1007/s10618-014-0345-2
Article MathSciNet Google Scholar
Jeong Y-S, Jeong MK, Omitaomu OA (2011) Weighted dynamic time warping for time series classification. Pattern Recogn 44:2231–2240. https://doi.org/10.1016/j.patcog.2010.09.022
Article Google Scholar
Kate RJ (2015) Using dynamic time warping distances as features for improved time series classification. Data Min Knowl Discov 30(2):283–312. https://doi.org/10.1007/s10618-015-0418-x
Article MathSciNet Google Scholar
Kurbalija V, Radovanović M, Geler Z, Ivanović M (2014) The influence of global constraints on similarity measures for time-series databases. Knowl Based Syst 56:49–67. https://doi.org/10.1016/j.knosys.2013.10.021
Article Google Scholar
Lee J-G, Han J, Li X, Gonzalez H (2008) TraClass: trajectory classification using hierarchical region-based and trajectory-based clustering. Proc VLDB Endow 1(1):1081–1094. https://doi.org/10.1145/1453856.1453972
Article Google Scholar
Li L, Aditya Prakash B (2011) Time series clustering: complex is simpler! Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit 28(1):137–146. https://doi.org/10.1177/1420326X11423163
Google Scholar
Lines J, Bagnall A (2015) Time series classification with ensembles of elastic distance measures. Data Min Knowl Discov 29(3):565–592. https://doi.org/10.1007/s10618-014-0361-2
Article MathSciNet Google Scholar
Liu J, Zhong L, Wickramasuriya J, Vasudevan V (2009) uWave: accelerometer-based personalized gesture recognition and its applications. Pervasive Mob Comput 5(6):657–675. https://doi.org/10.1016/j.pmcj.2009.07.007
Article Google Scholar
Lu S, Mirchevska G, Phatak SS, Li D, Luka J, Calderone RA, Fonzi WA (2017) Dynamic time warping assessment of highresolution melt curves provides a robust metric for fungal identification. PLoS ONE 12(3):e0173320. https://doi.org/10.1371/journal.pone.0173320
Article Google Scholar
Lv Y, Zhai CX (2010) Positional relevance model for pseudo-relevance feedback. In: Proceeding of the 33rd international ACM SIGIR conference on research and development in information retrieval—SIGIR’10, p 579. https://doi.org/10.1145/1835449.1835546
Masters J (2016) The level of pain and injury from slip and fall accidents. Brain Injury Society. http://www.bisociety.org/level-pain-injury-slip-fall-accidents/
National Council on Aging (NCOA) (2016) Falls prevention facts. https://www.ncoa.org/news/resources-for-reporters/get-the-facts/falls-prevention-facts/
Ng AY (1997) Preventing ‘overfitting’ of cross-validation data. In: ICML, vol 97, pp 245–253. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.47.6720&rep=rep1&type=pdf%0Ahttp://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.47.6720
Paparrizos J, Gravano L (2015) K-shape: efficient and accurate clustering of time series. ACM Sigmod. https://doi.org/10.1145/2723372.2737793
Google Scholar
Paparrizos J, Gravano L (2017) Fast and accurate time-series clustering. ACM Trans Database Syst 42(2):1–49. https://doi.org/10.1145/3044711
Article MathSciNet Google Scholar
Petitjean F, Forestier G, Webb GI, Nicholson AE, Chen Y, Keogh E (2015) Dynamic time warping averaging of time series allows faster and more accurate classification. In: Proceedings of IEEE international conference on data mining, ICDM, pp 470–479. https://doi.org/10.1109/ICDM.2014.27
Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining—KDD’12, p 262. https://doi.org/10.1145/2339530.2339576
Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850. https://doi.org/10.1080/01621459.1971.10482356
Article Google Scholar
Rani S, Sikka G (2012) Recent techniques of clustering of time series data: a survey. Int J Comput Appl 52(15):1–9. https://doi.org/10.5120/8282-1278
Google Scholar
Ratanamahatana CA, Keogh E (2005) Three myths about dynamic time warping data mining. In: Proceedings of the 2005 SIAM international conference on data mining, pp 506–510. https://doi.org/10.1137/1.9781611972757.50
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496. https://doi.org/10.1126/science.1242072
Article Google Scholar
Sakoe H, Chiba S (1978) Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans Acoust Speech Signal Process 26(1):43–49. https://doi.org/10.1109/TASSP.1978.1163055
Article MATH Google Scholar
Shokoohi-Yekta M, Wang J, Keogh E (2015) On the non-trivial generalization of dynamic time warping to the multi-dimensional case. In: Proceedings of the 2015 SIAM international conference on data mining, pp 289–297. https://doi.org/10.1137/1.9781611974010.33
Shou Y, Mamoulis N, Cheung D (2005) Fast and exact warping of time series using adaptive segmental approximations. Mach Learn 58(2–3):231–267. https://doi.org/10.1007/s10994-005-5828-3
Article MATH Google Scholar
Silva DF, Batista GE, Keogh E (2017) Prefix and suffix invariant dynamic time warping. In: Proceedings of IEEE international conference on data mining, ICDM, pp 1209–1214. https://doi.org/10.1109/ICDM.2016.107
Silva DF, Giusti R, Keogh E, Batista GE (2018) Speeding up similarity search under dynamic time warping by pruning unpromising alignments. In: Data mining and knowledge discovery. Springer, pp 1–29
Tan CW, Herrmann M, Forestier G, Webb GI, Petitjean F (2018) Efficient search of the best warping window for dynamic time warping. In: Proceedings of the 2018 SIAM international conference on data mining. https://www.francois-petitjean.com/Research/Petitjean2018-SDM-learn-warp-window.pdf
Valsamis A, Tserpes K, Zissis D, Anagnostopoulos D, Varvarigou T (2017) Employing traditional machine learning algorithms for big data streams analysis: the case of object trajectory prediction. J Syst Softw 127:249–257. https://doi.org/10.1016/j.jss.2016.06.016
Article Google Scholar
Vinh NX (2010) Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854. https://doi.org/10.1182/blood-2008-03-145946
MathSciNet MATH Google Scholar
Vlachos M, Kollios G, Gunopulos D (2002) Discovering similar multidimensional trajectories. In: Proceedings of international conference on data engineering, pp 673–684. https://doi.org/10.1109/ICDE.2002.994784
Von Luxburg U (2010) Clustering stability: an overview. Found Trends® Mach Learn 2(3):235–274
MATH Google Scholar
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the national conference on artificial intelligence. http://citeseer.ist.psu.edu/rd/0,307538,1,0.25,Download/http://citeseer.ist.psu.edu/cache/papers/cs/14353/http:zSzzSzwww.cs.cornell.eduzSzhomezSzcardiezSzpaperszSzicml-2000.pdf/wagstaff00clustering.pdf%5Cnhttp://portal.acm.org/citation.cfm?id=658275%5Cnhttp:/
Xi X, Keogh E, Shelton C, Wei L, Ratanamahatana CA (2006) Fast time series classification using numerosity reduction. In: Proceedings of the 23rd international conference on machine learning—ICML’06, pp 1033–1040. https://doi.org/10.1145/1143844.1143974
Zakaria J, Abdullah M, Keogh E (2012) Clustering time series using unsupervised-shapelets. In: Proceedings of IEEE international conference on data mining, ICDM, pp 785–94. https://doi.org/10.1109/ICDM.2012.26
Zhong Y, Liu S, Wang X, Xiao J, Song Y (2016) Tracking idea flows between social groups. In: AAAI, pp 1436–43
Zhou J, Zhu SF, Huang X, Zhang Y (2015) Enhancing time series clustering by incorporating multiple distance measures with semi-supervised learning. J Comput Sci Technol 30(4):859–873. https://doi.org/10.1007/s11390-015-1565-7
Article Google Scholar

Download references

Acknowledgements

This material is based upon work supported by the Air Force Office of Scientific Research, Asian Office of Aerospace Research and Development (AOARD) under award number FA2386-16-1-4023. The Australian Research Council under grant DE170100037 and the UK Engineering and Physical Sciences Research Council (EPSRC) under grant number EP/M015807/1 have also supported this work. Finally, we acknowledge the funding from NSF IIS-1161997 II and NSF IIS-1510741. We also wish to take this opportunity to thank the donors of the data to the UCR Time Series Archive.

Author information

Authors and Affiliations

University of California, Riverside, Riverside, USA
Hoang Anh Dau & Eamonn Keogh
Universidade Federal de São Carlos, São Carlos, Brazil
Diego Furtado Silva
Monash University, Melbourne, Australia
François Petitjean
University of Haute-Alsace, Mulhouse, France
Germain Forestier
University of East Anglia, Norwich, UK
Anthony Bagnall
University of New Mexico, Albuquerque, USA
Abdullah Mueen

Authors

Hoang Anh Dau
View author publications
You can also search for this author in PubMed Google Scholar
Diego Furtado Silva
View author publications
You can also search for this author in PubMed Google Scholar
François Petitjean
View author publications
You can also search for this author in PubMed Google Scholar
Germain Forestier
View author publications
You can also search for this author in PubMed Google Scholar
Anthony Bagnall
View author publications
You can also search for this author in PubMed Google Scholar
Abdullah Mueen
View author publications
You can also search for this author in PubMed Google Scholar
Eamonn Keogh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hoang Anh Dau.

Additional information

Responsible editor: Jian Pei.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dau, H.A., Silva, D.F., Petitjean, F. et al. Optimizing dynamic time warping’s window width for time series data mining applications. Data Min Knowl Disc 32, 1074–1120 (2018). https://doi.org/10.1007/s10618-018-0565-y

Download citation

Received: 28 September 2017
Accepted: 30 March 2018
Published: 09 April 2018
Issue Date: July 2018
DOI: https://doi.org/10.1007/s10618-018-0565-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Optimizing dynamic time warping’s window width for time series data mining applications

Abstract

Access this article

Similar content being viewed by others

Generalizing DTW to the multi-dimensional case requires an adaptive approach

A Scalable Segmented Dynamic Time Warping for Time Series Classification

Using dynamic time warping distances as features for improved time series classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Optimizing dynamic time warping’s window width for time series data mining applications

Abstract

Access this article

Similar content being viewed by others

Generalizing DTW to the multi-dimensional case requires an adaptive approach

A Scalable Segmented Dynamic Time Warping for Time Series Classification

Using dynamic time warping distances as features for improved time series classification

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation