Skip to main content
Log in

Enumeration of time series motifs of all lengths

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Time series motifs are repeated patterns in long and noisy time series. Motifs are typically used to understand the dynamics of the source because repeated patterns with high similarity evidentially rule out the presence of noise. Recently, time series motifs have also been used for clustering, summarization, rule discovery and compression as features. For all such purposes, many high-quality motifs of various lengths are desirable and thus originate the problem of enumerating motifs for a wide range of lengths. Existing algorithms find motifs for a given length. A trivial way to enumerate motifs is to run one of the algorithms for the whole range of lengths. However, such parameter sweep is computationally infeasible for large real datasets. In this paper, we describe an exact algorithm, called \({\textit{MOEN}}\), to enumerate motifs. The algorithm is an order of magnitude faster than the naive algorithm. The algorithm frees us from re-discovering the same motif at different lengths and tuning multiple data-dependent parameters. The speedup comes from using a novel bound on the similarity function across lengths and the algorithm uses only linear space unlike other motif discovery algorithms. We also describe an approximate extension of MOEN algorithm that is faster and suitable for larger datasets. We describe five case studies in entomology, sensor fusion, power consumption monitoring and activity recognition where \({\textit{MOEN}}\) enumerates several high-quality motifs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Supporting webpage. http://www.cs.unm.edu/~mueen/Projects/MOEN.html

  2. topsider.com marathon match. http://community.topcoder.com/tc?module=MatchDetails&rd=15618

  3. Webpage to download the benchmark dataset for context recognition. http://cis.legacy.ics.tkk.fi/jhimberg/contextdata/scenarios.html

  4. Brown AEX, Yemini EI, Grundy LJ, Jucikas T, Schafer WR (2013) A dictionary of behavioral motifs reveals clusters of genes affecting caenorhabditis elegans locomotion. Proc Natl Acad Sci 110(2):791–796

    Article  Google Scholar 

  5. Cassisi C, Aliotta M, Cannata A, Montalto P, Patanè D, Pulvirenti A, Spampinato L (2013) Motif discovery on seismic amplitude time series: the case study of Mt. Etna 2011 eruptive activity. Pure Appl Geophys 170(4):529–545

  6. Castro N, Azevedo P (2010) Multiresolution motif discovery in time series. In: Proceedings of the 2010 SIAM international conference on data mining, pp 665–676. http://epubs.siam.org/doi/abs/10.1137/1.9781611972801.73

  7. Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE (2000) Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23):e215–e220

    Article  Google Scholar 

  8. Keogh E, Kasetty S (2003) On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min Knowl Discov 7(4):349–371

    Article  MathSciNet  Google Scholar 

  9. Lam HT, Pham ND, Calders T (2011) Online discovery of top-k similar motifs in time series data. In: SIAM Conference on Data Mining, SDM ’11

  10. Oates T, Lin J, Li Y (2012) Visualizing variable-length time series motifs. In: Proceedings of the 2012 SIAM international conference on data mining, pp 895–906. http://epubs.siam.org/doi/abs/10.1137/1.9781611972825.77

  11. Lin J, Keogh E, Lonardi S, Patel P (2002) Finding motifs in time series. In: Proceedings of 2nd workshop on temporal data mining at KDD, pp 53–68

  12. Makonin S, Popowich F, Bartram L, Gill B, Bajic IV (2013) AMPds: a public dataset for load disaggregation and eco-feedback research. In: Electrical power and energy conference (EPEC), 2013 IEEE, pp 1–6

  13. Mäntyjärvi J, Himberg J, Kangas P, Tuomela U, Huuskonen P (2004) Sensor signal data set for exploring context recognition of mobile devices. In: Workshop “Benchmarks and a database for context recognition” in conjunction with the 2nd international conference on pervasive computing (PERVASIVE 2004)

  14. Mueen A (2013) Enumeration of time series motifs of all lengths. ICDM

  15. Mueen A, Keogh E (2010) Online discovery and maintenance of time series motifs. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’10. ACM, Washington, DC, pp 1089–1098. ISBN: 978-1-4503-0055-1

  16. Mueen A, Keogh E (2011) Logical-shapelets: an expressive primitive for time series classification. In: Proceedings of the 17th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’11. ACM, New York, pp 1154–1162. ISBN: 978-1-4503-0813-7

  17. Mueen A, Zhu Q, Cash S, Keogh E, Westover B (2009) Exact discovery of time series motifs. In: Proceedings of the 2009 SIAM International Conference on Data Mining, pp 473–484. http://epubs.siam.org/doi/abs/10.1137/1.9781611972795.41

  18. Mueen A, Nath S, Liu J (2010) Fast approximate correlation for massive time-series data. In: SIGMOD conference, pp. 171–182

  19. Narang A, Bhattcherjee S (2011) Real-time approximate range motif discovery & data redundancy removal algorithm. In: Proceedings of the 14th international conference on extending database technology, EDBT/ICDT ’11. ACM, New York, pp 485–496. ISBN: 978-1-4503-0528-0

  20. Nunthanid P, Niennattrakul V, Ratanamahatana C (2011) Discovery of variable length time series motif. In: 2011 8th international conference on electrical engineering/electronics, computer, telecommunications and information technology (ECTI-CON), pp 472–475

  21. Patel P, Keogh E, Lin J, Lonardi S (2002) Mining motifs in massive time series databases. In: Proceedings of the IEEE international conference on data mining, ICDM

  22. Pohl H, Hadjakos A (2010) Dance pattern recognition using dynamic time warping. In: SMC 2010 proceedings, pp 183–190

  23. Rakthanmanon T, Campana B, Mueen A, Batista G, Westover B, Zhu Q, Zakaria J, Keogh E (2012) Searching and mining trillions of time series subsequences under dynamic time warping. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12. ACM, New York, pp 262–270. ISBN: 978-1-4503-1462-6

  24. Rakthanmanon T, Keogh EJ, Lonardi S, Evans S (2011) Time series epenthesis: clustering time series streams requires ignoring some data. In: Proceedings of the 2011 IEEE 11th international conference on data mining, ICDM ’11, pp 547–556

  25. Sakurai Y, Papadimitriou S, Faloutsos C (2005) Braid: stream mining through group lag correlations. In: SIGMOD conference, pp 599–610

  26. Tanaka Y, Iwamoto K, Uehara K (2005) Discovery of time-series motif from multi-dimensional data based on MDL principle. Mach Learn 58:269–300

    Article  MATH  Google Scholar 

  27. Tang H, Liao SS (2008) Discovering original motifs with different lengths from time series. Knowl Based Syst 21:666–671

    Article  Google Scholar 

  28. Yankov D, Keogh E, Medina J, Chiu B, Zordan V (2007) Detecting time series motifs under uniform scaling. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07, pp 844–853

  29. Ye L, Keogh E (2009) Time series shapelets: a new primitive for data mining. In: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, KDD, pp 947–956

  30. Yingchareonthawornchai S, Sivaraks H, Rakthanmanon T, Ratanamahatana C (2013) Efficient proper length time series motif discovery. In: 2013 IEEE 13th international conference on data mining (ICDM), pp 1265–1270

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Abdullah Mueen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mueen, A., Chavoshi, N. Enumeration of time series motifs of all lengths. Knowl Inf Syst 45, 105–132 (2015). https://doi.org/10.1007/s10115-014-0793-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-014-0793-4

Keywords

Navigation