Weighted dynamic time warping for time series classification
Introduction
There has been a long-standing interest for time series classification and clustering in diverse applications such as pattern recognition, signal processing, biology, aerospace, finance, medicine, and meteorology [1], [2], [8], [12], [14], [18], [23], [25], [26], and thus some notable techniques have been developed including nearest neighbor classifier with a given distance measure, support vector machines, and neural networks [2], [4], [20]. The nearest neighbor classifiers with dynamic time warping (DTW) has shown to be effective for time series classification and clustering because of its non-linear mappings capability [7], [18], [25]. The DTW technique finds an optimal match between two sequences by allowing a non-linear mapping of one sequence to another, and minimizing the distance between two sequences [8], [7], [12], [22]. The sequences are "warped" non-linearly to determine their similarity independent of any non-linear variations in the time dimension. The technique was originally developed for speech recognition, but several researchers have evaluated its application in other domains and have developed several variants such as derivative DTW (DDTW) [11], [21], [22]. Fig. 1 shows the example of process of aligning two out of phase sequences by DTW.
The methodology for DTW is as follows. Assume a sequence A of length m, A=a1, a2, …, ai, …, am and a sequence B of length n, B=b1, b2, …, bj, …, bn. We create an m-by-n path matrix where the (ith, jth) element of matrix contains the distance between the two points ai and bj such that , where ||·||p represents the lp norm. The warping path is typically subject to several constraints such as [22]
Endpoint constraint: the starting and ending points of warping path have to be the first and the last points of the path matrix, that is, u1=(a1, b1) and uk=(am, bn).
Continuity constraint: the path can advance one step at a time. That is, when uk=(ai, bj), uk+1=(ai+1, bj+1) where ai−ai+1≤1 and bi−bi+1≤1.
Monotonicity: the path does not decrease, i.e., uk=(ai, bj), uk+1=(ai+1, bj+1) where ai≥ai+1 and bi≥bi+1.
The best match between two sequences is the one with the lowest distance path after aligning one sequence to the other. Therefore, the optimal warping path can be found by using recursive formula given bywhere γ(i, j) is the cumulative distance described by
As seen from Eq. (1), given a search space defined by two time series DTWp guarantees to find the warping path with the minimum cumulative distance among all possible warping paths that are valid in the search space. Thus, DTWp can be seen as the minimization of warped lp distance with time complexity of Ο(mn). By restraining a search space using constraint techniques such as Sakoe–Chuba Band [22] and Itakura Parallelogram [7], the time complexity of DTW can be reduced. Fig. 2 shows the warping matrix and optimal warping path between two sequences by DTW. In Fig. 2, a band with width w is used to constrain the warping.
However, the conventional DTW calculates the distance of all points between two series with equal weight of each point regardless of the phase difference between a reference point and a testing point. This may lead to misclassification especially in applications such as image retrieval where the shape similarity between two sequences is a major consideration for an accurate recognition, thus neighboring points between two sequences are more important than others. In other words, relative significance depending on the phase difference between points should be considered.
Therefore, this paper proposes a novel distance measure, called the weighted dynamic time warping (WDTW), which weights nearer neighbors more heavily depending on the phase difference between a reference point and a testing point. Because WDTW takes into consideration the relative importance of the phase difference between two points, this approach can prevent a point in a sequence from mapping the further points in another one and reduce unexpected singularities, which are alignments between a point of a series with multiple points of the other series. Some practical examples will be presented to graphically illustrate possible situations where WDTW clearly is a better approach.
In addition, a new weight function, called the modified logistic weight function (MLWF), is proposed to assign weights as a function of the phase difference between a reference point and a testing point. The proposed weight function extends the properties of logistic function to enhance the flexibility of setting bounds on weights. By applying different weights to adjacent points, the proposed algorithm can enhance the detection of similarity between series.
Finally, we extend the proposed idea to other variants of DTW such as derivative dynamic time warping (DDTW) and propose the weighted version of DDTW (WDDTW). We compare the performances of our proposed procedures with other popular approaches using public data sets available through UCR Time Series Data Mining Archive [13] for both time series classification and clustering problems. The experimental results show that the proposed procedures achieve improved accuracy for time series classification and clustering problems.
This remainder of the paper is organized as follows. In Section 2, we review some related literatures on times series classification and its methodologies. Section 3 explains the rationale of the advantage of the proposed idea. In Section 4, we describe the proposed WDTW and the modified logistic weight function for automatic time series classifications. The experimental results are presented and discussed in Section 5. The paper ends with concluding remarks and future works in Section 6.
Section snippets
Related works
As a result of the increasing importance of time series classification in diverse fields, lots of algorithms have been proposed for different applications. Husken and Stagge [6] utilized recurrent neural networks for time series classification and Guler and Ubeyli [4] presented the wavelet-based adaptive neuro-fuzzy inference system model for classification of ectroencephalogram (EEG) signals. Rath and Manmatha [21] used DTW for word image matching and compared the performance of DTW with other
Rationale for the performance advantages of WDTW
In this section, we will present the rationale underlying the proposed WDTW with practical examples to graphically illustrate situations where WDTW shows better performance than conventional DTW. The first example deals with automatic classification of defect patterns on semiconductor wafer maps. Fig. 3(a)–(d) shows four common classes of defect patterns on wafer maps. Jeong et al. [9] presented the effectiveness of using spatial correlograms (i.e., time series data) as new features for the
Proposed algorithm for time series classification
This section presents the proposed WDTW measure and a new weighting function, so called modified logistic weight function (MLWF) for time series data.
Performance comparison for time series classification
In this section, we perform extensive experiments to verify the effectiveness of the proposed algorithm for time series classification and clustering. All data sets, which include real-life time series, synthetic time series, and generic time series, come from different application domains and are obtained from “UCR Time Series Data Mining Archive” [13]. For the detailed descriptions of the data sets, please see Ratanamahatana and Keogh [20].
Euclidean distance, conventional DTW, and DDTW
Conclusion
A new distance measures for time series data, WDTW and WDDTW, are proposed to classify or cluster time series data set in diverse applications. Compared with the conventional DTW and DDTW, the proposed algorithm weighs each point according to the phase difference between a test point and a reference point. The proposed method is the generalized distance measure of Euclidean distance, DTW, and DDTW, and maximizes its effectiveness with optimal g value depending on different applications. A new
Acknowledgements
The authors acknowledge the support of Dr. Eamonn Keogh in providing us the experimental data set. Also, the authors would like to thank the anonymous reviewers for their valuable comments that improved our paper dramatically. The part of this work was supported by the National Science Foundation (NSF) Grant no. CMMI-0853894. Dr. Olufemi A. Omitaomu acts in his own independent capacity and not on behalf of UT-Battelle, LLC, or its affiliates or successors.
Young-Seon Jeong is now working toward his Ph.D. degree in the Department of Industrial and Systems Engineering, Rutgers University, New Brunswick, NJ. His research interests include spatial modeling of wafer map data, wavelet application for functional data analysis, and statistical modeling for intelligent transportation system
References (28)
- et al.
Classification of bioacoustic time series based on the combination of global and local decision
Pattern Recognition
(2004) - et al.
Adaptive neuro-fuzzy inference system for classification of EEG signals using wavelet coefficient
Journal of Neuroscience Methods
(2005) - et al.
A time series representation model for accurate and fast similarity detection
Pattern Recognition
(2009) - et al.
Recurrent neural networks for time series classification
Neurocomputing
(2003) Faster retrieval with a two-pass dynamic-time-warping lower bound
Pattern Recognition
(2009)Wavelet/mixture of experts network structure of ECG signals classification
Expert Systems with Applications
(2008)- et al.
Genetic algorithms and support vector machines for time series classification
Proceeding SPIE
(2002) Real Analysis. Modern Techniques and their Applications
(1999)- F. Itakura, Minimum prediction residual principle applied to speech recognition, in: Proceedings of the IEEE...
- et al.
Automatic diatom identification using contour analysis by morphological curvature scale spaces
Machine Vision and Applications
(2005)
Automatic identification of defect patterns in semiconductor wafer maps using spatial correlogram and dynamic time warping
IEEE Transactions on Semiconductor Manufacturing
Clustering of time series subsequences is meaningless: implications for previous and future research
Knowledge and Information Systems
Exact indexing of dynamic time warping
Knowledge and Information Systems
Cited by (594)
Breast cancer classification through multivariate radiomic time series analysis in DCE-MRI sequences
2024, Expert Systems with ApplicationsESDTW: Extrema-based shape dynamic time warping
2024, Expert Systems with ApplicationsAging trajectory and end-of-life prediction for lithium-ion battery via similar fragment extraction of capacity degradation curves
2024, Journal of Cleaner ProductionImproving forecasts for heterogeneous time series by “averaging”, with application to food demand forecasts
2024, International Journal of ForecastingA novel distance measure based on dynamic time warping to improve time series classification
2024, Information SciencesSpatiotemporal analysis of bike-share demand using DTW-based clustering and predictive analytics
2023, Transportation Research Part E: Logistics and Transportation Review
Young-Seon Jeong is now working toward his Ph.D. degree in the Department of Industrial and Systems Engineering, Rutgers University, New Brunswick, NJ. His research interests include spatial modeling of wafer map data, wavelet application for functional data analysis, and statistical modeling for intelligent transportation system
Myong K. Jeong is an Assistant Professor in the Department of Industrial and Systems Engineering and the Center for Operation Research, Rutgers University, New Brunswick, NJ. His research interests include statistical data mining, recommendation systems, machine health monitoring, and sensor data analysis. He is currently an Associate Editor of IEEE Transactions on Automation Science and Engineering and International Journal of Quality, Statistics and Reliability.
Olufemi A. Omitaomu is a Research Scientist at Geographic Information Science & Technology Group, Computational Sciences and Engineering Division in Oak Ridge National Laboratory Oak Ridge, TN. He is also an Adjunct Assistant Professor at Department of Industrial and Information Engineering in University of Tennessee, Knoxville, TN. His research areas include streaming and real-time data mining, signal processing, optimization techniques in data mining, infrastructure modeling and analysis, and disaster risk analysis in space and time.