Abstract
Clustering time series data is frequently hampered by various noise components within the signal. These disturbances affect the ability of clustering to detect similarities across the various signals, which may result in poor clustering results. We propose a method, which first smooths out such noise using wavelet decomposition and thresholding, then reconstructs the original signal (with minimised noise) and finally undertakes the clustering on this new signal. We experimentally evaluate the proposed method on 250 signals that are generated from five classes of signals. Our proposed method achieves improved clustering results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Given we know the class that a signal belongs to, we first ignore the class label of a signal to simulate a real life clustering scenario where the class labels are unknown. After clustering the signals we use the class labels as ground truth for the purpose of clustering evaluation.
- 2.
Clustering result tables for Simple Moving Average order 3, Hard and Soft thresholding not shown, similarly for clustering using raw wavelet coefficients.
References
Vlachos, M., Lin, J., Keogh, E., Gunopulos, D.: Wavelet-based anytime algorithm for k-means clustering of time series. In: Proceedings of Workshop on Clustering High Dimensionality Data and Its Applications (2003)
Rahman, M.A., Islam, M.Z.: A hybrid clustering technique combining a novel genetic algorithm with k-means. Knowl. Based Syst. (KBS) 71, 345–365 (2014)
Beg, A.H., Islam, M.Z.: A novel genetic algorithm-based clustering technique and its suitability for knowledge discovery from a brain dataset. In: Proceedings of IEEE Congress on Evolutionary Computation (IEEE CEC), Vancouver, Canada, 24–29 July 2016, pp. 948–956 (2016)
Härdle, W.K., Simar, L.: Applied Multivariate Statistical Analysis. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-45171-7
R Core Team, R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria (2013). http://www.R-project.org/
Constantine, W., Percival, D.: wmtsa: Wavelet Methods for Time Series Analysis. R package version 2.0-3 (2017). https://CRAN.R-project.org/package=wmtsa
Goyal, A., Bijalwan, A., Chowdhury, K.: A comprehensive review of image smoothing techniques. Int. J. Adv. Res. Comput. Sci. Technol. 1(4), 315–319 (2012)
Warren Liao, T.: Clustering of time series data-a survey. J. Pattern Recogn. Soc. 38, 1857–1874 (2005)
Sidney Burrus, C., Gopinath, R., Guo, H.: Introduction to Wavelets and Wavelet Transforms. Prentice Hall, New Jersey (1998)
Guo, H., Liu, Y., Liang, H., Gao, X.: An application on time series clustering based on wavelet decomposition and denoising. In: Fourth International Conference on Natural Computation (2008)
Graps, A.: An introduction to wavelets. IEEE Comput. Sci. Eng. 2(2), 50–61 (1995)
Polikar, R.: The Engineers Ultimate Guide to Wavelet Analysis: The Wavelet Tutorial Part I (2006)
Percival, D.B., Walden, A.T.: Wavelet Methods for Time Series Analysis, Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge (2006)
Nason, G.P.: Wavelet Methods in Statistics with R. Use R!. Springer, New York (2008). https://doi.org/10.1007/978-0-387-75961-6
Downie, T., Silverman, B.: The discrete multiple wavelet transform and thresholding methods. IEEE Trans Signal Process. 46, 2558–2561 (1998)
Donoho, D., Johnstone, I.: Adapting to unknown smoothness via wavelet shrinkage. Am. Stat. Asoc. 90, 1200–1224 (1995)
MacQueen, J.: Some methods for classification and analysis of multivariate observations. In: 5-th Berkeley Symposium on Mathematical Statistics and Probability, pp. 291–297 (1967)
Murtagh, F.: Multidimensional clustering algorithms. In: COMPSTAT Lectures 4. Wuerzburg: Physica-Verlag (1985)
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the surprising behavior of distance metrics in high dimensional space. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, pp. 420–434. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44503-X_27
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Grant, P., Islam, M.Z. (2019). Clustering Noisy Temporal Data. In: Li, J., Wang, S., Qin, S., Li, X., Wang, S. (eds) Advanced Data Mining and Applications. ADMA 2019. Lecture Notes in Computer Science(), vol 11888. Springer, Cham. https://doi.org/10.1007/978-3-030-35231-8_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-35231-8_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35230-1
Online ISBN: 978-3-030-35231-8
eBook Packages: Computer ScienceComputer Science (R0)