Author ORCID Identifier

https://orcid.org/0000-0003-2749-3063

Date of Award

5-8-2020

Degree Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Rafal Angryk

Second Advisor

Rafal Angryk

Third Advisor

Petrus Martens

Fourth Advisor

Pete Riley

Fifth Advisor

Yanqing Zhang

Abstract

Clustering is an essential branch of data mining and statistical analysis that could help us explore the distribution of data and extract knowledge. With the broad accumulation and application of time series data, the study of its clustering is a natural extension of existing unsupervised learning heuristics. We discuss the components which configure the clustering of time series data, specifically, the similarity measure, the clustering heuristic, the evaluation of cluster quality, and the applications of said heuristics. Being the groundwork for the task of data analysis, we propose a scalable and efficient time series similarity measure: segmented-Dynamic Time Warping. For time series clustering, we formulate the Distance Density Clustering heuristic, a deterministic clustering algorithm that adopts concepts from both density and distance separation. In addition, we explored the characteristics and discussed the limitations of existing cluster evaluation methods. Finally, all components lead to the goal of real-world applications.

DOI

https://doi.org/10.57709/20253148

File Upload Confirmation

1

Share

COinS