ABSTRACT
A special focus in data mining is to identify agglomerations of data points in spatial or spatio-temporal databases. Multiple applications have been presented to make use of such clustering algorithms. However, applications exist, where not only dense areas have to be identified, but also requirements regarding the correlation of the cluster to a specific shape must be met, i.e. circles. This is the case for eddy detection in marine science, where eddies are not only specified by their density, but also their circular-shaped rotation. Traditional clustering algorithms lack the ability to take such aspects into account.
In this paper, we introduce Vortex Correlation Clustering which aims to identify those correlated groups of objects oriented along a vortex. This can be achieved by adapting the Circle Hough Transformation, already known from image analysis. The presented adaptations not only allow to cluster objects depending on their location next to each other, but also allows to take the orientation of individual objects into considerations. This allows for a more precise clustering of objects. A multi-step approach allows to analyze and aggregate cluster candidates, to also include final clusters, which do not perfectly satisfy the shape condition.
We evaluate our approach upon a real world application, to cluster particle simulations composing such shapes. Our approach outperforms comparable methods of clustering for this application both in terms of effectiveness and efficiency.
- Elke Achtert, Christian Böhm, Jörn David, Peer Kröger, and Arthur Zimek. 2008. Global Correlation Clustering Based on the Hough Transform. Statistical Analysis and Data Mining: The ASA Data Science Journal 1, 3 (Nov. 2008), 111–127. https://doi.org/10.1002/sam.10012Google ScholarCross Ref
- Mihael Ankerst, Markus M. Breunig, Hans-Peter Kriegel, and Jörg Sander. 1999. OPTICS: ordering points to identify the clustering structure. ACM SIGMOD Record 28, 2 (Jun 1999), 49–60. https://doi.org/10.1145/304181.304187Google ScholarDigital Library
- T.J. Atherton and D.J. Kerbyson. 1999. Size invariant circle detection. Image and Vision Computing 17, 11 (1999), 795–803. https://doi.org/10.1016/S0262-8856(98)00160-7Google ScholarCross Ref
- Arne Biastoch, Franziska U. Schwarzkopf, Klaus Getzlaff, Siren Rühs, Torge Martin, Markus Scheinert, Tobias Schulzki, Patricia Handmann, Rebecca Hummels, and Claus W. Böning. 2021. Regional imprints of changes in the Atlantic Meridional Overturning Circulation in the eddy-rich ocean model VIKING20X. Ocean Sci. 17 (sep 2021), 1177–1211. https://doi.org/10.5194/os-17-1177-2021Google ScholarCross Ref
- Derya Birant and Alp Kut. 2007. ST-DBSCAN: An algorithm for clustering spatial–temporal data. Data & Knowledge Engineering 60, 1 (Jan 2007), 208–221. https://doi.org/10.1016/j.datak.2006.01.013Google ScholarDigital Library
- Dudley B. Chelton, Michael G. Schlax, and Roger M. Samelson. 2011. Global observations of nonlinear mesoscale eddies. Prog. Oceanogr. 91, 2 (oct 2011), 167–216. https://doi.org/10.1016/j.pocean.2011.01.002Google ScholarCross Ref
- Nelson Tavares de Sousa, Carola Trahms, Peer Kröger, Matthias Renz, René Schubert, and Arne Biastoch. 2022. Tracking the Evolution of Water Flow Patterns Based on Spatio-Temporal Particle Flow Clusters. In 2022 23rd IEEE International Conference on Mobile Data Management (MDM). IEEE, Paphos, Cyprus, 246–253. https://doi.org/10.1109/MDM55031.2022.00054Google ScholarCross Ref
- Liang Deng, Yueqing Wang, Yang Liu, Fang Wang, Sikun Li, and Jie Liu. 2019. A CNN-based vortex identification method. Journal of Visualization 22 (2019), 65–78.Google ScholarDigital Library
- Richard O. Duda and Peter E. Hart. 1972. Use of the Hough Transformation to Detect Lines and Curves in Pictures. Commun. ACM 15, 1 (jan 1972), 11–15. https://doi.org/10.1145/361237.361242Google ScholarDigital Library
- Anass El Aouni, Hussein Yahia, Khalid Daoudi, and Khalid Minaoui. 2019. A Fourier approach to Lagrangian vortex detection. Chaos: An Interdisciplinary Journal of Nonlinear Science 29, 9 (2019), 093106.Google ScholarCross Ref
- Alireza Hadjighasem, Daniel Karrasch, Hiroshi Teramoto, and George Haller. 2016. Spectral-clustering approach to Lagrangian vortex detection. Phys. Rev. E 93 (Jun 2016), 063107. Issue 6. https://doi.org/10.1103/PhysRevE.93.063107Google ScholarCross Ref
- Paul VC Hough. 1962. Method and means for recognizing complex patterns. US Patent 3,069,654.Google Scholar
- Lawrence Hubert and Phipps Arabie. 1985. Comparing partitions. Journal of Classification 2, 1 (Dec 1985), 193–218. https://doi.org/10.1007/BF01908075Google ScholarCross Ref
- Hans-Peter Kriegel, Peer Kröger, Jörg Sander, and Arthur Zimek. 2011. Density-based clustering. Wiley interdisciplinary reviews: data mining and knowledge discovery 1, 3 (2011), 231–240.Google ScholarCross Ref
- Hans-Peter Kriegel, Peer Kröger, and Arthur Zimek. 2009. Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering. ACM Transactions on Knowledge Discovery from Data 3, 1 (Mar 2009), 1–58. https://doi.org/10.1145/1497577.1497578Google ScholarDigital Library
- Redouane Lguensat, Miao Sun, Ronan Fablet, Pierre Tandeo, Evan Mason, and Ge Chen. 2018. EddyNet: A Deep Neural Network For Pixel-Wise Classification of Oceanic Eddies. In IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium. IEEE, Valencia, Spain, 1764–1767. https://doi.org/10.1109/IGARSS.2018.8518411Google ScholarCross Ref
- Priyanka Mukhopadhyay and Bidyut B. Chaudhuri. 2015. A survey of Hough Transform. Pattern Recognition 48, 3 (Mar 2015), 993–1010. https://doi.org/10.1016/j.patcog.2014.08.027Google ScholarDigital Library
- C. Pegliasco, A. Delepoulle, E. Mason, R. Morrow, Y. Faugère, and G. Dibarboure. 2022. META3.1exp: a new global mesoscale eddy trajectory atlas derived from altimetry. Earth System Science Data 14, 3 (2022), 1087–1107. https://doi.org/10.5194/essd-14-1087-2022Google ScholarCross Ref
- Givanna H. Putri, Mark N. Read, Irena Koprinska, Deeksha Singh, Uwe Röhm, Thomas M. Ashhurst, and Nicholas J.C. King. 2019. ChronoClust: Density-based clustering and cluster tracking in high-dimensional time-series data. Knowledge-Based Systems 174 (jun 2019), 9–26. https://doi.org/10.1016/j.knosys.2019.02.018Google ScholarDigital Library
- Saif Ur Rehman, Sohail Asghar, Simon Fong, and S. Sarasvady. 2014. DBSCAN: Past, present and future. In The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014). IEEE, Bangalore, India, 232–238. https://doi.org/10.1109/ICADIWT.2014.6814687Google ScholarCross Ref
- Jan K. Rieck, Claus W. Böning, and Klaus Getzlaff. 2019. The nature of eddy kinetic energy in the labrador sea: Different types of mesoscale eddies, their temporal variability, and impact on deep convection. J. Phys. Oceanogr. 49, 8 (aug 2019), 2075–2094. https://doi.org/10.1175/JPO-D-18-0243.1Google ScholarCross Ref
- Siren Rühs, Christina Schmidt, René Schubert, Tobias G. Schulzki, Franziska U. Schwarzkopf, Dewi Le Bars, and Arne Biastoch. 2022. Robust estimates for the decadal evolution of Agulhas leakage from the 1960s to the 2010s. Communications Earth & Environment 2022 3:1 3 (12 2022), 1–12. Issue 1. https://doi.org/10.1038/s43247-022-00643-yGoogle ScholarCross Ref
- René Schubert, Jonathan Gula, and Arne Biastoch. 2021. Submesoscale flows impact Agulhas leakage in ocean simulations. Commun. Earth Environ. 2021 21 2, 1 (sep 2021), 1–8. https://doi.org/10.1038/s43247-021-00271-yGoogle ScholarCross Ref
- A.M Treguier, O Boebel, B Barnier, and G Madec. 2003. Agulhas eddy fluxes in a 1/6° Atlantic model. Deep Sea Research Part II: Topical Studies in Oceanography 50, 1 (2003), 251–280. https://doi.org/10.1016/S0967-0645(02)00396-X Inter-ocean exchange around southern Africa.Google ScholarCross Ref
- Erik van Sebille, Stephen M. Griffies, Ryan Abernathey, Thomas P. Adams, Pavel Berloff, Arne Biastoch, Bruno Blanke, Eric P. Chassignet, Yu Cheng, Colin J. Cotter, Eric Deleersnijder, Kristofer Döös, Henri F. Drake, Sybren Drijfhout, Stefan F. Gary, Arnold W. Heemink, Joakim Kjellsson, Inga Monika Koszalka, Michael Lange, Camille Lique, Graeme A. MacGilchrist, Robert Marsh, C. Gabriela Mayorga Adame, Ronan McAdam, Francesco Nencioli, Claire B. Paris, Matthew D. Piggott, Jeff A. Polton, Siren Rühs, Syed H.A.M. Shah, Matthew D. Thomas, Jinbo Wang, Phillip J. Wolfram, Laure Zanna, and Jan D. Zika. 2018. Lagrangian ocean analysis: Fundamentals and practices. Ocean Modelling 121 (2018), 49–75. https://doi.org/10.1016/j.ocemod.2017.11.008Google ScholarCross Ref
- Dongkuan Xu and Yingjie Tian. 2015. A comprehensive survey of clustering algorithms. Annals of Data Science 2 (2015), 165–193.Google ScholarCross Ref
- Xin Yao, Di Zhu, Yong Gao, Lun Wu, Pengcheng Zhang, and Yu Liu. 2018. A Stepwise Spatio-Temporal Flow Clustering Method for Discovering Mobility Trends. IEEE Access 6 (2018), 44666–44675. https://doi.org/10.1109/ACCESS.2018.2864662Google ScholarCross Ref
Index Terms
- VoCC: Vortex Correlation Clustering Based on Masked Hough Transformation in Spatial Databases
Recommendations
Clustering aggregation
We consider the following problem: given a set of clusterings, find a single clustering that agrees as much as possible with the input clusterings. This problem, clustering aggregation, appears naturally in various contexts. For example, clustering ...
Correlation Clustering with Low-Rank Matrices
WWW '17: Proceedings of the 26th International Conference on World Wide WebCorrelation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled `similar' or `dissimilar.' Because the optimization problem is NP-hard, much of the previous literature focuses on ...
Clustering in Dynamic Spatial Databases
Efficient clustering in dynamic spatial databases is currently an open problem with many potential applications. Most traditional spatial clustering algorithms are inadequate because they do not have an efficient support for incremental clustering.In ...
Comments