High-Dimensional Data Clustering Algorithm Based on Stacked-Random Projection

Sun, Yujia; Platoš, Jan

doi:10.1007/978-3-030-57796-4_38

Yujia Sun^17,18 &
Jan Platoš¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1263))

Included in the following conference series:

International Conference on Intelligent Networking and Collaborative Systems

833 Accesses
1 Citations

Abstract

This study focuses on high dimensional data, which are characterized by sparsity, redundancy, and high computational complexity. It is impossible to obtain expected results via clustering with traditional algorithms due to the “Curse of Dimensionality”. In this study, we propose a Stacked-Random Projection dimensionality reduction framework and a dimensionality reduction evaluation index based on distance preservation. The algorithm uses Stacked-Random Projection to reduce the dimensionality of the high-dimensional data, and then spectral clustering and fast search and find density peak clustering are used to cluster the processed data. The algorithm is validated using two high-dimensional data sets. Experimental results show that this algorithm can improve the performance of clustering algorithm significantly.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Janitza, S., Celik, E., Boulesteix, A.-L.: A computationally fast variable importance test for random forests for high-dimensional data. Adv. Data Anal. Classif. 12(4), 885–915 (2018)
Article MathSciNet Google Scholar
Platos, J., Nowakova, J., Kromer, P., Snasel, V.: Space-filling curves based on residue number system. In: Barolli, L., Woungang, I., Hussain, O. (eds.) Advances in Intelligent Networking and Collaborative Systems. INCoS 2017. Lecture Notes on Data Engineering and Communications Technologies, vol. 8. Springer, Cham (2017)
Google Scholar
Li, W., Zhang, Y., Sun, Y., Wang, W., Zhang, W., Lin, X.: Approximate nearest neighbor search on high dimensional data - experiments, analyses, and improvement. IEEE Trans. Knowl. Data Eng. 32, 1475–1488 (2016)
Article Google Scholar
Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press, United States of America (1961)
Book Google Scholar
Xie, H., Li, J., Xue, H.: A survey of dimensionality reduction techniques based on random projection. arXiv preprint arXiv:1706.04371 (2017)
Mahmoud, N.: Random projection and its applications. arXiv preprint arXiv:1710.03163 (2017)
Papadimitriou, C.H., Raghavan, P., Tamaki, H., Vempala, S.: Latent semantic indexing: a probabilistic analysis. In: Proceedings of the 17th ACM Symposium on Principles of Database Systems, Scattle, Washington, pp. 159–168 (1998)
Google Scholar
Li, W., Bebis, G., Bourbakis, N.: Integrating algebraic functions of views with indexing and learning for 3D object recognition. In: Computer Vision and Pattern Recognition, pp. 110–112 (2004)
Google Scholar
Kaski, S.: Dimensionality reduction by random mapping: fast similarity computation for clustering. In: Proceedings of the 1998 IEEE International Joint Conference on Neural Networks, Anchorage, USA, vol. 1, pp. 413–418 (1998)
Google Scholar
Bingham, E., Mannila, H.: Random projection in dimensionality reduction: applications to image and text data. In: International Conference on Knowledge Discovery and Data Mining, vol. 17, no. 2, pp. 245–250 (2001)
Google Scholar
Zhao, R., Mao, K.: Semi-random projection for dimensionality reduction and extreme learning machine in high-dimensional space. IEEE Comput. Intell. Mag. 10(3), 30–41 (2015)
Article Google Scholar
Luxburg, U.V.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Article MathSciNet Google Scholar
Rodriguez, A., Liao, A.: Clustering by fast search and find of density peaks. Science 344(6191), 1492–1496 (2014)
Article Google Scholar
Chen, X.L., Deng, C.: Large scale spectral clustering with landmark-based representation. In: Proceedings of the 25th AAAI Conference on Artificial Intelligence, pp. 313–318 (2011)
Google Scholar
Jin, Z.G., Xu, P.X.: An adaptive community detection algorithm of density peak clustering. J. Harbin Inst. Technol. 50(5), 44–51 (2018)
Google Scholar
Johnson, W.B., Lindenstrauss, J.: Extensions of Lipschitz mappings into a Hilbert space. Contemp. Math. 26, 189–206 (1984)
Article MathSciNet Google Scholar
Menon, A.K.: Random Projections and Applications to Dimensionality Reduction. School of Information Technologies, The University of Sydney, Australia (2007)
Google Scholar
Biggs, N.: Algebraic Graph Theory. Cambridge University Press, Cambridge (1993)
MATH Google Scholar
Snasel, V., Drazdilova, P., Platos, J.: Closed trail distance in a biconnected graph. PLoS ONE 13(8), e0202181 (2018)
Article Google Scholar
Sun, Y., Platos, J.: CAPTCHA recognition based on Kohonen maps. In: Barolli, L., Nishino, H., Miwa, H. (eds.) Advances in Intelligent Networking and Collaborative Systems. INCoS 2019. Advances in Intelligent Systems and Computing, vol. 1035, pp. 296–305. Springer, Cham (2020)
Google Scholar
Sun, Y., Platos, J.: Text classification based on topic modeling and chi-square. In: Pan, J.S., Lin, J.W., Liang, Y., Chu, S.C. (eds.) Genetic and Evolutionary Computing. ICGEC 2019. Advances in Intelligent Systems and Computing, vol. 1107, pp. 513–520. Springer, Singapore (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

Technical University of Ostrava, 17. listopadu 2172/15, 70800, Ostrava-Poruba, Czech Republic
Yujia Sun & Jan Platoš
Hebei GEO University, No. 136 East Huai’an Road, Shijiazhuang, 050031, Hebei, China
Yujia Sun

Authors

Yujia Sun
View author publications
You can also search for this author in PubMed Google Scholar
Jan Platoš
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yujia Sun .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering Faculty of Information Engineering, Fukuoka Institute of Technology, Fukuoka, Japan
Leonard Barolli
Department of Electrical and Computer Engineering, University of Victoria, Victoria, BC, Canada
Kin Fun Li
School of Science and Technology, Kwansei Gakuin University, Sanda, Japan
Hiroyoshi Miwa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sun, Y., Platoš, J. (2021). High-Dimensional Data Clustering Algorithm Based on Stacked-Random Projection. In: Barolli, L., Li, K., Miwa, H. (eds) Advances in Intelligent Networking and Collaborative Systems. INCoS 2020. Advances in Intelligent Systems and Computing, vol 1263. Springer, Cham. https://doi.org/10.1007/978-3-030-57796-4_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-57796-4_38
Published: 21 August 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57795-7
Online ISBN: 978-3-030-57796-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics