skip to main content
10.1145/3380479.3380481acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

How to speed Connected Component Labeling up with SIMD RLE algorithms

Published:04 March 2020Publication History

ABSTRACT

The research in Connected Component Labeling, although old, is still very active and several efficient algorithms for CPUs and GPUs have emerged during the last years and are always improving the performance. This article introduces a new SIMD run-based algorithm for CCL. We show how RLE compression can be SIMDized and used to accelerate scalar run-based CCL algorithms. A benchmark done on Intel, AMD and ARM processors shows that this new algorithm outperforms the State-of-the-Art by an average factor of x1.7 on AVX2 machines and x1.9 on Intel Xeon Skylake with AVX512.

References

  1. D. A. Bader and J. Jaja, "Parallel algorithms for image histogramming and connected components with an experimental study," Parallel and Distributed Computing, vol. 35, 2, pp. 173--190, 1995.Google ScholarGoogle Scholar
  2. A. Lindner, A. Bieniek, and H. Burkhardt, "PISA - parallel image segmentation algorithms," pp. 1--10, Springer, 1999.Google ScholarGoogle Scholar
  3. L. He, X. Ren, Q. Gao, X. Zhao, B. Yao, and Y. Chao, "The connected-component labeling problem: a review of state-of-the-art algorithms," Pattern Recognition, vol. 70, pp. 25--43, 2017.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. F. Bolelli, M. Cancilla, L. Baraldi, and C. Grana, "Toward reliable experiments on the performance of connected components labeling algorithms," Journal of Real-Time Image Processing (JRTIP), pp. 1--16, 2018.Google ScholarGoogle Scholar
  5. M. Niknam, P. Thulasiraman, and S. Camorlinga, "A parallel algorithm for connected component labeling of gray-scale images on homogeneous multicore architectures," Journal of Physics - High Performance Computing Symposium (HPCS), 2010.Google ScholarGoogle Scholar
  6. S. Gupta, D. Palsetia, M. A. Patwary, A. Agrawal, and A. Choudhary, "A new parallel algorithm for two-pass connected component labeling," in Parallel & Distributed Processing Symposium Workshops (IPDPSW), pp. 1355--1362, IEEE, 2014.Google ScholarGoogle Scholar
  7. A. Rosenfeld and J. Platz, "Sequential operator in digital pictures processing," Journal of ACM, vol. 13, 4, pp. 471--494, 1966.Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. F. Wende and T. Steinke, "Swendsen-wang multi-cluster algorithm for the 2d/3d Ising Model on Xeon Phi and GPU," in International Conference on High Performance Computing (SuperComputing) (ACM, ed.), pp. 1--12, 2013.Google ScholarGoogle Scholar
  9. L. Lacassagne, L. Cabaret, F. Hebache, and A. Petreto, "A new SIMD iterative connected component labeling algorithm," in ACM Workshop on Programming Models for SIMD/Vector Processing (PPoPP), pp. 1--8, 2016.Google ScholarGoogle Scholar
  10. A. Kalentev, A. Rai, S. Kemnitz, and R. Schneider, "Connected component labeling on a 2d grid using CUDA," Journal of Parallel and Distributed Computing, vol. 71, pp. 615--620, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. A. Hennequin, I. Masliah, and L. Lacassagne, "Designing efficient SIMD algorithms for direct connected component labeling," in ACM Workshop on Programming Models for SIMD/Vector Processing (PPoPP), pp. 1--8, 2019.Google ScholarGoogle Scholar
  12. Y. Komura, "GPU-based cluster-labeling algorithm without the use of conventional iteration: application to swendsen-wang multi-cluster spin flip algorithm," Computer Physics Communications, pp. 54--58, 2015.Google ScholarGoogle ScholarCross RefCross Ref
  13. D. P. Playne and K. Hawick, "A new algorithm for parallel connected-component labelling on GPUs," IEEE Transactions on Parallel and Distributed Systems, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  14. F. Bolelli, L. Baraldi, M. Cancilla, and C. Grana, "Connected components labeling on DRAGs," in International Conference on Pattern Recognition (ICPR) (IEEE, ed.), pp. 121--126, 2018.Google ScholarGoogle Scholar
  15. F. Bolelli, S. Allegretti, L. Baraldi, and C. Grana, "Spaghetti labeling: Directed acyclic graphs for block-based connected components labeling," Transactions on Image Processing, vol. PP, pp. 1--14, 2019.Google ScholarGoogle Scholar
  16. L. Lacassagne and A. B. Zavidovique, "Light speed labeling for RISC architectures," in IEEE International Conference on Image Analysis and Processing (ICIP), 2009.Google ScholarGoogle Scholar
  17. L. Cabaret, L. Lacassagne, and D. Etiemble, "Parallel Light Speed Labeling for connected component analysis on multi-core processors," Journal of Real-Time Image Processing (JRTIP), vol. 15, no.1, pp. 173--196, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Hennequin, Q. L. Meunier, L. Lacassagne, and L. Cabaret, "A new direct connected component labeling and analysis algorithm for GPUs," in IEEE International Conference on Design and Architectures for Signal and Image Processing (DASIP), pp. 1--6, 2018.Google ScholarGoogle Scholar
  19. A. H. Robinson and C. Cherry, "Results of a prototype television bandwidth compression scheme," Proceedings of the IEEE, vol. 55, 3, pp. 8--19, 1967.Google ScholarGoogle ScholarCross RefCross Ref
  20. T. A. Welch, "A technique for high-performance data compression," Computer, vol. 17, 6, pp. 8--19, 1984.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. J. Ziv and A. Lempel, "Compression of individual sequences via variable-rate coding," Transactions on Information Theory, vol. 24, 5, pp. 530, 536, 1978.Google ScholarGoogle Scholar
  22. C.-Y. Chan and Y. E. Ioannidis, "Bitmap index design and evaluation," in ACM SIGMOD Record, vol. 27, pp. 355--366, ACM, 1998.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. J. Willms, "Autocorrelations of binary sequences and run structure," Transactions on Information Theory, vol. 59, 8, pp. 4985--1993, 2013.Google ScholarGoogle Scholar
  24. D. Lemire, O. Kaser, N. Kurz, L. Deri, C. O'Hara, F. Saint-Jacques, and G. Ssi-Yan-Kai, "Roaring bitmaps: Implementation of an optimized software library," Software: Practice and Experience, vol. 48, no. 4, pp. 867--895, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  25. A. Ungethum, J. Pietrzyk, P. Damme, D. Habich, and W. Lehner, "Conflict detection-based run-length encoding - avx-512 cd instruction set in action," in International Conference on Data Engineering Workshops (ICDEW), pp. 96--101, IEEE, 2019.Google ScholarGoogle Scholar
  26. H. Lang, L. Passing, A. Kipf, P. Boncz, T. Neumann, and A. Kemper, "Make the most out of your simd investments: counter control flow divergence in compiled query pipelines," Journal on Very Large Data Bases (VLDB), pp. 1--18, 2019.Google ScholarGoogle Scholar
  27. D. Lemire, "Lemire's simdprune https://github.com/lemire/simdprune," 2019.Google ScholarGoogle Scholar
  28. H. S. Warren, Hacker's Delight. Addison-Wesley Professional, 2nd ed., 2012.Google ScholarGoogle Scholar
  29. C. Grana, "YACCLAB https://github.com/prittt/YACCLAB," 2016.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    WPMVP'20: Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing
    February 2020
    29 pages
    ISBN:9781450375207
    DOI:10.1145/3380479

    Copyright © 2020 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 4 March 2020

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article
    • Research
    • Refereed limited

    Acceptance Rates

    Overall Acceptance Rate20of30submissions,67%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader