research-article

How to speed Connected Component Labeling up with SIMD RLE algorithms

Authors:
Florian Lemaitre

LIP6 - Sorbonne University, CNRS

LIP6 - Sorbonne University, CNRS
View Profile

,
Arthur Hennequin

LIP6 - Sorbonne University, CNRS

LIP6 - Sorbonne University, CNRS
View Profile

,
Lionel Lacassagne

LIP6 - Sorbonne University, CNRS

LIP6 - Sorbonne University, CNRS
View Profile

WPMVP'20: Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector ProcessingFebruary 2020Article No.: 2Pages 1–8https://doi.org/10.1145/3380479.3380481

Published:04 March 2020Publication History

WPMVP'20: Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing

Pages 1–8

ABSTRACT

The research in Connected Component Labeling, although old, is still very active and several efficient algorithms for CPUs and GPUs have emerged during the last years and are always improving the performance. This article introduces a new SIMD run-based algorithm for CCL. We show how RLE compression can be SIMDized and used to accelerate scalar run-based CCL algorithms. A benchmark done on Intel, AMD and ARM processors shows that this new algorithm outperforms the State-of-the-Art by an average factor of x1.7 on AVX2 machines and x1.9 on Intel Xeon Skylake with AVX512.

References

D. A. Bader and J. Jaja, "Parallel algorithms for image histogramming and connected components with an experimental study," Parallel and Distributed Computing, vol. 35, 2, pp. 173--190, 1995.Google Scholar
A. Lindner, A. Bieniek, and H. Burkhardt, "PISA - parallel image segmentation algorithms," pp. 1--10, Springer, 1999.Google Scholar
L. He, X. Ren, Q. Gao, X. Zhao, B. Yao, and Y. Chao, "The connected-component labeling problem: a review of state-of-the-art algorithms," Pattern Recognition, vol. 70, pp. 25--43, 2017.Google ScholarDigital Library
F. Bolelli, M. Cancilla, L. Baraldi, and C. Grana, "Toward reliable experiments on the performance of connected components labeling algorithms," Journal of Real-Time Image Processing (JRTIP), pp. 1--16, 2018.Google Scholar
M. Niknam, P. Thulasiraman, and S. Camorlinga, "A parallel algorithm for connected component labeling of gray-scale images on homogeneous multicore architectures," Journal of Physics - High Performance Computing Symposium (HPCS), 2010.Google Scholar
S. Gupta, D. Palsetia, M. A. Patwary, A. Agrawal, and A. Choudhary, "A new parallel algorithm for two-pass connected component labeling," in Parallel & Distributed Processing Symposium Workshops (IPDPSW), pp. 1355--1362, IEEE, 2014.Google Scholar
A. Rosenfeld and J. Platz, "Sequential operator in digital pictures processing," Journal of ACM, vol. 13, 4, pp. 471--494, 1966.Google ScholarDigital Library
F. Wende and T. Steinke, "Swendsen-wang multi-cluster algorithm for the 2d/3d Ising Model on Xeon Phi and GPU," in International Conference on High Performance Computing (SuperComputing) (ACM, ed.), pp. 1--12, 2013.Google Scholar
L. Lacassagne, L. Cabaret, F. Hebache, and A. Petreto, "A new SIMD iterative connected component labeling algorithm," in ACM Workshop on Programming Models for SIMD/Vector Processing (PPoPP), pp. 1--8, 2016.Google Scholar
A. Kalentev, A. Rai, S. Kemnitz, and R. Schneider, "Connected component labeling on a 2d grid using CUDA," Journal of Parallel and Distributed Computing, vol. 71, pp. 615--620, 2011.Google ScholarDigital Library
A. Hennequin, I. Masliah, and L. Lacassagne, "Designing efficient SIMD algorithms for direct connected component labeling," in ACM Workshop on Programming Models for SIMD/Vector Processing (PPoPP), pp. 1--8, 2019.Google Scholar
Y. Komura, "GPU-based cluster-labeling algorithm without the use of conventional iteration: application to swendsen-wang multi-cluster spin flip algorithm," Computer Physics Communications, pp. 54--58, 2015.Google ScholarCross Ref
D. P. Playne and K. Hawick, "A new algorithm for parallel connected-component labelling on GPUs," IEEE Transactions on Parallel and Distributed Systems, 2018.Google ScholarCross Ref
F. Bolelli, L. Baraldi, M. Cancilla, and C. Grana, "Connected components labeling on DRAGs," in International Conference on Pattern Recognition (ICPR) (IEEE, ed.), pp. 121--126, 2018.Google Scholar
F. Bolelli, S. Allegretti, L. Baraldi, and C. Grana, "Spaghetti labeling: Directed acyclic graphs for block-based connected components labeling," Transactions on Image Processing, vol. PP, pp. 1--14, 2019.Google Scholar
L. Lacassagne and A. B. Zavidovique, "Light speed labeling for RISC architectures," in IEEE International Conference on Image Analysis and Processing (ICIP), 2009.Google Scholar
L. Cabaret, L. Lacassagne, and D. Etiemble, "Parallel Light Speed Labeling for connected component analysis on multi-core processors," Journal of Real-Time Image Processing (JRTIP), vol. 15, no.1, pp. 173--196, 2018.Google ScholarDigital Library
A. Hennequin, Q. L. Meunier, L. Lacassagne, and L. Cabaret, "A new direct connected component labeling and analysis algorithm for GPUs," in IEEE International Conference on Design and Architectures for Signal and Image Processing (DASIP), pp. 1--6, 2018.Google Scholar
A. H. Robinson and C. Cherry, "Results of a prototype television bandwidth compression scheme," Proceedings of the IEEE, vol. 55, 3, pp. 8--19, 1967.Google ScholarCross Ref
T. A. Welch, "A technique for high-performance data compression," Computer, vol. 17, 6, pp. 8--19, 1984.Google ScholarDigital Library
J. Ziv and A. Lempel, "Compression of individual sequences via variable-rate coding," Transactions on Information Theory, vol. 24, 5, pp. 530, 536, 1978.Google Scholar
C.-Y. Chan and Y. E. Ioannidis, "Bitmap index design and evaluation," in ACM SIGMOD Record, vol. 27, pp. 355--366, ACM, 1998.Google ScholarDigital Library
J. Willms, "Autocorrelations of binary sequences and run structure," Transactions on Information Theory, vol. 59, 8, pp. 4985--1993, 2013.Google Scholar
D. Lemire, O. Kaser, N. Kurz, L. Deri, C. O'Hara, F. Saint-Jacques, and G. Ssi-Yan-Kai, "Roaring bitmaps: Implementation of an optimized software library," Software: Practice and Experience, vol. 48, no. 4, pp. 867--895, 2018.Google ScholarCross Ref
A. Ungethum, J. Pietrzyk, P. Damme, D. Habich, and W. Lehner, "Conflict detection-based run-length encoding - avx-512 cd instruction set in action," in International Conference on Data Engineering Workshops (ICDEW), pp. 96--101, IEEE, 2019.Google Scholar
H. Lang, L. Passing, A. Kipf, P. Boncz, T. Neumann, and A. Kemper, "Make the most out of your simd investments: counter control flow divergence in compiled query pipelines," Journal on Very Large Data Bases (VLDB), pp. 1--18, 2019.Google Scholar
D. Lemire, "Lemire's simdprune https://github.com/lemire/simdprune," 2019.Google Scholar
H. S. Warren, Hacker's Delight. Addison-Wesley Professional, 2nd ed., 2012.Google Scholar
C. Grana, "YACCLAB https://github.com/prittt/YACCLAB," 2016.Google Scholar

Recommendations

Designing efficient SIMD algorithms for direct Connected Component Labeling
WPMVP'19: Proceedings of the 5th Workshop on Programming Models for SIMD/Vector Processing

Connected Component Labeling (CCL) is a fundamental algorithm in computer vision, and is often required for real-time applications. It consists in assigning a unique number to each connected component of a binary image. In recent years, we have seen the ...
Read More
A new SIMD iterative connected component labeling algorithm
WPMVP '16: Proceedings of the 3rd Workshop on Programming Models for SIMD/Vector Processing

This paper presents a new multi-pass iterative algorithm for Connected Component Labeling. The performance of this algorithm is compared to those of State-of-the-Art two-pass direct algorithms. We show that thanks to the parallelism of the SIMD multi-...
Read More
Parallelization of Connected-Component Labeling on TILE64 Many-Core Platform

Many-core technology is considering as a key to improve the performance of recent computer systems. To obtain good performance for a many-core system, exploiting parallelism in arithmetic level is not enough and the parallelization strategy must apply ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WPMVP'20: Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing
February 2020
29 pages
ISBN:9781450375207
DOI:10.1145/3380479
Editors:
Jan Eitzinger
University Erlangen-Nuremberg, Germany
,
Lionel Lacassagne
Sorbonne University, France
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 4 March 2020
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate20of30submissions,67%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 6
  Total Citations
  View Citations
- 151
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

How to speed Connected Component Labeling up with SIMD RLE algorithms

WPMVP'20: Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing

ABSTRACT

References

Cited By

Recommendations

Designing efficient SIMD algorithms for direct Connected Component Labeling

A new SIMD iterative connected component labeling algorithm

Parallelization of Connected-Component Labeling on TILE64 Many-Core Platform

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

How to speed Connected Component Labeling up with SIMD RLE algorithms

WPMVP'20: Proceedings of the 2020 Sixth Workshop on Programming Models for SIMD/Vector Processing

ABSTRACT

References

Cited By

Recommendations

Designing efficient SIMD algorithms for direct Connected Component Labeling

A new SIMD iterative connected component labeling algorithm

Parallelization of Connected-Component Labeling on TILE64 Many-Core Platform

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media