Skip to main content
Log in

Box-counting algorithm on GPU and multi-core CPU: an OpenCL cross-platform study

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

In this paper, we present the analysis and development of a cross-platform OpenCL implementation of the box-counting algorithm, which is one of the most widely-used methods for estimating the Fractal Dimension. The Fractal Dimension is a relevant image analysis method used in several disciplines, but computing it is in general a time consuming process, especially when working with 3D images. Unlike parallel programming models that strictly depend on the hardware type and manufacturer, like CUDA, OpenCL allows us to provide an implementation suitable for execution on both GPUs and multi-core CPUs, whatever the hardware manufacturer. Sorting is a key part of the fast box-counting algorithm and the final speedup is highly conditioned by the efficiency of the sorting algorithm used. Our study reveals that current OpenCL implementations of sorting algorithms are clearly slower when compared with both CUDA for GPU and specific multi-core CPU implementations. Our OpenCL algorithm has been specifically optimized according the type of the target device and the results show an average speedup of up to 7.46× and 4×, when executed on the GPU and the multi-core CPU respectively, both compared with the single-threaded (sequential) CPU implementation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Listing 1
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Esteban FJ, Sepulcre J, Ruiz de Miras J, Navas J, de Mendizábal NV, Goñi J, Quesada JM, Bejarano B, Villoslada P (2009) Fractal dimension analysis of grey matter in multiple sclerosis. J Neurol Sci 282:67–71

    Article  Google Scholar 

  2. Wu YT, Shyu KK, Jao CW, Wang ZY, Soong BW, Wu HM, Wang PS (2010) Fractal dimension analysis for quantifying cerebellar morphological change of multiple system atrophy of the cerebellar type (MSA-C). NeuroImage 49:39–551

    Google Scholar 

  3. Shyu KK, Wu YT, Chen TR, Chen HY, Hu HH, Guo WY (2011) Measuring complexity of fetal cortical surface from MR images using 3-D modified box-counting method. IEEE Trans Instrum Meas 60:522–531

    Article  Google Scholar 

  4. Kotowski P (2006) Fractal dimension of metallic fracture surface. Int J Fract 141(1–2):269–286

    Article  Google Scholar 

  5. de Souza J, Rostirolla SP (2011) A fast MATLAB program to estimate the multifractal spectrum of multidimensional data: application to fractures. Comput Geosci 37(2):241–249

    Article  Google Scholar 

  6. Khanbareh H, Wu X, Van der Zwaag S (2012) Analysis of the fractal dimension of grain boundaries of AA7050 aluminum alloys and its relationship to fracture toughness. J Mater Sci 47(17):6246–6253

    Article  Google Scholar 

  7. Vahedi A, Gorczyca B (2011) Application of fractal dimensions to study the structure of flocs formed in lime softening process. Water Res 45(2):545–556

    Article  Google Scholar 

  8. Khoury M, Wenger R (2010) On the fractal dimension of isosurfaces. IEEE Trans Vis Comput Graph 16:1198–1205

    Article  Google Scholar 

  9. Russel D, Hanson J, Ott E (1980) Dimension of strange attractors. Phys Rev Lett 45:1175–1178

    Article  MathSciNet  Google Scholar 

  10. Ruiz de Miras J, Villoslada P, Navas J, Esteban FJ (2011) UJA-3DFD: a program to compute the 3D fractal dimension from MRI data. Comput Methods Programs Biomed 104:452–460

    Article  Google Scholar 

  11. Hou X, Gilmore R, Mindlin GB, Solari HG (1990) An efficient algorithm for fast O(N⋅ln(N)) box counting. Phys Lett A 151:43

    Article  MathSciNet  Google Scholar 

  12. Liebotich LS, Toth T (1989) A fast algorithm to determine fractal dimension by box counting. Phys Lett A 141:386

    Article  MathSciNet  Google Scholar 

  13. Bauer W, Mackenzie CD (2001) Cancer detection on a cell-by-cell basis using a fractal dimension analysis. Acta Phys Hung, Heavy Ion Phys 14(1–4):43–50

    Article  Google Scholar 

  14. Koster M, Hannawald J, Brameshube W (2006) Simulation of water permeability and water vapor diffusion through hardened cement paste. Comput Mech 37(2):163–172

    Article  MATH  Google Scholar 

  15. Diaz J, Munoz-Caro C, Nino A (2012) A survey of parallel programming models and tools in the multi and many-core era. IEEE Trans Parallel Distrib Syst 23(8):1369–1386

    Article  Google Scholar 

  16. NVIDIA GPU computing documentation (2011). http://developer.nvidia.com/nvidia-gpu-computing-documentation

  17. Khronos OpenCl Working Group (2010) The OpenCL specification. Version 1.1. http://www.khronos.org/opencl/

  18. Jiménez J, Ruiz de Miras J (2012) Fast box-counting algorithm on GPU. Comput Methods Programs Biomed 108(3):1229–1242

    Article  Google Scholar 

  19. Escalera S, Puig A, Amoros O, Salamó M (2011) Intelligent GPGPU classification in volume visualization: a framework based on error-correcting output codes. Comput Graph Forum 30(7):2107–2115

    Article  Google Scholar 

  20. Weber R, Gothandaraman A, Hinde RJ, Peterson GD (2011) Comparing hardware accelerators in scientific applications: a case study. IEEE Trans Parallel Distrib Syst 22:58–68

    Article  Google Scholar 

  21. Choudhary NK, Navada S, Ginjupalli R, Khanna G (2011) An exploration of OpenCL on multiple hardware platforms for a numerical relativity application. In: Proceedings of the international conference on parallel and distributed computing and systems, pp 87–92

    Google Scholar 

  22. Yuan Z, Si W, Liao X, Duan Z, Ding Y, Zhao J (2012) Parallel computing of 3D smoking simulation based on OpenCL heterogeneous platform. J Supercomput 61:84–102

    Article  Google Scholar 

  23. Zavala-Romero O, Meyer-Baese A, Meyer-Baese U (2012) Multiplatform GPGPU implementation of the active contours without edges algorithm. In: Proceedings of SPIE, vol 8399

    Google Scholar 

  24. Kruger A (1996) Implementation of a fast box-counting algorithm. Comput Phys Commun 98:224–234

    Article  MATH  Google Scholar 

  25. Bainville E (2011) OpenCL sorting. http://www.bealto.com/gpu-sorting_intro.html

  26. Ha L, Krüger J, Silva CT (2009) Fast four-way parallel radix sorting on GPUs. Comput Graph Forum 28(8):2368–2378

    Article  Google Scholar 

  27. Zagha M, Blelloch GE (1991) Radix sort for vector multiprocessors. In: Supercomputing’91: proceedings of the 1991 ACM/IEEE conference on supercomputing, New York, NY, USA, 1991, pp 712–721. ISBN: 0818621583

    Chapter  Google Scholar 

  28. Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for manycore GPUs. In: IPDPS 2009—proceedings of the 2009 IEEE international parallel and distributed processing symposium

    Google Scholar 

  29. clpp—OpenCL Data Parallel Primitives Library (2011). http://code.google.com/p/clpp/

  30. Hoberock J, Bell N (2012) Thrust: a parallel Template Library. v1.6.0. http://thrust.github.com/

  31. Du P, Weber R, Luszczek P, Tomov S, Peterson G, Dongarra J (2012) From CUDA to OpenCL: towards a performance-portable solution for multi-platform GPU programming. Parallel Comput 38(8):391–407

    Article  Google Scholar 

  32. Intel OpenCL Bitonic Sort algorithm (2011). http://software.intel.com/en-us/articles/vcsource-samples-bitonic-sorting/

  33. Intel Threading Building Blocks (TBB) (2008). http://threadingbuildingblocks.org/

  34. Stanford university (2011) The Stanford 3D scanning repository. http://graphics.stanford.edu/data/3Dscanrep

  35. Aim@shape repository (2011). http://shapes.aimatshape.net

  36. 3DVIA repository (2011). http://www.3dvia.com

  37. QuickSort. http://www.inf.fh-flensburg.de/lang/algorithmen/sortieren/quick/quicken.htm

  38. Khan FG, Khan OU, Montrucchio B, Giaconne P (2011) Analysis of fast parallel sorting algorithms for GPU architectures. In: Proceedings—2011 9th international conference on frontiers of information technology, FIT 2011, pp 173–178

    Google Scholar 

  39. Process.h C library specification. http://www.digitalmars.com/rtl/process.html

  40. Merrill D, Grimshaw A (2011) High performance and scalable radix sorting: a case study of implementing dynamic parallelism for GPU computing. Parallel Process Lett 21:245–272

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

This work has been partially supported by the University of Jaén, the Caja Rural de Jaén, the Andalusian Government and the European Union (via ERDF funds) through the research projects UJA2009/13/04 and PI10-TIC-5807.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Juan Ruiz de Miras.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiménez, J., Ruiz de Miras, J. Box-counting algorithm on GPU and multi-core CPU: an OpenCL cross-platform study. J Supercomput 65, 1327–1352 (2013). https://doi.org/10.1007/s11227-013-0885-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-013-0885-z

Keywords

Navigation