Abstract
The multi-swarm particle swarm optimization (MPSO) algorithm incorporates multiple independent PSO swarms that cooperate by periodically exchanging information. In spite of its embarrassingly parallel nature, MPSO is memory bound, limiting its performance on data-parallel GPUs. Recently, heterogeneous multi-core architectures such as the AMD Accelerated Processing Unit (APU) have fused the CPU and GPU together on a single die, eliminating the traditional PCIe bottleneck between them. In this paper, we provide our experiences developing an OpenCL-based MPSO algorithm for the task scheduling problem on the APU architecture. We use the AMD A8-3530MX APU that packs four x86 computing cores and 80 four-way processing elements. We make effective use of hardware features such as the hierarchical memory structure on the APU, the four-way very long instruction word (VLIW) feature for vectorization, and global-to-local memory DMA transfers. We observe a 29 % decrease in overall execution time over our baseline implementation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Advanced Micro Devices: AMD Accelerated Parallel Processing OpenCL Programming Guide. http://developer.amd.com/download/AMD_Accelerated_Parallel_Processing_OpenCL_Programming_Guide.pdf (2012)
Fernandez-Baca, D.: Allocating modules to processors in a distributed system. IEEE Trans. Softw. Eng. 15(11), 1427–1436 (1989)
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, Perth, Western Australia, vol. 4, pp. 1942–1948 (1995)
Salmon, J.K., Moraes, M.A., Dror, R.O., Shaw, D.E.: Parallel random numbers: as easy as 1, 2, 3. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, Seattle, WA, USA, November 2011, pp. 16:1–16:12. http://doi.acm.org/10.1145/2063384.2063405 (2011)
Shi, Y., Eberhart, R.: A modified particle swarm optimizer. In: Proceedings of the Evolutionary Computation, IEEE World Congress on Computational Intelligence, Anchorage, AK, USA, May 1998, pp. 69–73 (1998)
Solomon, S., Thulasiraman, P., Thulasiram, R.: Collaborative multi-swarm PSO for task matching using graphics processing units. In: ACM Proceedings of the 13th Annual Conference on Genetic and Evolutionary Computation (GECCO), Dublin, Ireland, July 2011, pp. 1563–1570. http://doi.acm.org.proxy1.lib.umanitoba.ca/10.1145/2001576.2001787 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Franz, W., Thulasiraman, P., Thulasiram, R.K. (2014). Optimization of an OpenCL-Based Multi-swarm PSO Algorithm on an APU. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Waśniewski, J. (eds) Parallel Processing and Applied Mathematics. PPAM 2013. Lecture Notes in Computer Science(), vol 8385. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55195-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-642-55195-6_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-55194-9
Online ISBN: 978-3-642-55195-6
eBook Packages: Computer ScienceComputer Science (R0)