Skip to main content

Efficient Parallel Implementation of Cellular Automata and Stencil Computations in Current Processors

  • Chapter
  • First Online:
Advances in Computing, Informatics, Networking and Cybersecurity

Abstract

A Cellular Automaton is a bio-inspired discrete model of computation with multiple applications, consisting of a regular grid of cells that have different states along time. Our research group at the University of Seville (Spain) collaborated in the past with Prof. Mohammad S. Obaidat in the fusion of Cellular Automata (CA) with another bio-inspired approach, the Address-Event-Representation (AER) neuromorphic communication protocol, for implementing a vision filter based on 3\(\,\times \,\)3 convolutions [22]. In the last years, our group has continued working on the optimization of CA implementations in current high-performance computational systems. In this chapter, several of these optimizations will be described, focusing on those especially aimed at current processors, including hardware alternatives (e.g. GPUs, BTBs, etc.), different forms of parallelism such as instruction-level parallelism, thread-level parallelism, data-level parallelism, as well as software approaches (such as ‘if-else’ statement elimination, loop unrolling, data pipelining and blocking, etc.). The effect of these optimizations will be qualitatively illustrated by means of the Roofline model, considering simple CAs such as the well-known Game-of-Life (GOL), which has been extensively used to explore CA characteristics. This CA was invented by John Horton Conway, an English mathematician that recently died of complications from COVID-19, so we want this case-study to serve also as a tribute to him.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hoekstra, A.G., Kroc, J., Sloot, P. (eds.): Simulating Complex Systems by Cellular Automata. Springer, Berlin, Heidelberg (2010)

    MATH  Google Scholar 

  2. Bajzát, T., Hajnal, E.: Cell automaton modelling algorithms: implementation and testing in GPU systems. In: INES 2011, 15th International Conference on Intelligent Engineering Systems (2011)

    Google Scholar 

  3. Balasalle, J., Lopez, M., Rutherford, M.: Optimizing Memory Access Patterns for Cellular Automata on GPUs, pp. 67–75. Elsevier–Morgan Kaufmann–NVIDIA (2011)

    Google Scholar 

  4. Bandman, O.: Using multi core computers for implementing cellular automata systems. Lect.ure Notes Comput. Sci. 6873(1), 140–151 (2011)

    Google Scholar 

  5. Cagigas-Muñiz, D., Diaz-del Rio, F., López-Torres, M., Jiménez-Morales, F., Guisado, J.L.: Developing efficient discrete simulations on multicore and GPU architectures. Electronics 9, 189 (2020). https://doi.org/10.3390/electronics9010189

    Article  Google Scholar 

  6. Cattaneo, R., Natale, G., Sicignano, C., Sciuto, D., Santambrogio, M.D.: On how to accelerate iterative stencil loops: a scalable streaming-based approach. ACM Trans. Archit. Code Optim. 12(4), 1–26 (2015)

    Article  Google Scholar 

  7. Chopard, B., Droz, M.: Cellular Automata Modeling of Physical Systems. Cambridge University Press, Cambridge, MA, USA (1998)

    Book  Google Scholar 

  8. Duesterwald, E., Gupta, R., Soffa, M.L.: Register pipelining: an integrated approach to register allocation for scalar and subscripted variables. In: Kastens, U., Pfahler, P. (eds.) Compiler Construction, pp. 192–206. Springer, Berlin, Heidelberg (1992)

    Chapter  Google Scholar 

  9. Berlekamp, E.R., Conway, J.H., Guy, R.K.: Winning Ways for your Mathematical Plays, 2nd edn. A K Peters/CRC Press, New York, USA (2001)

    MATH  Google Scholar 

  10. Gardner, M.: Mathematical games: the fantastic combinations of John Conway’s new solitaire game & “Life’’. Sci. Am. 223(4), 120–123 (1970)

    Article  Google Scholar 

  11. Gibson, M.J., Keedwell, E.C., Savić, D.A.: An investigation of the efficient implementation of cellular automata on multi-core CPU and GPU hardware. J. Parallel Distrib. Comput. 77, 11–25 (2015)

    Article  Google Scholar 

  12. Guisado, J., Jiménez-Morales, F., Fernández-de Vega, F.: Cellular automata and cluster computing: an application to the simulation of laser dynamics. Adv. Complex Syst. 10(Suppl. 1), 167–190 (2007)

    Article  Google Scholar 

  13. Hennessy, J.L., Patterson, D.A.: Computer Architecture, Sixth Edition: A Quantitative Approach, 6th edn. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2017)

    Google Scholar 

  14. Hwu, W.m.: GPU Computing Gems Jade Edition, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2011)

    Google Scholar 

  15. Ilachinski, A.: Cellular Automata: A Discrete Universe. World Scientific, Singapore (2001)

    Book  Google Scholar 

  16. Ilic, A., Pratas, F., Sousa, L.: Cache-aware roofline model: upgrading the loft. IEEE Comput. Archit. Lett. 13(1), 21–24 (2014). https://doi.org/10.1109/L-CA.2013.6

    Article  Google Scholar 

  17. Intel: Intel intrinsics guide. https://software.intel.com/sites/landingpage/IntrinsicsGuide/

  18. Kari, J.: Theory of cellular automata: a survey. Theor. Comput. Sci. 334(1–3), 3–33 (2005)

    Article  MathSciNet  Google Scholar 

  19. Kirk, D.B., Hwu, W.m.W.: Programming Massively Parallel Processors: A Hands-on Approach. Morgan Kaufmann Publishers, Burlington, MA (2010)

    Google Scholar 

  20. Koskela, T., Matveev, Z., Yang, C., Adedoyin, A., Belenov, R., Thierry, P., Zhao, Z., Gayatri, R., Shan, H., Oliker, L., Deslippe, J., Green, R., Williams, S.: A novel multi-level integrated roofline model approach for performance characterization. In: Yokota, R., Weiland, M., Keyes, D., Trinitis, C. (eds.) High Performance Computing, pp. 226–245. Springer, Cham (2018)

    Chapter  Google Scholar 

  21. Li, Z., Song, Y.: Automatic tiling of iterative stencil loops. ACM Trans. Progr. Lang. Syst. 26(6), 975–1028 (2004)

    Article  Google Scholar 

  22. Linares-Barranco, A., Sevillano, J., Obaidat, M.S.: AER filtering using glider: VHDL cellular automata description. In: 15th IEEE International Conference on Electronics, Circuits and Systems, pp. 614–617 (2008)

    Google Scholar 

  23. Lopez-Torres, M., Guisado, J., Jimenez-Morales, F., Diaz-del Rio, F.: GPU-based cellular automata simulations of laser dynamics. In: Proceedings of the XXIII Jornadas de Paralelismo: Jornadas SARTECO 2012, pp. 261–266. SARTECO, Elche (2012). http://www.jornadassarteco.org/js2012/papers/paper_151.pdf

  24. Matsumura, K., Zohouri, H., Wahib, M., Endo, T., Matsuoka, S.: AN5D: automated stencil framework for high-degree temporal blocking on GPUS. In: International Symposium on Code Generation and Optimization, pp. 199–211 (2020). https://doi.org/10.1145/3368826.3377904

  25. Millñin, E., Martínez, P., Gil Costa, G., Piccoli, M., Printista, A., Bederian, C., García Garino, C., Bringa, E.: Parallel implementation of a cellular automata in a hybrid CPU/GPU environment. In: XVIII Congreso Argentino de Ciencias de la Computación, pp. 184–193 (2013)

    Google Scholar 

  26. von Neumann, J.: Theory of Self-reproducing Automata. University of Illinois Press, Urbana (1966)

    Google Scholar 

  27. Nguyen, A.D., Satish, N., Chhugani, J., Kim, C., Dubey, P.: 3.5-D blocking optimization for stencil computations on modern CPUS and GPUS . In: SC, pp. 1–13. IEEE (2010)

    Google Scholar 

  28. Oxman, G., Weiss, S., Be’ery, Y.: Computational methods for Conway’s Game of Life cellular automaton. J. Comput. Sci. 5(1), 24–31 (2014)

    Article  MathSciNet  Google Scholar 

  29. Bryant, R.E., O’Hallaron, D.R.: Computer Systems: A Programmer’s Perspective, 3rd edn. Pearson, London, UK (2016)

    Google Scholar 

  30. Rybacki, S., Himmelspach, J., Uhrmacher, A.: CPU and GPU based simulation of cellular automata—a performance comparison. In: Proceedings of the 1st SIMUL, pp. 62–67 (2009)

    Google Scholar 

  31. Song, Y., Li, Z.: New tiling techniques to improve cache temporal locality. In: Proceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation, PLDI ’99, pp. 215–228. Association for Computing Machinery, New York, NY, USA (1999). https://doi.org/10.1145/301618.301668

  32. Stengel, H., Treibig, J., Hager, G., Wellein, G.: Quantifying performance bottlenecks of stencil computations using the execution-cache-memory model. In: Proceedings of the 29th ACM on International Conference on Supercomputing, ICS ’15, pp. 207–216. Association for Computing Machinery, New York, NY, USA (2015). https://doi.org/10.1145/2751205.2751240

  33. Williams, S., Waterman, A., Patterson, D.: Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4), 65–76 (2009)

    Article  Google Scholar 

  34. Yang, C., Kurth, T., Williams, S.: Hierarchical Roofline analysis for GPUS: accelerating performance optimization for the NERSC-9 Perlmutter system. Concurr. Comput. 32(20), 1–12 (2020)

    Article  Google Scholar 

Download references

Acknowledgements

This research was funded by the following research project of Ministerio de Economía, Industria y Competitividad, Gobierno de España (MINECO) and the Agencia Estatal de Investigación (AEI) of Spain, cofinanced by FEDER funds (EU): MABICAP (Bio-inspired machines on High Performance Computing platforms: a multidisciplinary approach, TIN2017-89842P), Par-HoT (Parallel Data Processing based on Homotopy Connectivity: Applications to Stereoscopic Vision and Biomedical Data, PID2019-110455GB-I00).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jose Luis Sevillano-Ramos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Diaz-del-Rio, F., Cagigas-Muñiz, D., Guisado-Lizar, J.L., Sevillano-Ramos, J.L. (2022). Efficient Parallel Implementation of Cellular Automata and Stencil Computations in Current Processors. In: Nicopolitidis, P., Misra, S., Yang, L.T., Zeigler, B., Ning, Z. (eds) Advances in Computing, Informatics, Networking and Cybersecurity. Lecture Notes in Networks and Systems, vol 289. Springer, Cham. https://doi.org/10.1007/978-3-030-87049-2_4

Download citation

Publish with us

Policies and ethics