skip to main content
research-article

Accelerating throughput-aware runtime mapping for heterogeneous MPSoCs

Authors Info & Claims
Published:16 January 2013Publication History
Skip Abstract Section

Abstract

Modern embedded systems need to support multiple time-constrained multimedia applications that often employ multiprocessor-systems-on-chip (MPSoCs). Such systems need to be optimized for resource usage and energy consumption. It is well understood that a design-time approach cannot provide timing guarantees for all the applications due to its inability to cater for dynamism in applications. However, a runtime approach consumes large computation requirements at runtime and hence may not lend well to constrained-aware mapping.

In this article, we present a hybrid approach for efficient mapping of applications in such systems. For each application to be supported in the system, the approach performs extensive design-space exploration (DSE) at design time to derive multiple design points representing throughput and energy consumption at different resource combinations. One of these points is selected at runtime efficiently, depending upon the desired throughput while optimizing for energy consumption and resource usage. While most of the existing DSE strategies consider a fixed multiprocessor platform architecture, our DSE considers a generic architecture, making DSE results applicable to any target platform. All the compute-intensive analysis is performed during DSE, which leaves for minimum computation at runtime. The approach is capable of handling dynamism in applications by considering their runtime aspects and providing timing guarantees.

The presented approach is used to carry out a DSE case study for models of real-life multimedia applications: H.263 decoder, H.263 encoder, MPEG-4 decoder, JPEG decoder, sample rate converter, and MP3 decoder. At runtime, the design points are used to map the applications on a heterogeneous MPSoC. Experimental results reveal that the proposed approach provides faster DSE, better design points, and efficient runtime mapping when compared to other approaches. In particular, we show that DSE is faster by 83% and runtime mapping is accelerated by 93% for some cases. Further, we study the scalability of the approach by considering applications with large numbers of tasks.

References

  1. Ahn, Y., Han, K., Lee, G., Song, H., Yoo, J., Choi, K., and Feng, X. 2008. SoCDAL: System-on-chip design acceLerator. ACM Trans. Des. Autom. Electron. Syst. 13, 17, 1--38. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Angiolini, F., Ceng, J., Leupers, R., Ferrari, F., Ferri, C., and Benini, L. 2006. An integrated open framework for heterogeneous MPSoC design space exploration. In Proceedings of the Design, Automation and Test Conference in Europe. 1--6. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Ascia, G., Catania, V., Di Nuovo, A. G., Palesi, M., and Patti, D. 2007. Efficient design space exploration for application specific systems-on-a-chip. J. Syst. Archit. 53, 733--750. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Benini, L., Bertozzi, D., and Milano, M. 2008. Resource management policy handling multiple use-cases in MPSoC platforms using constraint programming. In Proceedings of the International Conference on Logic Programming. 470--484. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Bonfietti, A., Lombardi, M., Milano, M., and Benini, L. 2009. Throughput constraint for synchronous data flow graphs. In Proceedings of the International Conference on Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. 26--40. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Borkar, S. 2007. Thousand core chips: A technology perspective. In Proceedings of the Annual Design Automation Conference. 746--749. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Carvalho, E. and Moraes, F. 2008. Congestion-aware task mapping in heterogeneous MPSoCs. In International Symposium on System-on-Chop (SoC). 1--4.Google ScholarGoogle Scholar
  8. Cho, S. H., Xanthopoulos, T., and Chandrakasan, A. 1999. A low power variable length decoder for MPEG-2 based on nonuniform fine-grain table partitioning. IEEE Trans. Very Large Scale Integ. (VLSI) Syst. 7, 2, 249--257. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Gangwal, O. P., Radulescu, A., Goossens, K., Pestana, S. G., and Rijpkema, E. 2005. Building predictable systems on chip: An analysis of guaranteed communication in the Æthereal network on chip. In Dynamic and Robust Streaming in and between Connected Consumer-Electronic Devices, vol. 3, Springer, 1--36.Google ScholarGoogle Scholar
  10. Geilen, M., Basten, T., Theelen, B., and Otten, R. 2005. An algebra of Pareto points. In Proceedings of the International Conference on Application of Concurrency to System Design. 88--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Ghamarian, A. H., Geilen, M. C. W., Stuijk, S., Basten, T., Theelen, B. D., Mousavi, M. R., Moonen, A. J. M., and Bekooij, M. J. G. 2006. Throughput analysis of synchronous data flow graphs. In Proceedings of the International Conference on Application of Concurrency to System Design. 25--36. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Giovanni, B., Fossati, L., and Sciuto, D. 2010. Decision-theoretic design space exploration of multiprocessor platforms. IEEE Trans. Comput. Aided Des. Integ. Cir. Sys. 29, 1083--1095. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Goossens, K., Dielissen, J., and Radulescu, A. 2005. AEthereal network on chip: Concepts, architectures, and implementations. IEEE Des. Test 22, 5, 414--421. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Grecu, C., Pande, P., Ivanov, A., and Saleh, R. 2005. Timing analysis of network on chip architectures for mp-soc platforms. Microelectronics J. 36, 9, 833--845.Google ScholarGoogle ScholarCross RefCross Ref
  15. Hentati, M., Aoudni, Y., Nezan, J., Abid, M., and Deforges, O. 2011. FPGA dynamic reconfiguration using the RVC technology: Inverse quantization case study. In Proceedings of the Conference on Design and Architectures for Signal and Image Processing. 1--7.Google ScholarGoogle Scholar
  16. Hu, J. and Marculescu, R. 2004. Energy-aware communication and task scheduling for network-on-chip architectures under real-time constraints. In Proceedings of the conference on Design, automation and Test in Europe (DATE'04). Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Jia, Z. J., Pimentel, A., Thompson, M., Bautista, T., and Nunez, A. 2010. NASA: A generic infrastructure for system-level MP-SoC design space exploration. In Proceedings of the Workshop on Embedded Systems for Real-Time Multimedia. 41--50.Google ScholarGoogle Scholar
  18. Keinert, J., Streubühr, M., Schlichter, T., Falk, J., Gladigau, J., Haubelt, C., Teich, J., and Meredith, M. 2009. SystemCoDesigner—an automatic ESL synthesis approach by design space exploration and behavioral synthesis for streaming applications. ACM Trans. Des. Autom. Electron. Syst. 14, 1, 1--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Kim, M., Banerjee, S., Dutt, N., and Venkatasubramanian, N. 2008. Energy-aware cosynthesis of real-time multimedia applications on MPSoCs using heterogeneous scheduling policies. ACM Trans. Embed. Comput. Syst. 7, 1, 1--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Kistler, M., Perrone, M., and Petrini, F. 2006. Cell multiprocessor communication network: Built for speed. IEEE Micro 26, 10--23. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Kumar, A., Fernando, S., Ha, Y., Mesman, B., and Corporaal, H. 2008. Multiprocessor systems synthesis for multiple use-cases of multiple applications on FPGA. ACM Trans. Des. Autom. Electron. Syst. 13, 40, 1--27. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Lee, E. A. and Messerschmitt, D. G. 1987. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput. 36, 24--35. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Leijten, J., van Meerbergen, J., Timmer, A., and Jess, J. 1997. PROPHID: A heterogeneous multi-processor architecture for multimedia. In Proceedings of the International Conference on Computer Design. 164--169. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Liu, W., Yuan, M., He, X., Gu, Z., and Liu, X. 2008. Efficient SAT-based mapping and scheduling of homogeneous synchronous dataflow graphs for throughput optimization. In Proceedings of the Real-Time Systems Symposium. 492--504. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lukasiewycz, M., Glass, M., Haubelt, C., and Teich, J. 2008. Efficient symbolic multi-objective design space exploration. In Proceedings of the Asia and South Pacific Design Automation Conference. 691--696. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Mariani, G., Avasare, P., Vanmeerbeeck, G., Ykman-Couvreur, C., Palermo, G., Silvano, C., and Zaccaria, V. 2010. An industrial design space exploration framework for supporting run-time resource management on multi-core systems. In Proceedings of the Conference on Design, Automation and Test in Europe. 196--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Moreira, O., Mol, J. J.-D., and Bekooij, M. 2007. Online resource management in a multiprocessor with a network-on-chip. In Proceedings of the Symposium on Applied Computing. 1557--1564. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Moreira, O., Valente, F., and Bekooij, M. 2007. Scheduling multiple independent hard-real-time jobs on a heterogeneous multiprocessor. In Proceedings of the International Conference on Embedded Software. 57--66. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Nollet, V., Avasare, P., Eeckhaut, H., Verkest, D., and Corporaal, H. 2008. Run-time management of a MPSoC containing FPGA fabric tiles. IEEE Trans. Very Large Scale Integr. Syst. 16, 24--33. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. OEIS. 2012. Encyclopedia of integer sequences. http://oeis.org/.Google ScholarGoogle Scholar
  31. Palermo, G., Silvano, C., and Zaccaria, V. 2005. Multi-objective design space exploration of embedded systems. J. Embed. Comput. 1, 305--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Palermo, G., Silvano, C., and Zaccaria, V. 2008. Robust optimization of SoC architectures: A multi-scenario approach. In Proceedings of the Workshop on Embedded Systems for Real-Time Multimedia. 7--12.Google ScholarGoogle Scholar
  33. Palma, J., Marcon, C., Moraes, F., Calazans, N., Reis, R., and Susin, A. 2005. Mapping embedded systems onto NoCs—The traffic effect on dynamic energy estimation. In Proceedings of the Symposium on Integrated Circuits and Systems Design. 196--201. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Paulin, P. G., Pilkington, C., Bensoudane, E., Langevin, M., and Lyonnard, D. 2004. Application of a multi-processor SoC platform to high-speed packet forwarding. In Proceedings of the Conference on Design, Automation and Test in Europe. 58--63. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Ren, J. and Kehtarnavaz, N. 2007. Comparison of power consumption for motion compensation and deblocking filters in high definition video coding. In Proceedings of the International Symposium on Consumer Electronics. 1--5.Google ScholarGoogle Scholar
  36. Rutten, M. J., van Eijndhoven, J. T. J., Jaspers, E. G. T., van der Wolf, P., Pol, E.-J. D., Gangwal, O. P., and Timmer, A. 2002. A heterogeneous multiprocessor architecture for flexible media processing. IEEE Des. Test 19, 39--50. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Schranzhofer, A., Chen, J.-J., and Thiele, L. 2010. Dynamic power-aware mapping of applications onto heterogeneous MPSoC platforms. IEEE Trans. Ind. Inf. 6, 4, 692--707.Google ScholarGoogle ScholarCross RefCross Ref
  38. Segars, S. 1997. ARM7TDMI power consumption. IEEE Micro 17, 4, 12--19. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Singh, A. K., Jigang, W., Prakash, A., and Srikanthan, T. 2009. Efficient heuristics for minimizing communication overhead in noc-based heterogeneous MPSoC platforms. In Proceedings of the International Symposium on Rapid System Prototyping. 55--60. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Singh, A. K., Kumar, A., and Srikanthan, T. 2011. A hybrid strategy for mapping multiple throughput-constrained applications on MPSoCs. In Proceedings of the International Conference on Compilers, Architectures and Synthesis of Embedded Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Singh, A. K., Srikanthan, T., Kumar, A., and Jigang, W. 2010. Communication-aware heuristics for run-time task mapping on NoC-based MPSoC platforms. J. Syst. Archit. 56, 242--255. Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Stuijk, S., Basten, T., Geilen, M. C. W., and Corporaal, H. 2007. Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In Proceedings of the 44th Annual Design Automation Conference. 777--782. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Stuijk, S., Geilen, M., and Basten, T. 2006. SDF3: SDF for free. In Proceedings of the 6th International Conference on Application of Concurrency to System Design. 276--278. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Stuijk, S., Geilen, M., and Basten, T. 2010. A predictable multiprocessor design flow for streaming applications with dynamic behaviour. In Proceedings of Euromicro Conference on Digital System Design. 548--555. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Sung, T.-Y., Shieh, Y.-S., Yu, C.-W., and Hsin, H.-C. 2006. High-efficiency and low-power architectures for 2-D DCT and IDCT based on CORDIC rotation. In International Conference on Parallel and Distributed Computing, Applications and Technologies. 191--196. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Texas Instruments. 2010. TMS320C6412 DSP. http://www.ti.com/product/tms320c6412.Google ScholarGoogle Scholar
  47. TILE-Gx100 2009. First 100-core processor with the new TILE-Gx family. http://www.tilera.com/products/processors/TILE-Gx_Family.Google ScholarGoogle Scholar
  48. van Stralen, P. and Pimentel, A. 2010. Scenario-based design space exploration of MPSoCs. In International Conference on Computer Design. 305--312.Google ScholarGoogle Scholar
  49. Vangal, S., Howard, J., Ruhl, G., Dighe, S., Wilson, H., Tschanz, J., Finan, D., Iyer, P., Singh, A., Jacob, T., Jain, S., Venkataraman, S., Hoskote, Y., and Borkar, N. 2007. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS. In Proceedings of the International Solid-State Circuits Conference. 98--589.Google ScholarGoogle Scholar
  50. Yang, P., Marchal, P., Wong, C., Himpe, S., Catthoor, F., David, P., Vounckx, J., and Lauwereins, R. 2002. Managing dynamic concurrent tasks in embedded real-time multimedia systems. In Proceedings of the International Symposium on System Synthesis. 112--119. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yang, Z., Kumar, A., and Ha, Y. 2010. An area-efficient dynamically reconfigurable spatial division multiplexing network-on-chip with static throughput guarantee. In Proceedings of the International Conference on Field-Programmable Technology. 389--392.Google ScholarGoogle Scholar
  52. Ykman-Couvreur, C., Avasare, P., Mariani, G., Palermo, G., Silvano, C., and Zaccaria, V. 2011. Linking run-time resource management of embedded multi-core platforms with automated design-time exploration. Computers Digital Techniques, IET 5, 2, 123--135.Google ScholarGoogle ScholarCross RefCross Ref
  53. Ykman-Couvreur, C., Nollet, V., Catthoor, F., and Corporaal, H. 2006. Fast multi-dimension multi-choice knapsack heuristic for MP-SoC run-time management. In Proceedings of the International Symposium on System-on-Chip. 1--4.Google ScholarGoogle Scholar
  54. Zamora, N. H., Hu, X., and Marculescu, R. 2007. System-level performance/power analysis for platform-based design of multimedia applications. ACM Trans. Des. Autom. Electron. Syst. 12, 2, 1--29. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Accelerating throughput-aware runtime mapping for heterogeneous MPSoCs

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image ACM Transactions on Design Automation of Electronic Systems
            ACM Transactions on Design Automation of Electronic Systems  Volume 18, Issue 1
            Special section on adaptive power management for energy and temperature-aware computing systems
            January 2013
            319 pages
            ISSN:1084-4309
            EISSN:1557-7309
            DOI:10.1145/2390191
            Issue’s Table of Contents

            Copyright © 2013 ACM

            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

            Publisher

            Association for Computing Machinery

            New York, NY, United States

            Publication History

            • Published: 16 January 2013
            • Accepted: 1 August 2012
            • Revised: 1 April 2012
            • Received: 1 September 2011
            Published in todaes Volume 18, Issue 1

            Permissions

            Request permissions about this article.

            Request Permissions

            Check for updates

            Qualifiers

            • research-article
            • Research
            • Refereed

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader