Abstract
Reconfigurable Architecture (RA), which provides extremely high energy efficiency for certain domains of applications, have one problem that current mapping algorithms for it do not scale well with the number of cores. One approach to this problem is using SIMD (Single Instruction Multiple Data) paradigm. However, SIMD can complicate the mapping problem by adding an additional dimension, i.e., iteration mapping, to the already inter-dependent problems of data mapping and operation mapping, and can significantly affect performance through memory bank conflicts. In this paper we introduce SIMD reconfigurable architecture, which allows for SIMD mapping at multiple levels of granularity, and investigate ways to minimize bank conflicts in a SIMD reconfigurable architecture with the related sub-problems taken into consideration. We further present data tiling and evaluate a conflict-free scheduling algorithm as a way to eliminate bank conflicts for a certain class of iteration and data mapping.
This work was supported in part by the Korea Science and Engineering Foundation(KOSEF) NRL Program grant funded by the Korea government(MEST) (No. 2011-0018609), the Engineering Research Center of Excellence Program of Korea Ministry of Education, Science and Technology(MEST) / Korea Science and Engineering Foundation(KOSEF) (Grant 2011-0000975), IDEC, and in part by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by MEST, under grant 2010-0011534.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Park, H., Fan, K., Mahlke, S.A., Oh, T., Kim, H., Kim, H.-S.: Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In: PACT 2008, pp. 166–176. ACM, New York (2008)
Wu, K., Kanstein, A., Madsen, J., Bereković, M.: MT-ADRES: Multithreading on Coarse-Grained Reconfigurable Architecture. In: Diniz, P.C., Marques, E., Bertels, K., Fernandes, M.M., Cardoso, J.M.P. (eds.) ARCS 2007. LNCS, vol. 4419, pp. 26–38. Springer, Heidelberg (2007)
Park, H., Park, Y., Mahlke, S.: Polymorphic pipeline array: A flexible multicore accelerator with virtualized execution for mobile multimedia applications. In: MICRO-42, pp. 370–380 (December 2009)
Kim, Y., Lee, J., Mai, T.X., Paek, Y.: Improving performance of nested loops on reconfigurable array processors. ACM Transactions on Architecture and Code Optimization (2012)
Mei, B., Vernalde, S., Verkest, D., De Man, H., Lauwereins, R.: ADRES: An Architecture with Tightly Coupled VLIW Processor and Coarse-Grained Reconfigurable Matrix. In: Cheung, P.Y.K., Constantinides, G.A. (eds.) FPL 2003. LNCS, vol. 2778, pp. 61–70. Springer, Heidelberg (2003)
Kim, Y., Lee, J., Shrivastava, A., Yoon, J., Paek, Y.: Memory-Aware Application Mapping on Coarse-Grained Reconfigurable Arrays. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) HiPEAC 2010. LNCS, vol. 5952, pp. 171–185. Springer, Heidelberg (2010)
Kim, Y., Lee, J., Shrivastava, A., Paek, Y.: Operation and data mapping for cgras with multi-bank memory. SIGPLAN Not. 45(4), 17–26 (2010)
Barua, R., Lee, W., Amarasinghe, S., Agarawal, A.: Compiler support for scalable and efficient memory systems. IEEE Trans. Comput. 50, 1234–1247 (2001)
Peleg, A., Weiser, U.: MMX technology extension to the intel architecture. IEEE Micro 16(4), 42–50 (1996)
Singh, H., Lee, M.-H., Lu, G., Kurdahi, F.J., Bagherzadeh, N., Chaves Filho, E.M.: MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans. Comput. 49(5), 465–481 (2000)
Lin, Y., Lee, H., Woh, M., Harel, Y., Mahlke, S., Mudge, T., Chakrabarti, C., Flautner, K.: Soda: A high-performance dsp architecture for software-defined radio. IEEE Micro 27(1), 114–123 (2007)
Woh, M., Seo, S., Mahlke, S., Mudge, T., Chakrabarti, C., Flautner, K.: Anysp: anytime anywhere anyway signal processing. In: Proceedings of the 36th Annual International Symposium on Computer Architecture, pp. 128–139. ACM (2009)
Dasika, G., Woh, M., Seo, S., Clark, N., Mudge, T., Mahlke, S.: Mighty-morphing power-SIMD. In: Proceedings of the 2010 International Conference on Compilers, Architectures and Synthesis for Embedded Systems, pp. 67–76. ACM (2010)
Kyo, S., Okazaki, S.: IMAPCAR: A 100 gops in-vehicle vision processor based on 128 ring connected four-way VLIW processing elements. J. Signal Process. Syst. 62, 5–16 (2011)
Fatemi, H., Mesman, B., Corporaal, H., Jonker, P.: RC-SIMD: Reconfigurable communication SIMD architecture for image processing applications. Journal of Embedded Computing 2, 167–179 (2006)
Bougard, B., De Sutter, B., Verkest, D., Van der Perre, L., Lauwereins, R.: A coarse-grained array accelerator for software-defined radio baseband processing. IEEE Micro 28, 41–50 (2008)
Binkert, N., Beckmann, B., Black, G., Reinhardt, S.K., Saidi, A., Basu, A., Hestness, J., Hower, D.R., Krishna, T., Sardashti, S., Sen, R., Sewell, K., Shoaib, M., Vaish, N., Hill, M.D., Wood, D.A.: The gem5 simulator. SIGARCH Comput. Archit. News 39, 1–7 (2011)
Wang, D., Ganesh, B., Tuaycharoen, N., Baynes, K., Jaleel, A., Jacob, B.: Dramsim: a memory system simulator. SIGARCH Comput. Archit. News 33, 100–107 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kim, Y., Lee, J., Lee, J., Mai, T.X., Heo, I., Paek, Y. (2012). Exploiting Both Pipelining and Data Parallelism with SIMD Reconfigurable Architecture. In: Choy, O.C.S., Cheung, R.C.C., Athanas, P., Sano, K. (eds) Reconfigurable Computing: Architectures, Tools and Applications. ARC 2012. Lecture Notes in Computer Science, vol 7199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28365-9_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-28365-9_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28364-2
Online ISBN: 978-3-642-28365-9
eBook Packages: Computer ScienceComputer Science (R0)