Connection of H.264/AVC hardware IPs using a specific Networks-on-Chip

https://doi.org/10.1016/j.micpro.2015.08.010Get rights and content

Abstract

Real time and high quality video coding assured by new codec’s as the H.264/AVC is gaining a wide interest in the research and industrial community for different applications. Several new hardware implementations (IPs) have been proposed for the various processing elements of video codec’s. At the same time, several works have been proposed to integrate this IPs in single System-on-Chip (SoC) aiming to provide multi-processor systems (MPSoC) based on shared-bus architecture. Generally these systems present the disadvantages of signal propagation delays and signal integrity and scalability depending on the number of cores used. Network-on-Chip is a newly introduced paradigm to overcome the communication problems of System-on-Chip architectures. In this paper we propose an optimized hardware architecture for the H.264/AVC using a Network-on-Chip. Firstly, we performed a 3 × 3 mesh Network-on-Chip and then we connect the hardware IPs proposed for the H.264/AVC using a specific Network-Interface. We use only the information available on the H.264/AVC to locate manually the position of IPs on the network. The synthesis results of our implementations are compared with the ordinary shared-bus based network. Finally, we embed the Network-on-Chip based system in a complete System-on-Chip using the MicroBlaze processor and we compare the results.

Introduction

Some years ago, researchers predicted that the future systems will contain various processing element (PE) or hardware IPs (Intellectual propriety) and will be realized using some billions of transistor integrated in the same chip. Using new semi-conductor fabrication technology (50 nm for example), these systems require the use of a new communication system together with new interconnections between the components inside the chip. Currently, the semi-conductor fabrication technology is around 28 nm and this increases the number of transistors on the same silicon areas and provides the possibility to integrate other IPs on the same chip. In order to meet the growing demands of customers, various complex systems integrated heterogeneous processing element and applications are proposed. Generally, these software/hardware IPs are coming from various disciplines which require a more adequate communication system compared to shared bus used in more existing systems. For instance aircraft control systems [1], hardware IP can be sensors, analogue to digital converters, digital comparators, processing modules, etc. Traditionally, the shared-bus is widely used in System-on-Chip (SoC) with a small number of cores. However it is well known that the shard-bus performances will decrease proportionally with the number of cores used. Consequently, they are not considered appropriate for future multi-core systems that have an important number of cores [2] and new architectures and scalable design approaches are needed. In order to cope with the growing interests in interconnected infrastructure, the Network-on-Chip (NoC) concept has been introduced which benefits computer systems by providing higher levels of performance and reliability [1].

Network-on-Chip is a newly introduced paradigm to overcome the communication problems of SoC architectures. Substituting the shared-bus network by NoC gives numerous advantages by reducing SoC manufacturing cost, SoC time to market, SoC time to volume, SoC design risk and it increases SoC performance. The performance of NoC from one topology to the other is different. As a result, topology selection is the very first step of designing NoC systems and mesh topology is the commonly accepted topology in several studies [3]. However, mapping applications represented by the weighted task graphs onto the mesh architecture is an NP-hard problem [4] especially for a large number of IPs. Application mapping is one of the most important dimensions in NoC research. It maps the cores of the application to the routers of the NoC topology, affecting the overall performance and power requirement of the system [5].

On the other hand, researchers and industrial communities are showing wide interest in real-time and high-quality video coding for different applications. The H.264/AVC [6] is a recent standard for high performance video coding; it can be successfully exploited in several scenarios including digital video broadcasting, high-definition TV and DVD-based systems, which require sustaining up to tens of Mbits/s [7]. While hardware architectures for the H.264/AVC and their elementary modules have been often emphasized, very little work [8], [9] has been published on the connection of these modules with the communication system of a SoC and especially using the NoC. This paper presents a detailed hardware implementation of a video coding application using an optimized NoC architecture. We demonstrate the benefits of parallel communication using a case study of the H.264 video encoder application. Our proposed hardware implementations of the elementary modules of the H.264/AVC and modules proposed by others authors are used to compare the connections using a shared-bus and N × N mesh based NoC.

The rest of this paper is organized as follows: In Section 2 we introduce the Network-on-Chip as a new communication system. In Section 3, we examine several hardware implementations of the H.264/AVC modules and the integration of these modules into a System-on-Chip using a standard bus. Section 4 summarizes the proposed communication system for the H.264/AVC using a specific Networks-on-Chip. Then, in Section 5, we embark synthesis results with comparisons. Finally, in Section 6 we present the conclusions and perspectives for future work.

Section snippets

Network on Chip (NoC)

On-chip communications have been traditionally provided by direct connections and shared buses. In a direct connection the transmitter is physically connected to the receiver. Direct connections provide the best performance in term of bandwidth. However, they are difficult to design and to reuse in the case of complex systems that have a large number of processing units [2]. A bus on the contrary may be easily reused, but it allows only a single communication at a given time between modules

The H.264 Advanced Video Coding

In this section, we present a case study of the H.264 encoder in which we show several hardware implementations of their elementary modules. We will start by describing the H.264/AVC with various details of hardware implementations of the elementary modules. Then we will continue with a hardware implementation of this encoder in a complete SoC based on the MicroBlaze processor. We use the PLB (Processor Local Bus) bus to achieve a SoC and we use specifics interfaces to connect various modules

The proposed communication system for the H.264 encoder

In Fig. 2, we have summarized the H.264 encoder. Some common elementary modules between the intra-coding and the inter-coding chains can be noticed. This common part is mainly composed by: the direct DCT, the direct quantization, the inverse quantization, the inverse DCT and the deblocking filter. This involves multiple data at inputs and outputs of these processing modules as specified in [14]. Several solutions have been proposed to solve this problem in the case of a hardware implementation

Results and discussions

The hardware architectures used for the H.264/AVC modules are reproduced using vhdl/RTL level. These architectures are simulated and synthesized to validate their performances and to calculate the processing time for each block of pixels. The proposed NoC is also described in vhdl, and then simulated and synthesized separately. Several adaptations are required before the connection of IPs. We use ModelSim6.1 for simulation and Xilinx-ISE12.2 for synthesis. The EDK tool is used to design the

Conclusion

Most System-on-Chip applications use shared-bus interconnection to ensure communication between its integrated IPs. The bus is used because of its low-cost and its simple control characteristics. However, such shared bus interconnection has some limitation in its scalability. Because only one master can utilize the bus at one time, all the bus accesses should be serialized by the used arbitrator. Network-on-Chip is a newly introduced paradigm to overcome the bus communication problems of SoC

Acknowledgements

This research was supported by Tassily project (2011–2014), CMEP financing contract No. 11MDU844 and EGIDE contract No. 24427QD. All synthesizes results presented in this paper were obtained using FPGA-based boards from LE2I laboratory, burgundy university. We would also like to thank Professor Bourennane El-Bay, from Burgundy University, and to Pr. Toumi Salah, from Annaba University, for their useful comments. The authors would also like to thank all the anonymous reviewers, for all their

Kamel Messaoudi received the engineering degree in automatic from the University of Annaba, Algeria (1997), and the magister degree in Industrial Computing and image processing from the university of Guelma, Algeria (2000). He received the PhD degree in electronics from the Badji Mokhtar University, Annaba (2012) and the PhD degree in instrumentation and image processing from Burgundy university Dijon, France (2012). He is associate professor in the department of electronics, University

References (30)

  • Dragomir Milojevic et al.

    Power Dissipation of the Network-on-Chip in Multi-Processor System-on-Chip Dedicated for Video Coding Applications

    J. Signal Process. Syst.

    (2009)
  • D. Milojevic, L. Montperrus, D. Verkest, Power dissipation of the network-on-chip in a system-on-chip for MPEG-4 video...
  • Davide Bertozzi, Antoine Jalabert, Srinivasan Murali, Rutuparna Tamhankar, Stergios Stergiou, Luca Benini, Giovanni De...
  • Y.W. Huang et al.

    Analysis, fast algorithm, and VLSI architecture design for H.264/AVC Intra Frame Coder

    IEEE Transact. Circuits Syst. Video Tech

    (2005)
  • Kamel Messaoudi, El-Bay Bourennane, Salah Toumi, Gilberto Ochoa, Performance Comparison of Two Hardware Implementations...
  • Cited by (5)

    • Hardware - software co-design framework for sum of absolute difference based block matching in motion estimation

      2020, Microprocessors and Microsystems
      Citation Excerpt :

      It has 16 × 16 processing element array and adder tree. Kamel Messaoudi et al. [15] proposed a network-on-chip architecture for interconnection hardware cores of H.264/AVC encoder. They analysed it in SoC platform with MicroBlaze processor.

    • Network-on-chip application mapping using genetic algorithm for a complex hardware implementation of video encoders

      2020, CCSSP 2020 - 1st International Conference on Communications, Control Systems and Signal Processing

    Kamel Messaoudi received the engineering degree in automatic from the University of Annaba, Algeria (1997), and the magister degree in Industrial Computing and image processing from the university of Guelma, Algeria (2000). He received the PhD degree in electronics from the Badji Mokhtar University, Annaba (2012) and the PhD degree in instrumentation and image processing from Burgundy university Dijon, France (2012). He is associate professor in the department of electronics, University Constantine 1, Algeria. His research interests includes: Image processing, Embedded system, FPGA design and Real time implementation.

    Hichem Mayache received the engineering degree in automatic from the University of Annaba, Algeria. He is currently pursuing the PhD in electronic at the university of Annaba. He is Assistant Professor in the department of Electronics, University of Tebessa, Algeria. His research interests includes: Network-on-Chip, telecommunication, embedded system, FPGA design and Real time implementation.

    Atef Benhaoues received the engineering degree in automatic from the University of Annaba, Algeria. He is currently pursuing the PhD in electronic at the university of Annaba. He is Assistant Professor in the department of Electronics, University of Djelfa, Algeria. His research interests includes: Network-on-Chip, telecommunication, embedded system, FPGA design and Real time implementation.

    El Bay Bourennane is Professor of Electronics at the laboratory LE2I, (Laboratory of Electronics, Computer Science and Image). University of Burgundy, Dijon, France.

    Salah Toumi is Professor of Electronics and LERICA laboratory director. Badji Mokhtar University, Annaba, Algeria. His research interests includes: Telecommunication, embedded system and micro-electronic.

    View full text