Skip to main content
Log in

Parallel, Pipelined and Folded Architectures for Computation of 1-D and 2-D DCT in Image and Video Codec

  • Published:
Journal of VLSI signal processing systems for signal, image and video technology Aims and scope Submit manuscript

Abstract

Several parallel, pipelined and folded architectures with different throughput rates are presented for computation of DCT, one of the fundamental operations in image/video coding. This paper begins with a new decomposition algorithm for the 1-D DCT coefficient matrix. Then the 2-D DCT problem is converted into the corresponding 1-D counterpart through a regular index mapping technique. Afterward, depending on the trade-off between hardware complexity and speed performance, the derived decomposition algorithm is transformed into different parallel-pipelined and folded architectures that realize the butterfly operations and the post-processing operations. Compared to other DCT processor, our proposed parallel-pipelined architectures, without any intermediate transpose memory, have the features of modularity, regularity, locality, scalability, and pipelinability, with arithmetic hardware cost proportional to the logarithm of the transform length.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. A. Madisetti and A.N. Willson, Jr., "A 100 MHz 2-D 8 ×8 DCT/IDCT Processor for HDTV Applications," IEEE Trans. Circuits and Systems for Video Technology, vol. 5, no. 4, 1995, pp. 158–165.

    Article  Google Scholar 

  2. M.T. Sun, T.C. Chen, and A.M. Gottlieb, "VLSI Implementation of a 16 ×16 Discrete Cosine Transform," IEEE Trans. Circuits and Systems, vol. 36, no. 4, 1989, pp. 610–617.

    Article  Google Scholar 

  3. S. Uramoto et al., "A 100-MHz 2-D Discrete Cosine Transform Core Processor," IEEE Journal of Solid-State Circuits, vol. 27, no. 4, 1992, pp. 492–498.

    Article  Google Scholar 

  4. J. Chen and K.J.R. Liu, "AComplete Pipelined Parallel CORDIC Architecture for Motion Estimation," IEEE Trans. Circuits and Systems II, vol. 45, no. 6, 1998, pp. 653–660.

    Article  Google Scholar 

  5. J.H. Hsiao, L.G. Chen, T.D. Chiueh, and C.T. Chen, "High Throughput CORDIC-Based Systolic Array Design for the Dis-crete Cosine Transform," IEEE Trans. Circuits and Systems for Video Technology, vol. 5, no. 3, 1995, pp. 218–225.

    Article  Google Scholar 

  6. D.C. Kar and V.V.B. Rao, "A CORDIC-Based Unified Sys-tolic Architecture for Sliding Window Applications of Discrete Transforms," IEEE Trans. Signal Processing, vol. 44, no. 2, 1996, pp. 441–444.

    Article  Google Scholar 

  7. S.B. Pan and R.-H. Park, "Unified Systolic Arrays for Computa-tion of the DCT/ DST/ DHT," IEEE Trans. Circuits and Systems for Video Technology, vol. 7, no. 2, 1997, pp. 413–419.

    Article  MATH  Google Scholar 

  8. N.I. Cho and S.U. Lee, "Fast Algorithm and Implementation of 2-D DCT," IEEE Trans. Circuits and Systems, vol. 38, no. 3, 1991, pp. 297–305.

    Article  Google Scholar 

  9. C.-L. Wang and C.-Y. Chen, "High-Throughput VLSI Architec-tures for the 1-D and 2-D Discrete Cosine Transforms," IEEE Trans. Circuits and Systems for Video Technology, vol. 5, no. 1, 1995, pp. 31–40.

    Article  Google Scholar 

  10. Y.-T. Chang and C.-L. Wang, "A New Fast DCT Algorithm and Its Systolic VLSI Implementation," IEEE Trans. Circuits and Systems II: Analog and Digital Signal Processing, vol. 44, no. 11, 1997, pp. 959–962.

    Article  Google Scholar 

  11. Y.M. Huang and J.L. Wu, "A Refined Fast 2-D Discrete Cosine Transform Algorithm," IEEE Trans. Signal Processing, vol. 47, no. 3, 1999, pp. 904–907.

    Article  MATH  Google Scholar 

  12. Y.P. Lee, T.H. Chen, L.G. Chen, M.J. Chen, and C.W. Ku, "A Cost-Effective Architecture for 8 ×8 Two-Dimensional DCT/IDCT Using Direct Method," IEEE Trans. Circuits and Systems for Video Technology, vol. 7, no. 3, 1997, pp. 459–467.

    Article  Google Scholar 

  13. M.J. Narashimha and A.M. Peterson, "On the Computation of the Discrete Cosine Transforms," IEEE Trans. Commun., vol. COM-26, no. 6, 1978, pp. 934–936.

    Article  Google Scholar 

  14. L.R. Rabiner and B. Gold, Theory and Application of Digital Signal Processing, Prentice-Hall Pub., Englewood Cliffs, NJ, 1975.

    Google Scholar 

  15. S.-F. Hsiao and W.-R. Shiue, "A High-Throughput, Low Power Architecture and Its VLSI Implementation for DFT/IDFT Com-putation," in Proc. International Conference on Acoustics, Speech and Signal Processing (ICASSP'99), paper No. 1673, March 1999.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsiao, SF., Tseng, JM. Parallel, Pipelined and Folded Architectures for Computation of 1-D and 2-D DCT in Image and Video Codec. The Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology 28, 205–220 (2001). https://doi.org/10.1023/A:1011165524744

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1011165524744

Navigation