Skip to main content
Log in

Novel algorithm for complex bit reversal: employing vector permutation and branch reduction methods

  • Published:
Journal of Zhejiang University-SCIENCE A Aims and scope Submit manuscript

Abstract

We present novel vector permutation and branch reduction methods to minimize the number of execution cycles for bit reversal algorithms. The new methods are applied to single instruction multiple data (SIMD) parallel implementation of complex data floating-point fast Fourier transform (FFT). The number of operational clock cycles can be reduced by an average factor of 3.5 by using our vector permutation methods and by 1.1 by using our branch reduction methods, compared with conventional implementations. Experiments on MPC7448 (a well-known SIMD reduced instruction set computing processor) demonstrate that our optimal bit-reversal algorithm consistently takes fewer than two cycles per element in complex array operations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Burrus, C.S., 1988. Unscrambling for fast DFT algorithms. IEEE Trans. Acoust. Speech Signal Process., 36(7):1086–1087. [doi:10.1109/29.1631]

    Article  MATH  Google Scholar 

  • Carter, L., Gatlin, K.S., 1998. Towards an Optimal Bit-reversal Permutation Program. Proc. 39th Annual Symp. On Foundations of Computer Science, p.544–553. [doi:10.1109/SFCS.1998.743505]

  • Chakraborty, T.S., Chakrabarti, S., 2008. On Output Reorder Buffer Design of Bit Reversed Pipelined Continuous Data FFT Architecture. IEEE Asia Pacific Conf. on Circuits and Systems, p.1132–1135. [doi:10.1109/APCCAS.2008.4746224]

  • Chen, L., Hu, Z., Lin, J.M., Gao, G.R., 2007. Optimizing the Fast Fourier Transform on Multi-core Architecture. IEEE Int. Parallel and Distributed Processing Symp., p.1–8. [doi:10.1109/IPDPS.2007.370639]

  • Drouiche, K., 2001. A new efficient computational algorithm for bit reversal mapping. IEEE Trans. Signal Process., 49(1):251–254. [doi:10.1109/78.890370]

    Article  MathSciNet  Google Scholar 

  • Evans, D., 1987. An improved digital-reversal permutation algorithm for the fast Fourier transforms. IEEE Trans. Acoust. Speech Signal Process., 35(8):1120–1125. [doi:10.1109/TASSP.1987.1165252]

    Article  MathSciNet  Google Scholar 

  • Evans, D., 1989. A second improved digital-reversal permutation algorithm for the fast Fourier transforms. IEEE Trans. Acoust. Speech Signal Process., 37(8):1288–1291. [doi:10.1109/29.31278]

    Article  MATH  Google Scholar 

  • Freescale Semiconductor, 2005. MPC7450 RISC Microprocessor Family Reference Manual [online]. Available from http://www.freescale.com/files/32bit/doc/ref_manual/MPC7450UM.pdf [Rev.5].

  • Freescale Semiconductor, 2007. MPC7450 RISC Microprocessor Family Software Optimization Guide [online]. Available from http://www.freescale.com/files/32bit/doc/app_note/AN2203.pdf [Rev.2].

  • Jana, P.K., Sinha, K., 2008. Permutation algorithms on optical multi-trees. Comput. Math. Appl., 56(10):2656–2665. [doi:10.1016/j.camwa.2008.03.060]

    Article  MathSciNet  MATH  Google Scholar 

  • Lloyd, B., Boyd, C., Govindaraju, N.K., 2008. Fast Computation of General Fourier Transforms on GPUs. IEEE Int. Conf. on Multimedia and Expo, p.5–8. [doi:10.1109/ICME.2008.4607357]

  • Lokhmotov, A., Mycroft, A., 2007. Optimal Bit-reversal Using Vector Permutations. Proc. 19th Annual ACM Symp. On Parallel Algorithms and Architectures, p.198–199. [doi:10.1145/1248377.1248411]

  • Marti-Puig, P., 2009. Two families of radix-2 FFT algorithms with ordered input and output data. IEEE Signal Process. Lett., 16(2):65–68. [doi:10.1109/LSP.2008.2003993]

    Article  Google Scholar 

  • Pei, S.C., Chang, K.W., 2007. Efficient bit and digital reversal algorithm using vector calculation. IEEE Trans. Signal Process., 55(3):1173–1175. [doi:10.1109/TSP.2006.887567]

    Article  MathSciNet  Google Scholar 

  • Püschel, M., Milder, P.A., Hoe, J.C., 2009. Permuting streaming data using RAMs. J. ACM, 56(2):Article No. 10, p.1–34. [doi:10.1145/1502793.1502799]

    Article  MathSciNet  MATH  Google Scholar 

  • Sundararajan, D., Ahmad, M.O., Swamy, M.N.S., 1994. A fast FFT bit-reversal algorithm. IEEE Trans. Circuits Syst. II: Anal. Dig. Signal Process., 41(10):701–703. [doi:10.1109/82.329741]

    Article  Google Scholar 

  • Walker, J., 1990. A new bit-reversal algorithm. IEEE Trans. Acoust. Speech Signal Process., 38(8):1472–1473. [doi:10.1109/29.57586]

    Article  MATH  Google Scholar 

  • Yong, A.A., 1991. A better FFT bit-reversal algorithm without tables. IEEE Trans. Signal Process., 39(10):2365–2367. [doi:10.1109/78.91199]

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Feng Yu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, F., Wang, Zk. & Ge, Rf. Novel algorithm for complex bit reversal: employing vector permutation and branch reduction methods. J. Zhejiang Univ. Sci. A 10, 1492–1499 (2009). https://doi.org/10.1631/jzus.A0920290

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1631/jzus.A0920290

Key words

CLC number

Navigation