Skip to main content
Log in

A variable-precision square root implementation for field programmable gate arrays

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Applications requiring variable-precision arithmetic often rely on software implementations because custom hardware is either unavailable or too costly to build. By using the flexibility of the Xilinx XC4010 field programmable gate arrays, we present a hardware implementation of square root that is easily tailored to any desired precision. Our design consists of three types of modules: a control logic module, a data path module to extend the precision in 4-bit increments, and an interface module to span multiple chips. Our data path design avoids the common problem of large fan-out delay in the critical path. Cycle time is independent of precision, and operation latency can be independent of interchip communication delays.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Abbreviations

Sj :

square root digit of weight 2−j

S j ε :

{−1, 0, 1}

S[j]:

computed square root value as of stepj

S sj :

sign bit in the representation ofS j in sign and magnitude form

S m j :

magnitude bit in the representation ofS j in sign and magnitude form

w[j]:

residual at stepj in two's complement carry-save representation

a :

sum vector in the carry-save representation of 2w[j]

b :

carry vector in the carry-save representation of 2w[j]

a i :

bit of weight 2−i in the sum vector,a

bi :

bit of weight 2−i in the carry vector,b

T[j]=−S[j − 1]sj − s 2j 2−(j+1) T i :

bit of weight 2−i inT

References

  • Bertin, P., Roncin, D., and Vuillemin, J. 1993. Programmable active memories: A performance assessment. DEC Research Rept. No. 24, Paris Research Laboratory.

  • Brent, R. 1976. Fast multiple-precision zero-finding methods and the complexity of elementary function evalua- tion.JACM, 23, 2 (Apr.): 242–251.

    Google Scholar 

  • Buell, D.A., and Ward, R.L. 1989. A multiprecise integer arithmetic package.J. Supercomputing, 3: 89–107.

    Google Scholar 

  • De Micheli, G., and Yip, R. 1990. Logic transformations for synchronous logic synthesis. InConf. Proc., Hawaii Internat. Conf. on System Sciences (Kailua-Kona, Hawaii, Jan. 2–5), IEEE Comp. Soc. Press, pp. 407–416.

    Google Scholar 

  • Ercegovac, M.D. 1984. On-line arithmetic: An overview. InConf. Proc., SPIE Vol. 495-Real Time Signal Pro- cessing VII, pp. 86–92.

  • Ercegovac, M.D., and Lang, T. 1985. A division algorithm with prediction of quotient digits. InConf. Proc., 7th IEEE Symp. on Computer Arithmetic (Urbana, Ill., June 4–6), IEEE Comp. Soc. Press, pp. 51–56.

    Google Scholar 

  • Ercegovac, M.D., and Lang, T. 1987. On-the-fly conversion of redundant into conventional representations.IEEE Trans. Comp., C-36, 7 (July): 895–897.

    Google Scholar 

  • Ercegovac, M.D., and Lang, T. 1988. UCLA Computer Science Department CS252a course notes. Comp. Sci. Dept., Univ. of Calif., Los Angeles.

    Google Scholar 

  • Ercegovac, M.D., and Lang, T. 1991. Module to perform multiplication, division, and square root in systolic arrays for matrix computations.J. Parallel and Distr. Comp., 11: 212–221.

    Google Scholar 

  • Ercegovac, M.D., and Lang, T. 1994.Digit-Recurrence Algorithms and Implementations far Division and Square Root. Kluwer Academic, Boston.

    Google Scholar 

  • Kanada, Y. 1988. Vectorization of multiple-precision arithmetic program and 201,326,000 decimal digits of pi calculation. InProc., Supercomputing '88, Vol. 2,Science and Applications, IEEE Comp. Soc. Press, pp. 117–128.

  • Leiserson, C.E., Rose, F.M., and Saxe, J.B. 1983. Optimizing synchronous circuitry by retiming. InThird Caltech Conf. on Very Large Scale Integration (R. Bryant, ed.), Comp. Sci. Press, Rockville, Md., pp. 87–116.

    Google Scholar 

  • Louie, M.E., and Ercegovac, M.D. 1992. Mapping division algorithms to field programmable gate arrays. InConf. Proc., 26th Asilomar Conf. on Signals, Systems, and Computers (Pacific Grove, Calif., Oct. 26–28), IEEE Comp. Soc. Press, pp. 371–375.

    Google Scholar 

  • Louie, M.E., and Ercegovac, M.D. 1993. On digit-recurrence division implementations for field programmable gate arrays. InConf. Proc., 11th IEEE Symp. on Computer Arithmetic (Windsor, Ontario, Canada, June 29–July 2), IEEE Comp. Soc. Press., pp. 202–209.

    Google Scholar 

  • Malik, S., Sentovich, E.M., Brayton, R.K., and Sangiovanni-Vincentelli, A. 1990. Retiming and resynthesis: Optimizing sequential networks with combinational techniques. InConf. Proc., Hawaii Internat. Conf. on System Sciences (Kailua-Kona, Hawaii, Jan. 2–5), IEEE Comp. Soc. Press, pp. 397–406.

    Google Scholar 

  • Shand, M., Bertin, P., and Vuillemin, J. 1991. Hardware speedups in long integer multiplication.Comp. Archi- tecture News, 19, 1: 106–114.

    Google Scholar 

  • Texas Instruments. 1992.TMS390S10 microSPARC Reference Guide. Texas Instruments, Inc.

  • Xilinx. 1992.XC4000 Logic Cell Array Family-Technical Data. Xilinx, Inc., San Jose, Calif.

    Google Scholar 

  • Xilinx. 1993.XACT4000 Design Implementation Software. Xilinx, Inc., San Jose, Calif.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Louie, M.E., Ercegovac, M.D. A variable-precision square root implementation for field programmable gate arrays. J Supercomput 9, 315–336 (1995). https://doi.org/10.1007/BF01212874

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF01212874

Keywords

Navigation