Abstract
Addition is the most frequent floating-point operation in modern microprocessors. Due to its complex shift-add-shift-round dataflow, floating-point addition can have a long latency. To achieve maximum system performance, it is necessary to design the floating-point adder to have minimum latency, while still providing maximum throughput. This paper proposes a new floating-point addition algorithm which exploits the ability of dynamically-scheduled processors to utilize functional units which complete in variable time. By recognizing that certain operand combinations do not require all of the steps in the complex addition dataflow, the average latency is reduced. Simulation on SPECfp92 applications demonstrates that a speedup in average addition latency of 1.33 can be achieved using this algorithm while maintaining single cycle throughput.
This work was supported by NSF under grant MIP93-13701.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
B. J. Benschneider, et al. A pipelined 50-Mhz CMOS 64-bit floating-point arithmetic processor. IEEE Journal of Solid-State Circuits, 24(5): 1317–1323, October 1989.
M. Birman, A. Samuels, G. Chu, T. Chuk, L. Hu, J. McLeod, and J. Barnes. Developing the WTL 3170/3171 Sparc floating-point co-processors. IEEE Micro, 10(1):55–63, February 1990.
M. P. Farmwald. On the Design of High Performance Digital Arithmetic Units. PhD thesis, Stanford University, August 1981.
E. Hokenek and R. K. Montoye. Leading-zero anticipator (LZA) in the IBM RISC System/6000 floating-point execution unit. IBM Journal of Research and Development, 34(1):71–77, January 1990.
ANSI/IEEE Std 754-1985, IEEE Standard for Binary Floating-Point Arithmetic.
P. Y. Lu, A. Jain, J. Kung, and P. H. Ang. A 32-mflop 32b CMOS floating-point processor. In Proceedings of the IEEE International Solid-State Circuits Conference, pages 28–29, 1988.
S. F. Oberman and M. J. Flynn. Design issues in division and other floating-point operations. In press IEEE Transactions on Computers, 1996.
N. Quach and M. Flynn. Design and implementation of the SNAP floating-point adder. Technical Report No. CSL-TR-91-501, Computer Systems Laboratory, Stanford University, December 1991.
N. T. Quach and M. J. Flynn. An improved algorithm for high-speed floating-point addition. Technical Report No. CSL-TR-90-442, Computer Systems Laboratory, Stanford University, August 1990.
N. T. Quach and M. J. Flynn. Leading one prediction — implementation, generalization, and application. Technical Report No. CSL-TR-91-463, Computer Systems Laboratory, Stanford University, March 1991.
SPEC Benchmark Suite Release 2/92.
A. Srivastava and A. Eustace. ATOM: A system for building customized program analysis tools. In Proceedings of the SIGPLAN '94 Conference on Programming Language Design and Implementation, pages 196–205, June 1994.
S. Waser and M. Flynn. Introduction to Arithmetic for Digital Systems Designers. Holt, Rinehart, and Winston, 1982.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oberman, S.F., Flynn, M.J. (1996). A variable latency pipelined floating-point adder. In: Bougé, L., Fraigniaud, P., Mignotte, A., Robert, Y. (eds) Euro-Par'96 Parallel Processing. Euro-Par 1996. Lecture Notes in Computer Science, vol 1124. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0024701
Download citation
DOI: https://doi.org/10.1007/BFb0024701
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61627-6
Online ISBN: 978-3-540-70636-6
eBook Packages: Springer Book Archive