skip to main content
10.1145/3577193.3593731acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Using Additive Modifications in LU Factorization Instead of Pivoting

Published:21 June 2023Publication History

ABSTRACT

Direct solvers for dense systems of linear equations commonly use partial pivoting to ensure numerical stability. However, pivoting can introduce significant performance overheads, such as synchronization and data movement, particularly on distributed systems. To improve the performance of these solvers, we present an alternative to pivoting in which numerical stability is obtained through additive updates. We implemented this approach using SLATE, a GPU-accelerated numerical linear algebra library, and evaluated it on the Summit supercomputer. Our approach provides better performance (up to 5-fold speedup) than Gaussian elimination with partial pivoting for comparable accuracy on most of the tested matrices. It also provides better accuracy (up to 15 more digits) than Gaussian elimination with no pivoting for comparable performance.

References

  1. Ahmad Abdelfattah, Azzam Haidar, Stanimire Tomov, and Jack Dongarra. 2017. Novel HPC Techniques to Batch Execution of Many Variable Size BLAS Computations on GPUs. In Proceedings of the International Conference on Supercomputing (ICS '17). Association for Computing Machinery, New York, NY, USA, 1--10. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Patrick R. Amestoy, Alfredo Buttari, Jean-Yves L'Excellent, and Theo A. Mary. 2019. Bridging the Gap between Flat and Hierarchical Low-Rank Matrix Formats: The Multilevel Block Low-Rank Format. SIAM Journal on Scientific Computing 41, 3 (Jan. 2019), A1414--A1442. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Knud D. Andersen. 1996. A Modified Schur-complement Method for Handling Dense Columns in Interior-Point Methods for Linear Programming. ACM Trans. Math. Software 22, 3 (Sept. 1996), 348--356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Erin Carson and Nicholas J. Higham. 2018. Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions. SIAM Journal on Scientific Computing 40, 2 (Jan. 2018), A817--A847. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Chang Zhai, Yingyu Liu, Shugang Jiang, Zhongchao Lin, and Xunwang Zhao. 2020. Integrated Simulation and Analysis of Super Large Slotted Waveguide Array. Applied Computational Electromagnetics Society Journal 35, 7 (July 2020), 813--820.Google ScholarGoogle Scholar
  6. Ali Charara, David Keyes, and Hatem Ltaief. 2019. Batched Triangular Dense Linear Algebra Kernels for Very Small Matrix Sizes on GPUs. ACM Trans. Math. Software 45, 2 (May 2019), 15:1--15:28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. James W. Demmel, Nicholas J. Higham, and Robert S. Schreiber. 1995. Stability of Block LU Factorization. Numerical Linear Algebra with Applications 2, 2 (1995), 173--190. Google ScholarGoogle ScholarCross RefCross Ref
  8. Simplice Donfack, Jack Dongarra, Mathieu Faverge, Mark Gates, Jakub Kurzak, Piotr Luszczek, and Ichitaro Yamazaki. 2015. A Survey of Recent Developments in Parallel Implementations of Gaussian Elimination. Concurrency and Computation: Practice and Experience 27, 5 (2015), 1292--1309. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Mathieu Faverge, Julien Herrmann, Julien Langou, Bradley Lowery, Yves Robert, and Jack Dongarra. 2015. Mixing LU and QR Factorization Algorithms to Design High-Performance Dense Linear Algebra Solvers. J. Parallel and Distrib. Comput. 85 (2015), 32--46. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Mark Gates, Jakub Kurzak, Ali Charara, Asim YarKhan, and Jack Dongarra. 2019. SLATE: Design of a Modern Distributed and Accelerated Linear Algebra Library. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '19). Association for Computing Machinery, Denver, CO, USA, 1--18. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. George A. Geist and Charles H. Romine. 1988. LU Factorization Algorithms on Distributed-Memory Multiprocessor Architectures. SIAM J. Sci. Statist. Comput. 9, 4 (July 1988), 639--649. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Gene H. Golub and Chales F. Van Loan. 2013. Matrix Computations (fourth ed.). The John Hopkins University Press, Baltimore, MD, USA.Google ScholarGoogle Scholar
  13. T. N. E. Greville. 1966. Note on the Generalized Inverse of a Matrix Product. SIAM Rev. 8, 4 (Oct. 1966), 518--521. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Laura Grigori, James W. Demmel, and Hua Xiang. 2011. CALU: A Communication Optimal LU Factorization Algorithm. SIAM J. Matrix Anal. Appl. 32, 4 (Oct. 2011), 1317--1350. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Wolfgang Hackbusch. 2015. Hierarchical Matrices: Algorithms and Analysis. Springer, Berlin, Heidelberg. Google ScholarGoogle ScholarCross RefCross Ref
  16. William W. Hager. 1989. Updating the Inverse of a Matrix. SIAM Rev. 31, 2 (June 1989), 221--239. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Nicholas J. Higham. 2002. Accuracy and Stability of Numerical Algorithms (second ed.). Society for Industrial and Applied Mathematics, Philadelphia, PA, USA. Google ScholarGoogle ScholarCross RefCross Ref
  18. Awais Khan, Hyogi Sim, Sudharshan S. Vazhkudai, Ali R. Butt, and Youngjae Kim. 2021. An Analysis of System Balance and Architectural Trends Based on Top500 Supercomputers. In The International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2021). Association for Computing Machinery, New York, NY, USA, 11--22. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Grzegorz Kwasniewski, Marko Kabic, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Jens Eirik Saethre, André Gaillard, Timo Schneider, Maciej Besta, Anton Kozhevnikov, Joost VandeVondele, and Torsten Hoefler. 2021. On the Parallel I/O Optimality of Linear Algebra Kernels: Near-Optimal Matrix Factorizations. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC '21). Association for Computing Machinery, New York, NY, USA, 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Cornwall Lau, E. F. Jaeger, Nicola Bertelli, Lee A. Berry, David L. Green, Masanori Murakami, Jin M. Park, Robert I. Pinsker, and Ron Prater. 2018. AORSA Full Wave Calculations of Helicon Waves in DIII-D and ITER. Nuclear Fusion 58, 6, Article 066004 (April 2018), 13 pages. Google ScholarGoogle ScholarCross RefCross Ref
  21. Xiaoye S. Li and J.W. Demmel. 1998. Making Sparse Gaussian Elimination Scalable by Static Pivoting. In SC '98: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing. IEEE Computer Society, San Jose, CA, USA, 34--34. Google ScholarGoogle ScholarCross RefCross Ref
  22. Neil Lindquist, Mark Gates, Piotr Luszczek, and Jack Dongarra. 2022. Threshold Pivoting for Dense LU Factorization. In 2022 IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Heterogeneous Systems (ScalAH). IEEE Computer Society, Dallas, Texas, USA, 34--42. Google ScholarGoogle ScholarCross RefCross Ref
  23. Neil Lindquist, Piotr Luszczek, and Jack Dongarra. 2020. Replacing Pivoting in Distributed Gaussian Elimination with Randomized Techniques. In 2020 IEEE/ACM 11th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA). IEEE Press, Atlanta, GA, USA, 35--43. Google ScholarGoogle ScholarCross RefCross Ref
  24. Victor Y. Pan and Liang Zhao. 2017. Numerically Safe Gaussian Elimination with No Pivoting. Linear Algebra Appl. 527 (Aug. 2017), 349--383. Google ScholarGoogle ScholarCross RefCross Ref
  25. D. Stott Parker. 1995. Random Butterfly Transformations with Applications in Computational Linear Algebra. Technical Report CSD-950023. Computer Science Department, UCLA, Los Angeles, CA, USA. 20 pages.Google ScholarGoogle Scholar
  26. Gilbert W Stewart. 1974. Modifying Pivot Elements in Gaussian Elimination. Math. Comp. 28, 126 (1974), 537--542. Google ScholarGoogle ScholarCross RefCross Ref
  27. John Todd. 1977. Basic Numerical Mathematics. Birkhäuser, Basel. Google ScholarGoogle ScholarCross RefCross Ref
  28. Lloyd N. Trefethen and Robert S. Schreiber. 1990. Average-Case Stability of Gaussian Elimination. SIAM J. Matrix Anal. Appl. 11, 3 (July 1990), 335--360. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Max A. Woodbury. 1950. Inverting Modified Matrices. Memorandum Report, Vol. 42. Statistical Research Group, Princeton, NJ.Google ScholarGoogle Scholar
  30. E. L. Yip. 1986. A Note on the Stability of Solving a Rank-p Modification of a Linear System by the Sherman-Morrison-Woodbury Formula. SIAM J. Sci. Statist. Comput. 7, 2 (April 1986), 507--513. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Hong Zheng and Jianlin Li. 2007. A Practical Solution for KKT Systems. Numerical Algorithms 46, 2 (Oct. 2007), 105--119. Google ScholarGoogle ScholarCross RefCross Ref
  32. G. Zielke. 1974. Testmatrizen mit maximaler Konditionszahl. Computing 13, 1 (March 1974), 33--54. Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Using Additive Modifications in LU Factorization Instead of Pivoting

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ICS '23: Proceedings of the 37th International Conference on Supercomputing
          June 2023
          505 pages
          ISBN:9798400700569
          DOI:10.1145/3577193

          Copyright © 2023 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 21 June 2023

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate584of2,055submissions,28%
        • Article Metrics

          • Downloads (Last 12 months)100
          • Downloads (Last 6 weeks)14

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader