An extra-component method for evaluating fast matrix-vector multiplication with special functions

Terekhov, Andrew V.

doi:10.1007/s11075-022-01383-y

An extra-component method for evaluating fast matrix-vector multiplication with special functions

Original Paper
Published: 10 August 2022

Volume 92, pages 2189–2217, (2023)
Cite this article

Numerical Algorithms Aims and scope Submit manuscript

Andrew V. Terekhov ORCID: orcid.org/0000-0002-6585-9057¹

222 Accesses
1 Altmetric
Explore all metrics

Abstract

In calculating integral or discrete transforms, use has been made of fast algorithms for multiplying vectors by matrices whose elements are specified as values of special (Chebyshev, Legendre, Jacobi, Laguerre, etc.) functions. To achieve higher efficiency, a convenient general approach for calculating matrix-vector products for some class of problems is proposed. A series of fast simple-structure algorithms developed under this approach can be efficiently implemented with software based on modern microprocessors. Computational experiments show that the method has a precomputation complexity of $O(N^2 \log N)$ and an execution complexity of $O(N \log N)$ in general. This agrees with the analytical estimates obtained for trigonometric functions. In addition, for trigonometric functions, the number of arithmetic operations at the preliminary stage can be decreased to $O(N \log N)$. The results of computational experiments with the algorithms show that these procedures can decrease the calculation time by several orders of magnitude compared with a conventional direct method of matrix-vector multiplication.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Algorithm for the Vandermonde Matrix-Vector Multiplication with Reduced Multiplicative Complexity

Polynomial Evaluation and Interpolation and Transformations of Matrix Structures

An Extension and Efficient Calculation of the Horner’s Rule for Matrices

Data availability

Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.

References

Cooley, J., Tukey, J.: An algorithm for the machine calculation of complex Fourier series. Mathematics of Computation 19(90), 297–301 (1965)
MathSciNet MATH Google Scholar
Dutt, A., Rokhlin, V.: Fast Fourier transforms for nonequispaced data. SIAM Journal on Scientific Computing 14(6), 1368–1393 (1993)
MathSciNet MATH Google Scholar
Greengard, L., Lee, J.-Y.: Accelerating the nonuniform fast Fourier transform. SIAM Review 46(3), 443–454 (2004)
MathSciNet MATH Google Scholar
Plonka, G., Potts, D., Steidl, G., Tasche, M.: Numerical Fourier Analysis. Springer International Publishing, New York (2018)
MATH Google Scholar
Jackson, J.I., Meyer, C.H., Nishimura, D.G., Macovski, A.: Selection of a convolution function for Fourier inversion using gridding (computerised tomography application). IEEE Transactions on Medical Imaging 10(3), 473–478 (1991)
Google Scholar
Boyd, J.P.: A fast algorithm for Chebyshev, Fourier, and sinc interpolation onto an irregular grid. Journal of Computational Physics 103(2), 243–257 (1992)
MathSciNet MATH Google Scholar
Beylkin, G.: On the fast Fourier transform of functions with singularities. Applied and Computational Harmonic Analysis 2(4), 363–381 (1995)
MathSciNet MATH Google Scholar
Liu, Q.H., Nguyen, N.: An accurate algorithm for nonuniform fast Fourier transforms (NUFFT’s). IEEE Microwave and Guided Wave Letters 8(1), 18–20 (1998)
Google Scholar
Fessler, J.A., Sutton, B.P.: Nonuniform fast Fourier transforms using min-max interpolation. IEEE Transactions on Signal Processing 51(2), 560–574 (2003)
MathSciNet MATH Google Scholar
Anderson, C., Dahleh, M.D.: Rapid computation of the discrete Fourier transform. SIAM Journal on Scientific Computing 17(4), 913–919 (1996)
MathSciNet MATH Google Scholar
Bailey, D.H., Swarztrauber, P.N.: The fractional Fourier transform and applications. SIAM Review 33(3), 389–404 (1991)
MathSciNet MATH Google Scholar
Ruiz-Antolín, D., Townsend, A.: A nonuniform fast Fourier transform based on low rank approximation. SIAM Journal on Scientific Computing 40(1), A529–A547 (2018)
MathSciNet MATH Google Scholar
O’Neil, M., Woolfe, F., Rokhlin, V.: An algorithm for the rapid evaluation of special function transforms. Applied and Computational Harmonic Analysis 28(2), 203–226 (2010)
MathSciNet MATH Google Scholar
NIST Digital Library of Mathematical Functions. http://dlmf.nist.gov/, Release 1.1.2 of 2021-06-15. Olver, F.W.J., Olde Daalhuis, A.B., Lozier, D.W., Schneider, B.I., Boisvert, R.F., Clark, C.W., Miller, B.R., Saunders, B.V., Cohl, H.S., McClain, M.A. (eds)
Ahmed, N., Natarajan, T., Rao, K.R.: Discrete cosine transform. IEEE Transactions on Computers C–23(1), 90–93 (1974)
MathSciNet MATH Google Scholar
Chen, W.H., Smith, C., Fralick, S.: A fast computational algorithm for the discrete cosine transform. IEEE Transactions on Communications 25(9), 1004–1009 (1977)
MATH Google Scholar
Driscoll, T.A., Hale, N., Trefethen, L.N.: Chebfun Guide. Pafnuty Publications (2014)
Alpert, B., Rokhlin, V.: A fast algorithm for the evaluation of Legendre expansions. SIAM Journal on Scientific and Statistical Computing 12(1), 158–179 (1991)
MathSciNet MATH Google Scholar
Hale, N., Townsend, A.: A fast, simple, and stable Chebyshev-Legendre transform using an asymptotic formula. SIAM Journal on Scientific Computing 36(1), A148–A167 (2014)
MathSciNet MATH Google Scholar
Shen, J., Wang, Y., Xia, J.: Fast structured Jacobi-Jacobi transforms. Mathematics of Computation 88, 1743–1772 (2019)
MathSciNet MATH Google Scholar
Slevinsky, R.M.: On the use of Hahn’s asymptotic formula and stabilized recurrence for a fast, simple and stable Chebyshev-Jacobi transform. IMA Journal of Numerical Analysis 38(1), 102–124 (2017)
MathSciNet MATH Google Scholar
Micheli, E., Viano, G.A.: The expansion in Gegenbauer polynomials: A simple method for the fast computation of the Gegenbauer coefficients. Journal of Computational Physics 239, 112–122 (2013)
MathSciNet MATH Google Scholar
Bremer, J., Yang, H.: Fast algorithms for Jacobi expansions via nonoscillatory phase functions. IMA Journal of Numerical Analysis 40(3), 2019–2051 (2019)
MathSciNet MATH Google Scholar
Townsend, A., Webb, M., Olver, S.: Fast polynomial transforms based on Toeplitz and Hankel matrices. Math. Comput. 87 (2016)
Bostan, A., Salvy, B., Schost, E.: Fast conversion algorithms for orthogonal polynomials. Linear Algebra and its Applications 432(1), 249–258 (2010)
MathSciNet MATH Google Scholar
Tygert, M.: Recurrence relations and fast algorithms. Applied and Computational Harmonic Analysis 28(1), 121–128 (2010)
MathSciNet MATH Google Scholar
Potts, D., Steidl, G., Tasche, M.: Fast algorithms for discrete polynomial transforms. Mathematics of Computations 67, 1577–1590 (1998)
MathSciNet MATH Google Scholar
Keiner, J.: Computing with expansions in Gegenbauer polynomials. SIAM Journal on Scientific Computing 31(3), 2151–2171 (2009)
MathSciNet MATH Google Scholar
Leibon, G., Rockmore, D.N., Park, W., Taintor, R., Chirikjian, G.S.: A fast Hermite transform. Theor. Comput. Sci. 409(2), 211–228 (2008) Symbolic-Numerical Computations
Beylkin, G., Coifman, R., Rokhlin, V.: Fast wavelet transforms and numerical algorithms I. Communications on Pure and Applied Mathematics 44(2), 141–183 (1991)
MathSciNet MATH Google Scholar
Aharoni, G., Averbuch, A., Coifman, R., Israeli, M.: Local cosine transform - a method for the reduction of the blocking effect in JPEG. Journal of Mathematical Imaging and Vision 3(1), 7–38 (1993)
MathSciNet MATH Google Scholar
Matviyenko, G.: Optimized local trigonometric bases. Applied and Computational Harmonic Analysis 3(4), 301–323 (1996)
MathSciNet MATH Google Scholar
Mohlenkamp, M.J.: A fast transform for spherical harmonics. Journal of Fourier Analysis and Applications 5(2), 159–184 (1999)
MathSciNet MATH Google Scholar
Michielssen, E., Boag, A.: A multilevel matrix decomposition algorithm for analyzing scattering from large structures. IEEE Transactions on Antennas and Propagation 44(8), 1086–1093 (1996)
Google Scholar
Yin, F., Wu, J., Song, J., Yang, J.: A high accurate and stable Legendre transform based on block partitioning and butterfly algorithm for NWP. Mathematics 7(10), (2019)
Wedi, N.P., Hamrud, M., Mozdzynski, G.: A fast spherical harmonics transform for global nwp and climate models. Monthly Weather Review 141(10), 3450–3461 (2013)
Google Scholar
Boyd, J.P.: Chebyshev and Fourier Spectral Methods. Dover, New York (2001)
MATH Google Scholar
Harris, F.J.: On the use of windows for harmonic analysis with the discrete Fourier transform. Proceedings of the IEEE 66(1), 51–83 (1978)
Google Scholar
Prabhu, K.M.M.: Window functions and their applications in signal processing. CRC Press, Boca Raton (2018)
Google Scholar
Kaiser, J., Schafer, R.: On the use of the I0-sinh window for spectrum analysis. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(1), 105–107 (1980)
Google Scholar
Mallat, S.: A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way, 3rd edn. Academic Press Inc, USA (2008)
Google Scholar
Schaeffer, N.: Efficient spherical harmonic transforms aimed at pseudospectral numerical simulations. Geochemistry, Geophysics, Geosystems 14(3), 751–758 (2013)
Google Scholar
Terekhov, A.V.: The Laguerre finite difference one-way equation solver. Computer Physics Communications 214, 71–82 (2017)
MathSciNet MATH Google Scholar
Terekhov, A.V.: The stabilization of high-order multistep schemes for the Laguerre one-way wave equation solver. Journal of Computational Physics 368, 115–130 (2018)
MathSciNet MATH Google Scholar
Mikhailenko, B.G.: Spectral Laguerre method for the approximate solution of time dependent problems. Applied Mathematics Letters 12, 105–110 (1999)
MathSciNet MATH Google Scholar
Abate, J., Choudhury, G., Whitt, W.: On the Laguerre method for numerically inverting Laplace transforms. Informs J. on Computing 8(4), 413–427 (1996)
MATH Google Scholar
Terekhov, A.V.: Generating the Laguerre expansion coefficients by solving a one-dimensional transport equation. Numerical Algorithms 89(1), 303–322 (2021)
MathSciNet MATH Google Scholar
Mikhailenko, B.G.: Simulation of seismic wave propagation in heterogeneous media. Siberian J. of Numer. Mathematics 6, 415–429 (2003)
MATH Google Scholar
Nussbaumer, H.J.: Fast Fourier Transform and Convolution Algorithms. Springer-Verlag, New York (1982)
MATH Google Scholar
Gil, A., Segura, J., Temme, N.M.: Fast and reliable high-accuracy computation of Gauss-Jacobi quadrature. Numerical Algorithms 87(4), 1391–1419 (2020)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

The study of proposed algorithms was financially supported by RFBR and Novosibirsk region (Project No. 20-41-540003).

Author information

Authors and Affiliations

Novosibirsk State Technical University, 630073, Novosibirsk, Russia
Andrew V. Terekhov

Authors

Andrew V. Terekhov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew V. Terekhov.

Ethics declarations

Conflict of interest

The author declares no competing interests.

Appendix: An efficient precomputation procedure for trigonometric transforms

In the case of calculation of nonuniform trigonometric transformations, the precomputation costs of Algorithms 1 and 2 can be reduced from $O(N^2\log N)$ to $O(N \log N +Nq)$ arithmetic operations, where q is determined by the required computation accuracy. As an example, let us consider an efficient procedure for calculating a row of the compressed matrix (7).

First, according to the inverse convolution theorem, for any $\mathbf {X},\mathbf {Y}\in \mathbb {C}^{N}$ we have

$$\begin{aligned} \mathcal {F}\left( \mathbf {X} \odot \mathbf {Y} \right) = \mathcal {F}\mathbf {X} *\mathcal {F}\mathbf {Y}=\tilde{\mathbf {X}} *\tilde{\mathbf {Y}}, \end{aligned}$$

where $\odot$ denotes the component-wise product and

$$\begin{aligned} \left( \tilde{\mathbf {X}} *\tilde{\mathbf {Y}}\right) _n\triangleq \frac{1}{\sqrt{N}}\sum \limits _{m=0}^{N-1}\tilde{x}_m\tilde{y}_{(n-m)\text {mod}\ N}, \quad n=0,\ldots ,N-1. \end{aligned}$$

(A.1)

Figure 2 shows that the Kaiser window can be approximated by a small number of coefficients of the Fourier series denoted here as $\tilde{x}_m$. Since in the discrete case for the Kaiser window no efficient formula for calculating individual Fourier components is known, the calculation of several components of the spectrum $\tilde{x}_m$ using FFT will require $O(N\log N)$ operations. This operation is performed once for all columns or rows of the transformation matrix.

Second, Fig. 3b shows that there is no need to calculate all elements of a compressed matrix row by formula (A.1). It is sufficient to calculate about 24 convolutions of the form (A.1) for each row to ensure an accuracy of transformation of the order $10^{-15}$.

Third, depending on the trigonometric transformation type given by the matrix A, the individual DFT components for the rows of the matrix A can be calculated by one of the following formulas:

$$\begin{aligned}&\tilde{y}^{\mathrm {cos}}_k(\theta )=\frac{1}{\sqrt{N}}\sum\nolimits_{j=0}^{N-1}\cos \left( j\theta \right) \omega _N^{jk}\nonumber \\= & {} \frac{1}{\sqrt{N}}\frac{\omega _N^k\left( (\left( \cos \left( N \theta \right) -1)\cos \left( \theta \right) +\sin \left( \theta \right) \sin \left( N \theta \right) \right) -(\cos \left( N \theta \right) -1)\omega _N^{k}\right) }{\omega _N^{2k}-2 \omega _N^{k} \cos \left( \theta \right) +1}, \end{aligned}$$

(A.2)

$$\begin{aligned} \tilde{y}^{\mathrm {sin}}_k(\theta )= & {} \frac{1}{\sqrt{N}}\sum \nolimits _{j=0}^{N-1}\sin \left( j\theta \right) \omega _N^{jk}\nonumber \\= & {} \frac{1}{\sqrt{N}}\frac{\omega _N^{k}\left( \left( \sin \left( \theta \right) (\cos \left( N \theta \right) -1)-\cos \left( \theta \right) \sin \left( N\theta \right) \right) +\sin \left( N \theta \right) \omega _N^{k}\right) }{-\omega _N^{2k}+2 \omega _N^{k} \cos \left( \theta \right) -1},\end{aligned}$$

(A.3)

$$\begin{aligned} \tilde{y}^{\mathrm {exp}}_k(\theta )= & {} \frac{1}{\sqrt{N}}\sum \nolimits _{j=0}^{N-1}{\mathrm e}^{\mathrm {i} j\theta }\omega _N^{jk}=\frac{1}{\sqrt{N}}\frac{{\mathrm e}^{\mathrm {i} N \theta }-1}{{\mathrm e}^{\mathrm {i}\theta }\omega _N^{k}-1}, \end{aligned}$$

(A.4)

where $\omega _N=\exp {\left( -{2\pi \mathrm {i}}/{N}\right) }$. The variables $\theta$ and k in formulas (A.2)-(A.4) are separated, which reduces the precomputation time. As a result, the total cost of calculating all rows of the matrix (7) will be proportional to $O(N \log N+Nq)$, where the first term is the number of operations required for a single FFT calculation for the Kaiser window, and the second term is the number of operations of N convolutions of the form (A.1) to calculate the compressed matrix elements.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Terekhov, A.V. An extra-component method for evaluating fast matrix-vector multiplication with special functions. Numer Algor 92, 2189–2217 (2023). https://doi.org/10.1007/s11075-022-01383-y

Download citation

Received: 04 May 2021
Accepted: 22 July 2022
Published: 10 August 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11075-022-01383-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An extra-component method for evaluating fast matrix-vector multiplication with special functions

Abstract

Access this article

Similar content being viewed by others

An Algorithm for the Vandermonde Matrix-Vector Multiplication with Reduced Multiplicative Complexity

Polynomial Evaluation and Interpolation and Transformations of Matrix Structures

An Extension and Efficient Calculation of the Horner’s Rule for Matrices

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Appendix: An efficient precomputation procedure for trigonometric transforms

Rights and permissions

About this article

Cite this article

Keywords

Navigation

An extra-component method for evaluating fast matrix-vector multiplication with special functions

Abstract

Access this article

Similar content being viewed by others

An Algorithm for the Vandermonde Matrix-Vector Multiplication with Reduced Multiplicative Complexity

Polynomial Evaluation and Interpolation and Transformations of Matrix Structures

An Extension and Efficient Calculation of the Horner’s Rule for Matrices

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Appendix: An efficient precomputation procedure for trigonometric transforms

Appendix: An efficient precomputation procedure for trigonometric transforms

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation