Randomized Kaczmarz algorithm with averaging and block projection

Zhang, Zeyi; Shen, Dong

doi:10.1007/s10543-023-01002-9

Randomized Kaczmarz algorithm with averaging and block projection

Published: 12 December 2023

Volume 64, article number 1, (2024)
Cite this article

BIT Numerical Mathematics Aims and scope Submit manuscript

370 Accesses
Explore all metrics

Abstract

The randomized Kaczmarz algorithm is a simple iterative method for solving linear systems of equations. This study proposes a variant of the randomized Kaczmarz algorithm by combining block projection and weighted averaging techniques. Here, block projection quickly decreases iterative errors, and averaging reduces randomness and enables parallel computation simultaneously. Their combination can balance the convergence rate, convergence horizon, and computational complexity. In addition, three adaptive weights are designed to balance multiple block calculations and accelerate the proposed method. Exponential convergence is established for general linear systems (overdetermined or underdetermined, full-rank or deficient-rank, and consistent or inconsistent). Numerical simulations explain and verify the results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Randomized Kaczmarz with averaging

Article 11 August 2020

Faster randomized block sparse Kaczmarz by averaging

Article Open access 28 December 2022

On a fast deterministic block Kaczmarz method for solving large-scale linear systems

Article 18 June 2021

References

Bai, Z.Z., Wu, W.T.: On convergence rate of the randomized Kaczmarz method. Linear Algebra Appl. 553, 252–269 (2018). https://doi.org/10.1016/j.laa.2018.05.009
Article MathSciNet Google Scholar
Bai, Z.Z., Wu, W.T.: On greedy randomized Kaczmarz method for solving large sparse linear systems. SIAM J. Sci. Comput. 40(1), A592–A606 (2018). https://doi.org/10.1137/17m1137747
Article MathSciNet Google Scholar
Bai, Z.Z., Wu, W.T.: On greedy randomized augmented Kaczmarz method for solving large sparse inconsistent linear systems. SIAM J. Sci. Comput. 43(6), A3892–A3911 (2021). https://doi.org/10.1137/20m1352235
Article MathSciNet Google Scholar
Censor, Y., Eggermont, P.P.B., Gordon, D.: Strong underrelaxation in Kaczmarz’s method for inconsistent systems. Numer. Math. 41(1), 83–92 (1983). https://doi.org/10.1007/bf01396307
Article MathSciNet Google Scholar
Chen, J.Q., Huang, Z.D.: On a fast deterministic block Kaczmarz method for solving large-scale linear systems. Numer. Algorithms (2021). https://doi.org/10.1007/s11075-021-01143-4
Article Google Scholar
Du, K., Si, W.T., Sun, X.H.: Randomized extended average block Kaczmarz for solving least squares. SIAM J. Sci. Comput. 42(6), A3541–A3559 (2020). https://doi.org/10.1137/20m1312629
Article MathSciNet Google Scholar
Dumitrescu, B.: On the relation between the randomized extended Kaczmarz algorithm and coordinate descent. BIT Numer. Math. 55(4), 1005–1015 (2014). https://doi.org/10.1007/s10543-014-0526-9
Article MathSciNet Google Scholar
Eggermont, P., Herman, G., Lent, A.: Iterative algorithms for large partitioned linear systems, with applications to image reconstruction. Linear Algebra Appl. 40, 37–67 (1981). https://doi.org/10.1016/0024-3795(81)90139-7
Article MathSciNet Google Scholar
Gordon, D., Gordon, R.: Component-averaged row projections: a robust, block-parallel scheme for sparse linear systems. SIAM J. Sci. Comput. 27(3), 1092–1117 (2005). https://doi.org/10.1137/040609458
Article MathSciNet Google Scholar
Gordon, D., Gordon, R.: Carp-cg: a robust and efficient parallel solver for linear systems, applied to strongly convection dominated PDEs. Parallel Comput. 36(9), 495–515 (2010). https://doi.org/10.1016/j.parco.2010.05.004
Article MathSciNet Google Scholar
Gower, R.M., Richtárik, P.: Randomized iterative methods for linear systems. SIAM J. Matrix Anal. Appl. 36(4), 1660–1690 (2015). https://doi.org/10.1137/15m1025487
Article MathSciNet Google Scholar
Gower, R.M., Molitor, D., Moorman, J., Needell, D.: On adaptive sketch-and-project for solving linear systems. SIAM J. Matrix Anal. Appl. 42(2), 954–989 (2021). https://doi.org/10.1137/19m1285846
Article MathSciNet Google Scholar
Haddock, J., Needell, D.: On Motzkin’s method for inconsistent linear systems. BIT Numer. Math. 59(2), 387–401 (2018). https://doi.org/10.1007/s10543-018-0737-6
Article MathSciNet Google Scholar
Kaczmarz, S.: Approximate solution of systems of linear equations. Int. J. Control 57(6), 1269–1271 (1993). https://doi.org/10.1080/00207179308934446
Article MathSciNet Google Scholar
Kamath, G., Ramanan, P., Song, W.Z.: Distributed randomized Kaczmarz and applications to seismic imaging in sensor network. In: 2015 International Conference on Distributed Computing in Sensor Systems, pp. 169–178 (2015). https://doi.org/10.1109/DCOSS.2015.27
Li, J., Hu, Z.C.: Toeplitz lemma, complete convergence, and complete moment convergence. Commun. Stat. Theory Methods 46(4), 1731–1743 (2017)
Article ADS MathSciNet Google Scholar
Lin, J., Zhou, D.X.: Learning theory of randomized Kaczmarz algorithm. J. Mach. Learn. Res. 16(103), 3341–3365 (2015)
MathSciNet Google Scholar
Liu, Y., Gu, C.Q.: On greedy randomized block Kaczmarz method for consistent linear systems. Linear Algebra Appl. 616, 178–200 (2021). https://doi.org/10.1016/j.laa.2021.01.024
Article MathSciNet Google Scholar
Loera, J.A.D., Haddock, J., Needell, D.: A sampling Kaczmarz–Motzkin algorithm for linear feasibility. SIAM J. Sci. Comput. 39(5), S66–S87 (2017). https://doi.org/10.1137/16m1073807
Article MathSciNet Google Scholar
Lorenz, D.A., Rose, S., Schöpfer, F.: The randomized Kaczmarz method with mismatched adjoint. BIT Numer. Math. 58(4), 1079–1098 (2018). https://doi.org/10.1007/s10543-018-0717-x
Article MathSciNet Google Scholar
Ma, A., Molitor, D.: Randomized Kaczmarz for tensor linear systems. BIT Numer. Math. 62(1), 171–194 (2021). https://doi.org/10.1007/s10543-021-00877-w
Article MathSciNet Google Scholar
Moorman, J.D., Tu, T.K., Molitor, D., Needell, D.: Randomized Kaczmarz with averaging. BIT Numer. Math. 61(1), 337–359 (2020). https://doi.org/10.1007/s10543-020-00824-1
Article MathSciNet Google Scholar
Morijiri, Y., Aishima, K., Matsuo, T.: Extension of an error analysis of the randomized Kaczmarz method for inconsistent linear systems. JSIAM Lett. 10, 17–20 (2018). https://doi.org/10.14495/jsiaml.10.17
Article MathSciNet Google Scholar
Necoara, I.: Faster randomized block Kaczmarz algorithms. SIAM J. Matrix Anal. Appl. 40(4), 1425–1452 (2019). https://doi.org/10.1137/19m1251643
Article MathSciNet Google Scholar
Needell, D.: Randomized Kaczmarz solver for noisy linear systems. BIT Numer. Math. 50(2), 395–403 (2010). https://doi.org/10.1007/s10543-010-0265-5
Article MathSciNet Google Scholar
Needell, D., Tropp, J.A.: Paved with good intentions: Analysis of a randomized block Kaczmarz method. Linear Algebra Appl. 441, 199–221 (2014). https://doi.org/10.1016/j.laa.2012.12.022
Article MathSciNet Google Scholar
Needell, D., Ward, R.: Batched stochastic gradient descent with weighted sampling. arXiv:1608.07641 (2017)
Needell, D., Srebro, N., Ward, R.: Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. Math. Program. 155(1–2), 549–573 (2015). https://doi.org/10.1007/s10107-015-0864-7
Article MathSciNet Google Scholar
Needell, D., Zhao, R., Zouzias, A.: Randomized block Kaczmarz method with projection for solving least squares. Linear Algebra Appl. 484, 322–343 (2015). https://doi.org/10.1016/j.laa.2015.06.027
Article MathSciNet Google Scholar
Pantelimon, I., Popa, C.: Constraining by a family of strictly nonexpansive idempotent functions with applications in image reconstruction. BIT Numer. Math. 53, 527–544 (2013). https://doi.org/10.1007/s10543-012-0414-0
Article MathSciNet Google Scholar
Rebrova, E., Needell, D.: On block Gaussian sketching for the Kaczmarz method. Numer. Algorithms 86(1), 443–473 (2020). https://doi.org/10.1007/s11075-020-00895-9
Article MathSciNet Google Scholar
Richtárik, P., Takáč, M.: Stochastic reformulations of linear systems: algorithms and convergence theory. SIAM J. Matrix Anal. Appl. 41(2), 487–524 (2020). https://doi.org/10.1137/18m1179249
Article MathSciNet Google Scholar
Steinerberger, S.: Randomized Kaczmarz converges along small singular vectors. SIAM J. Matrix Anal. Appl. 42(2), 608–615 (2021). https://doi.org/10.1137/20m1350947
Article MathSciNet Google Scholar
Steinerberger, S.: A weighted randomized Kaczmarz method for solving linear systems. Math. Comput. 90(332), 2815–2826 (2021). https://doi.org/10.1090/mcom/3644
Article MathSciNet Google Scholar
Strohmer, T., Vershynin, R.: A randomized Kaczmarz algorithm with exponential convergence. J. Fourier Anal. Appl. 15(2), 262–278 (2008). https://doi.org/10.1007/s00041-008-9030-4
Article MathSciNet Google Scholar
Yi, P., Lei, J., Chen, J., Hong, Y., Shi, G.: Distributed linear equations over random networks. IEEE Trans. Autom. Control (2022). https://doi.org/10.1109/TAC.2022.3187379
Article Google Scholar
Zeng, Y., Han, D., Su, Y., Xie, J.: Randomized Kaczmarz method with adaptive stepsizes for inconsistent linear systems. Numer. Algorithms (2023). https://doi.org/10.1007/s11075-023-01540-x
Article MathSciNet Google Scholar
Zouzias, A., Freris, N.M.: Randomized extended Kaczmarz for solving least squares. SIAM J. Matrix Anal. Appl. 34(2), 773–793 (2013). https://doi.org/10.1137/120889897
Article MathSciNet Google Scholar
Zouzias, A., Freris, N.M.: Randomized gossip algorithms for solving Laplacian systems. In: 2015 European Control Conference (ECC), pp. 1920–1925 (2015). https://doi.org/10.1109/ECC.2015.7330819

Download references

Acknowledgements

We would like to thank the anonymous reviewers for their valuable comments, which have simplified the proofs and tremendously improved the content and quality of this paper.

Author information

Authors and Affiliations

School of Mathematics, Renmin University of China, Beijing, 100872, China
Zeyi Zhang & Dong Shen

Authors

Zeyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dong Shen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Both authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Zeyi Zhang and Dong Shen. The first draft of the manuscript was written by Zeyi Zhang and both authors prepared the revised version. Both authors approved the final submission.

Corresponding author

Correspondence to Dong Shen.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Communicated by Gunnar Martinsson.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported by the Beijing Natural Science Foundation (Z210002) and National Natural Science Foundation of China [62173333].

Appendices

Proof of Lemma 3.1

For a real number sequence $\{a_i\}$, we can verify that

$$\begin{aligned} \sum _{t=n}^k \Big ( \prod _{i=t+1}^k (1-a_i)\Big ) a_t = 1- \prod _{i=n}^k (1-a_i)\quad (k\ge n). \end{aligned}$$

(7.1)

As k goes to infinity, the left term in (7.1) converges to one.

Denote $\varPhi (k+1,t)= (1-a_k) \varPhi (k,t)$, $\varPhi _{t,t}=1$, $\forall k\ge t$. Using (7.1), we have

$$\begin{aligned} x_{k+1}&= \varPhi (k+1,0) x_0 + \sum _{t=0}^k \varPhi (k+1,t+1) \frac{a_tb_t}{a_t}\nonumber \\&=\varPhi (k+1,0) x_0 + \sum _{t=0}^k \Big (\prod _{i=t+1}^k (1-a_i)\Big ) \frac{a_tb_t}{a_t}. \end{aligned}$$

(7.2)

It is easy to see that $\lim _{k\rightarrow \infty }\varPhi (k,t)=0$, $\forall t$. Applying the Toeplitz Lemma [16] to the second term on the right-hand side of (7.2), we obtain that $x_{k}$ converges to $\lim _{k\rightarrow \infty } \frac{b_k}{a_k}$ if the limit exists. $\square $

Proof of Theorem 3.5

From (3.5), we have

$$\begin{aligned} {\textbf{x}}_k-{\textbf{x}}_*=\left( {\textbf{I}}-\alpha \sum _{j=1}^\tau \omega _j {\textbf{X}}_j\right) \left( {\textbf{x}}_{k-1}-{\textbf{x}}_*\right) +\alpha \sum _{j=1}^\tau \omega _j \left( {\textbf{A}}^j\right) ^T{\textbf{Z}}_j\theta ^j\nonumber . \end{aligned}$$

The disturbed term is

$$\begin{aligned} \sum _{j=1}^\tau \omega _j ({\textbf{A}}^j)^T{\textbf{Z}}_j\theta ^j =&\begin{pmatrix} \omega _1({\textbf{A}}^1)^T{\textbf{Z}}_1,\cdots ,\omega _\tau ({\textbf{A}}^\tau )^T{\textbf{Z}}_\tau \end{pmatrix} \begin{pmatrix} (\theta ^1)^T,\cdots ,(\theta ^\tau )^T \end{pmatrix}^T\\ =&\begin{pmatrix} \omega _1({\textbf{A}}^1)^T{\textbf{Z}}_1,\cdots ,\omega _\tau ({\textbf{A}}^\tau )^T{\textbf{Z}}_\tau \end{pmatrix} \theta . \end{aligned}$$

$\square $

Note that

$$\begin{aligned} {\mathbb {E}}[{\textbf{Z}}_j]=&\sum _{i=1}^{l_j}p_{j,i} {\textbf{I}}_{\{i\}}^T ({\textbf{A}}^j_{\{i\}}({\textbf{A}}^j_{\{i\}})^T{)^{-1} }{\textbf{I}}_{\{i\}} =\sum _{i=1}^{l_j}\frac{\Vert {\textbf{A}}^j_{\{i\}}\Vert ^2}{\Vert {\textbf{A}}^j\Vert _F^2} \frac{{\textbf{I}}_{\{i\}}^T{\textbf{I}}_{\{i\}}}{\Vert {\textbf{A}}^j_{\{i\}}\Vert ^2} =\frac{{\textbf{I}}}{\Vert {\textbf{A}}^j\Vert _F^2}, \end{aligned}$$

where one should notice that ${\textbf{A}}^j_{\{i\}}({\textbf{A}}^j_{\{i\}})^T$ is a nonzero number because ${\textbf{A}}^j_{\{i\}}$ is a nonzero row vector.

Taking expectation on the disturbed term, we have

$$\begin{aligned}&{\mathbb {E}}\left[ \sum _{j=1}^\tau \omega _j \left( {\textbf{A}}^j\right) ^T{\textbf{Z}}_j\theta ^j\right] \\&\quad =\begin{pmatrix} \omega _1\left( {\textbf{A}}^1\right) ^T{\mathbb {E}}\left[ {\textbf{Z}}_1\right] ,\omega _2 \left( {\textbf{A}}^2\right) ^T{\mathbb {E}}\left[ {\textbf{Z}}_2\right] ,\ldots ,\omega _\tau \left( {\textbf{A}}^\tau \right) ^T{\mathbb {E}}\left[ {\textbf{Z}}_\tau \right] \end{pmatrix}\theta \\&\quad =\Vert {\textbf{A}}\Vert _F^{-2} \begin{pmatrix} \left( {\textbf{A}}^1\right) ^T, \left( {\textbf{A}}^2\right) ^T,\ldots , \left( {\textbf{A}}^\tau \right) ^T \end{pmatrix}\theta =\Vert {\textbf{A}}\Vert _F^{-2} {\textbf{A}}^T\theta ={\textbf{0}}. \end{aligned}$$

Finally, note that ${\textbf{X}}_j=({\textbf{A}}^j)^T{\textbf{Z}}_j{\textbf{A}}^j$, $\omega _j=\Vert {\textbf{A}}^j\Vert _F ^2/\Vert {\textbf{A}}\Vert _F ^2$, we have

$$\begin{aligned} \begin{aligned} {\mathbb {E}}[{{\textbf {x}}}_k-{{\textbf {x}}}_*] =&\left( {{\textbf {I}}}-\alpha \sum _{j=1}^\tau \frac{\Vert {{\textbf {A}}}^j\Vert _F^2 }{\Vert {{\textbf {A}}}\Vert _F^2} {\mathbb {E}}[{{\textbf {X}}}_j]\right) {\mathbb {E}}[{{\textbf {x}}}_{k-1}-{{\textbf {x}}}_*]\\ =&\left( {{\textbf {I}}}-\frac{\alpha }{\Vert {{\textbf {A}}}\Vert _F^2} {{\textbf {A}}}^T{{\textbf {A}}}\right) {\mathbb {E}}[{{\textbf {x}}}_{k-1}-{{\textbf {x}}}_*]. \end{aligned} \end{aligned}$$

Then, if $\alpha \le 1$, we obtain $\Vert {\mathbb {E}}[{{\textbf {x}}}_k-{{\textbf {x}}}_*]\Vert \le (1-\frac{\alpha \sigma _r({{\textbf {A}}}^T{{\textbf {A}}})}{\Vert {{\textbf {A}}}\Vert _F^2})\Vert {\mathbb {E}}[{{\textbf {x}}}_{k-1}-{{\textbf {x}}}_*]\Vert $.

Proof of Lemma 4.1

Denote

$$\begin{aligned} V&=\frac{\sum _{i=1}^\tau a_i^{q+2}}{\sum _{j=1}^\tau a_j^q} -\left( \frac{\sum _{i=1}^\tau a_i^{q+1}}{\sum _{j=1}^\tau a_j^q}\right) ^2\\&= \sum _{i=1}^\tau \frac{ a_i^{q}}{\sum _{j=1}^\tau a_j^q}a_i^2 -\big (\sum _{i=1}^\tau \frac{ a_i^{q}}{\sum _{j=1}^\tau a_j^q}a_i\big )^2\ge 0. \end{aligned}$$

Then, we obtain

$$\begin{aligned} u(q+1)&= \sum _{i=1}^\tau \frac{a_i^{q+1}}{\sum _{j=1}^\tau a_j^{q+1}} a_i = \frac{\sum _{j=1}^\tau a_j^{q+2}}{\sum _{j=1}^\tau a_j^{q+1}}\\&= \frac{\frac{\sum _{j=1}^\tau a_j^{q+2}}{\sum _{j=1}^\tau a_j^{q}} \sum _{j=1}^\tau a_j^{q} -\left( \frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}}\right) ^2\sum _{j=1}^\tau a_j^{q}+\left( \frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}}\right) ^2\sum _{j=1}^\tau a_j^{q}}{\sum _{j=1}^\tau a_j^{q+1}}\\&= \frac{\left( \frac{\sum _{j=1}^\tau a_j^{q+2}}{\sum _{j=1}^\tau a_j^{q}} -\left( \frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}}\right) ^2\right) \sum _{j=1}^\tau a_j^{q}+\left( \frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}}\right) ^2\sum _{j=1}^\tau a_j^{q}}{\sum _{j=1}^\tau a_j^{q+1}}\\&=\frac{V\sum _{j=1}^\tau a_j^{q}+\left( \sum _{j=1}^\tau a_j^{q+1}\right) ^2/\left( \sum _{j=1}^\tau a_j^{q}\right) }{\sum _{j=1}^\tau a_j^{q+1}}\\&=\frac{V\left( \sum _{j=1}^\tau a_j^{q}\right) ^2+\left( \sum _{j=1}^\tau a_j^{q+1}\right) ^2}{\left( \sum _{j=1}^\tau a_j^{q+1}\right) ^2}\frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}}\\&=\left( \frac{V\left( \sum _{j=1}^\tau a_j^{q}\right) ^2}{\left( \sum _{j=1}^\tau a_j^{q+1}\right) ^2} +1\right) \frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}}\\&\ge \frac{\sum _{j=1}^\tau a_j^{q+1}}{\sum _{j=1}^\tau a_j^{q}} =\sum _{i=1}^\tau \frac{a_i^{q}}{\sum _{j=1}^\tau a_j^{q}}a_i =u(q). \end{aligned}$$

$\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Zhang, Z., Shen, D. Randomized Kaczmarz algorithm with averaging and block projection. Bit Numer Math 64, 1 (2024). https://doi.org/10.1007/s10543-023-01002-9

Download citation

Received: 21 February 2023
Accepted: 14 November 2023
Published: 12 December 2023
DOI: https://doi.org/10.1007/s10543-023-01002-9

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Randomized Kaczmarz algorithm with averaging and block projection

Abstract

Access this article

Similar content being viewed by others

Randomized Kaczmarz with averaging

Faster randomized block sparse Kaczmarz by averaging

On a fast deterministic block Kaczmarz method for solving large-scale linear systems

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendices

Proof of Lemma 3.1

Proof of Theorem 3.5

Proof of Lemma 4.1

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Randomized Kaczmarz algorithm with averaging and block projection

Abstract

Access this article

Similar content being viewed by others

Randomized Kaczmarz with averaging

Faster randomized block sparse Kaczmarz by averaging

On a fast deterministic block Kaczmarz method for solving large-scale linear systems

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendices

Proof of Lemma 3.1

Proof of Theorem 3.5

Proof of Lemma 4.1

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation