A greedy average block Kaczmarz method for the large scaled consistent system of linear equations

: This paper presents a greedy average block Kaczmarz (GABK) method to solve the large scaled consistent system of linear equations. The GABK method introduces the strategy of extrapolation process to improve the GBK algorithm and to avoid computing the Moore-Penrose inverse of a submatrix of the coe ﬃ cient matrix determined by the block index set. The GABK method is proved to converge linearly to the least-norm solution of the consistent system of linear equations. Numerical examples show that the GABK method has the best e ﬃ ciency and e ﬀ ectiveness among all methods compared.


Introduction
We are concerned with the numerical solution of the large scaled consistent system of linear equations of the form where A ∈ R m×n and b ∈ R m are known and x ∈ R n is unknown to be determined. The Kaczmarz method in [1], which was revised to be applied to image reconstruction in [2] and is called as an algebraic reconstruction technique (ART), is a simple and of high performance for solving the large scaled system of linear Eq (1.1), and has many applications such as image reconstruction in computerized tomography [2][3][4][5] and parallel computing [4,6].
The block Kaczmarz methods (BK) have received much attention for its high efficiency for solving (1.1). Elfving [7] and Eggermont et al. [8] first presented block iterative methods to solve (1.1). Needell et al. in [9] proposed a randomized block Kaczmarz (RBK) algorithm to solve the linear least-squares problem by choosing a subsystem from the pre-determined partitions at random, which converges to the least-norm solution of (1.1) with an expected linear rate of convergence. Needell et al. in [10] further presented a randomized double block Kaczmarz (RDBK) to solve an inconsistent linear system (1.1). Chen and Huang [11] improved the error estimate in expectation and obtained a much better upper bound of the error estimate than that in [10]. Gower and Richtárik in [12] presented a Gaussian Kaczmarz (GK) method for (1.1). Necoara in [13] developed a unified framework of randomized average block Kaczmarz (RABK) algorithms with the iteration of the form for the consistent system (1.1), where A (i) denotes the ith row of the matrix A and b (i) denotes the ith entry of the vector b, ω k i presents the weight of the ith row of the matrix A at k step iteration. Niu and Zheng in [14] simplified the greed strategy in [15] to produce a greedy probability criterion with η ∈ (0, 1), and proposed a greedy block Kaczmarz algorithm (GBK) with the iteration It is proved that the GBK method converges linearly to the unique minimum norm least-squares solution of (1.1). We refer [15][16][17][18] more recent work on block Kaczmarz methods. The GBK algorithm in [14] needs to compute the Moore-Penrose inverse A † J k of A J k at each step and may be expensive when J k is large enough.
In this paper, we improve the GBK method in [14] by introducing the strategy of extrapolation process proposed in [13] to avoid the computing of A † J k in (1.5). The proposed method is called a greedy average block Kaczmarz algorithm, which is abbreviated as GABK. Numerical examples in Section 3 shows the GABK method has the best efficiency by the running time and the number of iterations and the best effectiveness by the convergence of the relative solution error among all methods compared.
The rest of this paper is organized as follows. Section 2 presents a greedy average block Kaczmarz algorithm and proves its convergence. Several examples are shown in Section 3 and some conclusions are drawn in Section 4.

A greedy average block Kaczmarz algorithm
We introduce the strategy of extrapolation process proposed in [13] for the GBK algorithm in (1.5) to avoid the computing of A † J k in (1.5) and propose a greedy average block Kaczmarz (GABK) method. Algorithm 1 summarizes the GABK algorithm. Steps 4 and 5 determine the control index set.
Step 6 computes the stepsize which is used to determine the stepsize adaptively and Step 7 presents the iteration process which avoids the computing of A † J k in (1.5), where the weight ω k i in (2.1) is chosen such that 0 < ω k i < 1 and i∈J k ω k i = 1. We set ω k i = 1/J k in Section 3. We consider the convergence of the GABK algorithm 1. We first need the following results presented in [17] to prove the convergence of Algorithm 1.
Output: the approximation solution x k+1 of the consistent system (1.1).
end Lemma 2.1. If A ∈ R m×n is a nonzero real matrix, then it holds that for any x ∈ range(A T ), where σ 2 min (A) and σ 2 max (A) denote the minimum and maximum singular value of A, respectively. Theorem 2.1. Assume the system of linear Eq (1.1) is consistent and let x * be a solution of (1.1). Let {x k } k≥0 be generated by Algorithm 1 with x 0 ∈ range(A T ). Assume that the weights fulfill ω k i ∈ (0, 1) for all i ∈ J k and k. Denote by ω min = min i∈J k ,k≥0 {ω k i }. Then it holds that Proof. With the consistency assumption of the system of linear Eq (1.1), we have Ax * = b, and According to the update rule of GABK and using the notation of , where α k is defined in (2.1), we have Now we consider the bound of L k . According to (2.1), we have (2.4)

Substituting the bound (2.4) into (2.3) results in
where the last inequality holds because of the choice of the index in Step 5 of Algorithm 1, i.e., where the last inequality holds by the lemma 2.1. Thus, substituting (2.6) into (2.5) results in (2.2). This completes the proof.
We remark that under the conditions of Theorem 2.1, if the matrix A is a normalized matrix, i.e., , then the convergence rate of GABK, which is determined by the inequality holds because of the fact Thus 0 < ρ GABK < 1, which means that Algorithm 1 has a linear convergence. We give a special selection of the parameters in Theorem 2.1 as follows, which will be used in Section 3.
for all k and i, then it holds that , substituting these results and the assumption that δ = 1 and ω k i = 1 |J k | into the right-hand side of (2.2) results in , which implies (2.2). This completes the proof.
We complete this section by analyzing the arithmetic complexity of Algorithm 1. It needs about (nnz(A)+nnz(A J k )+m+2n+2) complex flops at the kth iteration of the GABK method, where nnz(A) and nnz(A J k ) denote the number of nonzero elements of A and the submatrix A J k respectively. Moreover, the approximate computing cost of the k-step iteration of the GBK method is (2nnz(A J k )+3n+2|J k |)K+ n+2n|J k | complex flops, where |J k | is the cardinality of the set J k , and in GBK method the computation of the pseudoinverse is approximated by performing K number of iterations using conjugate gradients CGLS.

Numerical experiments
In this section, several different kinds of examples are given to show the efficiency and effectiveness of Algorithm 1 (GABK) for solving the consistent system of linear Eq (1.1). The GABK method is compared with the RABK method in [13], the GBK method in [14] and the fast deterministic block Kaczmarz (FDBK) in [16]. We run all examples by the soft of Matlab with the R2019b version on a personal computer with 2.0 GHz Inter(R) Core(TM) i7-8565U CPU processing unit, 8 GB memory, and 64 bit Windows 10 operating system.
For the RABK method, we consider two cases used in [13], and use the same sampling methods and choices of blocks, stepsizes and weights, which is listed in Table 1. Table 1. The sampling methods and parameters of two cases of the RABK method in [13].

Method
Sampling method The partition sampling in Table 1 means that selects randomly from the row partition . . , i m s }, i = 1, 2, . . . , s, and |J k | is the size of the block control set J k at the kth iteration.
For the GBK method, we use the parameter η = 1 2 (max (1.4), and the block control set in [16] for the FDBK method. The CGLS algorithm is used to calculate the Moore-Penrose inverse A † J k at each iteration of both algorithms. For the GABK method, we set ζ = 0.2 in (1.3) to grasp more rows from the matrix A, δ = 1 and the weights ω k i = 1/|J k | in (2.1). Three types of coefficient matrices A are considered to construct the consistent systems (1.1), i.e., overdetermined or underdetermined dense matrices with normally distribution produced by the Matlab function randn(m,n), large full rank sparse matrices and rank-deficient sparse matrices from the suitesparse matrix collection in [19]. We let b ∈ R m in (1.1) be generated by Ax * , where x * ∈ R n represents the exact solution produced by the Matlab function randn. The performance of the GABK and other methods are evaluated in terms of efficiency and effectiveness. The efficiency is defined by the iteration number denoted by 'IT' and the CPU time in seconds by 'CPU'. The effectiveness is determined by the relative solution error (RSE) defined by The initial solution x 0 is set as 0 in all experiments, and all algorithms do not stop until the RSE satisfies RS E < 10 −6 . All numerical results reported as follows are arithmetical average quantities with respect to 50 repeated trials of each method. The speed-up of GABK against other methods is defined by 3.1. Example 1. consistent overdetermined systems In this example, we consider the solution of the consistent overdetermined system of linear Eq (1.1) with a dense overdetermined (m ≥ n) matrix A ∈ R m×n , which has normal distribution. The coefficient matrix A ∈ R m×n with different size combined with m = i * 10 3 (i = 1, 2, ..., 5) and n = j * 10 2 ( j = 1, 5) is produced by the Matlab function randn(m,n). Table 2 shows IT and CPU together with the speedup for Algorithm 1 for solving (1.1), and those compared with the RABK method in [13], the GBK method in [14] and the FDBK method in [16]. We can see from Table 2 that the GABK method needs the least number of iterations and least CPU time among all methods. This result benefits from the greater block control set and an extrapolated stepsize in each iteration in Algorithm 1. Figure 1 shows that GABK has the fastest convergence of RSE both on IT and CPU among all methods. We use Algorithm 1 for solving the consistent underdetermined system of linear Eq (1.1) with different underdetermined (m < n) dense matrices A ∈ R m×n , and compare it with RABK [13], GBK [14] and FDBK [16]. Table 3 lists different size of A ∈ R m×n combined with m = i * 10 2 (i = 1, 5) and n = j * 10 3 ( j = 1, 2, ..., 5) and IT, CPU and speed-up of different methods. Figure 2 plots RSE versus IT (left) and CPU (right) of different method for solving (1.1) with the matrix A = randn(500, 4000). Similar results on IT, CPU and the speed-up to Table 2 in Example 1 are derived for Algorithm 1  from Table 3. It is observed from Figure 2 that the GABK method has the fastest convergence among all methods. Table 3. IT, CPU and the speed-up of Algorithm 1 compared with RABK [13], GBK [14] and FDBK [16] for solving the consistent system of linear Eq (1.1) with a dense underdetermined matrix A.

Example 3. linear systems with sparse full rank matrices
This example shows the application of Algorithm 1 to solve the consistent linear systems (1.1) with full rank sparse overdetermined or underdetermined matrices A ∈ R m×n from the Suite Sparse Matrix Collection in [19].
Tables 4 and 5 display the size, the sparsity (density) and the Euclidean condition number (cond(A)) of A ∈ R m×n , where density of A is defined by the ratio of the number of nozeros of A to the total number of A. Table 4. IT, CPU and the speed-up of Algorithm 1 (GABK) compared with RABK [13], GBK [14] and FDBK [16] for solving the consistent system of linear Eq (1.1) with a sparse overdetermined matrix A with full rank. The IT and CPU together with the speed-up of Algorithm 1 for solving (1.1) with different sparse overdetermined full rank matrix A ∈ R m×n and with different sparse underdetermined full rank matrix A ∈ R m×n are listed in Tables 4 and 5, respectively.   Tables 4 and 5 show very similar results to Table 2 in Example 1 for the GABK method. It can be seen from Figures 3 and 4 that the GABK method converges fastest among all methods for both overdetermined and underdetermined cases.
GABK is compared with the RABK [13], GBK [14] and FDBK [16] methods. Figures 3 and 4 show RSE versus IT (left) and CPU (right) of different method for solving (1.1) with df2177 and df2177 T , respectively. Table 5. IT, CPU and the speed-up of Algorithm 1 (GABK) compared with RABK [13], GBK [14] and FDBK [16] for solving the consistent system of linear Eq (1.1) with a sparse underdetermined matrix A with full rank.

Example 4. linear systems with sparse rank-deficient matrices
We test the consistent system of linear Eq (1.1) with the sparse rank-deficient matrices from [19], which originate from different field of application such as directed graph and combinatorial problem. We remove zero rows of the matrices relat6, GD00 a and GL7d26 before running all methods. Table 6 shows IT, CPU and speed-up of Algorithm 1 for solving (1.1) compared with those of the RABK [13], GBK [14] and FDBK [16] methods. Figure 5 plots RSE versus IT (left) and CPU (right) of different methods for solving (1.1) with a sparse rank-deficient matrix relat6. Table 6. IT, CPU and the speed-up of Algorithm 1 compared with RABK [13], GBK [14] and FDBK [16] for solving the consistent system of linear Eq (1.1) with a sparse matrix A with deficient rank.  Table 6 illustrates that Algorithm 1 needs the least CPU time and the least number of iterations among all methods. Figure 5 shows that the GABK method converges fastest among all methods for solving (1.1) with the sparse rank-deficient matrix relat6.

Conclusions
We propose a greedy average block Kaczmarz (GABK) approach to solve the large scaled consistent system of linear equations. We consider both dense and sparse systems. The tested systems are overdetermined and under-determined with full rank or rank-deficient matrix. The GABK method is proved to converge linearly to the least-norm solution of the underlying linear system. Numerical examples show that the GABK method has the best efficiency and effectiveness among all the methods compared.