Fast Algorithms for Structured Least Squares and Total Least Squares Problems

We consider the problem of solving least squares problems involving a matrix M of small displacement rank with respect to two matrices Z1 and Z2. We develop formulas for the generators of the matrix M HM in terms of the generators of M and show that the Cholesky factorization of the matrix M HM can be computed quickly if Z1 is close to unitary and Z2 is triangular and nilpotent. These conditions are satisfied for several classes of matrices, including Toeplitz, block Toeplitz, Hankel, and block Hankel, and for matrices whose blocks have such structure. Fast Cholesky factorization enables fast solution of least squares problems, total least squares problems, and regularized total least squares problems involving these classes of matrices.

[ ] and results in a fast solution algorithm for the problem considered in [8,9,10,7].
The core of the algorithm in [6], based on a more general algorithm of [11], relies on two results: the representation of the generators for the matrix A H A that appears in the normal equations when A is Toeplitz, and then a fast factorization of a matrix derived from these generators. So we begin in Sec. 2 with a construction of the generators for M H M when M is any matrix of small displacement rank and Z 1 is close to unitary. In Sec. 3 we show that it is inexpensive to form a Cholesky factorization of M H M whenever Z 2 is triangular and nilpotent. Section 4 concerns the application of this algorithm, and we conclude in Sec. 5.
The results in this work are derived from [4,5].

Generators of M H M
One way to solve a least squares problem involving a matrix M of full column rank is to use the normal equation formulation; we factor M H M = LL H where L is a square lower triangular matrix and then solve the linear system Thus, we can solve this problem fast if we can compute a Cholesky factorization of M H M fast. To do this, we will first derive a generator for the matrix M H M when M ∈ C m × n (m ≥ n) has low displacement rank.
Suppose that M has low displacement rank relative to the matrices Z 1 ∈ C m × m and Z 2 ∈ C n × n , which means that if we define then rank (N), which we denote by ρ 1 , is small relative to n. Suppose is a unitary matrix assumed to be small. For example, if E is Toeplitz, let Z 1 be the shift-down matrix with ones on its subdiagonal and zeros elsewhere, and then W is the matrix with a one in the last position of row 1.
Then M H M also has low displacement rank relative to Z 2 , as we can see from the identity

Determining a Factorization of M
H M From the Generators (3) where s i equals plus or minus 1. When Z 1 and Z 2 are shift-down matrices, it has been shown [6,1,2] that this implies that where S = diag(S i ), S i = diag(s i ) (n × n), and L(h i ) is the lower triangular Toeplitz matrix with first column equal to h i . We now generalize this result somewhat.
If Z 1 and Z 2 are shift-down matrices, then [6] shows how to do this reduction fast. Using Corollary 1, we will see that this can be done fast whenever Z 2 is triangular and nilpotent. We present the algorithm for the case Z 2 lower triangular; the upper triangular case is analogous but works with the columns in reverse order.
The algorithm proceeds by columns, putting zeros below the main diagonal. Note that Suppose we determine a rotation between the first row h 1 1 1 Now since is nilpotent, for some . There-p n ρ = ≤ Z Z 0 fore, , and working backward in H is upper triangular. Therefore, by introducing one zero into our matrix, we have implicitly introduced n -1 more, so we can put zeroes below the main diagonal in column 1 by using only ρ -1 rotations, independent of the size of n.
We then use the resulting second row, equal to the first row postmultiplied by Z 2 H , to zero the second element of row n + 1. Again this implicitly introduces additional zeros, n -2 of them, and we complete the operations on column 2 by using ρ -1 rotations.
If we repeat this for each column, we accomplish our reduction. Let H be the ρ × n matrix whose rows are h i H . We can thus reduce L to upper triangular form just by operating on the matrix H.
We design our algorithm to use Givens rotations as often as possible, minimizing the number of hyperbolic rotations in order to preserve stability. A Givens rotation can be used between row i and row j whenever s i and s j (see equation (3)) have the same sign; if they have different signs, then we must use a hyperbolic rotation. We'll assume that we have ordered the generators so that the first ρ rows of H have s i = 1 and the remaining ones have s i = -1. The cost of this reduction is at most O(ρn 2 ), ignoring sparsity, plus the cost of the multiplications by Z 2 . Sometimes sparsity can reduce this cost significantly [7]. Without exploiting the structure of L the cost would be O(ρn 3 ). Once the factors LL H are computed, they can then be used to solve (2).

Some Applications
Our fast algorithm enables us to solve least squares, total least squares, and regularized total least squares problems involving matrices for which Z 1 is close to unitary and Z 2 is triangular and nilpotent. This includes several important classes of structured matrices.

Consider first the least squares problem
where M ∈ C m × n has full column rank. We can use our algorithm if • M is Toeplitz. Then Z 1 and Z 2 are the shift-down matrices of appropriate size and the displacement rank is 2.
• M is block Toeplitz: where M δ has dimension m / γ × n / δ. Journal of Research of the National Institute of Standards and Technology 3 There is an analogous algorithm, FTriang, in [6], for the special case in which M is Toeplitz, but it has some typographical errors. In the statement following "if i<m", g3 on the left-hand side should be g4. In 12 places on p. 552, "m+n" should be "mn1". Also, the numbering of the phases of the computation is off by one compared with the description in the paper ("Initialization" should be "Phase 1", etc.). where each block has dimension m / γ × m / γ, and choosing Z 2 ∈ C m × n similarly but with blocks of dimension n / δ × n / δ gives a displacement rank of m / γ + n / δ.
• M has Toeplitz blocks: where each block M ij is Toeplitz. Then choosing Z 1 ∈ C m × m as a block-diagonal matrix consisting of shift-down matrices of dimensions matching the column dimension of the diagonal blocks of M and choosing Z 2 ∈ C n × n similarly but with blocks matching the row dimension gives a displacement rank of 2(γ + δ).
• M Hankel, block Hankel, or having Hankel blocks. These cases are analogous to those considered above.

Solving Structured Total Least Squares Problems
In some cases the matrix of the problem can be estimated only with error, and we need to determine not just the parameters x of the least squares fit but also the corrections to the operator. This problem can be formulated as (4) subject to the constraints with A and E having the same structure.
Suppose that the matrix E can be specified by p parameters α 1 , . . . , α p . For example, if E is a Toeplitz matrix, then and p = m + n -1. We rewrite our problem as (5) where Following [6], we have replaced the term || E || F 2 by α α H α α, equivalent except for scaling of the entries α ι 2 . We define the matrix X ∈ C m × p by the equation For example, if E is Toeplitz, then p = m + n -1 and Following [11], we form a quadratic approximation to (5) by using linear approximations α α + ∆ ∆α α and x + ∆ ∆x, resulting in so that If we minimize this with respect to ∆ ∆α α and ∆ ∆x, then we can form a new approximation α α = α α + ∆ ∆α α x = x + ∆ ∆ x to the solution of (5) and then repeat the procedure until convergence. As noted by [11], this is a Gauss-Newton algorithm applied to (5). Therefore, the main computational task is to solve linear least squares problems of the form (6) where If A is in one of the classes considered in Sec. 4.1, then the matrix M has low displacement rank and we can solve the problem fast.

Solving Regularized Total Least Squares Problems
In many deblurring problems and other discretized problems involving integral equations of the first kind, the matrix A is so ill-conditioned that noise in the observations y is magnified in solving the STLS problem and a meaningful solution cannot be obtained.
In this case it is necessary to add a regularization constraint to the problem. One common regularization constraint is to restrict the size of the solution, or some linear transformation of the solution: where u is a given scalar and C is commonly chosen to be the identity matrix or a difference operator. If C has low displacement rank relative to Z 2 and an appropriate Zˆ1, our algorithm can be easily modified to incorporate regularization. In this case, our problem (5) can be reformulated as (7) where β β = (A + E)x -y and λ, the regularization parameter, is the Lagrange multiplier for the new constraint. Using a derivation similar to that above, the linearization of (7) results in the following problem to be solved at each step of the iteration: Thus, our new M matrix is the matrix M from the previous section augmented by the extra rows [0, λC], and the only change necessary in the algorithm is to find the generators of this matrix rather than the old one.
The displacement structure of this matrix is greatly simplified if C is upper triangular and Z 1 and Z 2 are shift-down matrices. As noted before, W is zero except for a one in the last position of the first row, and thus W M is zero except for a λ in the last position of the first row. Therefore, W M Z 2 H = 0, so, applying Theorem 1, we have the following result.
Theorem 3. If C is upper triangular and Z 1 and Z 2 are shift-down matrices, then and has rank 2ρ 1 , where ρ 1 is the rank of N. More generally, (8) holds whenever W M Z 2 H = 0.

Conclusions
We have derived the generators for M H M when M is any matrix of small displacement rank relative to Z 1 and Z 2 . We have shown that it is inexpensive to form a Cholesky factorization of M H M whenever Z 1 is close to unitary and Z 2 is triangular and nilpotent, and we have generalized this algorithm when a regularization constraint is to be applied.