Approximating the Matrix Sign Function Using a Novel Iterative
Method

Soleymani, F.; Stanimirović, P. S.; Shateyi, S.; Haghani, F. Khaksar

doi:https://doi.org/10.1155/2014/105301

Abstract and Applied Analysis

On this page

Abstract Introduction Experimental Results Acknowledgments References Copyright Related Articles

Special Issue

Iterative Fixed-Point Methods for Solving Nonlinear Problems: Dynamics and Applications

View this Special Issue

Research Article | Open Access

Volume 2014 | Article ID 105301 | https://doi.org/10.1155/2014/105301

Approximating the Matrix Sign Function Using a Novel Iterative Method

F. Soleymani,¹P. S. Stanimirović,²S. Shateyi,³and F. Khaksar Haghani¹

Academic Editor: Juan R. Torregrosa

Received19 Apr 2014

Accepted06 Jul 2014

Published17 Jul 2014

Abstract

This study presents a matrix iterative method for finding the sign of a square complex matrix. It is shown that the sequence of iterates converges to the sign and has asymptotical stability, provided that the initial matrix is appropriately chosen. Some illustrations are presented to support the theory.

1. Introduction

The generic matrix function of a given matrix is defined formally by the integral representation where is an analytic function, , and is a closed curve which encircles all eigenvalues of (it should be contained in the domain of analyticity of ). The integral representation (1) is known as the Cauchy integral formula [1]. The integral of a matrix should be understood as the matrix whose entries are the integrals of the entries of . However, this mathematically appealing formula for computing the matrix functions is complicated and needs complex analysis to be fully understandable. Hence, several important other strategies for computing the matrix functions have been proposed and investigated, such as the Jordan canonical form and iterative methods for applied numerical problems (see, e.g., [1–3]).

In 1971, Roberts in [4] introduced the matrix sign function as a tool for model reduction and for solving Lyapunov and algebraic Riccati equations. He defined the sign function as a Cauchy integral and obtained the following integral representation of :

The matrix sign function is widely exploited in numerical linear algebra, especially in the computation of invariant subspaces and solutions of Riccati equations [5–7]. Note that the application of this function enables a matrix to be decomposed into two components whose spectra lie on opposite sides of the imaginary axis. The matrix sign function is a valuable tool for the numerical solution of Sylvester and Lyapunov matrix equations (see, e.g., [8]). An application of a generalization of the Newton iteration for the matrix sign function to the solution of the generalized algebraic Bernoulli equations was considered in [9].

Another application of this matrix function as a simple and direct method to derive some fundamental results in the theory of surface waves in anisotropic materials was presented in [10]. The authors of paper [11] investigated some practical iterations for matrix sector function which is a generalization of the matrix sign function.

Due to the applicability of the matrix sign function along with the difficulty of representation (2), stable iterative methods have become some viable choices.

The most important general family of matrix iterations for finding the matrix sign function was introduced in [12] using Padé approximants to and the following characterization: where and is less than 1 in magnitude. Let the -Padé approximant to be and . The iteration has been proved to be convergent to and with the order of convergence for . Generally speaking, the iterations of Kenny and Laub (4), generated by the and Padé approximants, are globally convergent. A list of different iterations (in the scalar form) is given in Table 1.

Note that Iannazzo in [13] pointed out that these iterations can be obtained from the general König family (which goes back to Schröder [14, 15]) applied to the equation . For a recent method in this area, one may refer to [16].

A lot of known methods could be extracted from the Padé family (4). For example, the Newton’s iteration can be deduced as the reciprocal of the iteration corresponding to the case and in Table 1:

Choosing yields the following fifth order method:

Similarly, options and , or and , or and result in the following fifth order methods, respectively:

The remaining sections of this work are organized as follows. Section 2 presents the construction and the derivation of a new matrix iteration for finding using the following new nonlinear equation solver: where . (10) is a combination of Jarratt’s method [17] and a secant approach [18].

Note that (10) is a novel three-step iterative method (for the scalar case). Section 2 also studies the stability of the method and verifies its asymptotical stability. In Section 3, we discuss some other aspects of the new method, applicable in the implementation. Therein, we derive a new inversion-free method as well as a scaled method for finding . Section 4 is devoted to the numerical examples for illustrating the convergence behavior of the new method against the existing ones. We emphasize that our constructed solver possesses the same global convergence rate as (6), but numerical example will reveal that it has an equal or faster behavior for solving many randomly generated matrices. This would be a clear advantage of our solver in contrast to (6). Finally, concluding remarks will be drawn in Section 5.

2. A Novel Iterative Method

Throughout this work it is assumed that has no eigenvalues on the imaginary axis, so that is defined. Note that this assumption implies that is nonsingular.

As discussed in the fundamental article of Kenny and Laub in 1995 [19], the construction of new matrix methods for the matrix sign is related to the iterative methods for finding the solution of nonlinear scalar equations. Let us apply (10) to the following nonlinear matrix equation: in which is the identity matrix. In fact, the sign is a solution of the matrix equation (11). After application and simplification of its reciprocal, we obtain where .

Iannazzo in [20] mentioned that a matrix convergence is governed by the corresponding scalar convergence. Since the method (10) reads the following error equation: wherein and , therefore its corresponding matrix method (12) converges with fifth order of convergence. But the question is whether this convergence is local or global. To answer this question, we draw the basins of attraction for the new scheme along with the existing methods of various orders.

It is shown in Figures 1–3 that methods (5), (6), and (12) are globally convergent, while methods (7), (8), and (9) are locally convergent (if one chooses a matrix (in Figure 2 (left)) with one eigenvalue with negative real part, but in a green petal, then the matrix iteration will not converge to ).

The higher order convergence of (12) made its basins larger and lighter in contrast to (5). Hence, the new method could be of interest due to its global fifth order of convergence for finding .

Now, an important challenge, that must be proven, is to show the stability of the new method (12) for finding the matrix sign function. This will be done formally in what follows.

Definition 1 (stability, see [1]). Consider an iteration with a fixed point . Assume that is Fréchet differentiable at . The iteration is stable in a neighborhood of if the Fréchet derivative has bounded powers; that is, there exists a constant such that for all .

We investigate the stability of (12) for finding in a neighborhood of the solution of (11). In fact, we analyze how a small perturbation at the th iterate is amplified or damped along the iterates. Stability concerns behavior close to convergence and so is an asymptotic property.

Lemma 2. The sequence generated by (12) is asymptotically stable.

Proof. If is a function of , then the iterates from (12) are all functions of and hence commute with . Commutativity properties are frequently used when deriving a matrix iteration for finding .
In this study, we restrict the analysis to asymptotically small perturbations; that is, we use the differential error analysis.
Let be the numerical perturbation introduced at the th iterate of (12). Next, one has Here, we perform a first order error analysis; that is, we formally use approximations , since , , is close to the zero (matrix). This formal manipulation is meaningful if is sufficiently small. We have where the following identities are used (for any nonsingular matrix and the matrix ):
Note that after some algebraic manipulations and using , we have (assuming for enough large ) We used the following facts on the matrix sign function , and . We can now conclude that the perturbation at the iterate is bounded; that is, Therefore, the sequence generated by (12) is asymptotically stable. This ends the proof.

Theorem 3. Let have no pure imaginary eigenvalues. Then, the matrix sequence defined by (12) converges to the matrix sign .

Proof . Let have a Jordan canonical form arranged as where is a nonsingular matrix and , are square Jordan blocks corresponding to eigenvalues lying in (open left-half complex plane) and (open right-half complex plane), respectively. Denote by and values lying on the main diagonals of blocks and , respectively.
Since is invertible, we know that is diagonalizable and [1] Therefore,
On the other hand, if we define , then from the equation (12) we obtain Notice that if is a diagonal matrix, then all successive are diagonal too. From (22), it is enough to prove that converges to , in order to ensure the convergence of the sequence generated by (12) to .
We can write (22) as uncoupled scalar iterations to solve , given by where and . From (22) and (23), it is enough to study the convergence of to , for all .
From (23) and since the eigenvalues of are not pure imaginary, we have that . Thus, we attain From the other point of view, since , we obtain and . This shows that is convergent. Now, one concludes that .
Recalling , we have and subsequently the convergence is established. The proof is complete.

Theorem 4. Consider the same conditions of Lemma 2 and Theorem 3. Then the proposed method (12) has fifth order of convergence to the sign matrix .

Proof . Clearly, are rational functions of and hence, like , commute with . On the other hand, we know that , , , and , . Using the replacement (for the sake of simplicity), we have Now, using any matrix norm from both sides of (27), we attain This reveals the fifth order of convergence for the new method (12). The proof is complete.

3. Multiplication-Rich and Scaled Variants of the New Method

In general, reduction of a problem in numerical linear algebra to the matrix inversion problem is not an advisable technique. The iterations (5)–(9) as well as (12) require explicit computation of a matrix inverse in each iterative step. Since explicit usage of the inverse matrix is relatively rare in numerical analysis, there is a normal aspiration to approximate the inverse. Such discussions and variants for (12) will be given in the following subsections.

3.1. Solve a Matrix Equation Instead of the Matrix Inverse

One of the ways to avoid explicit usage of the inverse matrix is to solve corresponding system of linear matrix equations instead of the matrix inverse. This is done in Algorithm 1.

(1) Given a suitable
(2) for until convergence do
(3)
(4) Solve the linear system for
(5) end for

3.2. Use Approximation of the Matrix Inverse

Another tendency is to give matrix multiplication-rich iterations which retain the convergence rate of the method. For example, the inverse in (5) can be replaced by one step of Schulz method for the matrix inverse, which has the form . This replacement produces the Newton-Schulz iteration which is multiplication-rich and retains the quadratic convergence of Newton’s method. However, it is only locally convergent, with convergence guaranteed for (see [1]). We apply similar idea to (12). For the sake of simplicity, we use the notation . Replacing with in (12), we get the following iterative rule for computing the matrix sign function:

Formula (30) is multiplication-rich with convergence guaranteed for .

Note that inverse-free algorithms are suitable for the implementation on vector and parallel computers. The iterative scheme (30) includes 9 matrix multiplications, while the complexity of (12) contains 6 matrix multiplications and one matrix inversion to achieve fifth convergence order.

3.3. A Hybrid Method

An efficient algorithm for finding the sign by avoiding the computation of matrix inverse is to use (12) or (33) in initial iterations, until is close enough to , and in a subsequent stage apply (30). Such a switching approach was proposed originally in [19]. A similar idea is exploited in Algorithm 2.

) Given a suitable
() use (12) until
() set
() for until convergence do
()
(6)
(7) end for

3.4. Scaling Method

For lower order methods, such as (5), the convergence is slow at the beginning. In fact, once the error is sufficiently small (in practice, less than, say, 0.5), successive errors decrease rapidly, each being approximately the square of the previous one. However, initial convergence can be slow if the iteration has a large eigenvalue, that is, in the case . Hence, a scaling approach to accelerate the beginning of this phase is necessary and can be done in what follows [7] for the Newton’s method: wherein

Such an approach could be done to refine the initial matrix and to provide a much more robust initial matrix to arrive at the convergence phase rapidly.

The new iteration (12) is quite fast and reliable due to the discussions in Sections 2 and 3. However, a way is open for speeding up its initial phase of convergence via the concept of scaling.

An effective way to enhance the initial speed of convergence is to scale the iterates prior to each iteration, that is, is replaced by . Such an idea can simply be done in what follows: where and . Algorithm 3 illustrates an efficient method based on (33) for finding .

(1) Given a suitable
(2) use (33) until
(3) set
(4) for until convergence do
(5)
(6)
(7) end for

4. Experimental Results

There are basically two general ways to terminate a matrix iteration for finding , that is, to stop when has relative error below a tolerance or to stop when some suitable residual (such as ) is below the tolerance. The relevant aim will in general be problem dependent. However, the stop termination is really important in matrix methods.

The considered stopping termination in this section would be the safe strategy introduced in [1] as follows:

For comparisons, we have used the matrix globally convergent methods (5), (6), and (12) using Mathematica 8 built-in precision, [21]. We used Mathematica function Inverse to compute the required matrix inverse. Implementation details of the function Inverse are based on efficient row reduction (Gaussian elimination) based on numerical approximation.

Example 5. In this test, we examine the behavior of different iterative methods for finding the matrix sign function of 10 randomly generated complex square matrices as follows: n = 70; number = 10; SeedRandom; Table[A[l] = RandomComplex[, , {n, n}];, {l, number}].

The results of comparisons in terms of the number of iterations have been reported in Table 2. We remark that whatever the eigenvalues of a matrix are closer to the imaginary axis, the speed of convergence for different methods becomes slower and more risky to face with singular matrices , whose inverses could not be computed.

In this example, we have used the stopping criterion (34) with , matrix norm one, and as the initial matrix.

Example 6. In order to confirm the acceleration via scaling, we have reran Example 5 with the norm scaling. The results are summarized in Table 3. The numerics reverify the effectiveness of the new method (12) in finding the matrix sign.

5. Summary

Interest in the sign function grew steadily starting from the 1970s and 1980s, initially among engineers and later among numerical analysts. Following such a trend, in this study we proposed a fifth order new iterative method for finding the matrix sign .

Applying the basins of attraction in the complex plane, based on the results from [20], we concluded that the introduced method (12) has global convergence. We then theoretically found that it is asymptotically stable. Some numerical experiments have been studied in order to show the faster convergence on the basis of a smaller number of iterations.

Several modifications of the introduced method have been established, such as a new inversion-free method, a composite method, and a scaled version.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The second author gratefully acknowledges support from the Research Project 174013 of the Serbian Ministry of Science. The authors are grateful to an anonymous reviewer for carefully reading their paper and helping to improve the presentation.

References

N. J. Higham, Functions of Matrices: Theory and Computation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pa, USA, 2008.
View at: Publisher Site | MathSciNet
J. L. Howland, “The sign matrix and the separation of matrix eigenvalues,” Linear Algebra and Its Applications, vol. 49, pp. 221–232, 1983.
View at: Publisher Site | Google Scholar | MathSciNet
J. Leyva-Ramos, “A note on mode decoupling of linear time-invariant systems using the generalized sign matrix,” Applied Mathematics and Computation, vol. 219, no. 22, pp. 10817–10821, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
J. D. Roberts, “Linear model reduction and solution of the algebraic Riccati equation by use of the sign function,” International Journal of Control, vol. 32, no. 4, pp. 677–687, 1980.
View at: Publisher Site | Google Scholar
Z. Bai and J. Demmel, “Using the matrix sign function to compute invariant subspaces,” SIAM Journal on Matrix Analysis and Applications, vol. 19, no. 1, pp. 205–225, 1998.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
R. Byers, C. He, and V. Mehrmann, “The matrix sign function method and the computation of invariant subspaces,” SIAM Journal on Matrix Analysis and Applications, vol. 18, no. 3, pp. 615–632, 1997.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
C. S. Kenney, A. J. Laub, and P. M. Papadopoulos, “Matrix-sign algorithms for Riccati equations,” IMA Journal of Mathematical Control and Information, vol. 9, no. 4, pp. 331–344, 1992.
View at: Publisher Site | Google Scholar | MathSciNet
P. Benner and E. S. Quintana-Ortí, “Solving stable generalized Lyapunov equations with the matrix sign function,” Numerical Algorithms, vol. 20, no. 1, pp. 75–100, 1999.
View at: Publisher Site | Google Scholar | MathSciNet
S. Barrachina, P. Benner, and E. S. Quintana-Ort, “Efficient algorithms for generalized algebraic Bernoulli equations based on the matrix sign function,” Numerical Algorithms, vol. 46, no. 4, pp. 351–368, 2007.
View at: Publisher Site | Google Scholar | MathSciNet
A. N. Norris, A. L. Shuvalov, and A. A. Kutsenko, “The matrix sign function for solving surface wave problems in homogeneous and laterally periodic elastic half-spaces,” Wave Motion, vol. 50, no. 8, pp. 1239–1250, 2013.
View at: Publisher Site | Google Scholar | MathSciNet
B. Laszkiewicz and K. Zietak, “Algorithms for the matrix sector function,” Electronic Transactions on Numerical Analysis, vol. 31, pp. 358–383, 2008.
View at: Google Scholar | MathSciNet
C. Kenney and A. J. Laub, “Rational iterative methods for the matrix sign function,” SIAM Journal on Matrix Analysis and Applications, vol. 12, no. 2, pp. 273–291, 1991.
View at: Publisher Site | Google Scholar | Zentralblatt MATH | MathSciNet
B. Iannazzo, Numerical solution of certain nonlinear matrix equations [Ph.D. thesis], Universita degli Studi di Pisa, Facolta di Scienze Matematiche, Fisiche e Naturali, Pisa, Italy, 2007.
E. Schröder, “Ueber unendlich viele Algorithmen zur Auflösung der Gleichungen,” Mathematische Annalen, vol. 2, no. 2, pp. 317–365, 1870.
View at: Publisher Site | Google Scholar | MathSciNet
E. Schröder, “On infinitely many algorithms for solving equations,” Tech. Rep. TR-92-121, Department of Computer Science, University of Maryland, College Park, Md, USA, 1992.
View at: Google Scholar
F. Soleymani, E. Tohidi, S. Shateyi, and F. Khaksar Haghani, “Some matrix iterations for computing matrix sign function,” Journal of Applied Mathematics, vol. 2014, Article ID 425654, 9 pages, 2014.
View at: Publisher Site | Google Scholar
P. Jarratt, “Some fourth order multi-point iterative methods for solving equations,” Mathematics of Computation, vol. 20, no. 95, pp. 434–437, 1966.
View at: Google Scholar
A. Iliev and N. Kyurkchiev, Nontrivial Methods in Numerical Analysis: Selected Topics in Numerical Analysis, LAP Lambert, 2010.
C. S. Kenney and A. J. Laub, “The matrix sign function,” IEEE Transactions on Automatic Control, vol. 40, no. 8, pp. 1330–1348, 1995.
View at: Publisher Site | Google Scholar | MathSciNet
B. Iannazzo, “A family of rational iterations and its application to the computation of the matrix pth root,” SIAM Journal on Matrix Analysis and Applications, vol. 30, no. 4, pp. 1445–1462, 2008.
View at: Publisher Site | Google Scholar | MathSciNet
M. Trott, The Mathematica Guide-Book for Numerics, Springer, New York, NY, USA, 2006.

Copyright

Copyright © 2014 F. Soleymani et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies

Views

1287

Downloads

1294

Citations