An Alternating Sequence Iteration’s Method for Computing Largest Real Part Eigenvalue of Essentially Positive Matrices: Collatz and Perron-Frobernius’ Approach

This paper describes a new numerical method for the numerical solution of eigenvalues with the largest real part of essentially positive matrices. Finally, a numerical discussion is given to derive the required number of mathematical operations of the new method. Comparisons between the new method and several well know ones, such as Power and QR methods, were discussed. The process consists of computing lower and upper bounds which are monotonically approximating the eigenvalue. Citation: Oepomo TS (2016) An Alternating Sequence Iteration’s Method for Computing Largest Real Part Eigenvalue of Essentially Positive Matrices: Collatz and Perron-Frobernius’ Approach. J Appl Computat Math 5: 334. doi: 10.4172/2168-9679.1000334


Introduction
A variety of numerical methods for finding eigenvalues of nonnegative irreducible matrices have been reported over the last decades, and the mathematical and numerical aspects of most of these methods are well reported . In recent article of Tedja [19], it was presented the mathematical aspects of Collatz's eigenvalue inclusion theorem for non-negative irreducible matrices. It is the purpose of this manuscript to present the numerical implementation of [19]. Indeed, there is the hope that developing new numerical method could lead to discovering properties that might be responsible for better numerical method in finding and estimating eigenvalues of non-negative irreducible matrices. Birkhoff and Varga [2] observed that the results of the Perron-Frobernius theory and consequently Collatz's theorem could be slightly generalized by allowing the matrices considered to have negative diagonal elements. They introduced the terms "essentially non-negative for matrices, the off-diagonal elements of which are non-negative, and "essentially positive matrices" for essentially nonnegative, irreducible matrices. The only significant changes is that whenever Perron-Frobernius theory and Collatz's theorem refer to the spectral radius of a non-negative matrix A, the corresponding quantity for an essentially non-negative matrix Ã is the (real) eigenvalue of the maximal real part in the spectrum of Ã, also to be denoted by Λ [Ã]. Of course Λ[Ã] need not be positive and it is not necessary dominant among the eigenvalues in the absolute value sense.

Definition
Incidentally, if A is what you call an essentially positive matrix (so it is a real matrix with positive off-diagonal entries), then A + aI has positive entries for sufficiently large positive a, so A + aI has a Perron eigenvalue, p say, with corresponding eigenvector v, say, having positive entries. But p is the only eigenvalue of A + aI for which there is a corresponding eigenvector with positive entries. Thus p -a is the only eigenvalue of A with a corresponding eigenvector having all its entries nonnegative, but p -a is real and need not be positive (since a could be greater than p). In this manuscript the eigenvalue corresponding to a positive eigenvector is real. There probably is a term already in the literature for "essentially positive" For example: Z-matrix, tau-matrix, M-matrix, and Metzler matrix all refer to matrices with off-diagonal entries all of the same sign, but having extra conditions.

Background
Let A be an n x n essentially positive matrix. The new method can be used to numerically determine the eigenvalue λ A with the largest real part and the corresponding positive eigenvector x[A] for essentially positive matrices. This numerical method is based on previous manuscript [16]. A matrix A=(a ij ) is an essentially positive matrix if a ij ≥ 0 for all i ≠ j, 1 ≤ i, j ≤ n, and A is irreducible.
Let x > 0 be any positive n components vector [19]. Let The following theorem is an application of corollary 2.3 from [16] to the design of numerical method using the Perron-Frobenius-Collatz mini-max principle for the calculation of x[A] [10].

Numerical Implication of Theorem 1
We will now define a group of sequences, the "decreasingsequence."

Decreasing-sequence
Let Y r (x) (r=1, 2, 3,…,n) be an n component vector valued function such that the following equation is valid: Here Ω r (x) (r=1, 2, 3,…, n) are scalar valued functions which are having properties as follows: • Ω r (x) is a continuous positive valued function which maps the set of positive vectors V + into a set of real numbers R.
• Equality in c) may be applicable only if Y r (x r-1 ) converges to x; this yields that f r (x)=M (x).
• If for some x > 0 Ω r (x)=x r for each r∈ (N=1, 2, 3,…, n) then This will imply that f n (x)=λ according to Collatz, hence x=x [A]. In Faddeev [25], Then n-component vector valued function Y r (x) defined in equation (6) will be referred to as the Decreasing-functions. A sequence {x p } (p=0, 1, 2, 3,…) of positive n-vectors is constructed which satisfy the conditions of theorem 1. The terms of the sequence {x p } are generated by the following recursive formula: . If x 0 is given the sequence {x p } is completely defined. x p + 1 and x p differ only in the r th component where Such a sequence will be called a decreasing maximum ratio sequence or briefly decreasing-sequence.
Corollary 1: Any decreasing-sequence converges to x A .
Note: In any case, if we have an increasing bounded sequence, the limit always exists and is finite (the sup is an upper bound/ the sequence is bounded from above by a supremum). It does converge to the supremum. This criterion is also true for a decreasing bounded sequence and is bounded below by an infimum. It will converge to the infimum. But we need to be very careful to define the term bounded, it means bounded from above and bounded from below. These statements hold for sets of real numbers. We will now define a second group of sequences, the "Increasing-sequence".

Increasing-sequence
Let y r (x) (r=1, 2, 3,…, n) be an n component vector valued function such that the following equation is valid: Here ω r (x) (r=1, 2, 3,…, n) are scalar valued functions which are having properties as follows: .
Equality in c) may be applicable only if y r (x r-1 ) converges to x; this yields that fr(x)=m(x).
If for some x > 0 ω r (x)=x r for each r∈ (N=1, 2, 3,…, n) then This will imply that f n (x)=λ according to Collatz, hence x=x [A]. In Faddeev [25], Then n-component vector valued function Y r (x) defined in equation (9) will be referred to as the Decreasing-functions. A sequence {x p } (p=0, 1, 2, 3,…,) of positive n-vectors is constructed which satisfy the conditions of theorem 1. The terms of the sequence {x p } are generated by the following recursive formula: .
Where y k (x)=y k + n (x) and y(x) is as defined in equation (9) (k=1, 2, 3,…). If x 0 is given the sequence {x p } is completely defined. Such a sequence will be called an increasing minimum ratio sequence or briefly increasing-sequence.

Corollary 2: Any Increasing-sequence converges to x A .
Proof: Since wr(x) ≥ x, so that yr(x) ≥ x. This yields to imply that, starting with x 0 ≥ 0, we have x p + 1 =y p + 1 (x p ) ≥ x p ≥…x 0 ≥ 0, so that x p automatically converges as long a sup p ||x p ||<8.
Numerical tests indicate that an alternation of the application of the decreasing and increasing sequences will converge faster than either the decreasing or increasing sequence separately. Therefore, we will define a sequence of vectors {x p } which are constructed by alternating methods of the decreasing or increasing type functions.
We will describe a sequence of n steps which generate x j + 1 …x j + n (j=0, n, 2n,…) in an iteration. If the decreasing functions (y r (x),r=1, 2,…,n) are used to generate the n terms of the sequence {x p } during iteration as defined in equation (8) then we will say that the iteration is in decreasing mode. Similarly, the iteration is in the increasing mode if increasing functions are used as defined in equation (12). Successive terms of the sequence {x p } can be defined recursively in the following respects: Where k=0 corresponds to the input vector. The first iteration could be either in the increasing or decreasing mode. We also define the sequence of real number {t  } and {T  } as follows: t 0 =m(x 0 ) and T 0 =M(x 0 ). At the end of each iteration we consider the following inequalities: Where (=1, 2, 3…) are indexes of the iteration. If inequalities (14-1) and (14-2) are met, we may set And the mode or sequence of the next ( + 1) st iteration will be different from the  st iteration, i.e. the sequence of the  st iteration is different from the ( + 1) st iteration. If either inequality (14-1) or (14-2) is not satisfied then the mode or sequence of the ( + 1 st ) iteration is the same as that of the  st iteration or unchanged and we set: A sequence having the above mentioned properties will be called the alternating sequence iteration.

Corollary 3: Any alternating sequence iteration converges to x A .
Proof: If inequalities (14-1) and (14-2) are satisfied, it means that the estimated result is located between the upper bound, M(x n + 1 ), and lower bound, M(x n + 1 ), so no change in the iteration sequence is required. Otherwise, the iteration sequence is required to be switched. Since if only either one of those inequalities is met then it means that the result is not between M(x n + 1 ) and m(x n + 1 ). The condition stated in inequalities (14-1) and (14-2) will ensure that the iteration results are located between the upper and lower bound. As the iterations are continuing, eventually, M(x n + 1 ) and m(x n + 1 ) are getting closer and closer. This condition will accelerate the rate of convergence. At the end, equation (4) would be almost zero.
Corollary 1, 2, and 3 described above lay the foundation of the procedure of an iterative method for the determination of the positive eigenvector of essentially positive matrices. The choices of the functions Ω r (x), ω r (x) are open, but are subject to the restrictions specified in connection with the decreasing and increasing sequences. In theorem 2 and theorem 3 which follow, a possible choice for Ω r (x) and ω r (x) is given.

Theorem 2
Let H r (x) (r=1, 2, 3,…, n) denote continuous, positive valued functions which map the set of positive vectors V + into a set of real number R such that m(x) ≤ H r (x) ≤ M(x) (16) and equality may hold on either side of equation (16) Where H r ≡ H r (x), f r ≡ f r (x), and all notations are defined in equations (1, 2, 3, 4, and 5). Then the functions Ω r (x) (together with a starting vector x 0 ) define a decreasing sequence.
Proof: We will first show that if f r < H r , the term (H r -a rr ) in equation (17) is always positive. By definition from equations (1) For an essentially positive matrix all the off diagonal elements cannot be zero and consequently for any vector x > 0; 0 r r z x > . Therefore (18) becomes 0 < H r -a rr (19) It is clear from equations (17) and (19) that Ω r (x) is a positive valued function. Equation (17) can also be used to derive estimates for the function θ r ≡ θ r (x)=f r (Y r (x)). With the abbreviation used before, The above inequalities are equivalent to θr=max[fr, Hr] θ r is the maximum of two continuous functions and is therefore continuous. By definition f r , H r are either less than or equal to M. Therefore from equation (21) θ r ≤ M. Thus Ω r (x) has property of c in equation (7) Equation (20) implies θ r > f r (23) From equations (22) therefore Ω r (x) ≤ x r . Thus Ω r (x) has property b in equation (7) As stated before for an essentially positive matrix, z r is positive, and therefore it is obvious from (22) that θ r > a rr ; and thus the denominator in equation (24) is positive. Therefore, by the established continuity of θ r (x), Ω r (x) it is also continuous. Thus Ω r (x) has property of (a) of equation (7)  Thus Ω r (x) has property d in equation (12) of decreasing sequences.
Finally suppose that Ω r (x)=x r for r=1, 2, 3,…, n, and thus θ r (x)=f r (x). From equation (20)  Hence Ω r (x) has property of e in equation (7) of decreasing sequences. This completes the proof of theorem 2.

Theorem 3
Let h r (r=1, 2, 3,…, n) denotes a continuous bounded function mapping and equality may hold on either side of equation (24) For an essentially positive matrix, all the off diagonal elements cannot be zero and accordingly for any other vector x > 0; 0 r r z x > ; therefore equation (28) becomes 0 > h r -a r It is clear from equation (26) and (28) that ω r (x) is a negative valued function. Equation (26) can also be used to derive estimates for the function, θ r =θ r (x)=f r (y r (x)). With the abbreviation used before θ r =f r if: The above inequalities are equivalent to: θ r is the minimum of two continuous functions and is thus continuous. By definition f r , h r are either greater than or equal to m. Therefore, from equation (30) θr ≥ m. Thus ω r (x) has property of c in equation (11) of increasing sequences. From the definition of θ r , ω r (x),z r we get: x ω + ≤ + therefore ω r (x) ≥ x r .
Thus ω r (x) has property b in equation (11) As stated before, for an essentially positive matrix, z r is positive, and hence it is obvious from (31) that θ r < a rr ; and thus the denominator in equation (33) is negative. Accordingly by the established continuity of θ r (x), ω r (x) it is also continuous. Thus ω r (x) has the property of (f) of equation (11)

of increasing sequences. From equation (30) θ r= m(x)
is possible only if min(f r , h r )=m(x). As h r > m(x) by assumption unless f r (x)=m(x), so in any case θ r (x)=m(x); f r (x)=m(x). Thus r (x) has property i in equation (11) of increasing sequence. Finally suppose that ω r (x)=x r for r=1, 2, 3,…, n, and thus θ r (x)=f r (x). From equation (29) θ r ≤ h r as θ r =f r , we get f r ≥ h r .
The last inequality is equivalent to min ( )

The Requirements of Functions H r (x), and h r (x)
The functions H r (x), and h r (x) can be selected in many means. The following are a few of the possible choices:

New Alternating Iteration Method
A step of the alternating sequence iteration method consists in modifying a single component x r of x. As a result z i , f i , ||x||,υ(x) will have to be calculated at each iteration. Calculating z i , ||x|| from their definition in equations (1), and (5) will be referred as recalculating. A considerable reduction of calculation can be accomplished if instead of recalculating these terms are merely updated according to the following steps: These steps will be referred to as the updating iteration. The updating equations can be obtained easily from equations 1 through 5. To prevent the accumulation of round off errors after a number of iterations, the variables will have to be recalculated instead of updating. If we are working in a double precision, our previous experiences indicate that it is more than sufficient to recalculate after every twenty five iterations.

Over-Relaxation Method
From various choices for functions H r (x),h r (x) and υ(x) as indicated in equation (34) seems to give a rapid convergence at least for full matrices. The purpose of this section is to present a variant of equation (35) by introducing the over-relaxation technique. We consider the following equation As it well known, for several suitable values forγ, is the overrelaxation factor, and 1 ≤ γ ≤ 2. Equation (36) may be useful in case of banded matrices. The over-relaxation method contains the following cases: • γ=1 for simultaneous over-relaxation method, and • 1 < γ < 2 for over-relaxation method.
Error vector in all methods the quantity ∆(x)=M(x)-m(x) as indicated in (4) is used as a measure of accuracy.

Discussion
Before we go any further, the following issues should be understood. Are both eigenvalues and eigenvectors required to be calculated, or are eigenvalues by itself enough? Is only the absolute largest eigenvalue of interest? Does the matrix have special characteristics such as real symmetric, essentially positive, and so on? If all eigenvalues and eigenvectors are required then this new cannot be used; If a matrix (A) is essentially positive and the positive eigenvector (x A ) and the corresponding eigenvalue (λ A ) are of particular interest, then the new method can be used. Each step of the numerical method requires n 2 + 0(n) computations, if the parameters are chosen for the best rate of convergence. It is possible to assume that in half the steps practically no computations are needed, resulting thereby in 2 0( ) 2 n n + computations for each iteration. As previously stated, after some iteration the variables will have to be recalculated instead of updating. Recalculations need n 2 + 0(n) additional computations. If the computations are performed in double precision, recalculation will not have to be performed so often. As a result, recalculations do not increase the total number of computation significantly.
For our numerical comparisons all three methods, Power, New Method, and QR methods, were tried to solve eigenvalue of the following matrices: All three methods were used to estimate the eigenvalue of Hilbert matrices of various orders. Let H n be a Hilbert matrix of order n. The elements of Hilbert matrix are defined according to the following relation: The results of the 3 methods can be seen in Tables 1-3.

•
We would like to find the efficiency of the three numerical methods, when a matrix had eigenvalues of nearly the same modulus. So it was decided to pick a matrix of order n that was almost cyclic (c n ). Consider the below mentioned matrix If the elements of A 1, 1 and A 2, 2 were replaced by zero, then the matrix would be nearly cyclic. For comparisons, the results of those 3 methods can be seen in Tables 4-6. • Introducing a proper shift of origin could speed up the convergence of power method [9]. So it was decided to try that kind of matrix by introducing a shift of origin would not help the speed of convergence. Such a matrix of order n(Q n ) can be given by the following relations.
The results of our tests are indicated in Tables 7-10 for Arnoldi.
We will assume that we are interested in the positive eigenvector and the corresponding eigenvalue of the essentially positive matrix. From our trials, it is obvious that in all three cases the rate of convergence of our new method is better than or at least as fast as the power [9]. The QR [26] method converges very slowly in the last two cases, when the separation between the eigenvalues is poor. Let us consider the results of case b, when the matrix is nearly cyclic. For a cyclic matrix there are some eigenvalues of equal modulus, and so for a matrix that is "near cyclic" it is plausible to assume the separation between the modules is very poor. The new method takes about 5, 700 multiplications and divisions to reach an accuracy of 8 digits; which is about 5 times the computations of the power method and the QR method reaches an accuracy of 2 digits and 4 digits respectively. We should remember that the QR method is not specifically designed to calculate just one eigenvalue; therefore, a comparable efficiency cannot be expected. Thus from our recent experience, we can conclude that the new method shows a good speed of convergence even when the separation of the eigenvalues is poor. However, in the case of banded matrices the new method converges slowly. The new was tried on various banded matrices arising from finite difference approximation to boundary value problems of ordinary differential equations. A computer code was written specially for banded matrices, to take advantage of the large number of zero elements in a banded matrix. We will here summarize the results of our computer runs with the following (20,20)    The over relaxation method as described in equation (36), was tried on the previously mentioned matrix with values of γ ranging from 1 to 1.99. The speed convergence did not show a remarkable dependence of γ. An 8 digit of accuracy was obtained in 168 iterations for γ=1.73, whereas for full matrices the new method gave a 9 digit of accuracy in 21 steps.
We will now return our attention to full matrices. Let R n be a matrix (of order n) with pseudo-random entries. The new method and the power method were tried on each family of matrices (R n , C n ,        H n ) of order n=20, 40 and 80. The speed of convergence is almost the same for the two methods remembering that each iteration step of the power method requires about twice as many computations. Within one method it is somewhat surprising that the number of iteration steps required for a given accuracy hardly depends on the order of the Hilbert matrix at all.
No convergence rates are presented, but the experiments shown indicate a linear convergence comparable to classical methods, such as the power method. Of course other methods, such as Rayleigh Quotient Iteration, exhibit faster convergence rates (usually cubic).
The table of computational results is presented for a special class of dense matrices, namely Hilbert matrices, and others of similar structure. The new proposed method is about 10% faster (in number of operations) than the power method (compare 24, 000 operations in Table 2 to 27,880 operations in Table 1 for an error of the order of 10 -8 ). For sparse matrices stemming for standard PDE problems, the new method is inferior.