On angles, projections and iterations

We investigate connections between the geometry of linear subspaces and the convergence of the alternating projection method for linear projections. The aim of this article is twofold: in the first part, we show that even in Euclidean spaces the convergence of the alternating method is not determined by the principal angles between the subspaces involved. In the second part, we investigate the properties of the Oppenheim angle between two linear projections. We discuss, in particular, the question of existence and uniqueness of"consitency projections"in this context.


Introduction
The interest in the convergence of sequences of iterates of projections of various types goes back at least to the mid-twentieth century. J. von Neumann's article [17] from 1949 can be considered one of the starting points of these investigations. In this article he shows that given a Hilbert space H and two closed subspaces M, N ⊂ H, with corresponding orthogonal projections P M and P N , respectively, the sequence defined by converges in norm to P M ∩N x 0 for every initial point x 0 ∈ H. An elementary geometric proof of von Neumann's theorem can be found in [14] This result was later generalised to the case of more than two subspaces by I. Halperin in [8]. In [9] S. Kayalar and H. L. Weinert showed that the speed of convergence is determined by the Friedrichs numbers between the subspaces involved. This can be considered a geometric condition controlling the convergence behaviour. Note that in all these cases the order in which the projections are iterated is of crucial importance. In [1], I. Amemiya and I. Ando asked the question of whether convergence in norm can always be achieved provided that each projection appears infinitely often. This question was finally answered negatively by E. Kopecká and A. Paszkiewicz in [13], where they give an example of three subspaces and an iteration order without convergence in norm. More information concerning this phenomenon can be found in [11] and [12].
In Banach spaces, there are at least two natural generalisations of orthogonal projections: metric projections and linear projections. For the first one, the image is the point inside the subspace minimising the distance to the argument. It turns out that for iterations of metric projections, one cannot expect convergence of the iterates to the metric projection onto the intersection; see, for example, [20].
Recall that a linear mapping P on a Banach space X is called a linear projection if it satisfies the condition that P 2 = P . In this case, there are many positive results under the additional assumption that the projections are of norm one. For example, if the Banach space X is uniformly convex, convergence of iterates of norm-one projections was established by R. E. Bruck and S. Reich in [5]. This result was later generalised, for example, to further classes of Banach spaces by C. Badea and Y. I. Lyubich in [2]. A dichtotomy for the speed of convergence of iterations of projections in Banach spaces which are uniformly convex of some power type has been exhibited by C. Badea and D. Seifert in [3]. More results on the convergence of the alternating algorithm for normone projections can be found in [7].
In the context of property (T) for certain groups, I. Oppenheim introduced in [19] an angle between linear projections in Banach spaces. This concept was developed further in [18], where a number of sufficient conditions for convergence of iterates of projections are given.
Iterations of non-orthogonal projections in Hilbert spaces, which then necessarily have a norm which is larger than one, are of interest in the context of discrete linear inclusions and of Skorokhod problems; see, for example, [15,21].
Since in most of the above results the convergence behaviour of the iterates of projections in determined or at least influenced by some kind of angle, one might hope that for non-orthogonal projections in Hilbert spaces the situation could be similar. More precisely, in the case of two linear projections, these projections are determined by two subspaces each-the range and the kernel. Moreover, in the case of Euclidean spaces, the concept of principal angles allows to determine the relative position of two subspaces up to an isometry. Therefore one could hope that these data might determine the convergence of the iterates of these projections.
The aim of this article is twofold: in the first part, we show that even in Euclidean spaces the convergence of the iterates is not determined by the principal angles between the subspaces involved. In the second part, we investigate the properties of the Oppenheim angle between two linear projections and provide an example which shows that the modification of the definition of this angle introduced in [18] is indeed necessary.

Principal Angles
Principal angles are used to describe the geometric configuration of two subspaces of a real Hilbert space H up to orthogonal mappings. Given two finite-dimensional subspaces S 1 , S 2 ⊆ H and denoting by q the minimum of the dimensions of S 1 and S 2 , the principal angles θ 1 , ..., θ q ∈ 0, π 2 and the corresponding principal vectors are defined inductively by for k = 1, . . . , q. The principal angles can also be represented in terms of the orthogonal projections P S 1 and P S 2 onto S 1 and S 2 , respectively. More precisely, where λ 1 ≥ ... ≥ λ q are the first q eigenvalues of the restriction of P S 2 P S 1 to S 2 . This formula allows for a direct computation of the principal angles, thus avoiding the optimisation problems in (1). Moreover, it has the advantage that it also makes sense for infinite-dimensional subspaces; see, for example, [10]. The principal angles between two subspaces define them completely up to a simultaneous rotation. This means that every rotation-invariant function of two subspaces can be written as a function of the principal angles between them; for example, the Dixmier angle and the Friedrichs angle are the smallest and the smallest non-zero principal angle, respectively.
Since the function which maps subspaces to their orthogonal complements commutes with rotations, knowing all the angles between two subspaces S 1 and S 2 is equivalent to knowing all the angles between S 1 and S ⊥ 2 . We denote by Θ(S 1 , S 2 ) the ordered tuple of the principal angles between two subspaces S 1 and S 2 . A more detailed exposition of these angles, where the relations between various approaches to angles between subspaces-including principal angles and directed distancesis examined, can be found in [16,Chapter 5.15].

The Cross-Ratio of projective points
For four distinct points a 1 , a 2 , a 3 , a 4 of the projective line P 1 (R), the cross-ratio of these points, denoted by [a 1 , a 2 , a 3 , a 4 ] ∈ R ∪ {∞}, is defined by where y 0 = ∞ and a i = p(λ i , µ i ), that is, λ i , µ i are the homogeneous coordinates of a i . In our application, the denominator will never be zero, and so we can always assume that [a 1 , a 2 , a 3 , a 4 ] ∈ R. We will need the following formula for the cross-ratio: Proof. Using (x 2 , −x 1 ) as homogeneous coordinates for span(x) ⊥ and the Leibniz formula for the determinant, we can directly obtain this assertion from the definition.
The behaviour of the cross-ratio function with respect to permutations is well known. Since we will use the following lemma later, we state it at this point without proof (which can be found, for example, in [4, pp. 123-126]).

Lemma 2. The cross-ratio satisfies
and where a, b, c, d are pairwise distinct one-dimensional subspaces of R 2 .

The Oppenheim angle between linear projections
Let P 1 , P 2 be two bounded linear projections in a Banach space. Assume that there is a bounded linear projection P 12 onto the intersection of the images of P 1 and P 2 satisfying P 12 P 1 = P 12 and P 12 P 2 = P 12 .
In the case of two orthogonal projections P 1 , P 2 in a Hilbert space, the above angle coincides with the Friedrichs angle between the images of P 1 and P 2 . The subtraction of the projection P 12 in the definition above plays the role of the quotient in the definition of the Friedrichs angle. There need not always be such a projection P 12 . Moreover, the intersection of the images of P 1 and P 2 need not even be complemented. Two projections P 1 and P 2 are called consistent if such a projection does exist. We also call a projection P 12 with the above properties a consistency projection. The main interest in this angle lies in the fact that a large Oppenheim angle, that is, a small cosine in the above definition, implies that the iterations (P 1 P 2 ) n converge uniformly to a (consistency) projection onto im P 1 ∩ im P 2 . For a detailed discussion of these angles, we refer the reader to [18].

Classification of the Convergence Behaviour Through Principal Angles
In all what follows, let X be a Euclidean space. All geometric characterisations of the convergence behaviour we provide in this article are based on the two-dimensional case. Given two subspaces M, N ⊂ R n we use the notation M ⊕ N for the direct sum of M and N , that is, this notation indicates that M ∩ N = {0}.

The Two-Dimensional Case
For two two-dimensional projections P 1 , P 2 : R 2 → R 2 , the convergence behaviour of ((P 1 P 2 ) n ) ∞ n=1 is determined by the geometric relation between the nullspaces and ranges of P 1 , P 2 . The question of the convergence behaviour of the sequence of iterates n=1 is trivial if either of the projections is the identity or the zero mapping. Therefore we restrict ourselves to the case where the range of both projections is a onedimensional subspace of R 2 . Proposition 1. Let P 1 , P 2 : R 2 → R 2 be two non-trivial projections, that is, neither of them is the identity or zero. We set R 1 = im P 1 , N 1 = ker P 1 , R 2 = im P 2 and N 2 = ker P 2 . The composition P 1 P 2 has at most one non-zero eigenvalue λ and it satisfies In particular, we have since P k is a projection, and thus 1 = v k , w k . Since R 1 = span(w 1 ), any non-zero eigenvector of P 1 P 2 must be a multiple of w 1 . Calculating P 1 P 2 w 1 , we get and hence, using Lemma 1, we obtain In particular, we see that convergence occurs if and only if the modulus of the above number is smaller than one.
Remark 1. Note that for two projections P 1 and P 2 in R 2 with distinct one-dimensional images, the zero mapping is the uniquely determined projection onto the intersection of these ranges. Using the above result, we see that that is, the iterates converge whenever the "cosine" of the Oppenheim angle between the projections is smaller than one. Moreover, the above discussion shows that, in this particular case, there is a relation between the Oppenheim angle and the principal angles.

The three-dimensional case: Some additional information might be needed
Also in the three-dimensional case, we restrict ourselves to non-trivial projections, that is, we exclude both the identity mapping and the zero mapping. In this section let P 1 , P 2 be two non-trivial projections in R 3 . In order to simplify the notation, we use the abbreviations We start our investigation of the convergence behaviour with the observation that the problem is at its core still a two-dimensional problem. We call an eigenvector of the composition P 1 P 2 non-trivial if the corresponding eigenvalue is neither zero nor one.
Lemma 3. For every non-trivial eigenvector v of P 1 P 2 , there exists a two-dimensional subspace E v such that the intersections of E v with R 1 , R 2 , N 1 , N 2 are all one-dimensional with trivial pairwise intersection and v ∈ E v .
Proof. Let λ be the eigenvalue corresponding to v (that is P 1 P 2 v = λv). Set w = P 2 (v), Note that w ∈ R 1 or v ∈ R 2 would mean that w = v = λv, while w ∈ N 1 or v ∈ N 2 would mean that P 1 P 2 v = 0, both contrary to the assumption that v is non-trivial. Therefore v and w are linearly independent, all the intersections are one-dimensional and E v is two-dimensional. Also using this, we can easily reproduce the triviality of the pairwise intersections.
The above lemma implies, in particular, that we still can interpret the ranges and kernels of the projections under consideration as projective points. Moreover, the crossratio of these points still carries the vital information regarding the convergence.

Lemma 4.
Let v be an eigenvector corresponding to the non-trivial eigenvalue λ of the operator P 1 P 2 and E v the associated subspace established in Lemma 3. Set where the cross-ratio is meant to be taken on E v .
Proof. First, we show that P k (E v ) ⊆ E v for k ∈ {1, 2}. To this aim, first observe that Hence the mappings P ′ k : E v → E v , x → P k (x) are well-defined projections for k ∈ {1, 2}. Since, by construction, λ is the non-trivial eigenvalue of P ′ 1 P ′ 2 , we can use Proposition 1 to complete the proof: A plane with the properties of E v in Lemma 3 is conversely always associated to a non-trivial eigenvector: Lemma 5. Let P 1 , P 2 be projections and let E be a two-dimensional subspace of R 3 such that the intersections of E with R 1 , R 2 , N 1 , N 2 are all one-dimensional with trivial pairwise intersection. Then E∩R 1 is an eigenspace of P 1 P 2 corresponding to a non-trivial eigenvalue.
Proof. As in the proof of Lemma 4 above, the well-definedness of the two projections By construction, R ′ 1 is also an eigenspace of P 1 P 2 corresponding to λ, and since the arguments in the cross-ratio function are pairwise distinct, λ / ∈ {0, 1} (see, for example, [4, Proposition 6.1.3]).
In order to formulate a characterisation of convergence in three dimensions, we need a geometric lemma regarding the connection of angles and directed distances for subspaces. Recall that for subspaces M, N ⊆ X the directed distance δ(M, N ) is defined by A simple computation shows that δ(M, N ) = sup{ P N ⊥ x : x ∈ M, x = 1}. Lemma 6. Let H be a real Hilbert space over, let S 1 , S 2 , V be three one-dimensional, pairwise distinct subspaces of H such that V ⊆ S 1 ⊕ S 2 , and let W be another subspace of H such that (S 1 ⊕ S 2 ) ∩ W = {0}. Then Proof. Let k ∈ {1, 2} and s k ∈ S k such that s k = 1. Note that because both S k and V are one-dimensional. From we may conclude that P (V ⊕W ) ⊥ = P (V ⊕W ) ⊥ P V ⊥ . Since the subspaces S 1 , S 2 , V are pairwise distinct and S 1 ⊆ S 2 ⊕ V , we may pick c ∈ R and v ∈ V such that s 1 = cs 2 + v. Then, Comparing the norms of these expressions, we see that P V ⊥ (s 1 ) = c P V ⊥ (s 2 ) with a number c satisfying |c| = s(S 1 ,V ) s(S 2 ,V ) . Therefore, we may conclude that Finally, as (S 1 ⊕ S 2 ) ∩ W = {0} and hence S 2 V ⊕ W , we have δ(S 2 , V ⊕ W ) = 0 which finishes the proof.
There is at most one non-trivial eigenvalue and if it exists it satisfies the equation In particular, the iterates (P 1 P 2 ) n converge if and only if that is, the convergence is determined by the angles between the ranges and kernels.
Proof. Assuming the existence of at least one non-trivial eigenvector v with corresponding eigenvalue λ, we see that Then, by comparing dimensions, we observe that R 1 = R ′ 1 ⊕ Z and R 2 = R ′ 2 ⊕ Z. Using Lemmata 4 and 6, we conclude that as claimed.
Remark 2. We conclude this section with a few observations concerning the validity of the above characterisation of convergence of the iterates.
1. Using the behaviour of the cross-ratio with respect to permutations of its arguments stated in Lemma 2, we obtain that for the case of two linear projections P 1 , P 2 on R 3 with one-dimensional ranges, we also see that there is at most one non-trivial eigenvalue and if it exists, then it satisfies the equation In particular, the iterates (P 1 P 2 ) n converge if and only if This shows that also in this case, convergence of the iterates is determined by the angles between the ranges and the kernels.
2. Copying the above arguments, it is possible to show the same characterisation for projections on Hilbert spaces, where both have either one-dimensional images or one-dimensional kernels.
3. The "mixed case", that is, the case where one projection has a one-dimensional range and the other projection has a one-dimensional kernel, is more complicated. Let P 1 and P 2 be two projections on R 3 and assume that P 1 has a one-dimensional range and P 2 has a two-dimensional one. Using the results of this section, we may conclude that there is a unique non-trivial eigenvalue µ of P (im P 1 , ker P 1 )P (ker P 2 , im P 2 ) and that the unique non-trivial eigenvalue of P 1 P 2 is λ = 1 − µ. So, if δ(im P 1 , ker P 2 )δ(im P 2 , ker P 1 ) δ(im P 1 , ker P 1 )δ(im P 2 , ker P 2 ) < 1.
and the cross-ratio [im P 1 , ker P 2 , im P 2 , ker P 1 ] > 0, we have convergence of the iterates (P 1 P 2 ) n . In order to show that condition (3) is not enough, we consider the following example. Consider the vectors On the other hand, setting Using the above arguments, we see that the sequence (P (S 1 , S 4 )(P (S 3 , S 2 )) n ) ∞ n=1 converges whereas ((P (S 1 , S ′ 4 )P (S 3 , S 2 )) n ) ∞ n=1 does not. Since all the principal angles are the same in both cases, they do not determine the convergence behaviour on their own.

Higher dimensions: Angles are not enough
For dimensions higher than three, a characterisation using only the principal angles cannot work. We show this by giving a counterexample. It is somewhat similar to the one given in Remark 2, but in contrast to the situation there, in higher dimensions it seems to be unclear how to overcome the problem. The counterexample is built by combining two two-dimensional examples in a specific way. In order to do this, we first need a simple observation on operator matrices. For two operators T 1 : H 1 → H 1 , T 2 : H 2 → H 2 on Hilbert spaces H 1 and H 2 , we denote the operator matrix We consider the case of four projections P 1 and P 2 on H 1 , and P 3 and P 4 on H 2 . Then P 1 ⊕ P 3 and P 2 ⊕ P 4 are projections on H 1 ⊕ H 2 satisfying im P 2 ⊕ P 4 = im P 2 ⊕ im P 4 and ker P 2 ⊕ P 4 = ker P 2 ⊕ ker P 4 as can be seen by a direct computation. Moreover, the spectrum of (P 1 ⊕ P 3 )(P 2 ⊕ P 4 ) is just the union of the spectra of P 1 P 2 and P 2 P 4 . The final observation needed for the example is that principal angles between two direct sums of subspaces are nothing but the combined principal angles between the individual subspaces in both summands.
We combine these projections by setting P 1 := P 1 1 ⊕ P 2 1 , and P 2,s := P 1 2,s ⊕ P 2 2,s for s = ±1. Since the principal angles between direct sums of subspaces are just the combination of the principal angles between the individual spaces, the principal angles between the ranges and kernels of P 1 and P 2,−1 are the same as the ones between the ranges and kernels of P 1 and P 2,1 . On the other hand, we have ρ(P 1 P 2,1 ) = 0 and ρ(P 1 P 2,−1 ) = 2, that is, although the principal angles agree, nevertheless the convergence behaviour is vastly different.

Some remarks on angles between linear projections
Let P 1 , P 2 be two bounded linear projections in a Banach space. Recall that in order to define the Oppenheim angle between P 1 and P 2 , we need a projection P 12 which satisfies P 12 P 1 = P 12 and P 12 P 2 = P 12 . As noted in Remark 2.6 in [18, p. 346] such a projection need not be unique. Since in [18] no example is given, we now give a simple example illustrating this phenomenon. Example 2. We consider the projections and The intersection of the images of P 1 and of P 2 is the z-axis, the projections and are both projections onto im P 1 ∩ im P 2 . Observe that P 12 P 1 = P 12 , P 12 P 2 = P 12 , P ′ 12 P 1 = P ′ 12 and P ′ 12 P 2 = P ′ 12 . So both projections are admissible in the definition of the Oppenheim angle. A direct computation shows that P 1 (P 2 − P 12 ) 1 = P 2 (P 1 − P 12 ) 1 = 1 but P 1 (P 2 − P ′ 12 ) 1 = P 2 (P 1 − P ′ 12 ) 1 = 2 This shows that these projections result in different values for the Oppenheim angle. For the Euclidean norm these two projections result in the same Oppenheim angle. Taking on the other hand and P ′ 12 , we obtain different angles for the Euclidean norm as well. Note that in the first case, we even have P 12 1 = P ′ 12 1 = 1.
In infinite dimensional Banach spaces, even the question of whether two projections P 1 and P 2 are consistent, that is, if there is a projection P 12 onto the intersection of the ranges of P 1 and P 2 such that P 12 P 1 = P 12 P 2 = P 12 , is of interest. Note that there are complemented subspaces with the property that their intersection is no longer complemented. In other words, it might happen that not only there is no projection satisfying the above condition, but that there is no bounded projection at all.
On the positive side, we can mention the following result of R. E. Bruck and S. Reich: Proposition 3 (Theorem 2.1 in [5, p. 464]). Let X be a uniformly convex space and let P 1 , . . . , P k be linear norm-one projections onto subspaces Y 1 , . . . , Y k . Then the strong limit lim n→∞ (P k P k−1 · · · P 1 ) n x exists for each x ∈ X and defines a norm-one-projection onto the intersection Y 1 ∩ . . . ∩ Y k .
Using this proposition together with the uniqueness of norm-one projections in smooth spaces, we obtain the following simple result: Proposition 4. Let X be a uniformly convex and smooth Banach space and let P 1 and P 2 be two norm-one projections in X. Then these projections are consistent, that is, there is a projection P 12 onto the intersection of the ranges of P 1 and P 2 with the property that P 12 P 1 = P 12 P 2 = P 12 .
Note that we cannot drop the assumption that X is smooth. This can be seen in the following four-dimensional example.