A rigorous version of R. P. Brent's model for the binary Euclidean algorithm

The binary Euclidean algorithm is a modification of the classical Euclidean algorithm for computation of greatest common divisors which avoids ordinary integer division in favour of division by powers of two only. The expectation of the number of steps taken by the binary Euclidean algorithm when applied to pairs of integers of bounded size was first investigated by R. P. Brent in 1976 via a heuristic model of the algorithm as a random dynamical system. Based on numerical investigations of the expectation of the associated Ruelle transfer operator, Brent obtained a conjectural asymptotic expression for the mean number of steps performed by the algorithm when processing pairs of odd integers whose size is bounded by a large integer. In 1998 B. Vall\'ee modified Brent's model via an induction scheme to rigorously prove an asymptotic formula for the average number of steps performed by the algorithm; however, the relationship of this result with Brent's heuristics remains conjectural. In this article we establish previously conjectural properties of Brent's transfer operator, showing directly that it possesses a spectral gap and preserves a unique continuous density. This density is shown to extend holomorphically to the complex right half-plane and to have a logarithmic singularity at zero. By combining these results with methods from classical analytic number theory we prove the correctness of three conjectured formulae for the expected number of steps, resolving several open questions promoted by D. E. Knuth in The Art of Computer Programming.


Introduction
The classical Euclidean algorithm for the computation of the greatest common divisor (GCD) of a pair of natural numbers has been described as the oldest nontrivial algorithm which remains in use to the present day [22, p.335]. The investigation of the number of division steps required by the Euclidean algorithm dates back at least to the 16 th century, when it was observed that pairs of consecutive Fibonacci numbers result in particularly long running times [38]. The mathematically rigorous analysis of the number of division steps began in the mid-19 th century with P.-J.-É. Finck's demonstration in [13] that the number of division steps required for the algorithm to process a pair of integers is bounded by a constant multiple of the logarithm of the largest of the two integers (see [39] for historical details).
Asymptotic expressions for the mean number of division steps required to process a pair of natural numbers (u, v) such that 1 ≤ u ≤ v ≤ n were obtained in the twentieth century by J. D. Dixon [9] and H. Heilbronn [17] and were subsequently refined by J. W. Porter [36]. In 1994 it was shown by D. Hensley [19] that the distribution of the number of division steps about its mean is asymptotically normal in the limit as n → ∞, and this result has been extended and generalised by V. Baladi and B. Vallée [2,6].
The binary Euclidean algorithm, proposed in 1967 by J. Stein [41] but possibly used in 1 st -century China [22, p.340], is a variant of the Euclidean algorithm which is adapted to the requirements of binary arithmetic, and is one of the fundamental algorithms for the computation of greatest common divisors. In sharp contrast to the classical Euclidean algorithm it is one of the least well-understood algorithms for GCD computation [44, §3]. Early heuristic investigations by R. P. Brent [3] led to a conjectured asymptotic expression for the mean number of steps performed by the binary Euclidean algorithm which remains unproved: B. Vallée has shown rigorously that the mean number of steps performed by the algorithm grows logarithmically with the size of the input [42], but the relationship of her result to the heuristic formulae given in earlier research remains conjectural. The purpose of this article is to directly transform the heuristic investigations of R. P. Brent into a rigorous argument and to prove the validity of the various conjectured asymptotic expressions for the mean number of steps, resolving a number of open questions promoted by D. E. Knuth

Overview of previous results
Let us now describe in detail the binary Euclidean algorithm and the current state of its analysis. The binary Euclidean algorithm begins with the following observation: given an arbitrary pair of natural numbers (u, v) it is sufficient to compute the greatest common divisor of the odd parts of u and v respectively, since if (u, v) = (2 k a, 2 ℓ b) for odd numbers a and b then gcd(u, v) = 2 min{k,ℓ} gcd(a, b). Given a pair of odd natural numbers (u, v) with u ≤ v, the algorithm operates as follows. If u and v are equal then their common value is returned as the value of the greatest common divisor. Otherwise since u and v are odd their difference v − u is even, and there exists a greatest natural number k such that v − u is divisible by 2 k . The pair (u, v) is replaced with the new pair of odd natural numbers (u, 2 −k (v − u)), and if the former of these two numbers is greater than the latter then the two are exchanged. This sequence of steps is repeated until a pair of equal numbers is obtained and the GCD is returned. Since the maximum of the two integers is strictly decreased by every iteration it is clear that the algorithm eventually terminates.
The analysis of the mean number of steps required for the algorithm to terminate was first attempted by R. P. Brent [3] using an heuristic argument which we 1 The Art of Computer Programming uses a scale from 0 to 50 to rank the difficulty of exercises, where 0 denotes triviality and 50 indicates a formidable unsolved research problem. The problems solved in this article -exercises 31 and 34 of [22, §4.5.2] -are rated at difficulties of 46 and 49 respectively. To place these figures in perspective, examples of "exercises" rated 50 include the Diophantine equation a n + b n + c n = d n in integers with n > 4, the equidistribution of ( 3 2 ) n modulo 1, and the existence of infinitely many Mersenne primes (see respectively pages xi, 180 and 413 of [22]). now describe 2 . We first note that the number of steps required to process the pair of odd numbers (u, v) is unaffected if both numbers are divided by their GCD, and by identifying the pair of numbers with the result of that operation we may view the algorithm as acting instead on fractions u v ∈ (0, 1] with odd numerator and denominator, which we will refer to as odd fractions. In this representation each iteration of the algorithm transforms the odd fraction u v to the odd fraction T k ( u v ), where k is the maximum integer such that 2 k divides v − u and The exact number of steps required to process the pair of odd natural numbers (u, v) is thus equal to the least integer n ≥ 0 such that where for each i = 1, . . . , n the integer k i is equal to the number of factors of 2 which divide the difference between the numerator and the denominator of the odd fraction T ki−1 • · · · • T k1 u v . In the set of all odd fractions u v ∈ (0, 1] such that v ≤ n, the probability that the integer k 1 defined above is equal to a fixed natural number k converges to 2 −k in the limit as n → ∞. Brent's model for the binary Euclidean algorithm, published in [3], makes the heuristic assumption that for all sufficiently large n, the behaviour of the algorithm when applied to the set of all odd fractions u v ∈ (0, 1] with denominator bounded by n is well modelled by considering instead the effect of the maps T k defined above on the uniform probability measure on (0, 1], with each map T k being applied with probability 2 −k independently at each step. After a single iteration of this random dynamical system the expectation of an absolutely continuous probability measure on (0, 1] with density f ∈ L 1 ([0, 1]) is thus given by the absolutely continuous probability measure with density equal to almost everywhere (see [3,22] for further details). Based on computer experiments Brent hypothesised, but was unable to prove, that the constant density 1 converges exponentially fast under the application of L n to a continuous limit density ξ : (0, 1] → R. Under the heuristic approximation that this limit distribution is exactly attained after a bounded number of iterations, the expected decrease in the value of log(u + v) under one application of the algorithm to the fraction u v can then be calculated to equal x 0 ξ(t)dt dx 2 The reader is cautioned that where some other authors' analyses use logarithms to base 2, we will use natural logarithms unless otherwise specified and therefore some constants may superficially vary. and hence the expected number of iterations required to reduce the odd fraction u v to 1, where 1 ≤ u ≤ v ≤ n, was anticipated in [3] to asymptotically grow as 1 β log n in the limit as n → ∞. An alternative calculation sharing the same underlying assumptions but based on the rate of growth of log √ uv leads instead to the coefficientβ := log 2 − 1 2 in place of β, and this version of Brent's argument is presented in [5,22]. In order to convert Brent's heuristic into a rigorous argument it would be natural to begin by investigating the operator L with the aim of constructing the hypothesised limit density ξ. Since L does not have good spectral properties when acting on L 1 ([0, 1]) this might naturally be attempted by studying L on a smaller space of functions as undertaken in standard texts on transfer operators such as [1,33,37], but this is complicated by the fact that L does not preserve the space of continuous functions on [0, 1]: when L is applied to the constant function 1, for example, one may see that a singularity near 0 of roughly logarithmic magnitude arises, since for very large N > 0 the size of the quantity ∞ k=1 1 (1+2 k (2 −N )) 2 which arises in the series defining (L1) 2 −N is of the order of magnitude of N . As such the operator L cannot be analysed by considering its action on spaces of functions which are bounded on [0,1].
In the 1998 article [42] B. Vallée addressed the problem of making Brent's argument rigorous with the introduction of several innovations. Vallée noted that the singular behaviour of L close to 0 can be accommodated by working in a Hardy space of holomorphic functions defined on an open disc D ⊂ C and having squareintegrable extension to the boundary circle, where the disc D is chosen such that (0, 1] ⊂ D and 0 lies on the boundary of D. On the other hand, in this environment the fact that the transformations z → z z+2 k fix the point 0 significantly complicates the spectral behaviour of the operator L. Vallée circumvented the problem of studying the spectrum of L by considering instead the family of operators V s on Hardy space defined by (2) (V s f )(z) := ∞ k=1 a odd 0<a<2 k 1 (a + 2 k z) 2s f 1 a + 2 k z for all z ∈ D, where s is allowed to be any complex number in the region ℜ(s) > 1 2 . The operator V 1 is related to the operator L by an induction process: a single iteration of V 1 models the effect of applying the main loop of the binary Euclidean algorithm to the fraction u v several times until the first point at which the numerator and denominator are exchanged. Since this operator is defined only in terms of transformations z → 1 a+2 k z which lack fixed points in the boundary of the disc, it can be shown that each V s is a compact operator on the Hardy space associated to the disc D. The existence of an analytic function η taking positive values on (0, 1] and fixed by V 1 can then be demonstrated using classical fixed-point theorems for compact operators. Vallée derived a rigorous result from the spectral analysis of the operator by proving that the number of exchanges E(u, v) taken by the binary algorithm to process the pair (u, v) satisfies the expression when s ∈ C with ℜ(s) > 1. Vallée also derived related functional-analytic formulae for the total number of steps S(u, v) and the total number of divisions by two T (u, v) performed by the algorithm, and using Tauberian theory was able to rigorously derive asymptotic expressions for the mean of each of these three quantities taken over all odd pairs (u, v) with 1 ≤ u ≤ v ≤ n. The following statement summarises Vallée's results: There exists a unique analytic function η : (0, 1] → (0, +∞) such that V 1 η = η and 1 0 η(x)dx = 1. If for each n ≥ 1 we define and similarly forΩ n in place of Ω n .
Vallée's theorem thus proves that the mean number of steps in the binary Euclidean algorithm is asymptotically logarithmic, but its relationship to Brent's model is indirect and many questions remain open. Prior to the present work no proof has been given that the constant in (4) is genuinely equal to the constants 1 β and 1 β conjectured by Brent and Knuth in [3,5,22]. The existence of the continuous density ξ : (0, 1] → R preserved by L and the exponential convergence under L of the uniform measure to the measure of density ξ also remain unproven. In this article we prove all of these conjectures, showing furthermore that the invariant density ξ is real-analytic and admits an analytic continuation to the complex right half-plane ℜ(z) > 0. We apply these results to give a direct proof that Brent's model correctly describes the asymptotic mean running time of the binary Euclidean algorithm for both odd and general natural number inputs, answering an open problem from The Art of Computer Programming which was first listed in 1981 (see [21, p.339] and [22, p.355]).
The constants in the heuristic formulae derived by Brent and Knuth are appreciably more amenable to computation than the rigorous expressions obtained by Vallée. The exponentially increasing number of summations involved in the definition of V s and the necessity of summing over all odd integers in the second and third expressions in Theorem 1 make approximate computation of Vallée's constants problematic, and to the author's knowledge no computation of these constants based on Vallée's definitions has yet been attempted. On the other hand, in [42, §4] Vallée conjectured that if the continuous invariant density ξ exists then the constant in (4) satisfies the simpler expression This later quantity is far easier to accurately approximate: Brent ([5], also reported in [22, p.350]) has computed the approximation which is believed to be correct to the number of decimal places shown. The verification of the useful identity (5) was therefore also listed as an open problem by Knuth [22,p.355]. In this article we will prove the correctness of this conjectured identity.

Statement of results
In establishing specific results on the mean number of exchanges, subtractions and dyadic divisions performed by the algorithm we work within a general framework defined in terms of the cost of processing the pair (u, v), following the approach of V. Baladi and B. Vallée [2]. We attach a non-negative real weight to each of the fundamental actions which the algorithm may perform at each step, namely: for each natural number k the algorithm might subtract u from v and then divide by 2 k ; or for each natural number k we might subtract u from v, divide by 2 k and then exchange u and v. Clearly the application of the algorithm to a pair (u, v) consists precisely in a particular sequence of repetitions of these fundamental actions. Formally, let us say that a cost function associated to the binary Euclidean algorithm is a non-negative function c : {1, 2} × N → R which is not identically zero. A cost function will be called regular if there exists C > 0 such that c(i, k) ≤ Ck for every (i, k) ∈ {1, 2} × N. We consider the quantity c(1, k) to represent the cost associated to subtraction followed by division by 2 k and then exchange, and the quantity c(2, k) to represent the cost associated to subtraction followed by division by 2 k without exchange. We define the total cost C(u, v) associated to the odd pair (u, v) to be the sum of the costs of the fundamental actions performed when processing (u, v). Since the final step of the algorithm results in a pair of the form (n, n) it is a priori ambiguous whether or not an exchange is performed in the final step, so by convention we shall always consider that the final step involves no exchange. We define the cost of a general pair of natural numbers to be the cost of the pair formed from the odd parts of the two numbers. The reader may note that, for example, the total number of exchanges E(u, v) may be obtained as the total cost C(u, v) when c is given by c(1, k) ≡ 1 and c(2, k) ≡ 0, to obtain C(u, v) ≡ T (u, v) one takes c(i, k) ≡ k, and to obtain C(u, v) ≡ S(u, v) one simply takes c(i, k) ≡ 1.
For each n ≥ 1 let us define n . We prove the following theorem on the mean cost of the binary Euclidean algorithm: Lebesgue almost everywhere. This function may be realised as a real-analytic function ξ : (0, 1] → (0, +∞) which extends analytically to a holomorphic function defined on the right half-plane ℜ(z) > 0. If c : {1, 2}×N → R is a regular cost function and In particular the following asymptotic results hold. If S(u, v) denotes the number of subtractions performed when processing the pair (u, v), then for each i = 1, 2, 3, 4. If T (u, v) denotes the total number of divisions by 2 performed when processing the pair (u, v), then and if E(u, v) denotes the number of exchanges performed when processing the pair (u, v) then for each i = 1, 2, 3, 4.
The equation (7) proves the original heuristic conjecture of R. P. Brent [3, §6]. The alternative expression (8) was conjectured by R. P. Brent [5] and D. E. Knuth [22,, the latter in the equivalent form 2 which may be derived from the expression above using integration by parts. The equivalence of (7) with (8) ξ(x)dx which appears in the expressions for the mean number of exchanges, but based on empirical investigations of the number of exchanges conducted by Vallée in [42] it would appear that this constant slightly exceeds one half. G. Maze [29] has previously proved the existence of a unique function ξ ∈ L 1 ([0, 1]) such that 1 0 ξ(x)dx = 1 and Lξ = ξ but was not able to establish stronger regularity properties of ξ such as continuity, nor any of the spectral properties of L which we require in our proof of Theorem 2. In particular Maze's result does not imply the existence of ξ(1) as a well-defined quantity as is clearly necessary in order to establish (9).
The results in this article are rooted in a deep study of an extension of Brent's transfer operator L, and this analysis comprises more than half of the paper. Let us briefly introduce some essential notation. Throughout this article we let D denote the translated complex unit disc D := {z ∈ C : |z − 1| < 1}. The notation H 2 (D) denotes the Hilbert space of holomorphic functions D → C which extend to squareintegrable functions along the boundary circle, and H ∞ (D) denotes the Banach space of bounded holomorphic functions D → C. When X is a Banach space we let B(X) and K(X) denote the sets of bounded and compact operators on X respectively. We recall that a function from an open subset U of C 2 to X is called holomorphic if it is Fréchet differentiable at every point, and this is the case if and only if it is locally expressible as the limit of a convergent power series with coefficients in X. A function from U to X is holomorphic if and only if its composition with every element of X * is holomorphic in the usual sense. A brief review of the concepts and properties from spectral theory and the theory of Banach spaces of holomorphic functions which are used in this article may be found in §4 below.
The following theorem summarises our investigation of Brent's operator: . The corresponding operator-valued maps (s, ω) → L s,ω and (s, ω) → D s,ω are holomorphic functions from U to B(H 2 (D)). The following additional properties hold: (a) For each s ∈ C with ℜ(s) > 1 2 the operator L s,0 has essential spectral radius not greater than The operator L 1,0 has spectral radius equal to one, has a simple eigenvalue at 1, and has no other spectrum on the unit circle. (c) There exists a function ξ ∈ H 2 (D) such that for all z ∈ D. More generally, if L s,0ξ = λξ for someξ ∈ H 2 (D) and λ ∈ C such that |λ| > for all z ∈ D. Ifξ ∈ H 2 (D) is an eigenfunction of L s,ω which corresponds to a nonzero eigenvalue then it admits an analytic continuation to the right half-plane ℜ(z) > 0. (d) The operator L s,0 has spectral radius strictly less than one when ℜ(s) ≥ 1 and s is not equal to one. (e) There exist an open set V ⊂ C 2 containing the point (1, 0), holomorphic functions (s, ω) → P s,ω and (s, ω) → N s,ω from V to B(H 2 (D)), and a holomorphic function λ : V → C such that for all (s, ω) ∈ V: (i) The identity L s,ω = λ(s, ω)P s,ω + N s,ω holds in the space of bounded operators on H 2 (D). (ii) We have P s,ω N s,ω = N s,ω P s,ω = 0. (iii) The spectral radius of N s,ω is strictly less than one. (iv) The operator P s,ω is a projection with rank equal to one.
The functions λ and P also satisfy λ(1, 0) = 1 and P 1, The proof of Theorem 3 is quite protracted and is undertaken in several stages which together comprise the greater part of this article. Let us briefly describe the steps involved. The first stage of the proof of Theorem 3 consists in showing that L s,ω and D s,ω are well-defined bounded operators which depend holomorphically on the parameters (s, ω), and that the former operator has small essential spectral radius as described in (a). This is the most straightforward part of the proof and is somewhat similar to the arguments used by Vallée in studying the operator family V s . This part of the proof comprises §5 below.
The detailed spectral properties of L s,0 described in Theorem 3(b)-(d) are more difficult to establish and between them their proofs occupy over a third of this article. The proof of these parts of Theorem 3 comprises §6 below. In constructing the invariant function ξ we use a quasicompact extension of the Kreȋn-Rutman theorem due to R. Nussbaum [32]; though versatile and concise this result does not seem to be widely appreciated in the existing literature on transfer operators. (Since our operator is quasicompact rather than compact, the classical results of M. A. Krasnoselskiȋ [23] used by Vallée in the analysis of V s do not apply.) In proving the other parts of Theorem 3(b)-(d) we must demonstrate that L 1,0 has no other spectrum on the unit circle, and that L 1+it,0 has no spectrum at all on the unit circle when t is real and nonzero. The essential spectral estimate in Theorem 3(a) reduces this to the problem of establishing the absence of additional eigenfunctions corresponding to eigenvalues of unit modulus. Direct solutions to this problem such as are used in [12,43] involve comparing a presumed eigenfunction with the known positive eigenfunction ξ, but in our case this comparison is inhibited by the fact the putative eigenfunction may have a higher order of singularity at 0 than does the positive invariant function ξ. (In the case of Vallée's operators V s it can be shown very early in the proof that all eigenfunctions must have logarithmic singularities at zero and so in [42] this problem does not arise.) This same issue also prevents the use of the projective cone-contraction arguments favoured for such tasks by C. Liverani [26]. To circumvent this obstacle we temporarily abandon the space H 2 (D) and instead study L s,0 on a smaller space of functions X among whose elements the only possible singularity at 0 is a logarithmic one. At the end of §6 we digress slightly from the proof of Theorem 3 to prove a minor conjecture of Brent ([3, Conjecture 2.1]). Moving back to the proof of Theorem 3 we then face the problem that the space X is too restrictive to accommodate the action of the operator L s,ω when ω is nonzero, and for this reason the final stage of the proof of Theorem 3 consists in transferring our results for the action of L s,0 on X back to the action of L s,0 on H 2 (D). This final stage and the proof of (e)-(f) are undertaken in §7.
The fact that the eigenfunctions of L s,0 extend analytically to the right halfplane suggests the possibility of replacing the space H 2 (D) considered in Theorem 3 (and perhaps also the space X considered in §6) with a Banach space of holomorphic functions defined in the entire right half-plane. An analysis along these lines has been conducted in the case of the classical Euclidean algorithm by D. Mayer [28]; however, at the present time we have not been successful in identifying a suitable candidate Banach space. In order for such an analysis to result in a proof of Theorem 2 the candidate Banach space would have to contain the constant function 1, but this is not the case for the spaces considered by Mayer.
The remainder of this article is structured as follows. In §4 we briefly summarise the ideas from functional analysis and spectral theory which are used in this paper, and as was indicated earlier sections §5-7 between them comprise the proof of Theorem 3. In §8 we establish some properties of the derivatives of the function λ considered in Theorem 3 which are useful in describing the quantity µ(c), and in §9 we prove a series of technical results which allow us to relate Dirichlet series of cost functions to the family of operators L s,ω via the equation which is our analogue of (3). In §10 we apply these results to derive Theorem 2 via a Tauberian argument. and H 2 (D) is a Hilbert space with respect to the inner product ·, · which clearly generates the norm · H 2 (D) . The Hardy space H 2 (D) admits the following alternative description which will be used heavily in this article: f : D → C belongs to H 2 (D) if and only if there exists a sequence of complex numbers (a n ) ∞ n=0 ∈ ℓ 2 such that for all z ∈ D and when this is the case we have f H 2 (D) = ∞ n=0 a 2 n 1 2 . The following standard estimate will be used frequently in the sequel: In particular we have Proof. Let f (z) = ∞ n=0 a n (z − 1) n for all z ∈ D. By the Cauchy-Schwarz inequality, Lemma 4.1 implies in particular that for each z ∈ D the map f → f (z) is a bounded linear functional on H 2 (D). We shall also make use of the Hardy space H ∞ (D) which is defined to be the set of bounded holomorphic functions D → C equipped with the complete norm f H ∞ (D) := sup{|f (z)| : z ∈ D}. The theory of Hardy spaces is described in detail in numerous textbooks, of which we mention [10,27,40]; all of the properties of Hardy spaces listed above may be found in any of those texts.

Essential spectrum. Recall that a linear operator acting on a complex
Banach space is called Fredholm if its kernel has finite dimension and its range is closed and has finite codimension. If the codimension of the range is equal to the dimension of the kernel then the operator is said to be Fredholm of index zero. For the purposes of this article we shall say that λ ∈ C belongs to the essential spectrum of a bounded linear operator L : X → X if L − λId X is not a Fredholm operator of index zero. A discussion of the relationship between this and other definitions of the essential spectrum may be found in [11, §I].
Let (X, d) be a metric space. The Kuratowski measure of noncompactness of a set A ⊆ X is defined to be the quantity ψ(A) := inf {δ > 0 : A can be covered by finitely many sets of diameter ≤ δ} .
Clearly ψ(A) = 0 if and only if A is compact. If L is a bounded linear operator on a Banach space (X, · ) then we define the Hausdorff measure of noncompactness of the operator L to be the quantity where ψ is calculated according to the metric on X induced by the norm · . It is likewise clear that L ∈ K(X) if and only if L χ = 0, and furthermore · χ is in fact a seminorm on B(X). If L ∈ B(X) then we also define The above definitions are related in the following result which originates in work of R. Nussbaum [31] and Lebow and Schechter [24]. A complete exposition of this result and the concepts outlined above may be found in [11, §I]. Our interest in the essential spectrum is largely due to the following fact which will be frequently invoked without comment: if λ ∈ C belongs to the spectrum of L ∈ B(X) but does not belong to the essential spectrum, then λ is an eigenvalue of L of finite multiplicity and is an isolated point of the spectrum of L (see e.g. [11, p.40]). Since the spectrum of L is closed and bounded it follows in particular that if ρ ess (L) < ρ(L) then L has an eigenvalue of modulus ρ(L).

Separation of spectrum.
Results of the following type are widely used in applications of the theory of transfer operators but the hypotheses have on occasion been unclearly stated. For this reason we include an indication of the proof.
Proposition 4.2. Let (X, · ) be a Banach space and L ∈ B(X) a bounded operator. Suppose that λ is an isolated point of the spectrum of L, that L − λId X is Fredholm, that every other element of the spectrum of L lies in a closed disc about the origin of radius strictly less than |λ|, and that λ is a simple eigenvalue of L in the sense that dim ker(L − λId X ) n = 1 for every integer n ≥ 1. Let Γ be an anticlockwise-oriented closed curve in C which encloses λ and does not enclose or intersect any other points of the spectrum of L. Then the integral defines a bounded operator on X with rank one such that P 2 = P and LP = P L.
Proof. By [20, Theorem III.6.17] the operator P is bounded and satisfies P 2 = P and LP = P L. Let X 1 and X 2 denote its image and kernel respectively. Since P is continuous X 2 is closed, and since X 1 = ker(Id X − P ), X 1 is also closed. Since L and P commute we have LX 1 ⊆ X 1 and LX 2 ⊆ X 2 . By the result just cited, the spectrum of L restricted to X 1 is precisely {λ}, and the spectrum of L restricted to X 2 equals the spectrum of L acting on X with the element λ removed; in particular the spectral radius of L restricted to X 2 is strictly less than |λ| and it follows easily that the spectral radius of N := L − LP is strictly less than |λ|. The identity N P = P N = 0 follows directly from the properties already stated.
Since L−λId X is Fredholm its range is closed and its kernel is finite-dimensional. Using [20,Lemma IV.5.29] it follows that the restriction of L − λId X to X 1 also has closed range and finite-dimensional kernel, and by the combination of [20, Theorem IV.5.30] and [20, Theorem IV.5.10] it follows that the dimension of X 1 must be finite. The restriction of L − λId X to X 1 is thus a linear transformation on a finitedimensional space with spectrum equal to {λ}, and since λ is a simple eigenvalue in the sense described above X 1 must be one-dimensional. In particular we have Lx = λx for every x ∈ X 1 and the rank of P is equal to one as claimed. Since L = LP + N by the definition of N it follows that L = λP + N as claimed.

Beginning of the proof of Theorem 3
We now start upon the route towards the proof of Theorem 3. In this and all subsequent sections we shall assume that a regular cost function c : {1, 2} × N → R has been specified. In this section we shall show that L 1,0 preserves integrals along the interval (0, 1), prove that the families of operators L s,ω and D s,ω are bounded and holomorphic on H 2 (D), and estimate the essential spectral radius of L s,0 . We begin with the following simple result.
Lemma 5.1. Let f : (0, 1] → C be Lebesgue integrable. Then the series converges Lebesgue almost everywhere and defines a function In particular the sum which defines L 1,0 g converges almost everywhere to a finite value. The result for a general integrable function f : (0, 1] → C follows by writing f as a complex linear combination of integrable non-negative functions. The next result proves Theorem 3 up to and including clause (a).
There exists an open set U ⊂ C 2 which contains the region {(s, ω) ∈ C 2 : ℜ(s) > 2 3 and ω = 0} such that for all (s, ω) ∈ U the formulae ,ω are holomorphic, and the essential spectral radius of L s,0 is less than or equal to Proof. Since c is a regular cost function we may choose C > 0 such that so that when (s, ω) ∈ U we have ℜ(s) > 2 3 and |ω|c(i, k) ≤ k 6 log 2 for all k ≥ 1 and for i = 1, 2 as desired. To prove that L s,ω is a well-defined element of B(H 2 (D)) and that the corresponding function (s, ω) → L s,ω is holomorphic it is clearly sufficient to prove that these properties hold for G s,ω and D s,ω , since the corresponding properties of L s,ω then follow from the identity L s,ω = G s,ω + D s,ω . We begin by recalling the following classical result which may be found in [10,27,40] [27,40]).
It is clear from (14) that so that in particular each G s,ω,k and each D s,ω,k belongs to B(H 2 (D)). Since each of the maps z → 1/(1 + 2 k z) takes the closure of D into the interior of D the operators G s,ω,k are all compact. It is furthermore not difficult to see that each of these operators may be locally written as a convergent power series in (s, ω) with coefficients in B(H 2 (D)), and hence the operator-valued functions (s, ω) → G s,ω,k and (s, ω) → D s,ω,k are holomorphic. To show that G s,ω , D s,ω are well-definded operators which depend holomorphically on (s, ω) it is therefore sufficient to show that the series ∞ k=1 G s,ω,k and ∞ k=1 D s,ω,k converge in B(H 2 (D)) in a locally uniform manner with respect to (s, ω). Since the sum of a convergent series of compact operators is compact this will also suffice to show that G s,ω is compact for every (s, ω) ∈ U.
Let us therefore prove that these series converge in the required manner. The case of D s,ω is straightforward: we have for each k ≥ 1, and since also it follows from (15) and (16) so that the series ∞ k=1 D s,ω,k converges locally uniformly in (s, ω) to the limit D s,ω which is well-defined and depends holomorphically on (s, ω).
In order to bound the norms of the operators G s,ω,k we use an alternative estimate suggested by the analysis of B. Vallée [42], based on the following theorem of R. M. Gabriel [14]: if U ⊂ C is an open ball, g : U → C is holomorphic, Γ is a circular contour in U , and γ is a rectifiable convex Jordan curve enclosed by Γ, then Our interest is in the case where γ is also circular, and in this case (17) could also be deduced from a related theorem in which the integrand is taken to be positive and subharmonic [15]. For a modern treatment and related results see [16].
For each k ≥ 1 let us define ϕ k (z) := 1 1+2 k z for every z ∈ D. Using the substitution u = ϕ k (z) together with the estimate |ω|c(1, k) ≤ k 6 log 2 which follows from the definition of U we may obtain Choose a circular contour Γ in D which is centered at 1 and has radius large enough that Γ encloses the curve ϕ k (∂D). Combining (18), (19) and (17) we find that , where the last inequality follows from the definition of · H 2 (D) given in (13). We conclude from this estimate that for each (s, ω) ∈ U the sum G s,ω = ∞ k=1 G s,ω,k is a convergent series of compact operators, and hence defines an element of K(H 2 (D)). Since this convergence is locally uniform with respect to (s, ω), the function (s, ω) → G s,ω is holomorphic.
To complete the proof of the proposition it remains to show that when (s, 0) ∈ U the essential spectral radius of L s,0 is bounded above by The composition of a bounded operator with a compact operator is compact, and it follows that for each n ≥ 1 the expression L n s,0 = (G s,0 +D s,0 ) n expands into a sum of 2 n−1 compact operators (which arise from products which involve at least one instance of G s,0 ) and a single possibly noncompact operator, D n s,0 . We therefore have for every n ≥ 1, and it follows from Theorem 4 that the essential spectral radius of L s,ω is bounded by the ordinary spectral radius of D s,ω . To prove the proposition we will show that this latter quantity is bounded by and z ∈ D we may write the sum defining the function D s,0 f alternatively as and in this manner we may for each n ≥ 1 write D n An easy inductive argument establishes the relation We may thus compute in a similar manner to our earlier calculation of the bounds on D s,ω,k H 2 (D) . It follows that and this clearly yields The proof is complete.

Analysis of Brent's operator on X
As was indicated in §3, in order to prove those parts of Theorem 3 which pertain to the point spectrum of L s,0 we will find it necessary to work on a smaller function space than H 2 (D). This quite lengthy process is undertaken in the current section.
Let X be the set of all holomorphic functions f : D → C with the property that there exist α ∈ C and g ∈ H ∞ (D) such that f (z) = α log 2 z + g(z) for all z ∈ D. Clearly every f ∈ X has a unique representation in this form. If f ∈ X has the form f (z) = α log 2 z + g(z) for all z ∈ D where α ∈ C and g ∈ H ∞ (D) then we define . It is clear that X is a Banach space with respect to this norm. The objective of this section is to prove the following result: defines a bounded linear operator L s,0 ∈ B(X). This family of operators satisfies the following properties: (i) For each s the essential spectral radius of L s,0 acting on X is less than or equal to The operator L 1,0 acting on X has spectral radius equal to one, has a simple isolated eigenvalue at 1, and has no other spectrum on the unit circle. (iii) There exists a unique function ξ ∈ X such that L 1,0 ξ = ξ, 1 0 ξ(x)dx = 1, and ξ(x) is real and strictly positive for all x ∈ (0, 1]. There exists χ ∈ H ∞ (D) such that for all z ∈ D, More generally, if L s,0ξ = λξ for someξ ∈ X and complex number λ = 1 (iv) If ℜ(s) ≥ 1 and s = 1 then the spectral radius of L s,0 acting on X is strictly less than 1.
The proof of Theorem 5 is quite prolonged and is divided into a series of stages: the boundedness of the operator is proved below in Corollary 6.11, property (i) is proved in Proposition 6.12, and properties (ii)-(iv) are proved in Proposition 6.16. With somewhat more effort one may show that the function s → L s,0 is a holomorphic mapping into B(X), but this fact is not needed in order to prove the main results of this article. In any case we see no reason to believe that L s,ω should preserve X when ω is nonzero and c is an arbitrary cost function, and this circumstance renders X an unsuitable space in which to attempt to prove the full statement of Theorem 3.
By working in H p (D) in place of H 2 (D) for some p ∈ (2, +∞) throughout this and the previous section it would be possible to sharpen the estimate for the essential spectral radius of L s,0 acting on X to By taking p arbitrarily large we could in this manner obtain a bound of 1 4 ℜ(s) −1 when ℜ(s) ≥ 1. Since we shall have no use for such a sharpened estimate in this document we omit this analysis.
A byproduct of the analysis in this section is that we may rigorously verify the following minor conjecture of R. P. Brent: for all x ∈ [0, 1] for every integer n ≥ 0. Then there exist a real analytic function F ∞ : (0, 1] → R and real numbers K > 0, θ ∈ (0, 1) such that for all x ∈ (0, 1] and n ≥ 1 Since the proof of this result is tangential to the main thrust of this section we postpone it to subsection 6.6 below.
6.1. Elementary estimates. We will begin the proof of Theorem 5 by listing some elementary but useful results which will be repeatedly applied in this and the following section.
Proof. We may write Proof. Let f (z) = ∞ n=0 a n (z − 1) n for all z ∈ D. Using the Cauchy-Schwarz inequality we have for all z ∈ D as required.
Proof. In view of the power series log z = ∞ n=1 (−1) n+1 n (z − 1) n which is valid for all z ∈ D we have log 2 H 2 (D) = π 2 6 . Given f ∈ X let us write f (z) = α log 2 z + g(z) where g ∈ H ∞ (D) and α ∈ C. Clearly Lemma 6.5. Let M > 0 and s ∈ C. Then there exists a constant K ≥ 0 such that for all z ∈ C with ℜ(z) > 0 and |z| ≤ M and all integers k ≥ 1, Proof. When z 1 , z 2 ∈ C with ℜ(z 1 ), ℜ(z 2 ) ≥ 0 the mean value theorem implies that Using the elementary inequality |e ω − 1| ≤ |ω|e |ω| which is valid for all ω ∈ C we obtain so we may take K := |2s|e M|s| .
6.2. Auxiliary operator estimates. In this subsection we investigate the action on X of the operator G s,0 which was considered in the proof of Proposition 5.2. Our analysis centres around the observation by B. Vallée in [42,Prop. 3] that functions in the image of G s,0 may be decomposed into three parts with very particular properties. However, where Vallée decomposes a single function G s,0 f ∈ H 2 (D) into a sum of three elements of H 2 (D), we wish to decompose G s,0 itself into a sum of three bounded operators from H 2 (D) to X, and our analysis is correspondingly more intricate. Lemma 6.6. For each f ∈ H 2 (D) and s, z ∈ C such that ℜ(z) > 0 and ℜ(s) > Proof. Let M ≥ 1 and s ∈ C with ℜ(s) > 2 3 . Let z ∈ C with ℜ(z) > 0 and |z| ≤ M , and let k be a non-negative integer. By Lemma 6.2 we have which in particular implies that 1/(1 + 2 −k z) ∈ D and therefore f (1/(1 + 2 −k z)) is well-defined. Using Lemma 6.3 together with (20) it follows that By Lemma 6.5 there exists a constant K > 0 depending on M and s such that and this clearly implies in particular We have |f (1)| ≤ f H 2 (D) by Lemma 4.1, and using this together with (21), (22) and (23) we obtain say, for all z ∈ C such that ℜ(z) > 0 and |z| ≤ M , and all integers k ≥ 0, where C 1 ≥ 1 depends only on s and on the constant M ≥ 1. We deduce that the series defining (B s f )(z) converges uniformly with respect to z in this region and hence defines a holomorphic function in its interior, which clearly satisfies the bound specified in the statement of the lemma. Since M is arbitrary we conclude that for each fixed s, B s f is a holomorphic function defined for all z ∈ C such that ℜ(z) > 0. Proof. Fix f, m, M and s throughout the proof. If k ≥ 1 and z ∈ C with ℜ(z) > 0 and m ≤ |z| ≤ M , then 1/(1 + 2 k z) ∈ D by Lemma 6.2 and therefore f (1/(1 + 2 k z)) is well-defined. Using the elementary estimate together with Lemma 4.1 we may obtain the inequality Now, since additionally it follows that for all z ∈ C such that ℜ(z) > 0 and m ≤ |z| ≤ M say, as required. Since the series defining (G s,0 f )(z) converges absolutely uniformly over this region it defines a holomorphic function in the interior of the region, and this function satisfies the bound claimed in the statement of the proposition. Since m and M are arbitrary it follows that for each fixed s, the function G s,0 f is holomorphic throughout the region ℜ(z) > 0.
Lemma 6.8. Let f ∈ H 2 (D) and s ∈ C with ℜ(s) > 2 3 . Then the expression converges absolutely at each z in the half-plane ℜ(z) > 0 and defines a function which is holomorphic in that region. For all z ∈ C such that ℜ(z) > 0 we have (C s f )(z) = (C s f )(2z), and there exists a constant C 3 > 0 depending only on s such that Proof. Let f ∈ H 2 (D) and let 0 < m ≤ 1 < M . By Lemma 6.6 there exists C 1 > 0 depending on M and s but not on f such that the series When ℜ(z) > 0 and 1 ≤ |z| ≤ 2 we have | log z| ≤ | log |z|| + | arg z| ≤ log 2 + π 2 and therefore It follows that |(C s f )(z)| is everywhere bounded by (C 1 +C 2 + 4) f H 2 (D) as required.
By combining the previous three lemmas we obtain the following result which underpins much of our analysis of the action of L s,0 on X: Proposition 6.9. For each s ∈ C with ℜ(s) > 2 3 and each f ∈ H 2 (D) the function G s,0 f defined in Lemma 6.7 belongs to X, and the function G s,0 : H 2 (D) → X thus defined is a bounded linear map. For each f ∈ H 2 (D) there exists g ∈ H ∞ (D) such that (G s,0 f )(z) = −f (1) log 2 z + g(z) for all z ∈ D.
Proof. Define an operator A : H 2 (D) → X by setting (Af )(z) := −f (1) log 2 z for all z ∈ D for every f ∈ H 2 (D). Since |f (1)| ≤ f H 2 (D) for all f ∈ H 2 (D) by Lemma 4.1 it is clear that A is a bounded linear map from H 2 (D) to X. Now define two more operators B s , C s : H 2 (D) → X by taking the function f ∈ H 2 (D) to the functions B s f and C s f defined in Lemmas 6.6 and 6.8 respectively. It is clear from Lemmas 6.6 and 6.8 that B s and C s are well-defined bounded linear maps from H 2 (D) to X, and since clearly G s f = Af + B s f + C s f for all f ∈ H 2 (D) we conclude that G s,0 : H 2 (D) → X is a bounded linear map. To derive the expression (G s,0 f )(z) = −f (1) log 2 z + g(z) we simply define g := (B s + C s )f ∈ H ∞ (D).

Boundedness of Brent's operator on X.
Proposition 6.10. Let s ∈ C with ℜ(s) > 2 3 . For each f ∈ X the series defines a function D s,0 f ∈ X, and the function D s,0 : X → X thus defined is a bounded linear map with spectral radius not greater than Proof. Fix s throughout the proof. We have seen in the proof of Proposition 5.2 that D s,0 acts on H 2 (D), and since X ⊂ H 2 (D) by Lemma 4.1 it follows that for every f ∈ X the above formula for D s,0 f converges to a well-defined holomorphic function D s,0 f : D → C. We begin by proving the following claim: there exists a constant K > 0 depending on s such that for all f ∈ H ∞ (D) and n ≥ 1 we have D s,0 f ∈ H ∞ (D) and Let φ k (z) := z z+2 k for all k ∈ N and z ∈ D. Arguing in the same manner as in the proof of Proposition 5.2, for each n ≥ 1 and f ∈ H ∞ (D) we may write for all z ∈ D, and for each choice of integers k 1 , . . . , k n ≥ 1 the inequality is satisfied. It follows easily that for all z ∈ D we have which implies the validity of (25). We next assert that the holomorphic function h : D → C defined by h(z) := (D s,0 log 2 )(z) − 1 4 s −1 log 2 z belongs to H ∞ (D). We begin by noting that for all z ∈ D, By Lemma 6.5 there exists a constant K > 0 depending on s such that for each k ≥ 1 and z ∈ D it follows that we may estimate for every z ∈ D. On the other hand, to bound the second of the two sums we observe that By combining (26), (27) and (28) we conclude that h ∈ H ∞ (D) as claimed.
We may now prove the results asserted in the statement of the proposition. If f ∈ X satisfies f (z) = α log 2 z + g(z) for all z ∈ D where α ∈ C and g ∈ H ∞ (D), then we have (29) (D s,0 f ) (z) = α 4 s − 1 log 2 z + αh(z) + (D s,0 g) (z) for all z ∈ D, where αh + D s,0 g ∈ H ∞ (D). This shows that f has the form claimed in the statement of the proposition, and furthermore using (25) which shows that D s,0 is a bounded linear operator on X. More generally, by iterating (29) we find that for each n ≥ 1 for all z ∈ D, and therefore using (25) again Since f is arbitrary it follows by Gelfand's formula that the spectral radius of D s,0 acting on X is not greater than 1 4 ℜ(s) −1 . This completes the proof of the proposition.
Corollary 6.11. For each s ∈ C with ℜ(s) > 2 3 , L s,0 is a bounded linear operator on X. If L s,0 f = λf for some f ∈ X and complex number λ = 1 4 s −1 then there exists g ∈ H ∞ (D) such that for all z ∈ D.
Proof. Since L s,0 = G s,0 + D s,0 and G s,0 , D s,0 ∈ B(X) by Propositions 6.9 and 6.10 it is clear that L s,0 ∈ B(X) as claimed. If f ∈ X satisfies L s,0 f = λf and f (z) = α log 2 z + g(z) for all z ∈ D where α ∈ C and g ∈ H ∞ (D), then by Propositions 6.9 and 6.10 there exist g 1 , g 2 ∈ H ∞ (D) such that for all z ∈ D. It follows that λα = −f (1) + α 4 s −1 and since λ = 1 4 s −1 this implies the result claimed.
6.4. Essential spectrum of Brent's operator on X. The principle underlying the following proposition is similar to that behind a theorem of H. Hennion [18]. The author wishes to thank O. Butterley for describing to him some extensions of Hennion's argument.
Proposition 6.12. The essential spectral radius of L s,0 acting on X is less than or equal to Proof. Let B H 2 (D) and B X denote the closed unit balls of H 2 (D) and X respectively, and note that B X ⊆ 2B H 2 (D) by Lemma 6.4. Let ε > 0 be small enough that (1+ε) √ 2 4 ℜ(s) − √ 2 < 1. By Proposition 6.9 there exists a constant K 1 > 0 such that G s,0 f X ≤ K 1 f H 2 (D) for all f ∈ H 2 (D), so in particular for all f ∈ H 2 (D) and n ≥ 1 By Proposition 6.10 the spectral radius of D s,0 acting on X is not greater than 1 4 ℜ(s) −1 , so there clearly exists K 2 > 0 such that n for every n ≥ 0. By Proposition 5.2 the essential spectral radius of L s,0 acting on H 2 (D) is not greater than √ 2 4 ℜ(s) − √ 2 , so using Theorem 4 we may find a constant K 3 > 0 such that for every integer n ≥ 0 the Hausdorff measure of noncompactness of L n s,0 acting on H 2 (D) is strictly less than K 3 In particular, for each n ≥ 0 there exist an integer ℓ n ≥ 1 and a finite sequence U n 1 , . . . , U n ℓn of subsets of n whenever f and g both belong to the same set U n i . We will use these sets to construct a finite covering of L n s,0 B X by sets of small diameter with respect to · X .
We claim that this collection of sets forms a cover of B X . To see this suppose that Since i is arbitrary it follows that f belongs to at least one of the sets V (k0,...,kn−1) . Since f is arbitrary we conclude that (33) (k0,...,kn−1)∈I V (k0,...,kn−1) = B X as claimed. Now let J denote the collection of all sets of the form L n s,0 V (k0,...,kn−1) for (k 0 , . . . , k n−1 ) ∈ I. In view of (33) it is clear that the union of the elements of J is equal to L n s,0 B X . Let us bound the diameters of the elements of J . Suppose that f, g ∈ L n s,0 V (k0,...,kn−1) ∈ J . By definition there existf ,ĝ ∈ V (k0,...,kn−1) such that f = L n s,0f and g = L n s,0ĝ , and we trivially have f −ĝ X ≤ 2 since both functions belong to B X . It follows from the definition of V (k0,...,kn−1) that for each i = 0, . . . , n − 1 the functions 1 2 L i s,0f and 1 2 L i s,0ĝ both belong to U i ki , and therefore (32) and (30). Now, the relation is easily seen to hold for all integers m ≥ 1, since the case m = 1 is simply the identity L s,0 = G s,0 + D s,0 and the same identity facilitates the induction step Using (35) followed by (34) and (31) we may write L n s,0 (f −ĝ) n−1 whenever f, g ∈ L n s,0 B X belong to the same element of J . We have shown that the collection J of subsets of X forms a finite cover of L n s,0 B X whose elements have diameter bounded by the quantity above. This last expression is therefore an upper bound for the Hausdorff measure of noncompactness of L n s,0 acting on X. Since n is arbitrary we deduce using Theorem 4 that the essential spectral radius of L s,0 acting on X is less than or equal to (1+ε) √ 2 4 ℜ(s) − √ 2 , and since ε is arbitrary the conclusion of the proposition follows.
6.5. Point spectrum of Brent's operator on X. The result of Proposition 6.12 renders it a straightforward undertaking to bound the spectral radius of L s,0 as follows.
Proof. Since the essential spectral radius of L s,0 is strictly less than one it suffices to bound the moduli of the eigenvalues of L s,0 . To this end suppose that L s ξ s = λξ s for some λ ∈ C and nonzero ξ s ∈ X. We first consider the case in which ℜ(s) > 1. Since ξ s is holomorphic but is not the zero function we have |ξ s (x)| > 0 for all but countably many x ∈ (0, 1]. In particular, for all but countably many x ∈ (0, 1] both of the quantities are nonzero for all k ≥ 1. Using the inequalities 2ℜ(s) > 2 and 0 < 1 1+2 k x < 1 it follows that for such an x where |ξ s | is understood as an element of L 1 ([0, 1]) and L 1,0 |ξ s | is understood in the sense of Lemma 5.1, since obviously |ξ s | / ∈ X. By integration we deduce using Lemma 5.1, which implies that |λ| < 1 as claimed. If instead ℜ(s) = 1 then a similar analysis shows that |λξ s (x)| ≤ (L 1,0 |ξ s |)(x) for all x ∈ (0, 1] and by integration we deduce that |λ| ≤ 1.
To proceed further we will use a generalisation of the Kreȋn-Rutman theorem due to R. Nussbaum [32]. Following the conventions of Nussbaum's article we shall say that a subset K of a real Banach space X is a cone if it is closed and convex, satisfies λx ∈ K for all x ∈ K and λ ≥ 0, and for every Theorem 6 (Nussbaum). Let (X, · ) be a real Banach space, K ⊂ X a cone, and L : X → X a bounded linear operator such that LK ⊆ K. Let B K denote the intersection of the closed unit ball of X with the cone K, and define the spectral radius of L relative to K to be the quantity where ψ(Z) is the Kuratowski measure of noncompactness of the set Z ⊂ X. If ρ K ess (L) > ρ K (L) then there exists a nonzero function u ∈ K such that Lu = ρ K (L)u. We use this theorem to obtain the following: Lemma 6.14. There exists ξ ∈ X such that L 1,0 ξ = ξ, 1 0 ξ(x)dx = 1, and ξ(x) > 0 for all x ∈ (0, 1]. There exists χ ∈ H ∞ (D) such that for all z ∈ D Proof. Let X R denote the real Banach space of functions f ∈ X such that f (x) is real for every x ∈ (0, 1], equipped with the same norm as X, and let K denote the set of all f ∈ X R such that f (x) ≥ 0 for all x ∈ (0, 1]. It is straightforward to verify that K is a cone in X R in the sense defined above and that L 1,0 K ⊆ K. It is clear from Gelfand's formula that the quantity ρ K (L 1,0 ) is bounded above by the spectral radius of the operator L 1,0 acting on X, and by Lemma 6.13 this in turn is bounded above by 1. Conversely, observe that the constant function 1 belongs to B K . For each n ≥ 1 we may choose θ n ∈ C and g n ∈ H ∞ (D) such that (L n 1,0 1)(z) = θ n log 2 z + g n (z) for all z ∈ D, which by Lemma 5.1 implies and since n is arbitrary we deduce that ρ K (L 1,0 ) ≥ 1. Lastly, it is obvious that for each n ≥ 1 the quantity ψ(L n 1,0 B K ) is not greater than the Kuratowski measure of noncompactness of the image under L 1,0 of the closed unit ball of X, and we know by Proposition 6.12 that this quantity decreases to zero with exponential speed as n → ∞. We conclude that ρ K ess (L 1,0 ) < ρ K (L 1,0 ) = 1, and by Theorem 6 it follows that there exists a nonzero function ξ ∈ K such that L 1,0 ξ = ξ. It is clear that every nonzero element of K has positive integral along the interval [0, 1], so by multiplying ξ by a real scalar if necessary we may without loss of generality suppose that 1 0 ξ(x)dx = 1. We note that Corollary 6.11 immediately yields the validity of the formula (36).
For the remainder of the article we let ξ denote the function constructed in Lemma 6.14 above. Lemma 6.15. Let t ∈ R and suppose that L 1+it,0 ξ t = λξ t for some nonzero function ξ t ∈ X and some λ ∈ C such that |λ| = 1. Then λ = 1, t = 0, and ξ t is a scalar multiple of ξ.
Proof. By multiplying ξ t by a complex number of unit modulus if required, we may assume without loss of generality that ξ t (1) is real and nonnegative. By Corollary 6.11 and Lemma 6.14 there exist χ t , χ ∈ H ∞ (D) such that for all z ∈ D and (38) and it follows in particular that the quantity sup x∈(0,1] |ξ t (x)|ξ(x) −1 is finite. Multiplying ξ t by a positive real number if necessary we may assume that this supremum is equal to one. To prove the lemma we will show that under this hypothesis λ = 1, t = 0 and ξ t = ξ. Let us investigate the scalar factor which arises in (37). Since |λ| = 1 we have If 4 1+it = 4 then |4 1+it − 1| > 3 and therefore the second inequality above is strict. If 4 1+it = 4 then |4 1+it − 1| = 3, but if additionally λ = 1 then the first inequality must be strict since the terms inside the summation have different arguments and will partially cancel one another. We conclude that with equality if and only if both λ = 1 and 4 it = 1. It follows that and if the limit is equal to one then necessarily λ = 1 and 4 it = 1. Let us show that that the limit in (39) must equal one. For a contradiction let us suppose otherwise. In this case the supremum of |ξ t (x)|ξ −1 (x) over x ∈ (0, 1] is necessarily attained at some point x 0 ∈ (0, 1]. Since by hypothesis the limit (39) is strictly less than one we necessarily have for all sufficiently large integers k, and hence contradicting our hypothesis that |λ| = 1. We conclude that the limit in (39) is equal to one and hence in particular 4 it = 1, λ = 1, and ξ t (1) = ξ(1). In view of the last two identities we have (1), where we have simplified the expression for (L 2 1+it ξ t )(1) by taking advantage of the fact that the four functions 1 (1 + 2 k z + 2 k+ℓ z) 2+2it ξ t 1 1 + 2 k + 2 k+ℓ z which appear in the sum defining (L 2 1+it ξ t )(z) take only two distinct values when evaluated at z = 1, and we have used a similar simplification for (L 2 1,0 ξ)(1). Since the first and final expressions in the chain of inequalities (40) are identical, the inequalities in between must necessarily be equations. For this to be possible the expressions must have the same argument as one another and must also have constant argument with respect to the choice of k, ℓ ≥ 1, since otherwise the first inequality in (40) would be strict due to partial cancellations between terms. Similarly, since |ξ t (x)| ≤ ξ(x) for all x ∈ (0, 1] the identities ξ t 1 + 2 ℓ 2 k + 2 ℓ + 1 = ξ 1 + 2 ℓ 2 k + 2 ℓ + 1 and ξ t 1 1 + 2 k + 2 k+ℓ = ξ 1 1 + 2 k + 2 k+ℓ must hold for every k, ℓ ≥ 1 since otherwise the second inequality in (40) would be strict. It follows that we may choose θ ∈ R such that for all k, ℓ ≥ 1 Taking k = 1 and recalling that 4 it = 1 it is clear that this implies and so in fact e iθ = 1. If r ∈ N is any integer then taking instead ℓ ≡ k + r we similarly find 1 = e iθ = lim Since the sequence takes values in D and converges to a limit in D, the validity of the identity ξ t 2 r 1 + 2 r = 2 r 1 + 2 r −2it ξ 2 r 1 + 2 r for all integers r ≥ 1 implies that ξ t (z) = z −2it ξ(z) for every z ∈ D. By (37) and (38) it follows that for real x ∈ (0, 1] but this limit fails to exist when t = 0. We conclude that t = 0 and therefore ξ t (z) = ξ(z) for all z ∈ D, which completes the proof of the lemma.
Collating together the results of this subsection we obtain the following result which, in combination with Corollary 6.11 and Proposition 6.12, completes the proof of Theorem 5.
Proof. All of these properties follow from the combination of Lemmas 6.13, 6.14 and 6.15 except for the simplicity of the eigenvalue of L 1,0 at 1. Specifically, while Lemmas 6.14 and 6.15 together show that ker(L 1,0 −Id X ) is one-dimensional, it remains to show that ker(L 1,0 −Id X ) n+1 is one-dimensional for every n ≥ 1. Suppose for a contradiction that this is not the case, and let n ≥ 1 be the smallest integer such that the dimension of ker(L 1,0 −Id X ) n+1 exceeds one. Ifξ ∈ ker(L 1,0 −Id X ) n+1 then necessarily (L 1,0 − Id X )ξ ∈ ker(L 1,0 − Id X ) n and so we have (L 1,0 − Id X )ξ = λξ for some λ ∈ C since ker(L 1,0 − Id X ) n is one-dimensional and contains ξ. However, using Lemma 5.1 we may calculate so that in fact (L 1,0 − Id X )ξ = 0. We conclude thatξ ∈ ker(L 1,0 − Id X ), and sincê ξ was arbitrary it follows that ker(L 1,0 − Id X ) n+1 = ker(L 1,0 − Id X ), contradicting the hypothesis that dim ker(L 1,0 − Id X ) n+1 > 1. The proof is complete.
Proof. We assert that F n (x) = x 0 L n 1,0 1 (t)dt for all x ∈ (0, 1] and n ≥ 0, which we will prove by induction on n. The case n = 0 is clearly trivial. To prove the induction step, suppose that F n (x) = x 0 L n 1,0 1 (t)dt for all x ∈ (0, 1] and some integer n ≥ 0 and note that for each x ∈ (0, 1] we may write Since for each x ∈ (0, 1] using the change of variable u = t/(t + 2 k ) and v = 1/(1 + 2 k t) respectively, we have for all x ∈ (0, 1] as required to complete the induction step. By Theorem 5, 1 is an isolated point of the spectrum of L 1,0 which does not belong to the essential spectrum and is a simple eigenvalue in the sense of Proposition 4.2, and the remainder of the spectrum of L 1,0 acting on X lies inside a disc about the origin of radius strictly smaller than 1. It follows from Proposition 4.2 that there exist P, N ∈ B(X) such that L 1,0 = P + N , P N = N P = 0, P L 1,0 = L 1,0 P , P 2 = P and ρ(N ) < 1. For each n ≥ 1 we therefore have (41) L n 1,0 1 = P 1 + N n 1. As a particular consequence lim n→∞ L n 1,0 1 = P 1. Since X embeds continuously in L 1 ([0, 1]), 1,0 1 = P 2 1 = P 1 which by Theorem 5 implies that P 1 is a scalar multiple of ξ. On the other hand 1 0 L n 1,0 1 (x)dx = 1 0 1(x)dx = 1 for every n ≥ 1 and therefore 1 0 (P 1) (x)dx = 1, and we conclude that P 1 = ξ.

Conclusion of the proof of Theorem 3
In this short section we derive clauses (b) to (d) Theorem 3 from the corresponding parts of Theorem 5 and proceed to prove Theorem 3(e). These actions finally complete the proof of Theorem 3. Proof. By Lemma 6.4 every element of X belongs to H 2 (D), so the 'if' part of the lemma is trivial. To prove the converse direction we must therefore prove that dim ker(L s,0 − λId H 2 (D) ) n ≤ dim ker(L s,0 − λId X ) n .
Since |λ| exceeds the essential spectral radius of L s,0 acting on H 2 (D) the operator Since every power of a Fredholm operator of index zero is also Fredholm of index zero, d is precisely the codimension of the image of (L 1,0 − λId H 2 (D) ) n , which in turn is equal to the dimension of the kernel of the adjoint operator (L 1,0 − λId H 2 (D) ) n * acting on H 2 (D) * . If ℓ : H 2 (D) → C is a nonzero element of this kernel then by definition and |ℓ(f )| ≤ C ℓ f H 2 (D) for every f ∈ H 2 (D), where C ℓ is a constant depending on ℓ. It follows from Lemma 6.4 that for every f ∈ X the quantity ℓ(f ) is well-defined and satisfies |ℓ(f )| ≤ C ℓ f H 2 (D) ≤ 2C ℓ f X , so ℓ belongs to X * and therefore for every f ∈ X * . Since X contains H ∞ (D), and H ∞ (D) is dense in H 2 (D), ℓ cannot be the zero element of X * , and we conclude that ℓ is a nonzero element of ker ((L 1,0 − λId X ) n ) * . Since ℓ is arbitrary it follows that this kernel has dimension at least d. Since |λ| exceeds the essential spectral radius of L 1,0 acting on X the operator (L 1,0 − λId X ) n is also Fredholm of index zero and the image of (L 1,0 − λId X ) n is closed. The codimension of this image equals the dimension of ker ((L 1,0 − λId X ) n ) * and hence is also at least d. By the Fredholm property of (L 1,0 − λId X ) n it follows that ker (L 1,0 − λId X ) n has dimension at least d, and this is proves the lemma.
By Proposition 5.2 the essential spectral radius of L s,0 acting on H 2 (D) is bounded by , so every point of the spectrum of L s,0 with modulus greater than that quantity is an eigenvalue. The combination of Theorem 5 and Lemma 7.1 immediately yields: Corollary 7.2. The operator L 1,0 ∈ B(H 2 (D)) has a simple eigenvalue at 1 and has no other eigenvalues on the unit circle. There exists ξ ∈ H 2 (D) such that L 1,0 ξ = ξ, 1 0 ξ(x)dx = 1 and ξ(x) > 0 for all x ∈ (0, 1]. If λ ∈ C, L s,0ξ = λξ ∈ H 2 (D) and |λ| > for all z ∈ D. If s ∈ C with ℜ(s) ≥ 1, then ρ(L s,0 ) ≤ 1 with equality if and only if s = 1.
Together with the following proposition the above completes the proof of Theorem 3(b)-(d).
(3) The spectral radius of N s,ω is strictly less than one.
(4) The operator P s,ω is a projection with rank equal to one.
The functions λ and P also satisfy λ(1, 0) = 1 and P 1, Proof. By Corollary 7.2, 1 is an isolated point of the spectrum of L 1,0 , so we may choose a counterclockwise-oriented closed curve Γ in C which encloses 1 but does not enclose any other points of the spectrum of L 1,0 . By Proposition 5.2 the essential spectral radius of L 1,0 is less than one and so the operator L 1,0 − Id H 2 (D) is Fredholm of index zero, and it follows from Corollary 7.2 that the remainder of the spectrum of L 1,0 lies in a disc about the origin of radius strictly less than one. By [20,Theorem IV.3.16] there exists an open ball V containing (1, 0) such that for all (s, ω) ∈ V, the spectrum of L s,ω does not intersect Γ. For all (s, ω) ∈ V let us define which is a projection by [20,Theorem III.6.17] and clearly commutes with L s,ω .
Since zL s,ω − Id H 2 (D) −1 depends holomorphically on (s, ω) within its domain of definition for each fixed z, it is easily seen that P s,ω depends holomorphically on (s, ω). Define N s,ω := L s,ω − L s,ω P s,ω for each (s, ω); this operator clearly also depends holomorphically on (s, ω). The identity N s,ω P s,ω = P s,ω N s,ω follows from the definitions and the fact that P s,ω is a projection. By Proposition 4.2 the rank of P 1,0 is 1 and we have L 1,0 = P 1,0 + N 1,0 and ρ(N 1,0 ) < 1. By [20,Theorem IV.3.16] the rank of P s,ω is equal to that of P 1,0 for all (s, ω) ∈ V, and since L s,ω clearly commutes with P s,ω the image of P s,ω is invariant under L s,ω and hence is a one-dimensional eigenspace. Let λ(s, ω) denote the corresponding eigenvalue; since L 1,0 = P 1,0 +N 1,0 we have λ(1, 0) = 1. By Corollary 7.2 it follows that the image of P 1,0 is the one-dimensional subspace of H 2 (D) spanned by ξ.

The derivatives of the leading eigenvalue
We now take our first steps towards the proof of Theorem 2 by investigating the derivatives of the function λ defined in Theorem 3. This will be applied in the following two sections when we relate the operator L s,ω to the quantity µ(c) defined in Theorem 2 via the equation (12).
As well as providing the important information that the derivative of λ(s, 0) at s = 1 is nonzero, the following result is crucial in unifying several of the expressions for the asymptotic number of subtraction steps which were stated in Theorem 2. In this and all subsequent sections we use the notation λ s and λ ω to refer to the partial derivatives of λ with respect to the first and second variables respectively.
Proposition 8.1. Let V ⊂ C 2 and λ : V → C be as given in Theorem 3. Then Proof. We begin the proof with a calculation of a type which is rather standard in the theory of Ruelle operators (see for example [33]). Let V := {s ∈ C : (s, 0) ∈ V} and define ξ s := P s,0 ξ for every s ∈ V . Clearly the function from V to H 2 (D) defined by s → ξ s is holomorphic and satisfies ξ 1 = ξ, and we have (42) L s,0 ξ s = L s,0 P s,0 ξ = λ(s, 0)P s,0 ξ = λ(s, 0)ξ s for every s ∈ V . For each s ∈ V let ξ ′ s ∈ H 2 (D) denote the first derivative of the function s → ξ s evaluated at s.
For each s ∈ V and z ∈ D we may use (42) to write λ(s, 0)ξ s (z) = ∞ k=1 1 (1 + 2 k z) 2s ξ s 1 1 + 2 k z + 1 (z + 2 k ) 2s ξ s z z + 2 k and for each fixed z ∈ D this series converges absolutely in a manner which is locally uniform with respect to s. It follows that for each z ∈ D we may differentiate termwise with respect to s at s = 1 to obtain of which the right-hand side simplifies to Integrating along the interval (0, 1), applying Lemma 5.1 and eliminating the term 1 0 ξ ′ 1 (x)dx from both sides of the equation we derive the identity Using the substitution u = 1 1+2 k x for each k we may obtain so by combining these results we may obtain which is the first of the three identities claimed.
We now make the following general assertion: if f : (0, 1) → R is a measurable function such that Viewed as a statement about the random dynamical system determined by the family of maps T k : [0, 1] → [0, 1], this assertion equates to the statement that the product of the probability measure with respect to which the maps are chosen with the absolutely continuous measure on [0, 1] with density ξ is stationary with respect to the skew product transformation. Let us prove the claim. Given such a function f , using the substitution u = (1 − x)/2 k x yields and the substitution v = 2 k x/(1 − x) similarly yields Since by definition ξ(x) = (L 1,0 ξ)(x) for every x ∈ (0, 1) it follows that indeed as was claimed.
Let us now apply the claim with f (x) := 2 log(1 + x), which clearly satisfies the integrability hypothesis. In this case the claim results in the identity and by adding this to the already-established identity (43) we obtain λ s (1, 0) + 2 or more simply which is the second identity asserted in the statement of the proposition. Finally let us apply the claim with f (x) := log x, which meets the integrability hypothesis since |f (x)ξ(x)| ≤ C(1 + | log x| 2 ) for all x ∈ (0, 1) for some positive constant C. In this case the claim yields Adding this equation to the previously-established identity (43) results in the identity which simplifies to and this is the third identity asserted in the statement of the proposition. The proof is complete.
The following result allows us to relate the expression µ(c) defined in the statement of Theorem 2 to the function λ.
Lemma 8.2. Let V ⊂ C 2 and λ : V → C be as given in Theorem 3. Then Proof. Similarly to the proof of Proposition 8.1 let W := {ω ∈ C : (1, ω) ∈ V} and define ξ ω := P 1,ω ξ for every ω ∈ W . Clearly the function from W to H 2 (D) defined by ω → ξ ω is holomorphic and satisfies ξ 0 = ξ, and L 1,ω ξ ω = λ(1, ω)ξ ω for every ω ∈ W . For each ω ∈ W let ξ ′ ω ∈ H 2 (D) denote the first derivative of the function ω → ξ ω evaluated at ω. For each ω ∈ W and z ∈ D we have and for each fixed z ∈ D this series converges absolutely in a manner which is locally uniform with respect to ω. It follows that for each z ∈ D we may differentiate termwise with respect to ω at ω = 0 to obtain Integrating along the interval (0, 1) and subtracting the quantity 1 0 ξ ′ 0 (x)dx from either side yields in a straightforward manner.

Properties of the Dirichlet series
In this section we establish the equation (12) which relates the subject of Theorem 3 with that of Theorem 2, and apply it to study Dirichlet series in one variable which describe the moments of the distribution of C(u, v) on Ξ (1) n and Ξ (2) n . The desired correspondence rests on the following dull but necessary technical lemma: Lemma 9.1. Let (s, ω) ∈ U where U is as defined in Theorem 3. For each n ≥ 1 let Θ n denote the set of all pairs of coprime odd natural numbers (u, v), where u ≤ v, which are mapped to (1, 1) by exactly n steps of the binary Euclidean algorithm. Then for each (s, ω) ∈ U and n ≥ 1, and furthermore so for each n ≥ 1 we have We will show that this last sum matches the second expression given in the statement of the lemma, and to do this we must characterise the sets Θ n in terms of the functions h ∈ H. Clearly we have Θ 0 = {(1, 1)} and Θ 1 = {(1, 1 + 2 k ) : k ≥ 1}, and Θ n ∩ Θ m = ∅ when m = n. We make the following claim: for each n ≥ 1 we have (u, v) ∈ Θ n if and only if there exists a finite sequence h 1 , . . . , h n ∈ H such that h 1 ∈ H D and and to each (u, v) ∈ Θ n there corresponds a unique such sequence h 1 , . . . , h n ; furthermore, when (45) is satisfied with h 1 ∈ H D we have C(u, v) = n i=1 c(h i ). We first consider the case n = 1. We have (u, v) ∈ Θ 1 if and only if u = 1 and v = 1+2 k for some integer k ≥ 1. In this case a single step of the algorithm subtracts u from v, divides by 2 k and does not perform an exchange, so the appropriate cost is c(2, k). It is clear that u v = h(1) for h(z) = z z+2 k and that this relation does not hold when h is replaced with a different element of H D , and we have c(h) = c(2, k) = C(u, v) as required. Conversely if u v = h(1) in least terms for some h ∈ H D then (u, v) = (1, 1 + 2 k ) for some integer k and therefore (u, v) ∈ Θ 1 . This completes the proof in the case n = 1.
Let us now suppose that case n of the claim has been proved and deduce case n + 1. It is sufficient to show that if (u, v) ∈ Θ n and u v = (h n • · · · • h 1 )(1) then the numerator and denominator of h( u v ) form a pair belonging to Θ n+1 for every h ∈ H, that for every (u, v) ∈ Θ n+1 there exist a unique (p, q) ∈ Θ n and a unique h ∈ H such that u v = h( p q ), and that C(u, v) = c(h) + C(p, q). The first assertion is straightforward. If (u, v) ∈ Θ n and h(z) = 1 1+2 k z , then h( u v ) = v v+2 k u in least terms, so the numerator and the denominator are odd and coprime. It is clear that the pair (v, v + 2 k u) is mapped to (u, v) by one step of the algorithm so that (v, v + 2 k u) ∈ Θ n+1 as claimed. Similarly, if h(z) = z z+2 k then h( u v ) = u u+2 k v in least terms with odd numerator and denominator and we may easily check that (u, u + 2 k v) ∈ Θ n+1 .
Let us prove the second assertion. If (u, v) ∈ Θ n+1 where v − u is divisible by 2 exactly k times and 2 −k (v − u) ≥ u, then a single iteration of the binary algorithm takes (u, v) to (u, 2 −k (v − u)) and this operation contributes a cost of c(2, k). The pair (p, q) := u, 2 −k (v − u) is clearly also a pair of coprime odd natural numbers with the second term being greater than or equal to the first and hence belongs to Θ n . Furthermore we may write say, where h ∈ H is given by h(z) = z z+2 k and thus C(u, v) = c(2, k) + C(p, q) = c(h) + C(p, q) as desired. If on the other hand v − u is divisible by 2 exactly k times and u > 2 −k (v − u), then in a similar fashion a single step of the binary algorithm takes (u, v) to (p, q) := 2 −k (v − u), u ∈ Θ n contributing a cost of c(1, k), and we may write where h ∈ H is given by h(z) := 1 1+2 k z so that C(u, v) = c(1, k) + C(p, q) = c(h) + C(p, q) as required.
The following proposition, alluded to in §3, relates the cost functions to be studied in Theorem 2 to the operators considered in Theorem 3. We state this result in a somewhat more general form than is strictly required for the purposes of this article, in case the full statement is found useful in future investigations into the asymptotic distribution of costs. Let V ⊂ C 2 , λ : V → C and P (·,·) : V → B(H 2 (D)) be as in Theorem 3. Then there exists a holomorphic function R : V → C such that for all (s, ω) ∈ V ∩ W (47) (u,v)∈Ξ (1) exp(ωC(u, v)) v 2s = (D s,ω P s,ω 1)(1) 1 − λ(s, ω) + R(s, ω).
The following result comprises those parts of Proposition 9.2 which will be used in the present paper.
p : U → C such that for all s ∈ U with ℜ(s) > 1 where each R (i) p has a pole at 1 of order not greater than p and is otherwise holomorphic in U .
The following entirely number-theoretic lemma will also be useful in this and the following sections. Proof. For each s we may write (u,v)∈Ξ (2) 1 Finally, we apply the results of this section to obtain a fourth formula for the derivative λ s (1, 0) using an argument similar to one employed by B. Vallée [42,Prop. 6]. = − π 2 ξ(1) 16λ s (1, 0) .
Identifying the rightmost term of each line proves the corollary.

Proof of Theorem 2
The existence and properties of ξ mentioned in the statement of Theorem 2 were of course proved in Theorem 3, so in this section we have only to establish the general asymptotic formula for C(u, v) and apply it to the specific cost measurements E(u, v), S(u, v) and T (u, v). We require the following Tauberian theorem due to H. Délange ([8, Th. III], see also [30, p.121-122]) which has been found useful in related works on Euclidean algorithms [7,25,43] as well as in other investigations of asymptotic phenomena via transfer operators [34,35].
Theorem 7 (Delange). Let α ∈ R and k ∈ N, and let (a n ) be a sequence of non-negative real numbers such that the Dirichlet series ∞ n=1 n −s a n converges absolutely for all s ∈ C such that ℜ(s) > α. Suppose that f and g are holomorphic functions defined on an open subset of C which includes the half-plane {s ∈ C : ℜ(s) ≥ α} such that g(α) = 0 and such that when ℜ(s) > α, ∞ n=1 a n n s = g(s) (s − α) k + f (s).