Orthogonal decomposition of composition operators on the $H^2$ space of Dirichlet series

Let $\mathscr{H}^2$ denote the Hilbert space of Dirichlet series with square-summable coefficients. We study composition operators $\mathscr{C}_\varphi$ on $\mathscr{H}^2$ which are generated by symbols of the form $\varphi(s) = c_0s + \sum_{n\geq1} c_n n^{-s}$, in the case that $c_0 \geq 1$. If only a subset $\mathbb{P}$ of prime numbers features in the Dirichlet series of $\varphi$, then the operator $\mathscr{C}_\varphi$ admits an associated orthogonal decomposition. Under sparseness assumptions on $\mathbb{P}$ we use this to asymptotically estimate the approximation numbers of $\mathscr{C}_\varphi$. Furthermore, in the case that $\varphi$ is supported on a single prime number, we affirmatively settle the problem of describing the compactness of $\mathscr{C}_\varphi$ in terms of the ordinary Nevanlinna counting function. We give detailed applications of our results to affine symbols and to angle maps.

Let be T be a bounded operator on a Hilbert space. The nth approximation number a n (T ) is the distance in the operator norm from T to the operators of rank < n. Studying the decay of approximation numbers is relevant for compact operators T . Indeed, T is compact if and only if a n (T ) → 0 as n → ∞.
Since the proof of Theorem 1.1 is fairly short, we will present it in our preliminary section. Note that the asymptotic estimate p n ∼ n log n as n → ∞ is a direct corollary of the prime number theorem.
To give an example, suppose that ϕ(s) = c 0 s + c 1 for some c 0 ≥ 1 and c 1 ∈ C 0 . Then ϑ = Re c 1 , and if e n (s) = n −s for n = 1, 2, 3, . . . denotes the standard basis of H 2 , then C ϕ e n = n −c1 e n c 0 .
Hence a n (C ϕ ) = n − Re c1 = n −ϑ in this case, coinciding with the upper bound of (1.2). Note that for all other symbols, where ϕ 0 (s) ≡ c 1 , the maximum principle implies that ϑ < Re c 1 .
One of the main goals of the present paper is to improve on the estimates (1.2) for certain symbols ϕ. Specifically, we shall place restrictions on the prime numbers appearing in the Dirichlet series ϕ 0 . Let P denote a set of prime numbers and set M (P) = {n ∈ N : p|n =⇒ p ∈ P}. We say that a Dirichlet series f is supported on P if f (s) = n∈M (P) b n n −s .
A set of of prime numbers P is called sparse if p∈P p −1 < ∞. Our first main result is the following improvement of the lower bound in Theorem 1.1.
Our proof of Theorem 1.2 relies on an orthogonal decomposition of C ϕ that is made available by the assumptions that c 0 ≥ 1 and that ϕ 0 is supported on P, see Lemma 3.1. Let P ⊥ denote the set of prime numbers not in P. To apply the orthogonal decomposition effectively, we require that P is sparse, so that the set M (P ⊥ ) has positive density in N.
We also have a more refined result. We say that a set of prime numbers P is νsparse for some 0 < ν ≤ 1 if p∈P p −ν < ∞. In particular, a set of prime numbers is 1-sparse if and only if it is sparse. Theorem 1.3. Consider a symbol ϕ ∈ G ≥1 and suppose that ϕ 0 is supported on P.
(a) If P is sparse, then there is a constant C 1 = C 1 (ϕ 0 ) such that a n (C ϕ ) ≥ C 1 C ϕ e n H 2 .
To exemplify the type of estimates which can be obtained from Theorem 1.3, let P be a set of prime numbers and consider the affine symbol The approximation numbers of composition operators generated by affine symbols ϕ ∈ G 0 have been investigated by Queffélec Note that the case ϑ = 0 is omitted from Corollary 1.4. In this case the estimate from Theorem 1.3 (a) fails to be sharp, since it follows from [4, Thm. 1] that C ϕ is not compact, and thus that a n (C ϕ ) ≍ 1 for n ≥ 1. In Theorem 4.2 we shall also consider some examples of affine symbols supported on infinite but very sparse sets of prime numbers.
In the second part of the paper, we will investigate when the composition operator C ϕ is compact on H 2 . Suppose that ϕ ∈ G ≥1 , and consider the Nevanlinna counting function Re s, defined for every w ∈ C 0 . Bayart [3,Prop. 3] employed the classical Littlewood inequality for the Nevanlinna counting function in the unit disc to establish the Littlewood-type estimate On account of J. Shapiro's characterization of the compact composition operators on the Hardy space of the unit disc [17], and the Littlewood-type estimate (1.5), it seems plausible that the compactness of C ϕ on H 2 is related to the requirement that Bayart [3,Thm. 2] proved that if Im ϕ 0 is bounded and (1.6) holds, then C ϕ is compact on H 2 . Conversely, Bailleul [1,Thm. 6] established that if ϕ 0 is supported on a finite set of prime numbers, ϕ is finitely valent, and C ϕ is compact on H 2 , then (1.6) holds. We give a complete description in the case that ϕ 0 is supported on a single prime.
To prove this theorem, we will exploit the fact that such functions ϕ 0 are periodic (with period 2πi/ log p), in addition to the orthogonal decomposition discussed earlier. Accordingly, we will decompose the Nevanlinna counting function (1.4) into an infinite number of restricted counting functions. To handle these restricted counting functions we will rely on some ideas and techniques from our recent paper [7], where the compactness of C ϕ was characterized in the case that ϕ ∈ G 0 . Each restricted counting function comes with a change of variable formula, also known as a Stanton formula, that allows us to express C ϕ f H 2 for Dirichlet series f of a certain form, see Lemma 6.2.
To conclude the paper we will provide a detailed study of angle maps. For c 0 ≥ 1, ϑ ≥ 0 and 0 < α < 1, consider the symbol ϕ α, If ϑ > 0, then Theorem 1.3 immediately implies that a n ( for n ≥ 2, see Corollary 8.1. Similarly to the case of affine maps discussed above, Theorem 1.3 (a) does not provide the correct lower bound when ϑ = 0. In this case we shall instead proceed via the change of variable formula of Lemma 6.2 and detailed analysis of the restricted counting function. Theorem 1.6. For a positive integer c 0 and a real number 0 < α < 1, let ϕ α (s) = c 0 s + Φ α (p −s ). Then ϕ α is in G ≥1 and a n (C ϕα ) ≍ (log n) for n ≥ 2.
In the classical setting of H 2 (D), detailed studies of the approximation numbers of composition operators generated by symbols that map into an angle are carried out in [12] and [16]. Via the transference principle of [15,Sec. 9], these results also yield estimates for the approximation numbers of composition operators C ψα : H 2 → H 2 generated by angle maps ψ α (s) = 1/2 + Φ α (p −s ).
Organization. In the preliminary Section 2 we give the proof of Theorem 1.1, and discuss the notion of vertical limit functions. In Section 3 we analyze the orthogonal decomposition of C ϕ and prove Theorem 1.2 and Theorem 1.3. In Section 4 we apply Theorem 1.3 to affine symbols, and in Section 5 to membership in the Schatten classes. In Section 6 we introduce and study restricted counting functions and their associated Stanton formulas. In Section 7 we provide the proof of Theorem 1.5. In Section 8 we study the example of angle maps.
Notation. We will sometimes use the notation f (x) ≪ g(x) to indicate that there is a constant C such that f (x) ≤ Cg(x) for all relevant x. The notation ≫ indicates the reverse estimate, and f (x) ≍ g(x) means that f (x) ≪ g(x) and g(x) ≪ f (x).

Preliminaries
We will have use for two additional characterizations of the approximation numbers of a bounded operator T on a Hilbert space H, a n (T ) = sup See for example [9,Sec. II.7]. Recall also that approximation numbers satisfy the ideal property (2.3) a n (S 1 T S 2 ) ≤ S 1 a n (T ) S 2 for bounded operators S 1 , T , and S 2 on a Hilbert space H.
The following demonstration of Theorem 1.1, adapted from [5], illustrates the use of (2.1) and (2.3). In the proof, we also make use of the following result from [10, p. 329].
Proof of Theorem 1.1. We begin with the upper bound in (1.2). Set ψ(s) = s + ϑ. Note, by the definition (1.1) of ϑ, that ϕ − ϑ is in G ≥1 . Since C ϕ−ϑ C ψ = C ϕ , the ideal property (2.3) with S 1 = I, T = C ϕ−ϑ , and S 2 = C ψ , therefore yields a n (C ϕ ) ≤ C ϕ−ϑ a n (C ψ ) = n −ϑ , where the final equality follows from Lemma 2.1 and the trivial analysis of C ψ presented in the introduction.
For the lower bound in (1.2), we choose E = span({e 2 , e 3 , . . . , e pn }) as the ndimensional subspace of H 2 in (2.1). To estimate the infimum of C ϕ f H 2 , for f (s) = n j=1 b j p −s j of unit norm, we consider the auxiliary subspace F = span({e 2 c 0 , e 3 c 0 , . . . , e p c 0 n }) and deduce from the fundamental theorem of arithmetic, orthogonality, and the Cauchy-Schwarz inequality that Taking the infimum on the right-hand side, over all f ∈ E of unit norm, we obtain the stated lower bound a n (C ϕ ) ≥ p − Re c1 n .
We will now briefly recall a few facts about vertical limit functions and generalized boundary values. Let T ∞ denote the countable infinite Cartesian product of the unit circle T in the complex plane, endowed with its Haar measure µ ∞ . Via prime factorization, we may view any χ ∈ T ∞ as a character, For a Dirichlet series f (s) = n≥1 b n n −s and a character χ ∈ T ∞ , consider the vertical limit function If f converges uniformly in C θ for some θ ∈ R, then {f χ } χ∈T ∞ consists precisely of the functions which can be obtained as uniform limits in C θ of vertical translates f (· + iτ k ), where (τ k ) k≥1 is a sequence of real numbers. Despite the fact that a function f ∈ H 2 need only converge in C 1/2 , the Dirichlet series f χ actually converges in C 0 for almost every χ ∈ T ∞ (see e.g. [11,Thm. 4.1]). Moreover, the generalized boundary value f * (χ) = lim σ→0 + f χ (σ) exists for almost every χ ∈ T ∞ , and The following result can be extracted from [6,Sec. 2].

Orthogonal decomposition and approximation numbers
We now fix a subset P of the full set of prime numbers. For each j ∈ M (P ⊥ ), we let H 2 j denote the subspace of H 2 comprised of Dirichlet series of the form e j f , where f is supported on P. Since H 2 j1 ⊥ H 2 j2 if j 1 = j 2 , we have the orthogonal decomposition The following simple observation is the starting point of the present paper.
Lemma 3.1. Let ϕ ∈ G ≥1 and suppose that ϕ 0 is supported on P. For every j ∈ M (P ⊥ ), let C ϕ,j denote the operator obtained by restricting C ϕ to H 2 j . Then Proof. In view of (3.1), it is sufficient to prove that C ϕ maps H 2 j to H 2 j c 0 , since the map j → j c0 is injective on M (P ⊥ ). Consider the action of C ϕ on e n , where n = jk for j ∈ M (P ⊥ ) and k ∈ M (P): We see that C ϕ e n ∈ H 2 j c 0 , as a consequence of the assumption that ϕ 0 is supported on P.
In view of Lemma 3.1 there is for every n ≥ 1 some m ≥ 1 and j ∈ M (P ⊥ ) such that a n (C ϕ ) = a m (C ϕ,j ). We first apply this to obtain a lower bound for the approximation numbers of C ϕ which will immediately imply our first main result.

Lemma 3.2.
Let ϕ ∈ G ≥1 and suppose that ϕ 0 is supported on a sparse set of prime numbers P. There is then a positive integer m = m(P) such that Proof. By definition, any f j ∈ H 2 j can be written f j = e j f for a function f supported on P, and f j H 2 = f H 2 . By Lemma 2.2 (ii) and the composition rule (2.6), we have that ) for almost every χ ∈ T ∞ . This formula is at first valid for polynomials f , but by a density argument it continues to hold for general f supported on P, if we interpret f χ c 0 (ϕ * 0 (χ)) as a generalized boundary value when needed. By (2.4) we therefore have that In particular, j → C ϕ,j = a 1 (C ϕ,j ) is decreasing for j ∈ M (P ⊥ ). Letting (j n ) n≥1 denote the increasing sequence of integers in M (P ⊥ ), we conclude that since e jn ∈ H 2 jn and e jn H 2 = 1. The hypothesis that P is sparse means that and thus that there is a positive integer m such that j n ≤ mn for every n ≥ 1. Therefore Proof of Theorem 1.2. Since ϕ 0 is supported on a sparse set of prime numbers, Lemma 3.2 yields that a n (C ϕ ) ≥ C ϕ e mn H 2 (2.5), and accordingly This gives the stated estimate with We now turn toward proving Theorem 1.3 (a).
for every integer n ≥ 1.
Proof. As before, we compute the norms on T ∞ , so that For any ε > 0, consider the set it follows, by interpreting each side of the inequality as an average, that By the definition of X ε , we find that Extending the final integral from X ε to T ∞ , we conclude that

Proof of Theorem 1.3 (a). Combining Lemma 3.2 and Lemma 3.3 yields that
where m is as in Lemma 3.2 and C is from Lemma 3.3.
The remainder of this section is devoted to the proof of Theorem 1.3 (b). For notational reasons, we introduce the partial zeta function Lemma 3.4. Suppose that P is a set of ν-sparse prime numbers for some 0 < ν ≤ 1. Then for every K ∈ M (P) and every 2σ ≥ ν.
Proof. We estimate where (k m ) m≥1 are the integers of M (P) in increasing order and j ∈ M (P ⊥ ).
Proof. We apply the min-max principle This gives us that Accordingly, suppose that f ∈ E ⊥ with f H 2 = 1. If Re s ≥ ϑ, the Cauchy-Schwarz inequality and Lemma 3.4 imply that f (s) converges absolutely, and that Of course the same estimate also holds if f is replaced by f χ c 0 for any χ ∈ T ∞ . Since s = Re ϕ * 0 (χ) ≥ ϑ for almost every χ, we may therefore apply this estimate in conjunction with Lemma 2.2 (ii), (2.4) and (2.6) to see that Together with the min-max principle, this gives the claimed estimate.
is strictly decreasing, onto (by the assumption ϑ > 0), continuous and enjoys the estimate Φ(xy) ≤ y −ϑ Φ(x) for every x, y ≥ 1. Hence Φ has an inverse function Φ −1 : (0, 1] → [1, ∞) satisfying the same properties and enjoying the estimate for every 0 < x, y ≤ 1. Fix some 0 < x ≤ 1. The orthogonal decomposition of Lemma 3.1 allows us to rewrite We now apply Lemma 3.5 to bound the right-hand side from above. Note that the hypotheses of Lemma 3.5 certainly hold, since we are working under the stronger assumptions that 0 < ν < 1 and 2ϑ ≥ ν/(1 − ν). We obtain that Counting for each m the number of positive integers j (not only those in M (P ⊥ )) which satisfy the inequality where the second inequality comes from (3.2) applied with holds for every 0 < x ≤ 1.
Since ϑ > 0, there is a smallest positive integer N such that N 2ϑ ≥ ζ P (ν). By the upper bound in Theorem 1.1 it follows that a n (C ϕ ) ≤ ζ P (ν) for every n ≥ N .
for every n ≥ N . Following the proof of Lemma 3.3 verbatim with ε = ϑ yields that for every n ≥ N , which completes the proof.

Composition operators generated by affine symbols
To exemplify Theorem 1.3 we consider affine symbols, which we recall from (1.3) to have the form At first, we assume that ϕ is supported by a set of |P| = d < ∞ prime numbers. In particular, c p = 0 for every p ∈ P. Note from (2.5) that Before proving Corollary 1.4, let us quickly recall the known results about a n (C ϕ ) in this setting. We begin with the case c 0 = 0, in which case we must require that ϑ ≥ 1/2 in order for C ϕ to be bounded. Queffélec If ϑ > 1/2, then by [13,Thm. 4.1] we have that where the implied constant depends on Re c 1 and ϑ > 1/2, but not on d. Actually, the estimate is stated and proved only for d = 1 in [13]. However, it can be extended to general d ≥ 1 by applying the max-min principle (2.1) and the subordination principle for affine symbols from [6,Thm. 5].
Suppose instead that c 0 ≥ 1. If ϑ > 0, then the best previously known estimates were from Theorem 1.1. As mentioned in the introduction, if ϑ = 0 for an affine symbol ϕ, then a n (C ϕ ) ≍ 1 for n ≥ 1, and so this case is not of interest.
To prove Corollary 1.4, we require the following version of Hankel's asymptotic estimate for the modified Bessel function of the second kind with parameter 0. It will be convenient for us to have explicit constants; we have made no attempt to optimize these.
Proof of Corollary 1.4. In view of Lemma 2.3 we may without loss of generality replace ϕ by ϕ χ for any χ ∈ T ∞ This allows us to assume that ϕ 0 is of the form where τ ∈ R and γ p > 0 for p ∈ P. By Theorem 1.3 (a) and (b), we need to estimate as n → ∞. Suppose that n is large enough that γ p log n ≥ 1 8 for every p ∈ P. Then, by applying Lemma 4.1 with x = γ p log n, We finish this section by discussing a class of affine symbols with |P| = ∞. For any affine symbol with absolutely convergent coefficients, the image ϕ * 0 (T ∞ ) is an annulus (see e.g. [19,Sec. XI.5]). Hence ϕ * 0 (T ∞ ) touches the line Re w = ϑ tangentially. However, the examples of this section show that the interaction between different prime numbers is essential in determining the behavior of the approximation numbers a n (C ϕ ). When c 0 = 0, symbols with |P| = ∞ have previously been considered in [15,Thm. 1.3].

Schatten classes
For 1 ≤ q < ∞, a linear operator T on a Hilbert space H belongs to the Schatten class S q if (a n (T )) n≥1 ∈ ℓ q , in which case its Schatten norm is given by Let (x n ) n≥1 be an orthonormal basis of H. If T ∈ S q , then On the other hand, if 1 ≤ q ≤ 2 and n≥1 T x n q < ∞, then T ∈ S q , and The necessary and sufficient conditions (5.1) and (5.2) for Schatten membership can be found for example in [9, pp. 94-95].
Let us now return to composition operators C ϕ with symbols ϕ ∈ G ≥1 . Note from Theorem 1.1 that if ϑ > 1/q for some 1 ≤ q < ∞, then C ϕ ∈ S q . We can use (5.1) to obtain the following converse.
Proof. As in the proof of Theorem 1.1, we exploit that ϕ − ϑ also belongs to G ≥1 ; by Lemma 2.1 we have that C * ϕ−ϑ = C ϕ−ϑ = 1. Hence the ideal property (2.3) implies that C * ϕ−ϑ C ϕ Sq ≤ C ϕ Sq . We then apply (5.1) with T = C * ϕ−ϑ C ϕ and x n = e n as the standard basis of H 2 to conclude that Observing that a measure-theoretic argument then shows that 2 Re ϕ * 0 (χ) ≥ ϑ+1/q for almost every χ ∈ T ∞ . Since ess inf χ∈T ∞ Re ϕ * 0 (χ) = ϑ by (2.5), we conclude that ϑ ≥ 1/q. Therefore only the case ϑ = 1/q is of further interest. In this setting, we have the following corollary of Theorem 1.3.
Corollary 5.2. Let 1 ≤ q < ∞. Suppose that ϕ ∈ G ≥1 with ϑ = 1/q and that ϕ 0 is supported on a sparse set of prime numbers P.
(a) If 1 ≤ q ≤ 2, then C ϕ ∈ S q if and only if C ϕ e n H 2 n≥1 ∈ ℓ q .
(b) If 2 < q < ∞ and P is 2/(2 + q)-sparse, then again we have that C ϕ ∈ S q if and only if C ϕ e n H 2 n≥1 ∈ ℓ q .
Proof. If 1 ≤ q ≤ 2, then the statement is a consequence of Theorem 1.3 (a) and the general sufficient condition (5.2). If 2 < q < ∞, then the statement follows directly from Theorem 1.3.
When q = 2, Theorem 5.1 has previously been observed by Finet, Queffélec and Volberg [8, p. 279]. We finish this section by pointing out the following curiosity of the Hilbert-Schmidt case. If ϕ ∈ G ≥1 , then In other words, C ϕ is Hilbert-Schmidt if and only if C ϕ0 is Hilbert-Schmidt. Note from the examples of Section 4 and Section 8 that the approximation numbers of C ϕ0 tend to have much more rapid decay than those of C ϕ , even when ϑ = 1/2. This is compensated for by the fact that if c 0 ≥ 1, then a 1 (C ϕ ) = C ϕ = 1, while it always holds that C ϕ > 1 when c 0 = 0.

Restricted counting functions
We shall now begin to work towards Theorem 1.5. For notational simplicity we will assume that p = 2 throughout. We therefore consider symbols From (6.1) it is easy to establish the Carlson-type formula valid for all f ∈ H 2 j and j ∈ O. From (6.2) and a direct computation we obtain the following Littlewood-Paley formula. The proof is very similar to those of [ Proof. Swap the order of integration in (6.3) and then apply (6.2) for each σ > 0. The proof is finished by using the formula Define the restricted counting function N ϕ by For technical reasons, we have included Im s = −π/ log 2 in the definition of N ϕ (see (6.10) below). This only affects the value of N ϕ on a set of measure zero. The following version of the Stanton formula follows immediately from Lemma 6.1 and a change of variables.
Our next goal is to obtain a version of the Littlewood-type inequality (1.5) for the restricted counting function. Our main tool will be the classical Littlewood inequality for the Hardy space of the unit disc (see e.g. [18,Sec. 10.4]). Recall that if ψ is an analytic self-map of the unit disc D, then the Nevanlinna counting function is defined as for η ∈ D \ {ψ(0)}. The Littlewood-type inequality for N D ψ takes the form for η = ϕ(0). The proof of the following result is inspired by [7,Lem. 2.4]. Lemma 6.3. Suppose that ϕ ∈ G ≥1 and that ϕ 0 is supported on P = {2}. There is a constant C = C(ϕ) such that Proof. Let Θ : D → S denote the Riemann map of D onto the half-strip normalized so that Θ(0) = 1, Θ ′ (0) > 0, and for T > 0, let Θ T = T Θ. For any w ∈ C 0 and any T > 0, the function is an analytic map from D to D. Fix a number T ≥ 2 Re w/c 0 . Then Re ϕ(Θ T (0)) = Re ϕ(T ) ≥ 2 Re w, since Re ϕ 0 (s) ≥ 0 for every s ∈ C 0 . Hence it is evident that ψ(0) = 0. Using (6.6) with η = 0 we find that Noting that ψ(z) = 0 if and only if ϕ(Θ T (z)) = w, and substituting s = Θ T (z), we rewrite this estimate as (6.7) By standard regularity results for conformal maps there is an absolute constant C > 0 such that Re s ≤ CT log 1 |Θ −1 T (s)| whenever | Im s| ≤ T /2 and 0 < Re s ≤ T /2. Since Re ϕ 0 (s) ≥ 0 for every s ∈ C 0 , we of course have that if ϕ(s) = w, then Re s ≤ Re w/c 0 . This implies that if T ≥ 2 max(Re w/c 0 , π/ log 2), then Noting that both ϕ(T ) and w are in the half-plane C 0 , a basic estimate for the pseudo-hyperbolic metric (see e.g. [7,Lem. 2.3]) yields that Combining (6.7), (6.8), and (6.9), and recalling that Re ϕ(T ) ≥ 2 Re w, we conclude that as long as T ≥ 2 max(Re w/c 0 , π/ log 2). For the purposes of the present lemma, where we restrict our attention to 0 < Re w ≤ c 0 π/ log 2, it is sufficient to choose T = 2π/ log 2.
While Theorem 1.5 is stated in terms of the Nevanlinna counting function N ϕ , we shall prove it via the restricted counting function N ϕ , which is natural in view of Lemma 6.2. To bridge the gap, the remainder of this section is devoted to the following result. Theorem 6.4. Suppose that ϕ ∈ G ≥1 and that ϕ 0 is supported on P = {2}. Let N ϕ denote the Nevanlinna counting function (1.4) and N ϕ the restricted counting function (6.5). Then Three preliminary results are required for the proof of Theorem 6.4. We first decompose the Nevanlinna counting function as where N ϕ,k (w) denotes the restricted counting function Re s corresponding to the the half-strip Note in particular that N ϕ,0 = N ϕ . We begin our study of N ϕ,k with the following Littlewood-type estimate.
Lemma 6.5. Suppose that ϕ ∈ G ≥1 and that ϕ 0 is supported on P = {2}. There is a constant C = C(ϕ) > 0 such that the estimate holds uniformly for for 0 < Re w ≤ c 0 π/ log 2 and k ∈ Z.
Proof. If ϕ(w) = s for some s ∈ S k , then the periodicity of ϕ 0 implies that Hence we get from Lemma 6.3 that For 0 < Re w ≤ c 0 π/ log 2, we can combine Lemma 6.5 and (6.10) to obtain a less precise version of Bayart's estimate (1.5). Specifically, It is reasonable to expect that the main contribution to N ϕ (w) in the decomposition (6.10) arises from the k such that w/c 0 ∈ S k . This is the main idea in the proof of Theorem 6.4. Before proceeding, we record the following simple fact. Lemma 6.6. Suppose that ϕ ∈ G ≥1 and that ϕ 0 is supported on P = {2}. Let j 1 , j 2 , k 1 , k 2 be integers satisfying the equation k 1 − j 1 = k 2 − j 2 . For every w 1 such that w 1 /c 0 ∈ S j1 , there is a w 2 such that w 2 /c 0 ∈ S j2 , Re w 2 = Re w 1 , and Proof. Given w 1 , define Clearly w 2 /c 0 ∈ S j2 and Re w 2 = Re w 1 . Consider s 1 ∈ S k1 such that ϕ(s 1 ) = w 1 . Define If k 1 − j 1 = k 2 − j 2 , then k 2 − k 1 = j 2 − j 1 , and thus ϕ(s 2 ) = w 2 (with the same multiplicity as ϕ(s 1 ) = w 1 ), by the periodicity of ϕ 0 . This demonstrates that but by symmetry we also have the reverse estimate.
A direct consequence of Lemma 6.6 is that the property N ϕ,k (w) = o(Re w) does not depend on k. Lemma 6.7. Suppose that ϕ ∈ G with c 0 ≥ 1 and that ϕ 0 is supported on We now prove the main result of this section.
Prood of Theorem 6.4. The implication =⇒ is trivial since N ϕ (w) ≤ N ϕ (w). For the converse implication ⇐= , we assume that for every sequence (w j ) j≥1 in C 0 such that Re w j → 0. We may without loss of generality assume that 0 < Re w j ≤ c 0 π/ log 2, so that Lemma 6.5 applies. Fix ε > 0. We need to prove that there is some J > 0 such that for every j ≥ J. For each j ≥ 1, define k j by the requirement that w j /c 0 ∈ S kj . By Lemma 6.5 and the decomposition (6.10) there is some non-negative integer K (which does not depend on j) such that where the points w j arise from Lemma 6.6, whence Re w j = Re w j → 0 as j → ∞.
We can now appeal to Lemma 6.7 to choose J so large that whenever |k| ≤ K and j ≥ J. Hence (6.11) holds for j ≥ J.
j such that f k H 2 ≤ 1 for every k ≥ 1 and which converges weakly to 0. For every ε > 0 and θ > 0 there is some constant K = K(ε, θ) such that Proof. Since f is in H 2 j , we can use the periodicity condition (6.1) and (6.2) to see that for every k ∈ Z and every σ > 0. The estimate easily follows.
We shall also require the following pointwise estimate for the derivative of a function in H 2 j .
Proof. Applying the Cauchy-Schwarz inequality we find that where we in the final estimate used that j ≥ 3 and Re w ≥ θ.
Proof of Theorem 1.5: Sufficiency. We assume now that where N ϕ is the Nevanlinna counting function (1.4). By Theorem 6.4 this is actually equivalent to the weaker assumption that where N ϕ is the restricted counting function (6.5). Our goal is to prove that C ϕ is compact on H 2 , which by Lemma 3.1 is equivalent to proving that (i) C ϕ,j is compact for every j ∈ O, (ii) C ϕ,j → 0 as j → ∞. We begin by establishing an estimate that is of relevance to the proofs of both claims. Fix 0 < δ < 1. For every ε > 0, there is some 0 < θ = θ(ε) ≤ c 0 π/ log 2 such that if 0 < Re w ≤ θ, then Suppose for the sake of contradiction that (7.4) does not hold along some sequence (w k ) k≥1 in C 0 with c 0 π/ log 2 ≥ Re w k → 0 as k → ∞. If | Im w k | is unbounded, we obtain a contradiction to Lemma 6.3. However, if | Im w k | is bounded we have a contradiction to (7.3). Combining (7.4) with Lemma 7.2 we find that if f (s) = n≥1 a n n −s is any function in H 2 j with f H 2 ≤ 1, then where we made use of the identity (6.4) to assert the final inequality. Let us prove the validity of (i). Fix j ∈ O and suppose that (f k ) k≥1 is a sequence in H 2 j which converges weakly to 0 and satisfies f k H 2 ≤ 1 for every k. We then choose θ > 0 such that (7.5) holds for each f k . Next, appealing to Lemma 7.1, we choose K such that for k ≥ K. By Lemma 6.2, combining (7.5) and (7.6) yields that Since ε was arbitrary, we conclude that C ϕ f k H 2 → 0 as k → ∞, and thus that C ϕ,j is compact.
For the proof of necessity, we require the following sub-mean value property of the Nevanlinna counting function for the unit disc. It is convenient to introduce the notation D(w, r) = {ξ ∈ C : |ξ − w| < r}. Lemma 7.4. Suppose that ψ is an analytic self-map of the unit disc D and let N D ψ denote the Nevanlinna counting function associated with ψ. If g is an analytic map from a domain Ω to D, D(w, r) ⊆ Ω, and ψ(0) ∈ g(D(w, r)), then Proof. For a short proof we refer to [17,Sec. 4.6].
Proof of Theorem 1.5: Necessity. We assume now that C ϕ is compact on H 2 and seek to establish that where N ϕ is the Nevanlinna counting function (1.4). In view of Theorem 6.4, we may equivalently establish that where N ϕ is the restricted counting function (6.5).
Since C ϕ is compact on H 2 , it is certainly compact on the subspace H 2 1 . We shall make use of a version of Lemma 6.2 adapted to a larger half-strip. If we first write down the Littlewood-Paley formula (6.3) with respect to the half-strip | Im s| < 2π/ log 2, and then make a change of variables, we obtain the formula for every f ∈ H 2 1 . Here the counting function has been restricted to the larger strip, so that At every point in w ∈ C 0 , the subspace H 2 1 has a reproducing kernel which is bounded in C 0 . A direct computation shows that the normalized reproducing kernel at w ∈ C 0 is given by If (w k ) k≥1 is any sequence in C 0 with Re w k → 0 + , then (K w k ) k≥1 converges weakly to 0 in H 2 1 , and thus C ϕ K w k → 0 as k → ∞, since C ϕ is compact. From (7.8) and (7.9) we therefore conclude that (7.10) lim Let us for the moment restrict our attention to a single w = w k , assuming without loss of generality that 0 < Re w ≤ c 0 π/ log 2. As in the proof of Lemma 6.3, we define where we now fix T = 2π/ log 2. We also let g(ξ) = (ξ − w)/(ξ + w), so that g is a conformal map from C 0 to D. Clearly, ψ(z) = g(ξ) if and only if ϕ(Θ T (z)) = ξ. Hence If ξ ∈ D(w, Re w/2), then Since Re ϕ 0 (s) ≥ 0, we see that if ϕ(s) = ξ, then certainly Re s ≤ (3/4)T . Set S T = {s ∈ C 0 : | Im s| < T }. If s ∈ S T , then it follows from a Kellogg-Warschawski type theorem (see e.g. [14,Thm. 3.9]) that It is of course also possible to establish this estimate by direct computation. Since Re ϕ(T ) ≥ c 0 T > Re ξ for every ξ ∈ D(w, Re w/2), we can use the estimate together with Lemma 7.4 for Ω = C 0 to conclude that Next we recall that g(w) = 0 and return to formula (7.11). If we restrict ourselves to solutions of ϕ(s) = w satisfying | Im s| ≤ π/ log 2 = T /2, then (6.8) says that ). Applying these two estimates for every w = w k , we have finally proven that (7.10) implies the desired conclusion,

Approximation numbers for angle maps
As in Section 6 and Section 7, we will assume without loss of generality that we are working with the prime number p = 2. Fix a real number 0 < α < 1 and let The function Φ α is a univalent map from the unit disc D onto the angle The following result is an immediate consequence of Theorem 1.3. Let ϕ α,ϑ (s) = c 0 s + ϑ + Φ α (2 −s ). Then ϕ α,ϑ is in G ≥1 and a n (C ϕ α,ϑ ) ≍ n −ϑ (log n) − 1 2α for n ≥ 2. The implied constants depend on α and ϑ.
Proof. It is evident that ϕ α,ϑ is in G ≥1 . Appealing to both parts of Theorem 1.3 and computing the H 2 -norm with (2.4), we get for n ≥ 2 that a n (C ϕ α,ϑ ) ≍ C ϕ α,ϑ e n H 2 = n −ϑ T n −2 Re Φα(e iθ ) dθ 2π where a straightforward estimate has been carried out in the final step. Theorem 1.3 (a) no longer yields the correct behavior of the approximation numbers in the more intricate case when ϑ = 0. To prove Theorem 1.6, we will instead base our analysis on the change of variables formula from Lemma 6.2 and on estimates of the restricted counting function.

Lemma 8.2.
There is β = β(c 0 , α), α < β < 1, such that Proof. In view of the maximum principle, it is sufficient to show that there is β such that , and a numerical choice of β.
By Lemma 3.1, it is sufficient to establish (8.4) for f j in H 2 j satisfying f j (+∞) = 0, as long as we do so uniformly for every j ∈ O.
To estimate C ϕα f j 2 H 2 we use the change of variables formula from Lemma 6.2 and Lemma 8.3 (ii). Let η 2 be as in the latter result and split the integral from Lemma 6.2 at Re w = η 2 to obtain We begin with I 1 . We appeal to Lemma 8.3 (ii), then extend the integral in σ from (0, η 2 ) to (0, ∞) to see that Writing m = j2 k ≥ 2 and n = j2 l ≥ 2, (8.8) 1 log m + log n ≤ 1 log 2 (k + l) −1 if j = 1, Note that if j = 1, we are only summing over k, l ≥ 1, while we need to consider k, l ≥ 0 if j ∈ O \ {1}. In either case, we insert (8.7) and (8.8) into (8.6) and appeal to Hilbert's inequality to conclude that The integral I 2 is easier to estimate. We can for instance use the coarse upper bound N ϕα (w) ≤ N ϕα (w) ≤ Re w/c 0 from (1.5) and argue as above to see that Thus I 2 clearly satisfies the same upper bound as I 1 (up to a constant), since η 2 > 0.
We insert this estimate into the double integral in (8.10), then use the substitution σ = x/ log(j 2 2 k+l ) and that (log 2)(k − l)/ log(j 2 2 k+l ) ≤ 1 to obtain By choosing 0 < β 1 < α sufficiently small, we can make B as small as we wish. In particular, we can ensure that Hence C ϕα f j 2 H 2 / f j 2 H 2 ≍ (log j) 1−1/α , completing the proof of (8.9).