Summability and speed of convergence in an ergodic theorem

Given an irrational vector $\alpha$ in $\mathbb{R}^{d}$, a continuous function $f(x)$ on the torus $\mathbb{T}^{d}$ and suitable weights $\Phi(N,n)$ such that $\sum_{n=-\infty}^{+\infty}\Phi(N,n)=1$, we estimate the speed of convergence to the integral $\int_{\mathbb{T}^{d}}f(y)dy$ of the weighted sum $\sum_{n=-\infty}^{+\infty}\Phi(N,n) f(x+n\alpha)$ as $N\rightarrow +\infty$. Whereas for the arithmetic means $N^{-1}\sum_{n=1}^{N}f(x+n\alpha)$ the speed of convergence is never faster than $cN^{-1}$, for other means such speed can be accelerated. We estimate the speed of convergence in two theorems with different flavor. The first result is a metric one, and it provides an estimate of the speed of convergence in terms of the Fourier transform of the weights $\Phi(N,n)$ and the smoothness of the function $f(x)$ which holds for almost every $\alpha$. The second result is a deterministic one, and the speed of convergence is estimated also in terms of the Diophantine properties of the given irrational vector $\alpha\in\mathbb R^d$.


Introduction
The motivation for this work comes from an attempt to estimate the speed of convergence in a classical ergodic theorem which we now describe.There are several results in the literature concerning this problem and here we only cite a few.
A classical result of L. Kronecker states that if α = (α 1 , . . ., α d ) ∈ R d is an irrational vector, that is, if 1, α 1 , . . ., α d are linearly independent over the rationals, then the sequence {nα} +∞ n=1 is dense in the torus T d = R d /Z d .This implies that for every continuous nonconstant function f (x) on the torus the sequence {f (x+nα)} +∞ n=1 does not have a limit as n → +∞.Another classical result obtained independently by P. Bohl, W. Sierpinski and H. Weyl states that the sequence {nα} +∞ n=1 is uniformly distributed in the torus, and the arithmetic means of the sequence {f (x + nα)} +∞ n=1 converge to the integral of the function, lim For such classical facts we refer the reader, for instance, to [16] and [23,Chapter 6].The map x → x + α is a measure preserving ergodic transformation whenever α ∈ R d is an irrational vector and the above results are particular cases of classical ergodic theorems.It is known that no general statement can be made about the rate of convergence in these theorems.In [13] and [15] it is proved that if T is a measure preserving ergodic transformation of the interval [0, 1] and if {ε n } +∞ n=1 is a positive sequence converging to 0, then there exists a continuous function f (x) such that, for almost every x, one has lim sup Confirming a conjecture of Erdös and Szüs, in [14] and in [18] it is proved that if f (x) is the characteristic function of an interval {a ≤ x ≤ b}, with 0 < b − a < 1, then the quantity is bounded in N if and only if b − a = hα − k for some integers h and k.Therefore, for a characteristic function the speed of convergence cN −1 is the exception, not the rule.For multidimensional analogues of such results see [10,11].
In [12] it is proved that if f (x) is a continuously differentiable function on {0 ≤ x ≤ 1} with df (x)/dx Lipschitz continuous and with f (0) = f (1), then, for every α, one has lim sup It is also proved that if f (0) = f (1), hence f (x) is continuous as a function on the torus T, but the derivative may have a jump discontinuity, then, for almost every α, one has lim sup Observe that a discontinuous function cannot have an absolutely convergent Fourier expansion.On the other hand, the assumptions f (0) = f (1) and df (x)/dx Lipschitz continuous in {0 ≤ x ≤ 1} imply that | f (m)| ≤ cm −2 .More generally, if df (x)/dx is Hölder continuous with exponent ε > 0, then | f (m)| ≤ cm −1−ε .In [6] it is proved that if {α n } +∞ n=0 is a van der Corput sequence on the interval {0 ≤ x ≤ 1} and if the Fourier coefficients of the function f (x) have decay | f (m)| ≤ c|m| −1−ε for some ε > 0, then sup In [1] it is proved that the expected speed of convergence of Weyl sums of continuous, or, more generally, square integrable functions, is slightly less than N −1/2 .More precisely, they proved that if f (x) is square integrable and if ν < 1/2 then, for almost every (α, x) ∈ T d × T d , one has lim sup They also proved that the exponent −1/2 is best possible and that there exist continuous functions f (x) such that, for almost every (α, x) ∈ T d × T d , one has lim sup In conclusion, the rate of convergence of the means N −1 N n=1 f (x + nα) to the integral T d f (y)dy can be arbitrarily slow and it is also quite easy to see that this rate of convergence cannot be faster than cN −1 ; see the proof of Corollary 1.7.The goal in this paper is to show that, with suitable smoothness assumptions on the function f (x), the speed of summability of the divergent sequence {f (x + nα)} +∞ n=1 can be improved if instead of the arithmetic means one considers smoother means such as, for instance, See [7] and [24] for references about summation methods.Let us now fix some notations for what follows.Denote by t the distance of a real number t to the nearest integer, that is, t = inf n∈Z {|t − n|}.Functions on the torus T d = R d /Z d are identified with periodic functions on R d with period Z d .The Fourier transform and the Fourier expansion of an integrable function on the torus are defined respectively by The Sobolev space W δ,2 (T d ), δ > 0, is the space of distributions on T d defined by the norm .
In what follows Φ(N, n) denotes a complex valued function of the positive integer variable N ≥ 1 and the integer variable n ∈ Z, with the property that for every N the function n → Φ(N, n) has bounded support, and that (1) The weighted discrepancy associated to the weights {Φ(N, n)} +∞ n=−∞ and to the Kronecker sequence {nα} +∞ n=−∞ , with α ∈ R d , or equivalently with α ∈ T d , is defined by An example to keep in mind is where Ψ(t) is a suitable bounded function with compact support.In this case N is roughly the size of the support of the function n → Φ(N, n).The assumption of compact support could be weakened assuming a suitably fast decay at infinity.Our first main result is related to the results in [4,5] and it reads as follows.
(i) There exist constants K > 0 and ϑ > 0 such that for every N ≥ 1 and every t ∈ R one has Then, for almost every α there exists a positive constant c(f, α) such that for every positive integer N one has Observe that if the assumption (i) holds true with an exponent ϑ 0 , then it also holds true for every ϑ 1 ≤ ϑ 0 .If the function f (x) has a degree of smoothness d/2 < δ ≤ dϑ 0 − d/2 with ϑ 0 > 1, then one cannot guarantee a speed of convergence cN −ϑ 0 , but at least one can guarantee a speed cN −ϑ 1 for every 1 The above result is a metric one and it holds true for almost every α.Our second main result is a deterministic one and it holds for a specific α.Theorem 1.2.Let D Φ,α N be the operator defined as above with Φ(N, n) satisfying (1).Assume the following.
(i) There exist constants K > 0 and ϑ > 0 such that for every positive integer N and every t ∈ R one has (ii) The vector α ∈ R d is irrational and there exist constants H > 0 and σ ≥ d such that α • m ≥ H|m| −σ for every m ∈ Z d \ {0}.(iii) Finally assume that δ > d/2 and set Then there exists a positive constant c = c(H, K, d, δ, ϑ, σ) such that for every function f (x) in the Sobolev space W δ,2 (T d ) and every positive integer N one has Observe that both Theorem 1.1 and Theorem 1.2 guarantee a speed of convergence cN −ϑ , up to some possible logarithmic transgressions, but the smoothness assumptions on the functions in these theorems are different.The index of smoothness δ > dϑ − d/2 in Theorem 1.1 is allowed to be smaller than the index δ > ϑσ ≥ dϑ in Theorem 1.2.On the other hand the conclusion in Theorem 1.1 holds for almost every α, with α depending on the given function one is considering, whereas in Theorem 1.2 the vector α is independent of the function.Anyhow, both theorems are essentially sharp.The following theorem shows that in Theorem 1.1 and in Theorem 1.2 the speed of convergence cN −ϑ cannot be accelerated for every nonconstant function, provided that the assumption Then, for every function f (x), every m ∈ Z d \ {0} and every α ∈ T d one has lim sup In particular, if C(t) > 0 for a set of t ∈ T of measure 0 < η ≤ 1, then C(m•α) > 0 for a set of α ∈ T d of measure η; see Lemma 2.1.
Notice that in this theorem the smoothness index of the function plays no role.Nonetheless, some smoothness is necessary.Indeed, since the Sobolev space W Moreover, if the sequence {Φ(N, n)} +∞ n=−∞ is non-negative, then the above discrepancy is infinite for every α.
The following theorem shows that the index δ > dϑ−d/2 in Theorem 1.1 is sharp, provided that the assumption (i) in the theorem can be reversed.
Theorem 1.5.Assume that for an infinite sequence of N 's there exists H > 0 such that such that, for every α, one has (ii) There exists a function f (x) in the Sobolev space W dϑ− d 2 ,2 (T d ) such that, for almost every α, one has lim sup The following theorem shows that the smoothness index δ > ϑσ in Theorem 1.2 is sharp, provided that the assumptions (i) and (ii) in the theorem can be reversed.
Theorem 1.6.Assume that for an infinite sequence of N 's there exists H > 0 such that Assume also that for a given α there exist L > 0 and an infinite subset Ω of A crucial assumptions in Theorem 1.2 is the behavior of the sequence { α • m } m∈Z d .We recall some results in Diophantine approximation.It follows from these results that the assumption (ii) in Theorem 1.2 is not empty and σ ≥ d is necessary.The following corollaries show that also the assumption (i) in Theorem 1.1 and Theorem 1.2 is nonempty.
Then Theorem 1.1 and Theorem 1.2 can be applied with every ϑ, but with [ √ N ] instead of N , that is the relations between the indexes d, δ, ϑ and σ are the ones in the theorems, but the speed of convergence is cN The next corollary shows that, up to a small logarithmic transgression, Kronecker sequences associated to vectors α which satisfy the hypothesis (ii) in Theorem 1.2 with σ = d give optimal quadrature rules for Sobolev functions.See [3] for results about quadrature rules for Sobolev functions.See also [8] and [9] for results about existence of optimal quadrature rules.
Let α be an irrational vector in R d and assume also that there exist constants Then there exists a positive constant c such that When σ = d the speed of convergence cN − δ d cannot be improved in the sense that there exists c > 0 such that for every distribution of points {z(n)} N n=1 and weights {ω(n)} N n=1 there exist nonconstant functions in W δ,2 (T d ) with The above corollaries show that it is quite easy to exhibit examples of weights Φ(N, n) that satisfy the assumptions in Theorem 1.1 and Theorem 1.2.It is less immediate to construct weights Φ(N, n) that satisfy the reverse assumption, in particular the ones in Theorem 1.5 and Theorem 1.6.However, such weights exist; see Remark 3.1.We include in the paper an appendix where we shall consider the logarithmic means defined by the weights Although these logarithmic means do not satisfy exactly the assumptions in Theorem 1.1 and Theorem 1.2, the proofs of these theorems can be adapted.
To conclude, the above results may have continuous analogues where the discrete means are replaced by continuous means, We plan to investigate such operator in future works.
In the next section we provide the proofs of our main theorems, whereas in Section 3 we conclude with some final remarks.

Proofs of the main results
To prove Theorem 1.1 we need an elementary lemma.
Lemma 2.1.If g(t) is a periodic locally integrable function on T, then, for every m ∈ Z d \{0}, one has More precisely, if g(t) is a measurable function on T, then, for every m ∈ Z d \{0}, the functions g(t), t ∈ T, and g(m • α), α ∈ T d , have the same distribution function.Namely, for every −∞ < s < +∞, Proof.By periodicity and a change of variables, for every non-zero integer h and real number u, one has The distribution functions of g(t) and g(m•α) are seen to be equal by applying the above identity to the characteristic function of the upper level set χ {t∈T, g(t)>s} (t).
We now prove our first main result.
Proof of Theorem 1.1.First observe that for every 0 < p < 2 one has .
The first factor is the Sobolev norm of f (x), whereas the second series converges provided that 2pδ/(2 − p) > d.In particular, for p = 1 and δ > d/2 one sees that the Fourier expansion of f (x) converges absolutely.This fact and the compact support of n → Φ(N, n) assure the pointwise identity Hence, thanks to (i), one has In order to show that the constants c(f, α) are finite for almost every α it suffices to show that the series defining these constants converges absolutely for almost every α.By the previous lemma the functions α → m • α −ϑ are in L p (T d ) for every p < 1/ϑ with norm independent of m, If 0 < ϑ < 1, then the functions the functions α → m • α −ϑ are integrable and the series As observed before, this holds true if δ > d/2.
If ϑ ≥ 1, then 0 < p < 1/ϑ ≤ 1, and, by the inequality |a + b| p ≤ |a| p + |b| p , the series converges for almost every α and in the L p (T d ) quasinorm provided that As observed at the beginning of the proof this happens for every 0 < p < 2 whenever 2pδ/(2 − p) > d, from which one obtains A key ingredient in the proof of Theorem 1.2 is a classical result in Diophantine approximation.Let γ be an irrational number.If the sequence { γn } N n=1 is welldistributed in 0 ≤ t ≤ 1/2 as it is distributed the sequence {n/(2N )} N n=1 , then one can guess that Under suitable Diophantine assumptions on γ the above conjectured estimate is correct.The following lemma is a variant of known results (see e.g.[17,Chapter 3]).
Lemma 2.2.Assume that α = (α 1 , . . ., α d ) ∈ R d is an irrational vector, that is, 1, α 1 , . . ., α d are linearly independent over the rationals, and assume that there exist constants H > 0 and σ ≥ d such that α • m ≥ H|m| −σ for every m ∈ Z d \ {0}.Then there exists a positive constant c such that, for every R ≥ 1, Proof.By the assumptions α•m ≥ H|m| −σ and |m| < R, the interval [0, H/(2R) σ ) does not contain any term of the sequence { α • m } 0<|m|<R .Moreover, for every integer n such that 0 < n < 2 σ−1 H −1 R σ the interval I σ n,R = [nH/(2R) σ , (n + 1)H/(2R) σ ) contains at most one term of such sequence.Indeed, if there are two terms in the interval, then there are integer points p = q with |p|, |q| < R, and integers u and v such that The signum is minus if α • p and α • q approximate the nearest integers u and v both from above or both from below, the signum is plus if one approximation is from above and the other from below.Hence, But |p ± q| < 2R, and this contradicts the assumption α • m ≥ H|m| −σ .Notice that the number of intervals I σ n,R 's is of the order of cR σ , whereas the number of integer points in the punctured ball {0 < |m| < R} is about cR d , and recall also that σ ≥ d.Observe that one has the worst estimate when the terms of the sequence { α • m } 0<|m|<R are concentrated in the the first 0 < n ≤ cR d intervals.
In conclusion, Proof of Theorem 1.2.As in the proof of Theorem 1.1 one has the pointwise identity Hence, by Cauchy's inequality and assumption (i), one has the estimate By Lemma 2.2, for every positive integer M one has Observe that if σ = d, then δ > σ/2.For the last case ϑ > 1/2, the choice Collecting the above estimates one obtains that Proof Theorem 1.3.Recall that the Fourier coefficients are bounded by the L 1 (T d ) norm of the function, so that sup Therefore, lim sup The proof of Theorem 1.4 is straightforward.We include a proof for the sake of completeness.
Proof of Theorem 1.4.It suffices to recall that the Sobolev space W d 2 ,2 (T d ) contains unbounded functions.If f (x) is unbounded in just one point and if α is an irrational vector, or if the weights Φ(N, n) are non-negative and α is arbitrary, then in the sum +∞ n=−∞ Φ(N, n)f (x + nα) the possible infinite terms do not cancel.Hence sup An explicit example of function in W d 2 ,2 (T d ) unbounded in a neighborhood of the origin and bounded elsewhere is given by the series The following lemma is a main ingredient in the proof of Theorem 1.5.[19,Chapter III,Theorem 3A] and [2,20].Hence, for almost every α there are infinitely many terms larger than 1 in the given series, so that such series diverges.
Proof of Theorem 1.5.Observe that if ϑ ≤ 1 then dϑ − d/2 ≤ d/2, and this case is already covered by Theorem 1.4.In order to prove (i), define The norm of this function in the Sobolev space W δ,2 (T d ) is Then, by part (i) of Lemma 2.3 , it follows that, for every α, one has lim sup The proof of (ii) is similar.Define If ϑ > 1/2 this function is in the Sobolev space W dϑ− d 2 ,2 (T d ), and Then, by part (ii) of Lemma 2.3, it follows that, for almost every α, one has lim sup At last, we prove Theorem 1.6.
Proof of Theorem 1.6.Let A be a subset of Ω of cardinality |A| < +∞, and let . Moreover, for the N 's in the theorem and under the assumption that α • m ≤ L|m| −σ for every m ∈ A, Hence, if δ ≤ ϑσ one has lim sup Letting |A| → +∞, it follows that the family of operators {D Φ,α N } +∞ N =1 is not uniformly bounded from W ϑσ,2 (T 2 ) into L ∞ (T d ).Therefore, by the resonance theorem of Banach and Steinhaus, there exists a function f ∈ W ϑσ,2 (T d ) such that lim sup We conclude the section proving the corollaries.
Up to a factor 1/(2N + 1) one recognizes the Dirichlet kernel and easily verifies that sin (2N + 1)πt (2N + 1) sin(πt) Hence Theorem 1.1 and Theorem 1.2 apply with ϑ = 1.In order to prove that the speed of convergence cN −1 cannot be accelerated one can apply Theorem 1.3.However, there is also a more elementary and general argument that applies to every nonconstant function f (x).Assume that there exists a pair {N, N + 1} such that Then the triangle inequality gives a contradiction, Proof of Corollary 1.8.In this case we have Up to a factor 1/N one recognizes the Fejér kernel, and checks that Theorem 1.1 and Theorem 1.2 apply with ϑ = 2.To prove that the speed of convergence cN −2 cannot be accelerated observe that sin 2 (πt) .
If t = 0, then C(0) = +∞.If t = 0 is rational, then sin 2 (πN t) takes a finite number of values for N → +∞, hence lim sup Proof of Corollary 1.9.The function Ψ(t) is related to the Bochner-Riesz kernel.Recall the integral representation of Bessel functions, The Poisson summation formula gives the series expansion See [22, Chapter 4, Theorem 4.15 and Chapter 7, Theorem 2.4].Observe that the use of the Poisson summation formula is legitimate since both above series are absolutely and uniformly convergent (see [21,Lemmas 4 and 5]).Also recall that the Bessel function J γ+ 1 2 (z) has the asymptotic expansions Assume for simplicity that 0 < t < 1/2.Then the above sum has a main term of the form The remainder is the sum over all k's with |t + k| ≥ 1/2 and it can be estimated as The main term can be estimated from above by Observe that the estimates of the main term dominate the remainder.Also notice that In conclusion, Hence, Theorem 1.1 and Theorem 1.2 apply with ϑ = γ + 1.To apply Theorem 1.3 let us show that there exist ε > 0 and η > 0 such that for every t in a set of measure η one has C(t) > ε.Observe that Again assume that t + k is not an integer for every k.The asymptotic expansion of Bessel functions gives Let 0 < λ < 1/2.Then for every t such that λ/2 ≤ t ≤ λ one has In conclusion, if λ is suitably small, for every N suitable large one has By the Poisson summation formula, It follows that, for every t = 0, Hence, for this Φ(N, n) assumption (i) in Theorem 1.2 holds with arbitrary ϑ, and for fixed σ and δ one can chose a ϑ that optimizes the estimates in Theorem 1.2.In particular, if δ/σ ≥ 1/2 one can choose ϑ = δ/σ, whereas if δ/σ < 1/2 one can choose (δ − d/2)/(σ − d) < ϑ < 1/2.

Concluding remarks
Remark 3.1.The above corollaries show that the assumptions in Theorem 1.1 and Theorem 1.2 are not void.We now show that the assumptions in in Theorem 1.5 and Theorem 1.6 are not void as well.Let us prove that for every ϑ > 0 there exists a positive weight Φ(N, n) which satisfies (1), with the property that n → Φ(N, n) has compact support for every N , and with the property that there exist constants H > 0 and K > 0 such that for every t one has Let j be a positive integer, and let It is easily verified that G N (t) is a trigonometric polynomial of degree (N − 1)j.Hence, the convolution F N * G N (t) is a trigonometric polynomial as well.From the inequalities 2 t ≤ | sin(πt)| ≤ π t one deduces that cannot be replaced by any index δ < 3/2.Indeed, for every δ < 3/2 there exists a function f (x) ∈ W δ,2 (T) such that lim sup for almost every α.In order to show that this is true, observe that In Petersen [18] it is proved that if 0 < α, β < 1 the following are equivalent: It is easy to verify that (ii) is also equivalent to Let β ∈ (0, 1) be an algebraic number and set Such a function f (x) belongs to W δ,2 (T) for every δ < 3/2.Since for every transcendental number α ∈ (0, 1) condition (i) does not holds, then (iii) does not hold as well.This is exactly what we wanted to prove thanks to (2) and the fact that almost every α ∈ (0, 1) is a transcendental number.
Remark 3.3.It is curious to compare the above results on the speed of convergence in ergodic theorems with the approximation properties of Fourier series.Whereas our results suggest that stronger summation methods guarantee faster convergence, the approximation properties of partial sums and Féjer means of Fourier series seem to go in the opposite direction.Assume d = 1 and denote by S N f (x) and F N f (x) the partial sums and the arithmetic means of the partial sums of the Fourier expansion of a function f (x), The partial sums S N f (x) may not converge, but the approximation is close to optimal.Indeed, if S N denotes the operator norm of the partial sums, the Lebesgue constant, if E N (f ) denotes the best approximation in the supremum norm of f (x) with trigonometric polynomials of degree at most N , and if P N (x) is the trigonometric polynomial of best approximation, then Finally, the means F N f (x) always converge, but the approximation is never better than c/N , In particular, the partial sums may converge faster than the Féjer means.

Appendix
In this appendix we deal with logarithmic means.Such means are defined by the weights and the associated logarithmic discrepancy is See [7, Section 2.2] for references about these means and discrepancy.Although Theorem 1.1 and Theorem 1.2 do not immediately apply in this setting, due to the fact that the assumption (i) on the kernels +∞ n=−∞ Φ(N, n)e 2πint is not satisfied, the proofs can be adapted to obtain some analogues of the above results.In particular, the above theorems apply to functions in Sobolev classes W δ,2 (T d ) with δ > d/2.The main ingredient in the proofs of both theorems is an estimate for the kernels +∞ n=−∞ Φ(N, n)e 2πint .Lemma 4.3.For every t ∈ T, Moreover, if N is large enough and if t ≥ c/N , the reverse inequality holds true as well.
Proof.A direct and explicit proof goes as follows.The inequality ≤ 1 is obvious.In order to prove the other inequality it suffices to assume that N > 1 and |t| ≤ 1/2.An integration by parts gives The inequality e iπ(n+1)t − 1 ≤ π(n + 1)|t| implies that also the sum over the {1 ≤ n ≤ N − 1, n ≤ 1/|t|} is uniformly bounded.Indeed,

d 2 , 2 (Theorem 1 . 4 .
T d ) contains unbounded functions, it easily follows that the smoothness assumption δ > d/2 in Theorem 1.1 and Theorem 1.2 is necessary.There exists a function f (x) in the Sobolev space W d 2 ,2 (T d ) such that for every irrational vector α and every N one has

Theorem 4 . 1 .
If the function f (x) has an absolutely convergent Fourier expansion, m∈Z | f (m)| < +∞, then, for almost every α, there exists a positive constant c(f, α) such that, for every positive integer N , one has

Theorem 4 . 2 .
If α is not a Liouville vector, that is, if there exist positive constants H and σ such that α • m ≥ H|m| −σ for every m ∈ Z d \ {0}, then there exists a positive constant c = c(d, H, σ), such that for every positive integer N ,sup x∈T d D Φ,α N f (x) ≤ c log −1 (1 + N ) m∈Z | f (m)| log (1 + |m|) .
) n sin (πt) = I + II + III.The inequalities 2 t ≤ |sin (πt)| ≤ π t imply that sin (πnt) n sin (πt) estimate for the term I is immediately obtained.In order to estimate II one can separately consider the sum where the index n varies in the set {1 ≤ n ≤ N − 1, n ≤ 1/|t|} and the sum where the index varies in the set {1 ≤ n ≤ N − 1, n > 1/|t|}.The latter set is empty if |t| < 1/(N − 1).Otherwise, there is a uniform bound.Indeed,