Uniformity in the Wiener-Wintner theorem for nilsequences

We prove a uniform extension of the Wiener-Wintner theorem for nilsequences due to Host and Kra and a nilsequence extension of the topological Wiener-Wintner theorem due to Assani. Our argument is based on (vertical) Fourier analysis and a Sobolev embedding theorem.


INTRODUCTION
Let (X , µ) be a probability space and let T : X → X be an invertible measure preserving transformation. The classical Wiener-Wintner theorem [WW41] asserts that for every f ∈ L 1 (X , µ) there exists a subset X ′ ⊂ X with full measure such that the weighted averages (1.1) 1 N N n=1 f (T n x)λ n converge as N → ∞ for every x ∈ X ′ and every λ in the unit circle . Over the years this theorem has been improved and generalized in many directions. For example, Lesigne [Les90,Les93] proved that the weights (λ n ) can be replaced by polynomial sequences of the form (λ p 1 (n) 1 · · · λ p k (n) k ), λ j ∈ , p j ∈ [X ] (or, equivalently, (e 2πip(n) ), p ∈ [X ]). More recently Host and Kra [HK09, Theorem 2.22] showed that this can be enlarged to the class of nilsequences.
In a different direction, Bourgain's uniform Wiener-Wintner theorem [Bou90] asserts convergence of the averages (1.1) to zero for f orthogonal to the Kronecker factor uniformly in λ, cf. Assani [Ass03]. A joint extension of this result and Lesigne's polynomial Wiener-Wintner theorem has been obtained by Frantzikinakis [Fra06]. In the same spirit, our main result is a uniform version of the Wiener-Wintner theorem for nilsequences.
Let G be a nilpotent Lie group with a cocompact lattice Γ. The compact manifold G/Γ together with the Haar measure on it is called a nilmanifold. Using the universal covering we may and will assume that the connected component of the identity G o is simply connected. Let further G • be a Γ-rational filtration of length l on G and P( , G • ) be the group of G • -polynomials (we recall these notions in Section 2). Then for every polynomial g ∈ P( , G • ) and F ∈ C(G/Γ) we call the sequence (F (g(n)Γ)) n a basic l-step nilsequence. An l-step nilsequence is a uniform limit of basic l-step nilsequences (which are allowed to come from different nilmanifolds and filtrations).
Nilsystems (i.e. rotations on nilmanifolds) and nilsequences appear naturally in connection with norm convergence of multiple ergodic averages [HK05]. The 1-step nilsequences are exactly the almost periodic sequences. For examples and a complete description of 2step nilsequences see Host, Kra [HK08]. For a characterization of nilsequences of arbitrary step in terms of their local properties see [HKM10, Theorem 1.1]. Although it is possible to express basic nilsequences as basic nilsequences of the same step associated to "linear" sequences of the form (g n ) n (this is essentially due to Leibman [Lei05b], see e.g. Chu Lie groups), "polynomial" nilsequences, in addition to being formally more general, seem to be better suited for inductive purposes. This has been observed recently and utilized in connection with additive number theory, see e.g. Green, Tao, Ziegler [GTZ10] and Green, Tao [GT10].
From now on we fix a tempered Følner sequence (Φ N ) N in . For an ergodic system (X , µ, T ) we denote the Host-Kra factor of order l, defined in [HK05], by l (X ). We also denote the Sobolev spaces on G/Γ by W j,p (G/Γ). All these notions are recalled in Section 2. Our main result, Theorem 4.1, has the following consequence.
Theorem 1.2 (Uniform Wiener-Wintner for nilsequences). Assume that (X , µ, T ) is ergodic and let f ∈ L 1 (X ) be such that ( f | l (X )) = 0. Let further G/Γ be a nilmanifold with a Γ-rational filtration G • on G of length l. Then for a.e. x ∈ X we have In view of a counterexample in Section 5 the Sobolev norm cannot be replaced by the L ∞ norm. On the other hand, we have not investigated whether the above order k is optimal and believe that it is not.
The conclusion (1.3) differs from the uniform polynomial Wiener-Wintner theorem of Frantzikinakis [Fra06] in several aspects. First, our class of weights is considerably more general, comprising all nilsequences rather than polynomial phases (a polynomial phase f (p(n) ), f ∈ C( / ), p ∈ [X ] is also a nilsequence of step deg p with the filtration = · · · = ≥ {0} of length deg p and cocompact lattice ). Also, our result does not require total ergodicity, an assumption that cannot be omitted in the result of Frantzikinakis. The price for these improvements is that we have to assume the function to be orthogonal to the Host-Kra factor and not only to the Abramov factor of order l (i.e. the factor generated by the generalized eigenfunctions of order ≤ l).
The conclusion (1.4) generalizes a result of Assani [Ass03, Theorem 2.10], which corresponds essentially to the case l = 1. Note that without the orthogonality assumption on the function, everywhere convergence can fail even for averages (1.1) for some λ ∈ . For more information on this phenomenon we refer to Robinson [Rob94], Assani [Ass03] and Lenz [Len09].
Let G • be a Γ-rational filtration on G and g ∈ P( , G • ) be a polynomial sequence. By Leibman [Lei05b, Theorem B] the sequence g(n)Γ is contained and equidistributed in a finite unionỸ of sub-nilmanifolds of G/Γ. For a Riemann integrable function F :Ỹ → we call the bounded sequence (F (g(n)Γ)) n a basic generalized l-step nilsequence (one obtains the same notion upon replacing the polynomial g(n) by a "linear" polynomial (g n ) n ). A generalized l-step nilsequence is a uniform limit of basic generalized l-step nilsequences.
A concrete example of a generalized nilsequence is (e i[nα]nβ ) for α, β ∈ or, more generally, bounded sequences of the form (p(n)) and (e ip(n) ) for a generalized polynomial p, i.e., a function obtained from conventional polynomials using addition, multiplication, and taking the integer part, see Bergelson, Leibman [BL07].
We also obtain an extension of the Wiener-Wintner theorem for nilsequences due to Host and Kra [HK09, Cor. 2.23 and its proof] to non-ergodic systems. Theorem 1.5 (Wiener-Wintner for generalized nilsequences). For every f ∈ L 1 (X , µ) there exists a set X ′ ⊂ X of full measure such that for every x ∈ X ′ the averages (1.6) 1 converge for every generalized nilsequence (a n ). If in addition (X , T ) is a uniquely ergodic topological dynamical system, f ∈ C(X ) and the projection π : X → l (X ) is continuous for some l then the averages (1.6) converge for every x ∈ X and every l-step generalized nilsequence (a n ). See Host, Kra and Maass [HKM12, remarks following Theorem 3.5] for examples of systems for which the additional hypothesis is satisfied.
A consequence of this result concerning norm convergence of weighted polynomial multiple ergodic averages due to Chu [Chu09], cf. Host, Kra [HK09] for the linear case, is discussed in Section 7.
Acknowledgment. The work on the paper began during the first author's research visit to the University of California, Los Angeles. She is deeply grateful to her host Terence Tao for many helpful and motivating discussions without which the paper would not have been written. She thanks UCLA and its analysis group for perfect working conditions and friendly and pleasant atmosphere. The authors thank Bernard Host and Bryna Kra for their comments, Idris Assani for references and Example 5.1, Nikos Frantzikinakis for a hint regarding non-ergodic systems and the anonymous referees for corrections and helpful suggestions.

NOTATION AND TOOLS
We begin with the notions and tools needed. Throughout the paper we assume an L ∞function to be defined everywhere.
Definition 2.1 (Følner sequence). A sequence (Φ n ) of finite subsets of a discrete group G is called Følner if for every g ∈ G |gΦ n △Φ n | |Φ n | → 0 as n → ∞ holds. Moreover, a Følner sequence is called tempered (or said to satisfy Shulman's condition) if there exists C > 0 such that for every n ∈ one has Recall that the maximal function is defined by Lindenstrauss' maximal inequality [Lin01, Theorem 3.2] asserts that for every f ∈ L 1 (X ) and every λ > 0 we have where the implied constant depends only on the constant in the temperedness condition.
Definition 2.3 (Generic point). Let (Φ N ) be a tempered Følner sequence in , (X , µ, T ) be an ergodic system, and let f ∈ L ∞ (X , µ). We call x ∈ X generic for f with respect to We call x ∈ X fully generic for f w.r.t. (Φ N ) if it is generic for every function g in the (separable) T -invariant subalgebra generated by f .
By a generalization by Lindenstrauss [Lin01, Theorem 1.2] of Birkhoff's ergodic theorem to tempered Følner sequences, generic and hence fully generic points form a set of full measure. The temperedness assumption cannot be dropped even for sequences of intervals with growing length in , see del Junco, Rosenblatt [dJR79] and Rosenblatt, Wierdl [RW92]. We refer to Butkevich [But01] for an overview on pointwise convergence of ergodic averages along Følner sequences in and general groups, examples and further references.
A measure-preserving system (X , µ, T ) is called regular if X is a compact metric space, µ is a Borel probability measure and T is continuous. Every measure-preserving system is measurably isomorphic to a regular measure-preserving system upon restriction to a separable T -invariant sub-σ-algebra [Fur81, §5.2]. The ergodic decomposition of the measure on a regular measure-preserving system (X , µ, T ) is a measurable map x → µ x from X to the space of T -invariant ergodic Borel probability measures on X , unique up to equality µ-a.e., such that µ-a.e. x ∈ X is generic for every Definition 2.4 (Gowers-Host-Kra seminorms). For a probability measure preserving system (X , µ, T ) and f ∈ L ∞ (X , µ), the Gowers-Host-Kra seminorms are defined recursively by T n ff 2 l U l (X ,µ) .
We will write U l (X ) instead of U l (X , µ) if no confusion is possible.
These seminorms (that are indeed seminorms for l ≥ 1) have been introduced by Host and Kra in the ergodic case [HK05] and also make sense in the non-ergodic case as pointed out by Chu, Frantzikinakis and Host [CFH11]. The limit superior in the above definition is in fact a limit as follows from the characterization of these seminorms via cube spaces [HK05, §3.5] and the mean ergodic theorem. It follows by induction on l ∈ that see [ET12] for subtler analysis. Moreover, if µ = µ x dµ(x) is the ergodic decomposition then If (X , µ, T ) is ergodic then for each l there is a factor l (X ) of X , called the Host-Kra factor of order l, that is an inverse limit of l-step nilsystems and is such that for all f ∈ L ∞ (X ) Since the uniformity seminorms are bounded by the supremum norm and invariant under T and complex conjugation they can also be calculated using smoothed averages This will allow us to use the following quantitative version of the classical van der Corput estimate (the proof is included for completeness). Here o K (1) stands for a quantity that goes to zero for each fixed K as N → ∞.

Lemma 2.7 (Van der Corput). Let (Φ N ) N be a Følner sequence in and (u n ) n∈ be a sequence in a Hilbert space with norm bounded by C. Then for every K
Proof. Let K > 0 be given. By the definition of a Følner sequence we have By Hölder's inequality and the claim follows using the estimate (a + b) 2 ≤ 2a 2 + 2b 2 .
We now recall the notions of a (nilpotent) (pre-)filtration and a polynomial sequence. Since in this article we always work in the category of Lie groups we demand all groups in any prefiltration to be Lie. As mentioned in the introduction, we only consider Lie groups in which the connected component of the identity is simply connected.
The sequence that consists of the trivial group is called the prefiltration of length −∞. A filtration (on a group G) is a prefiltration G • such that G 0 = G 1 (and G 0 = G).
Although prefiltrations behave well in algebraic constructions, in our analytic arguments we will have to work with filtrations. Note that in a prefiltration G • of length l, the subgroup G l need not be central in G 0 .
It is well-known that the lower central series on a nilpotent Lie group G is a filtration on G. If G • is a prefiltration of length l and t ≤ l then G •+t denotes the prefiltration of length l − t given by (G •+t ) i = G i+t . We will denote the dimension of a Lie group by d = dim G and the dimension of the i-th group in a prefiltration by We define G • -polynomial sequences by induction on the length of the prefiltration.
is G •+1 -polynomial. We write P( , G • ) for the set of G • -polynomial maps.
By a result originally due to Leibman [Lei02] (see [ZK12] for a short proof) the set P( , G • ) is in fact a group under pointwise operations and the sequence is a prefiltration. We will not need the full strength of this result, but merely that a multiple of a G • -polynomial sequence and any constant sequence in G 0 is again G • -polynomial (this can be easily seen from the definition).
Finally we outline a special case of the cube construction of Green, Tao and Ziegler [GTZ10, Definition B.2] using notation of Green and Tao [GT12, Proposition 7.2]. We will only have to perform it on filtrations, but even in this case the result is in general only a prefiltration.
Definition 2.12 (Cube construction). Given a prefiltration G • we define the prefiltration G • by is the diagonal group corresponding to G. By an abuse of notation we refer to the filtration obtained from G • by replacing G 0 with G 1 as the "filtration G • ".
To see that this indeed defines a prefiltration let We show by induction on the length of the prefiltration G • that for every k ∈ the map As remarked earlier, the prefiltration G • and the filtration G • are in general distinct concepts. Also the map g k is in general not polynomial with respect to the filtration G • since it need not take values in G 1 . However, this is a very mild obstacle and a slight modification of g k will work. A natural candidate is g k (0) −1 g k , but later in the proof this choice would lead to shifts of a function on G/Γ by g(k) for every k, and there is no useful control on Sobolev norms of such shifts in terms of Sobolev norms of the original function. Instead we would like to shift only by elements that belong to a fixed compact set and this requires a more sophisticated modification. This follows from local homeomorphy of G and G/Γ, from local compactness of G and from compactness of G/Γ. For example, for G = and Γ = the fundamental domain K can be taken to be the interval [0, 1) with the fractional part map {·}. In case of a general connected Lie group the fundamental domain can be taken to be [0, 1) dim(G) in Mal'cev coordinates [GT12, Lemma A.14], but we do not need this information.
For each nilmanifold that we consider we fix some map {·} as above and define (2.14)g k : This is the conjugate of We will use Mal'cev bases adapted to filtrations in the sense of [GT12, Definition 2.1] with the additional twist that we consider not necessarily connected Lie groups. This provides additional generality since, by the remark following [HK08, Theorem 3], not every nilsequence arises from nilmanifolds associated to connected Lie groups.
Definition 2.15 (Mal'cev basis adapted to a filtration). Let G be a nilpotent Lie group with a cocompact lattice Γ and a filtration G • of length l that consists of connected, simply connected Lie groups. An ordered basis {X 1 , . . . , X d } for the Lie algebra of G is called a Mal'cev basis for G/Γ adapted to G • if the following conditions are satisfied.
(3) The lattice Γ consists precisely of the elements with integer Mal'cev coordinates. The lower central series on a (not necessarily connected) nilpotent Lie group G is Γrational for every cocompact lattice Γ [Mal49]. In this case Mal'cev coordinates on G o are usually called coordinates of the second kind. Any subfiltration of a rational filtration is clearly rational.
Definition 2.17 (Sobolev space). Let G/Γ be a nilmanifold with a Γ-rational filtration, so in particular we have a Mal'cev basis {X 1 , . . . , X d } for the Lie algebra of G. We identify the vectors X i with their extensions to right invariant vector fields on G/Γ. The Sobolev space W j,p (G/Γ), j ∈ , 1 ≤ p < ∞, is defined by the norm Finally, since we will use induction over rational filtrations in the proof of our main result and the inductive hypothesis will involve G • , we have to show that this filtration is rational provided that G • is rational. This follows from the next lemma.
Lemma 2.18 (Rationality of the cube filtration). Let G • be a Γ-rational filtration. Then the filtration Thus it suffices to show that Γ i (G i ) o has finite index in G i . By the assumption Γ i G o i has finite index in G i for each i, so it contains a finite index normal subgroup N i ≤ G i and we can write G i = A i N i with a finite set A i . With this notation we have For every a ∈ G i+1 and n ∈ N i we have [a,

VERTICAL CHARACTERS
Let G be a nilpotent Lie group with a cocompact lattice Γ and a Γ-rational filtration G • of length l, so that Γ i = Γ ∩ G i is a cocompact lattice in G i for each i = 1, . . . , l. Then G/Γ is a smooth principal bundle with the compact abelian Lie structure group G l /Γ l . The fibers of this bundle are called "vertical" tori (as opposed to the "horizontal" torus G/ΓG 2 ) and everything related to Fourier analysis on them is called "vertical".
Definition 3.1 (Vertical character). Let G/Γ be a nilmanifold and G • a Γ-rational filtration on G. A measurable function F on G/Γ is called a vertical character if there exists a character χ ∈ G l /Γ l such that for every g l ∈ G l and a.e. y ∈ G/Γ we have F (g l y) = χ(g l Γ l )F ( y).
The key ingredient of our proof is the following modification of a construction due to Green and Tao, see e.g. [Tao12, Lemma 1.6.13] and [GT12, §7], which shows that discrete derivatives of vertical character nilsequences are nilsequences of lower step.
Let F be a smooth vertical character, g ∈ P( , G • ) and a n = F (g(n)Γ) be the corresponding basic nilsequence. A calculation shows that for every k ∈ we have a n+k a n = (F whereF k is the restriction of {g(k)}F ⊗ {g(0)}F from G 2 1 to G 1 . Recall that the filtration G • is Γ -rational by Lemma 2.18. Since F is a vertical character, {g(k)}F ⊗ {g(0)}F is G linvariant (note that G l = G △ l ), so thatF k is well-defined onỸ =G 1 /Γ 1 , whereG = G 1 /G l is a nilpotent group with the cocompact latticeΓ = Γ G l /G l and theΓ-rational filtratioñ G i = G i /G l , i = 1, . . . , l,G 0 =G 1 . Abusing the notation we may considerg k (n) also as an element of P( ,G • ), so a n+k a n =F k (g k (n)Γ) is a basic nilsequence of step l − 1.
We will write A D B if A and B satisfy the inequality A ≤ C B with some constant C that depends on some auxiliary constant(s) D and some geometric data.

Lemma 3.2 (Control on Sobolev norms in the cube construction). With the above notation we have
where the implied constant does not depend on k and F .
Proof. For the Mal'cev basis onG/Γ that is induced by the Mal'cev basis on G 1 /Γ we have so it suffices to estimate the latter quantity.
For this end observe that the Haar measure on G 1 /Γ is a self-joining of the Haar measure on G/Γ under the canonical projections to the coordinates. Therefore and by the Cauchy-Schwarz inequality we have for any smooth functions F 0 , F 1 on G/Γ. Now recall that {g(k)} ∈ K for some fixed compact set K ⊂ G 1 , so that by smoothness of the group operation {g(k)}F L 2p (G/Γ) F L 2p (G/Γ) , and analogously for {g(0)}F . Similar calculations for the derivatives lead to the bound Definition 3.4 (Vertical Fourier series). Let G/Γ be a nilmanifold and G • be a Γ-rational filtration on G. For every F ∈ L 2 (G/Γ) and χ ∈ G l /Γ l let With this definition F χ is defined almost everywhere and is a vertical character as witnessed by the character χ. The usual Fourier inversion formula implies that F = χ∈ G l /Γ l F χ in L 2 (G/Γ). We further need the following variant of Bessel's inequality.
Proof. Since vertical characters have constant absolute value on G l /Γ l -fibers, we have by (3.5) and the Cauchy-Schwarz inequality for every χ. By the Plancherel identity and Hölder's inequality this implies finishing the proof.
It is worth mentioning that there is also a Plancherel-type identity Proof. The compact abelian Lie group G l /Γ l is isomorphic to a product of a torus and a finite group. In order to keep notation simple we will consider the case G l /Γ l ∼ = d l , the conclusion for disconnected G l /Γ l follows easily from the connected case. We rescale the last d l elements of the Mal'cev basis in such a way that they correspond to the unit tangential vectors at the origin of the torus d l . The characters on G l /Γ l are then given by By the centrality of G l the operations of taking derivatives along elements of the Mal'cev basis and taking the χ-th vertical character (3.5) commute, so we have for every j ∈ . The same argument works if some of the indices (m 1 , . . . , m d l ) vanish, in which case a smaller number of derivatives is added to j, and thus altogether m F χ m W j,p F W j+d l ,p .
We will need an estimate on the L ∞ norm of a vertical character in terms of a Sobolev norm with minimal smoothness requirements. For this end we would like to use a Sobolev embedding theorem on G/ΓG l since this manifold has lower dimension than G/Γ. Morally, a vertical character is a function on the base space G/ΓG l that is extended to the principal G l /Γ l -bundle G/Γ in a multiplicative fashion. However, in general this bundle lacks a global cross-section, so we are forced to work locally. Proof. The case p = ∞ is clear, so we may assume p < ∞.
Since Γ is discrete there exists a neighborhood U ⊂ G of the identity such that the quotient map U → G/Γ is a diffeomorphism onto its image. Let M ⊂ G be a (d−d l )-dimensional submanifold that intersects G l in e G transversely. By joint continuity of multiplication in G we may find neighborhoods of identity V ⊂ G l and W ⊂ M such that V W ⊂ U. By transversality the differential of the map ψ : V × W → G, (v, w) → vw is invertible at (e G , e G ), so by the inverse function theorem and shrinking V, W if necessary we may assume that ψ is a diffeomorphism onto its image. We may also assume that V, W are connected, simply connected and have smooth boundaries. Recalling that the quotient map U → G/Γ is a diffeomorphism, we obtain a chart Ψ : V × W → G/Γ for a neighborhood of e G Γ that has the additional property that Ψ(g l v, w) = g l Ψ(v, w) whenever v, g l v ∈ V . Shrinking V and W further if necessary we may assume that the differential of Ψ and its inverse are uniformly bounded. By homogeneity we obtain similar charts for some neighborhoods of all points of G/Γ. By compactness G/Γ can be covered by finitely many such charts, so it suffices to estimate By definition of Sobolev norms we have Since F is a vertical character and by multiplicativity of Ψ in the first argument, the integrand on the left-hand side is constant, so that the bound being independent of v. Now, W is a d − d l dimensional manifold, so the usual Sobolev embedding theorem [AF03, Theorem 4.12 Part I Case A] applies and we obtain . By the above discussion this implies the desired estimate.

THE MAIN ESTIMATE
In this section we deal with our main problem of estimation of averages in (1.3). The general strategy is to decompose F into a vertical Fourier series, to use the quantitative van der Corput estimate and to control various norms that appear during this procedure using the results of the previous section. In several places in our argument we will need convergence of Birkhoff averages of a function to its integral. In order to ensure this convergence we restrict attention to fully generic points.
The following uniform estimate is our main result.
Theorem 4.1 (Uniformity seminorms control averages uniformly). Assume that (X , µ, T ) is ergodic. Then for every f ∈ L ∞ (X ) and every point x that is fully generic for f with respect to (Φ N ) the following holds. For every l ∈ and ε > 0 there exists N 0 such that for every nilmanifold G/Γ with a Γ-rational filtration G • on G of length l, every smooth function F on G/Γ and every g ∈ P( , G • ) we have l r−1 and the implied constant depends only on the nilmanifold G/Γ, filtration G • and the Mal'cev basis that is implicit in the definition of Γ-rationality.
If in addition (X , T ) is uniquely ergodic and f ∈ C(X ) then the conclusion holds for every x ∈ X and N 0 can be chosen independently of x.
Example 5.1 below shows that there is in general no constant C such that the estimate holds for every 1-step basic nilsequence F (S n y), even without uniformity. Thus one cannot expect to replace the Sobolev norm by F ∞ in Theorem 4.1.
Remark 4.4. Quantifying the proof of Host, Kra [HK05, Proposition 5.6] using standard Fourier analysis on d(2 l −1) one obtains for the non-uniform averages the upper bound lim sup for "linear" sequences g(n) = h n h ′ , where the implied constant depends on geometric data like the choice of a decomposition of identity on the pointed cube space (G/Γ) [k] * = (G/Γ) 2 l −1 . Note also that Host and Kra worked with intervals with growing length instead of tempered Følner sequences in .
Proof of Theorem 4.1. We argue by induction on l. In the case l = 0 the group G is trivial, so F ∞ = F W 0,1 (G/Γ) and the claim follows by the definition of generic points. We now assume that the claim holds for l − 1 and show that it holds for l. Write a n := F (g(n)Γ).
Assume first that F is a vertical character and recall the notation from Section 3. Let δ > 0 be chosen later. For the dimensions (d i ) of the groups in the filtrationG • we have the . . , l − 1. By the induction hypothesis applied toG/Γ with the inducedΓ-rational filtration and Lemma 3.2 we have 1 l r−1 − d l for any integer k provided that N is large enough depending on l, k, δ and x. Let K be chosen later. The van der Corput Lemma 2.7 implies 1 |Φ N | n∈Φ N f (T n x)a n 2 ≤ 2 provided that N is large enough depending on l, K, δ and x. By Lemma 3.8 this is dominated by By the Cauchy-Schwarz inequality this is dominated by By (2.6) for sufficiently large K = K( f , δ) the above average over k approximates f 2 to within δ, so we have Taking δ = δ(ε) sufficiently small and N ≥ N 0 (l, f , ε, x) sufficiently large we obtain 1 |Φ N | n∈Φ N f (T n x)a n F Wk ,2 l ( f U l+1 (X ) + ε).
Note that N 0 does not depend on F . Let now (a n ) = (F (g(n)Γ)) be an arbitrary l-step basic nilsequence on G/Γ. Let F = χ F χ be the vertical Fourier series. By the above investigation of the vertical character case, since the vertical Fourier series of F converges absolutely and by Lemma 3.7 we get for N ≥ N 0 as required.
Under the additional assumptions that (X , T ) is uniquely ergodic and f ∈ C(X ) we obtain the additional conclusion that the estimate is uniform in x ∈ X for l = 0 from uniform convergence of ergodic averages 1 |Φ N | n∈Φ N T n f , see e.g. [Wal82,Theorem 6.19]. For general l it suffices to observe that in the above proof the dependence of N 0 on x comes in only through the inductive hypothesis. Also, there is no need for temperedness of (Φ N ) in this case.
Proof of Theorem 1.2. Let f ∈ L 1 (X ) with ( f | l (X )) = 0 be given. By truncation we can approximate it by a sequence of bounded functions ( f j ) ⊂ L ∞ (X ) such that f j → f in L 1 . Replacing each f j by f j − ( f j | l (X )) we may assume that ( f j | l (X )) = 0 for every j. By Fixing a j, restricting to the set of points that are generic for | f − f j | with respect to {Φ N } and letting N → ∞ we can estimate the limit by f − f j 1 pointwise on a set of full measure. Hence the limit vanishes a.e. Under the additional assumptions that (X , T ) is uniquely ergodic and f is continuous the uniform convergence (1.4) follows directly from Theorem 4.1.

A COUNTEREXAMPLE
The following example shows that there is no constant C such that the estimate (4.3) holds for every 1-step basic nilsequence F (S n y). Thus one cannot replace the Sobolev norm by F ∞ in Theorem 4.1 even without uniformity in F and g.
Example 5.1 (I. Assani). We begin as in Assani, Presser [AP12, Remarks] and consider an irrational rotation system ( , µ, T ) on the unit circle, f ∈ C( ), x ∈ and define S := T , y := x and F :=f . We have By f 4 U 2 ( ) = ∞ k=−∞ |f (k)| 4 , the inequality (4.3) takes the form . Let now {a n } ∞ n=1 ⊂ and consider random polynomials Therefore for every N ∈ there is ω (or a choice of signs + or −) so that Assume now that inequality (5.2) holds for some constant C and every f ∈ C( ). Then by the above for f = P N (·, ω) we have and hence N n=1 a 2 n ≤ (C D) 2 (a n ) l 4 log N .
Taking a n = log n/n implies N n=1 log n/n ≤C log N for someC and all N , a contradiction. We also refer to Assani [Ass10] and Assani, Presser [AP12] for related issues.

WIENER-WINTNER THEOREM FOR GENERALIZED NILSEQUENCES
In view of Theorem 4.1 the Wiener-Wintner theorem for generalized nilsequences (Theorem 1.5) follows by a limiting argument from a structure theorem for non-ergodic measure preserving systems due to Chu, Frantzikinakis and Host.
Proof of Theorem 1.5. Restricting to the separable T -invariant σ-algebra generated by f we may assume that (X , µ, T ) is regular. Let µ = µ x dµ(x) be the ergodic decomposition.
Consider first a function 0 ≤ f ≤ 1 and letf := ( f | l (X )). By [CFH11, Proposition 3.1] we obtain a sequence of functions ( f j ) ⊂ L ∞ (X ) such that the following holds.
(2) For every j and µ-a.e. x ∈ X the sequence ( f j (T n x)) n is an l-step nilsequence.
Using the first condition we can pass to a subsequence such that f − f j L 2 l−1 (X ,µ x ) → 0 for a.e. x ∈ X . Thus we obtain a full measure subset X ′ ⊂ X such that the following holds for every x ∈ X ′ : (1) for every j the sequence ( f j (T n x)) n is an l-step nilsequence, (2) for every j the point x is fully generic for f − f j with respect to an ergodic measure µ x and (3) f − f j U l (X ,µ x ) → 0 as j → ∞ (this follows from the basic inequality (2.5)). Let x ∈ X ′ and (a n ) be a basic l-step nilsequence of the form a n = F (g(n)Γ) with smooth F . Since the product of two nilsequences is again a nilsequence, by Leibman [Lei05b, Theorem A] the limit exists for every j ∈ . By Theorem 4.1 we have for every j, where the constant does not depend on j, and this implies the existence of the limit (1.6).
Let now x ∈ X ′ and (a n ) be a basic generalized nilsequence of the form a n = F (g(n)Γ) with a real valued Riemann integrable function F . Let ǫ > 0. Since F is Riemann integrable onỸ = {g(n)Γ : n ∈ } (which is a finite union of sub-nilmanifolds with the weighted Haar measure ν) and by the Tietze extension theorem, there exist continuous functions F ǫ and By mollification we may assume that H ǫ and F ǫ are smooth. By the above the limits lim N 1 |Φ N | n∈Φ N f (T n x)H ǫ (g(n)Γ) and lim N 1 |Φ N | n∈Φ N f (T n x)F ǫ (g(n)Γ) exist. By continuity of F ǫ and H ǫ we have for every x ∈ X ′ lim sup and since ǫ > 0 was arbitrary this proves the existence of the limit (1.6).
A limiting argument allows one to replace the basic generalized nilsequence by a generalized nilsequence. By linearity we obtain the conclusion for f ∈ L ∞ (X ). The general case f ∈ L 1 (X ) follows from the maximal inequality (2.2).
Under the additional assumptions of unique ergodicity of (X , T ) and continuity of the projection π : X → l (X ) we find that the functions f j can be chosen to be continuous on X by [HKM10, Theorem A] and every point is fully generic for f − f j , allowing us to replace the set of full measure X ′ in the above argument by X .

WEIGHTED MULTIPLE AVERAGES
The Wiener-Wintner theorem (Theorem 1.5 for linear nilsequences) was used by Host and Kra [HK09, Theorem 2.25] to show that the values of a bounded measurable function along almost every orbit of an ergodic transformation are good weights for L 2 convergence of linear multiple ergodic averages. A polynomial extension of this result was proved by Chu [Chu09, Theorem 1.1]. Since our Theorem 1.5 is stated for "polynomial" nilsequences we can slightly shorten the proof of her result that we formulate for L 1 functions and tempered Følner sequences.
Corollary 7.1 (Convergence of weighted multiple ergodic averages). Let (Φ N ) be as above and let φ ∈ L 1 (X ). Then there is a set X ′ ⊂ X of full measure such that for every x ∈ X ′ the sequence φ(T n x) is a good weight for polynomial multiple ergodic averages along (Φ N ), i.e., for every measure-preserving system (Y, ν, S), integer polynomials p 1 , . . . , p k and functions f 1 , . . . , f k ∈ L ∞ (Y, ν) the averages (7.2) 1 |Φ N | n∈Φ N φ(T n x)S p 1 (n) f 1 · · · S p k (n) f k converge in L 2 (Y, ν) as N → ∞.
In order to reduce to an appropriate nilfactor we need the following variant of [Chu09, Theorem 2.2]. Recall that two polynomials are called essentially distinct if their difference is not constant. Lemma 7.3. Let (Φ N ) N be an arbitrary Følner sequence in . For every r, d ∈ there exists k ∈ such that for every ergodic system (X , µ, T ), any functions f 1 , . . . , f r ∈ L ∞ (X ) with f 1 U k (X ) = 0, any non-constant pairwise essentially distinct integer polynomials p 1 , . . . , p r of degree at most d and any bounded sequence of complex numbers (a n ) n we have lim sup N →∞ 1 |Φ N | n∈Φ N a n T p 1 (n) f 1 · · · T p r (n) f r Thus the hypothesis ensures convergence to zero of the integrand in the previous display for a.e. s provided that k is large enough.
Proof of Corollary 7.1. By ergodic decomposition it suffices to consider ergodic systems (Y, ν, S). Assume first that φ ∈ L ∞ (X ). By Lemma 7.3 we may assume that each f i is measurable with respect to some Host-Kra factor l (Y ).
By density we may further assume that each f i is a continuous function on a nilsystem factor of Y . In this case the sequence S p i (n) f i ( y) is a basic nilsequence of step at most l deg p i for each y ∈ Y , and the product i S p i (n) f i ( y) is also a basic nilsequence of step at most l max i deg p i . Therefore the averages (7.2) converge pointwise on Y for a.e. x ∈ X by Theorem 1.5, and by the Dominated Convergence Theorem they converge in L 2 (Y ).
We can finally pass to φ ∈ L 1 (X ) using the maximal inequality (2.2).