The Theorem of Iterates for elliptic and non-elliptic Operators

We introduce a new approach for the study of the Problem of Iterates using the theory on general ultradifferentiable structures developed in the last years. Our framework generalizes many of the previous settings including the Gevrey case and enables us, for the first time, to prove non-analytic Theorems of Iterates for non-elliptic differential operators. In particular, by generalizing a Theorem of Baouendi and Metivier we obtain the Theorem of Iterates for hypoelliptic analytic operators of principal type with respect to several non-analytic ultradifferentiable structures.

The aim of this paper is to present a new approach to the Problem of Iterates using the ultradifferentiable structures introduced in [55] and [56], which generalizes and unifies many of the previous cases.
In our context an ultradifferentiable structure U is a subalgebra of smooth functions which is defined by estimates on the derivatives of its elements. Well-known ultradifferentiable structures include the Denjoy-Carleman classes which are given by weight sequences and the Braun-Meise-Taylor classes whose defining data are weight functions. The latter were originally introduced by [4] and [5], but the modern formulation of these classes was given in [17]. The classes discussed in [55], which are determined by weight matrices, i.e. families of weight sequences, encompass both Denjoy-Carleman classes and Braun-Meise-Taylor classes. Other examples of ultradifferentiable spaces are the Gelfand-Shilov classes, cf. [30] and the recently introduced L p -ultradifferentiable classes, see [32].
Then ultradifferentiable vectors of some operator P associated to the structure U are those functions (or distributions) which satisfy the defining estimates of U for the iterates P k of P . Thus the Problem of Iterates in its general form can rather casually be formulated as the following question: Given an operator P suppose that a function (or distribution) u satisfies the defining estimates of an ultradifferentiable structure U for the iterates P k of P . Can we conclude that u satisfies these estimates for all derivatives?
Or more concisely, are the ultradifferentiable vectors of P with respect to U already ultradifferentiable functions of class U? If the answer to this question is "yes" then we say that the Theorem of Iterates holds for the operator P and the structure U.
Our main goal is to develop a unified approach to the problem of iterates using the recent development of the theory of general ultradifferentiable classes given in [55], [56] and in particular the microlocal theory in [29]. This approach allows us not only to unify and generalize previously known results but also to treat cases which have not been available in the literature up to now. In particular, in the case of principal type operators we are able to use the technical estimate in [3] to infer the Theorem of Iterates for a wide variety of ultradifferentiable classes, which include quasianalytic and non-quasianalytic classes. We note that, to our knowledge, this is the first time the Theorem of Iterates is proven for a non-elliptic operator and a non-analytic ultradifferentiable structure.
In the case of Braun-Meise-Taylor classes our main Theorem takes a relatively concise form. However, in order to formulate it correctly, we need to recapitulate some notations: We say that a differential 2020 Mathematics Subject Classification. Primary 35A18, 26E10, 35H10; Secondary 46F05, 35B65, 35H20. Key words and phrases. ultradifferentiable vectors, wave front sets, ultradifferentiable classes, Theorem of Iterates. The first author was supported by FWF grant I 3472 and J 4439. The second author was supported by FWF projects P 32905 and P 33417. 1 operator P defined on some open set U ⊆ R n is of principal type 1 or that P is an operator with simple real characteristics if the principal symbol p d of P satisfies |p d (x, ξ)| + n j=1 ∂ ξj p d (x, ξ) = 0 for all (x, ξ) ∈ U × R n \{0}.
A weight function in the sense of [17] is a continuous and increasing function ω : [0, ∞) → [0, ∞) with ω(0) = 0 which satisfies log t = o(ω(t)) as t → ∞, (β) We set where V ⋐ U is a relatively compact subset of U , f ∈ E(U ) is a smooth functions, h > 0 and ϕ * ω (t) := sup s≥0 (st − ϕ ω (s)) is the conjugate function of ϕ ω . The Roumieu class (of ultradifferentiable functions) associated with ω is given by and the Beurling class associated to ω is Similarly, for a partial differential operator P of order d with analytic coefficients we set We may note that the condition (Ξ) has appeared in various applications of Braun-Meise-Taylor classes, e.g. in the study of global pseudodifferential operators in [2]. 1 We follow here the classic definition, see e.g. [68] and the references therein. It sometimes does not agree with the definition of principal type operators given in modern treatises, for example in [38,Chapter 26]. 2 1A. Preliminaries. We denote by N = {1, 2, . . . } the set of positive integers and by N 0 = N ∪ {0} the set of non-negative integers. Furthermore U ⊆ R n is always an open set. In this paper we focus on linear differential operators with analytic coefficients, i.e.
P (x, D) = |α|≤d a α (x)D α with a α ∈ A(U ). We use here the convention D j = −i∂ xj . Then the symbol of P is p(x, ξ) = |α|≤d a α (x)ξ α and p d (x, ξ) = |α|=d a α (x)ξ α is the principal symbol of P . The characteristic set of P is given by Hence Char(P ) = ∅ if and only if P is elliptic.
We say that a distribution u ∈ D ′ (U ) is an analytic vector of the operator P if for any V ⋐ U there are constants C, h > 0 such that P k u L 2 (V ) ≤ Ch k k! for all k ∈ N 0 . We write A(U ; P ) for the space of analytic vectors of P . In [43] and [46] it was shown separately that if P is elliptic then A(U ; P ) = A(U ). A similar result was proven in [53] for elliptic systems of vector fields. We can consider this problem in a more general setting, if we replace the factor k! in the estimate above by e.g. (k!) s . Recall that a smooth function f ∈ E(U ) is an s-Gevrey function, s ≥ 1, if for all V ⋐ U there are constants C, h > 0 such that sup x∈V |D α f (x)| ≤ Ch |α| (|α|!) s , ∀ α ∈ N n 0 .
The space of s-Gevrey functions on U is denoted by G s (U ). Analogously, an s-Gevrey vector u of P is a distribution u ∈ D ′ (U ) which satisfies the estimate P k u L 2 (V ) ≤ Ch k (k!) s .
We denote the space of s-Gevrey vectors of P by G s (U ; P ) and if P is elliptic then G s (U ; P ) = G s (U ) for all s ≥ 1 according to [11].
In fact, Métivier [52,Theorem 1.2] showed that the ellipticity of an analytic differential operator P can be characterized by the regularity of its non-analytic Gevrey vectors: If s > 1 then P is elliptic if and only if G s (U ; P ) = G s (U ).
Clearly the Problem of Iterates is closely related to other regularity questions of the operator P , see e.g. [14]. This connection has been extensively studied for operators with constant coefficients, see for example [6], [8], [10], [42] and [54]. However, in the wake of Métivier's Theorem the study of vectors of a differential operator with variable coefficients has mainly split into two directions: • If P is elliptic then the Theorem of Iterates has been proven for a large class of ultradifferentiable structures: e.g. for Denjoy-Carleman classes in [13], for Braun-Meise-Taylor classes in [9], for Gelfand-Shilov classes in [19] and for L q -ultradifferentiable functions in [33]. In particular, in [13] a microlocal elliptic Theorem of Iterates for Denjoy-Carleman classes is proven: If u is an ultradifferentiable vector of P with respect to a weight sequence M then WF {M} u ⊆ Char P , where WF {M} u denotes the ultradifferentiable wavefront set of u with respect to M introduced by [35]. • If P is non-elliptic then it might still be possible to show that analytic vectors are analytic, cf.
the surveys [14] and [23]. For non-analytic Gevrey vectors one tries to determine the loss of regularity in terms of the Gevrey scale (G s ) s . More precisely, we want to find for each s > 1 some s ′ > s such that every s-Gevrey vector is an s ′ -Gevrey function. This approach was used for example in [3], [12], [14] and [23].
The simplest class of non-elliptic operators with variable coefficients are the operators of principal type. The main result on Gevrey vectors of principal type operators is the following result of Baouendi and Métivier [3,Theorem 1.3]: If P is a hypoelliptic operator of principal type with analytic coefficients in U ⊆ R n then for each V ⋐ U there is some δ > 0 such that for all s ≥ 1 we have that every s-Gevrey vector u of P in U is an s ′ -Gevrey function in V where s ′ = (ds − δ)/(d − δ).
In this paper we are going to generalize the result of Baouendi and Métivier using the new theory on ultradifferentiable structures defined by weight matrices introduced in [55] which in turn will yield the Theorem of Iterates for hypoelliptic operators of principal type with respect to ultradifferentiable structures given by certain weight matrices. For example, the following observation was the starting point of this paper: The prototypical example of a nontrivial weight matrix is the Gevrey matrix For a discussion of the properties of G we refer to [55,Section 5]. The Roumieu classes of ultradifferentiable functions and vectors associated to G are respectively, whereas the Beurling classes are given by respectively. Proposition 1.2. Let P be a hypoelliptic partial differential operator of principal type with analytic coefficients on an open set U ⊆ R n . Then Proof of Proposition 1.2. Since G s (U ) ⊆ G s (U ; P ) for all s ≥ 1, cf. [14], it is enough to show The concept of weight matrix was introduced in [55] in order to deal simultaneously with Denjoy-Carleman classes and Braun-Meise-Taylor classes. It is well known that the Gevrey classes can be realized as Denjoy-Carleman classes or as Braun-Meise-Taylor classes, but in general weight sequences and weight functions describe different classes, cf. [16].
The theory of weight matrices allows us to deal with countable intersections and also countable unions (in the sense of germs) of Denjoy-Carleman classes, which will be of some importance in our considerations. For example, E [G] (U ) can neither be described as Denjoy-Carleman classes nor as Braun-Meise-Taylor classes, cf. [55,Theorem 5.22].
Weight matrices have been used to generalize and unify results regarding ultradifferentiable classes in various areas, see e.g. [39], [57], [58] or [59]. In particular, in [29] we defined the ultradifferentiable wavefront set associated with classes given by weight matrices and generalized and unified results on the 4 wavefront set for Denjoy-Carleman classes proved in [28] and [35] and for Braun-Meise-Taylor classes in [1].
As we have seen, we can associate to each weight matrix (or weight sequence or weight function) two different ultradifferentiable classes, the Roumieu class and the Beurling class, respectively. Since the Gevrey classes are Roumieu classes, such spaces have been mainly studied as for example in [13]. But when both Roumieu and Beurling classes have been considered, there seems to be no much difference regarding the results obtained, see e.g. [7], [10] or also Theorem 1.1 above. Nevertheless, we will notice that in the case of weight matrices there is occasionally a difference between the Beurling and the Roumieu case when we regard vectors of a non-elliptic operator.
1B. Outline of the paper. We want to present in this paper a throughout introduction to the theory of ultradifferentiable vectors associated to weight matrices. In Section 2 we recall for the convenience of the reader the definitions and facts from the theory of weight matrices we are going to need, including some statements concerning the ultradifferentiable wavefront set, which have not been explicitly stated in [29]. Then we show in Section 3 that the microlocal theory in [13] can be extended to classes given by weight matrices. In particular we prove the elliptic Theorem of Iterates for these classes. We should note that the restriction to analytic operators allows us to work with rather weak conditions on the weight matrix. In fact, we require only that the associated classes are invariant under the action of analytic differential operators and under the composition with analytic diffeomorphisms.
Next we want to generalize Proposition 1.2 to other weight matrices. In order to do so we introduce in Section 4 the notion of ultradifferentiable scales, which can be considered as a special kind of weight matrices. This allows us to extend [3, Theorem 1.3] (i.e. Theorem 4.5) and Proposition 1.2 (cf. Theorem 4.7) to ultradifferentiable scales and their associated weight matrices, respectively. We will see that many families of weight sequences, which have been studied previously in the literature, constitute ultradifferentiable scales, including the scale (N q ) q>1 of q-Gevrey sequences which are given by N q k = q k 2 and the scale (B λ ) λ>0 given by B λ k = k!(log(k + e)) λk . In Section 5 the proof of Theorem 1.1 and especially condition (Ξ) are discussed. Furthermore, we discuss in the second part of this section how the exact definition of ultradifferentiable scales is tied to the rather precise estimates obtained in [3] and how to modify it for the study of vectors of other operators.
In the final section we have included some selected topics. In Subsection 6A we observe that the theory of ultradifferentiable scales developed in Section 4 can also be applied to generalize the results of [21]. Subsection 6B explores for which weight sequences M the associated weight function ω M satisfies (Ξ). In the next subsection we take a first look at vectors determined by a family of weight functions. We close the paper with the proof of the following variant of [52,Theorem 1.2], where E {N q } (U ) and E {N q } (U ; P ) are the Roumieu class and the space of Roumieu vectors of P associated with the weight sequence N q , respectively. Theorem 1.4. Let P be a differential operator with analytic coefficients in U and q > 1. Then the following statements are equivalent: (1) P is elliptic.

Ultradifferentiable classes
for all k ∈ N. Note that for any such weight sequence M we have We are also going to use frequently the sequence m k = M k /k!. For a weight sequence M, a bounded open set V ⊆ R n and a constant h > 0 we set For later use we note the following result in the spirit of [45,Lemma 6]. Throughout the paper, if not indicated otherwise, we are going to consider the constants appearing in the proofs to be generic, that is they may change their value from line to line.
Proof. For each h > 0 we denote by C h the smallest constant C > 0 such that L ′ k ≤ Ch k M k holds for all k ∈ N 0 . We define a new sequence L by setting Clearly L ′ ≤ L. If we put µ k = M k /M k−1 and λ k = L k /L k−1 for k ∈ N then we recall from [45,Lemma 6] that µ k /λ k is increasing and unbounded. Set for k ∈ N and define the sequence N by N 0 = 1 and The sequence ν k is increasing since µ k is increasing, thence N satisfies (2.1). It is easy to see that L ≤ N and therefore k ≤ C k √ N k for some constant C > 0 independent of k ∈ N. It follows that N is a weight sequence since For each ε > 0 there has to exist k ε ∈ N such that λ k /µ k ≤ ε for all k ≥ k ε . Hence and thus ν k µ k ≤ ε for large enough k. 6 Following [29] we say that a weight sequence M is semiregular if Observe that (2.4) implies that for all γ > 0 there is some constant C > 0 such that We may also note that (2.5) is equivalent to If M is semiregular then E [M] is closed under composition with analytic mappings, that is, if Φ : U → V is an analytic mapping between two open sets U ⊆ R n1 and V ⊆ R n2 then for all [36] and [29], respectively.
Example 2.4. We present some examples of weight sequences, which will appear throughout the paper.
(1) The Gevrey class of order s > 1 is defined by the semiregular non-quasianalytic weight sequence (2) Let q, r > 1 be two parameters. The weight sequence L q,r = (L q,r k ) k defined by L q,r k = k!q k r is semiregular if and only if r ≤ 2. Observe that for all q, r, s > 1 we have G s ⊳ L q,r . Furthermore L q0,r0 ⊳ L q1,r1 if r 0 < r 1 and q 0 , q 1 > 1 arbitrary or if r 0 = r 1 and 1 < q 0 < q 1 .
A weight matrix M is a family of weight sequences such that for each pair M, N ∈ M we have either M ≤ N or N ≤ M. The Roumieu class associated with the weight matrix M is and the corresponding Beurling class is defined by   [61,Sect. 4]. We should note that in the last statement in particular the fact that M is equivalent to a countable weight matrix is important: The intersection of uncountably many non-quasianalytic Denjoy-Carleman classes might be quasianalytic, cf. [15].
Combined with Remark 2.3 we moreover conclude that A weight matrix M is called R-semiregular if M satisfies (2.7) and and B-semiregular if (2.7) and (2) If ℓ ∈ N is fixed and M, N are two weight sequences satisfying M k+ℓ ≤ γ k+ℓ N k for some constant γ > 0 independent of k ∈ N 0 then there are constants C, h > 0 such that Indeed, it follows from (2.1) that the sequence ((L k ) 1/k ) k is increasing for any weight sequence L. Thus we have Hence if M is a weight matrix satisfying (2.8) then by iterating (2.8) we obtain for each M ∈ M and ℓ ∈ N there are N ∈ M and constants C, h > 0 such that (2.10) holds. Similarly for a weight matrix M with (2.9) we have that for all N ∈ M and ℓ ∈ N there exist M ∈ M and C, h > 0 satisfying (2.10).
(3) An equivalent condition to (2.8) is Similarly, we can without loss of generality replace in (2.9) M k+1 and N k by m k+1 and n k , respectively.
Example 2.6. Here are some families of weight matrices which will play a prominent role later on.
(1) The Gevrey matrix G = {G s : s > 1} is semiregular. Both spaces E [G] (U ) are non-quasianalytic and furthermore we have the identity cf. [55]. (2) Using the sequences L q,r we can define two families of semiregular matrices: Since L q0,r0 ⊳ L q1,r1 for r 1 > r 0 and all q 0 , q 1 > 1 we have that R q [≈]R q ′ for any q, q ′ > 1. Hence if we set R = R e then E [R] (U ) = E [R q ] (U ) for all q > 1. Furthermore Q r [ ]Q r ′ for all r < r ′ and R( )Q r and Q r { }R for all r > 1. We note finally that G{⊳)R. Analogously to above we set Then J σ ( )B j,σ for all j ∈ N and σ > 0. Clearly we have also that J σ1 (≈)J σ2 for all σ 1 , σ 2 > 0 and finally J σ {≈}{B 1,σ } for any σ > 0.  2B. Weight functions. In this section we discuss briefly the relationship between weight functions and weight matrices, as described in [55].
(1) It is well known that the weight function t 1/s generates the Gevrey class of order s, i.e. G s (U ) = E {t 1/s } (U ) and in particular, for s = 1, A(U ) = G 1 (U ) = E {t} (U ) is the space of analytic functions. It follows that if ω is a weight function such that ω(t) = o(t α ) for some 0 < α ≤ 1 then (2) In general, weight sequences and weight functions describe distinct spaces, see [16].
We say that a weight function ω is quasianalytic if ω satisfies (2.11) and non-quasianalytic otherwise. If ω is non-quasianalytic then ω(t) = o(t) for t → ∞.
According to [17] we can without loss of generality assume that ω vanishes on [0, 1]. Then the Young conjugate ϕ * Definition 2.9. Let ω be a weight function such that ω| [0,1] = 0. The associated weight matrix W = {W λ = (W λ k ) k : λ > 0} to ω is given by for all j, k ∈ N 0 and all λ > 0. We note that (2.13) implies (2.8) and (2.9). Furthermore we have (2.14) From (2.14) we obtain that as topological vector spaces. Finally, (1) Let ω be a weight function with ω(t) = o(t) when t → ∞. Then the associated weight matrix is semiregular. If q = e λ then clearly W λ,s L q,r and it is easy to see that L q,r ⊳ W λ ′ ,s when λ < λ ′ . It follows that W s [≈]Q r . Hence The ultradifferentiable wavefront set. The ultradifferentiable wavefront set for Roumieu classes given by weight sequences was introduced in [35]. In [1] the wavefront set was defined in the case of weight functions. These definitions have been generalized by [29] to the category of classes given by weight matrices.
For the convenience of the reader we recall from [29] the definition of the wavefront set associated to classes given by weight matrices. We give also a summary of the results we need later on, observing in particular that, in analogy to the results of [35] in the case of a single weight sequence, semiregularity of the weight matrix is sufficient for the ultradifferentiable microlocal elliptic regularity Theorem to hold for operators with analytic coefficients; a fact that was not explicitly stated in [29] because in that paper we worked in a more general setting.
We define the Fourier transform of a distribution u ∈ E ′ (U ) to bê where the bracket on the right-hand side denotes the distributional action.
The basic properties of the ultradifferentiable wavefront set are summarized in the following Proposition. (1) WF [M] u is a closed subset of U × R n \{0} which is conic in the second variable.
If we assume that M satisfies additional conditions then we can show more properties of WF [M] u: Proposition 2.14 ([29, Proposition 5.6(1)]). Let M be a weight matrix satisfying (2.7) and u ∈ D ′ (U ).
We have where π 1 : U × R n \{0} → U is the projection to the first variable.

Similar to the smooth category we define the [M]-singular support sing supp
It is possible to choose the distributions u k in Definition 2.12 in a special manner. For our purpose a simplified version of [29, Lemma 5.3] is sufficient: Furthermore assume that χ k ∈ D(U ) is a sequence of functions with common support in K and for all α ∈ N n 0 there are constants C α , h α > 0 such that If µ is the order of u near K then the sequence (χ k u) k is bounded in E ′,µ (K) and (1) in the Roumieu case we have for some M ∈ M and C, h > 0. Theorem 2.17. Let P be a differential operator with analytic coefficients on U and M be a [semiregular] weight matrix. Then we have

Ultradifferentiable vectors
3A. Microlocal theory. The aim of this section is to generalize the microlocal theory presented in [13] to the setting of weight matrices. In order to accomplish this we have to use a more generalized notion of vectors than the one from Section 1. For this we need to recall some notions.
Let σ ∈ R. We denote the Sobolev space of order σ by H σ (R n ), which is equipped with the norm The localized Sobolev space H σ loc (U ) consists of those distributions g ∈ D ′ (U ) which satisfy ϕg ∈ H σ (R n ) for all ϕ ∈ D(U ). It is a locally convex space whose topology is given by the seminorms Definition 3.1. Let M be a weight matrix, P = {P 1 , . . . , P ℓ } a system of differential operators of order d j , j = 1, . . . , ℓ, with analytic coefficients in the open set U ⊆ R n and σ ∈ R. If V ⋐ U , M ∈ M and h > 0 then we set . . , ℓ} k and k ∈ N 0 . In the case k = 0 we use the convention {1, . . . , ℓ} 0 = 0 and d 0 = 0. We set σ (U ; P) is called an ultradifferentiable vector of class [M] (or an [M]-vector) of the system P. We also define E Proposition 3.2. Let M, N be weight matrices and P be a system of analytic differential operators. Then the following holds: Proof. If M{ }N and V ⋐ U are given then for all M ∈ M and all h > 0 there are N ∈ N and h ′ , C > 0 such that u for u ∈ D ′ (U ). Hence (1) holds in the Roumieu case. The proofs of the other statements in (1) and (2) are similar. In order to show (3), recall that (2.7) implies that for all M ∈ M and all γ > 0 there is a constant We may also note that if Q = |β|≤d a β (x)D β is an operator with analytic coefficients in U then for each V ⋐ U we can find constants C, r > 0 such that for some weight sequence M ∈ M and h ≥ 1. Iterating this argument we conclude that there are a weight sequence M ∈ M and constants C, where P α and d α are defined as in Definition 3.1. Therefore for some constants C, h > 0 independent of k ∈ N and α ∈ {1, . . . , ℓ} k and hence f ∈ E {M} (U ; P). If f ∈ E (M) (U ) and V ⋐ U then we define a sequence L ′ ⊳ M by setting According to Lemma 2.2 for each M ∈ M there is a weight sequence N such that G 1 ≤ L ′ ≤ N ⊳ M and by construction we have that there are constants γ > 0 and C > 0 such that Thence for each M ∈ M and h > 0 there is a constant C > 0 such that From this estimate it follows in the same manner as in the Roumieu case that f ∈ E (M) (U ; P). Remark 3.3. Traditionally, the L 2 -norm is mainly used in the definition of vectors, but in the literature the norm in the definition of vectors is chosen according to the techniques used in the paper in question, see e.g. the discussion in [14]. We have already mentioned that Definition 3.1 is more general than the definition of vectors used in Section 1, because, as we will see in a moment, Definition 3.1 is microlocalizable, cf. [13] and [12].
However, cf. [14], if the system P = {P 1 , . . . , P ℓ } is subelliptic, that is for each V ⋐ U there is ε > 0 such that for all σ ∈ R the estimate holds for some C > 0, then we obtain that , see e.g. [37] or [66]. Furthermore, E where ε is the subellipticity index of W , see A.
We suppose for a moment that M, N ∈ M are two weight sequences for which there exists a constant and all h > 0. Thence, by the above arguments we can conclude that actually for all σ, τ ∈ R. The Roumieu case follows similarly.
We are now able to begin to extend the microlocal theory developed in [13] for Roumieu vectors given by a semiregular weight sequence of an operator with analytic coefficients to vectors associated to a [semiregular] weight matrix. We follow mainly the presentation given in [12]. We start with a characterization of the property of being a vector by the Fourier transform.
Theorem 3.4. Let P be a differential operator of order d with analytic coefficients in U , u ∈ D ′ (U ), x 0 ∈ U and M be a weight matrix. Then for a sequence M ∈ M and some constants C, h > 0 and ν ∈ R. (2) u ∈ E (M) (V ; P ) for some neighborhood V of x 0 if and only if there are a neighborhood W of x 0 , a sequence f k ∈ E ′ (U ) and a constant ν ∈ R such that f k | W = P k u | W and for all M ∈ M and every h > 0 there is some C > 0 so (3.4) is satisfied. 13 Proof. We begin with the Roumieu case. Hence suppose that u ∈ E {M} σ (V ; P ) for some neighborhood V of x 0 and σ ∈ R. Following [12] let W 2 ⋐ W 1 ⋐ V be two neighborhoods of x 0 and choose ϕ, ψ ∈ D(W 1 ) with ψϕ = ϕ and ϕ = 1 in for some M ∈ M and some constants h > 0 and ν = −σ.
On the other hand assume that there is a sequence f k ∈ E ′ (U ) and a neighborhood V of x 0 such that f k | V = P k u| V and (3.4) holds for some M ∈ M and constants C, h, ν > 0. Now let σ ≤ −ν − (n + 1)/2. Then we obtain for every W ⊆ V that for some C ′ > 0 since σ was chosen appropriately.
The Beurling case follows in a similar manner.
In the definition of the wavefront set of iterates the estimate (3.4) will correspond to (2.15). The following statement is going to provide a correspondence of the boundedness of the sequence u k in Definition 2.12.
Proposition 3.5 ([12, Proposition 1.6]). Let u ∈ D ′ (U ), P be an analytic partial differential operator of order d and K ⊆ U be a compact set. Furthermore assume that χ k ∈ D(U ) is a sequence of functions with common support in K satisfying If p ∈ N and q ∈ N 0 then the sequence f k = χ pdk+q u obeys the estimate for some constants C ′ , ν > 0.
Definition 3.6. Let P be a differential operator with analytic coefficients of order d, M a weight matrix, u ∈ D ′ (U ) and (x 0 , ξ 0 ) ∈ U × R n \{0}. Then we say that and there exists some ν ∈ R such that for all M ∈ M and all h > 0 there is a constant C > 0 for which the estimates (3.5) and (3.6) are satisfied.
It is easy to see that WF [M] (u; P ) satisfies the same basic properties as WF [M] u, cf. Proposition 2.13: Proposition 3.7. Let M and N be two weight matrices and u ∈ D ′ (U ). Then: We have also a variant of Lemma 2.16: , and K ⊆ U be a compact subset, F ⊆ R n a closed cone and χ k ∈ D(U ) a sequence of functions with support in K such that for all α ∈ N n 0 there are constants Proof. First we prove the Roumieu case. Let x 0 ∈ K and ξ 0 ∈ F . Then (x 0 , ξ 0 ) / ∈ WF {M} (u; P ) and we choose V , Γ and f k according to Definition 3.6. If supp χ dk ⊆ V then χ dk P k u = χ dk f k and therefore Note that without loss of generality we can always assume ν ≥ 0. We observe that (3.7) gives For ℓ, j ≥ 0 we have, (cf. [29, p. 26]) If j ≤ k then Since M is R-semiregular it follows from Remark 2.5(2) that for each M ∈ M there are N ∈ M, C, h > 0 such that On the other hand choose a closed cone Γ 1 ⊆ Γ ∪ {0} with ξ 0 ∈ Γ 1 . Then there is a constant c > 0 such that |ξ − η| ≥ c(|ξ| + |η|) for all ξ ∈ Γ 1 and η / ∈ Γ. If we also use (3.9) and setc = min{1, c} then it follows that for each M ∈ M there is some N ∈ M such that was chosen arbitrarily, note that F can be covered by a finite number of cones like Γ ′ and therefore (3.10) holds in F for some constants C, h and ν 0 as long as supp χ k ⊆ U is a small enough neighborhood of x 0 . But K is compact hence we can argue as in the proof of [36,Lemma 8.4.4]. There is a finite number of such open sets U j that cover K and we can choose a partition of unity χ j,k ∈ D(U j ) such that (χ j,k ) k satisfies (3.7) for each j.
Then the same is true for χ j,k χ k and we conclude from above that (3.10) holds for χ j,dk χ dk P k u. Since j χ j,dk χ dk P k u = χ dk P k u we have proven (3.10) in the general case. The proof of the estimate in the Beurling category is analogous. Just note that if M is B-semiregular then Remark 2.5(2) implies that for all N ∈ M there are M ∈ M, C, h > 0 such that (3.9) holds. Lemma 3.8 allows us to prove an analogue of Proposition 2.15: loc (U 1 ; P ). If x ∈ U 1 then by Theorem 3.4 (and Proposition 3.5) it follows that ( 3B. Invariance under analytic mappings. The aim of this section is to prove the invariance of the definition of WF [M] (u; P ). We begin by recalling two results from [35], see also [14]. . Let U 1 ⊆ R n1 and U 2 ⊆ R n2 be two open sets, a ∈ A(U 1 ) and f : U 1 → U 2 be an analytic mapping. Furthermore assume that χ k ∈ D(U 2 ) is a sequence of functions with support in the same fixed compact set and there are constants C, h > 0 such that Then the sequence χ ′ k = a(χ k • f ) has the same properties with different constants C, h.
for some constants C, h > 0.
Then there exist constants C ′ , h ′ > 0 such that for all t ∈ R and f ∈ F we have Theorem 3.12. Let x 0 ∈ U , u ∈ D ′ (U ), P be a differential operator of order d with analytic coefficients in U and F be a compact family of analytic real-valued functions. Assume also that χ k ∈ D(U ) is a sequence of functions satisfying |D α χ k | ≤ Ch |α| k |α| , |α| ≤ k, with supports inside of the same small enough neighborhood W of x 0 . Then the following holds: (1) If M is an R-semiregular weight matrix and (x 0 , df (x 0 )) / ∈ WF {M} (u; P ) ∪ {0} for all f ∈ F then there are a sequence M ∈ M, constants C, h > 0, ν ′ ∈ R and q ∈ N 0 such that for all f ∈ F then there are ν ′ and q ∈ N 0 such that for all M ∈ M and h > 0 there is some C > 0 satisfying (3.11).
Proof. Note first that the set F = {tdf (x 0 ) : t > 0, f ∈ F} is a closed cone in R n \ {0}. Since by Proposition 3.7(1) WF [M] (u; P ) is a closed subset of U × R n \{0} which is conic in the second variable, there has to be a neighborhood V of x 0 and an open conic neighborhood Γ ⊆ R n \{0} of F such that WF [M] u ∩ V × Γ = ∅. Then Lemma 3.8 implies that we can find a sequence f k ∈ E ′ (U ) and ν ∈ R such that the following holds. First, f k | V = (P k u)| V and the Fourier transforms of the f k either satisfy in the Roumieu case, for some constants C, h and M ∈ M or, in the Beurling case, for all M ∈ M and h > 0 there is some C > 0 such that (3.12) and (3.13) hold. We assume for the moment that (3.12) and (3.13) holds for some fixed M ∈ M and some constants C, h > 0. We can further suppose that supp χ k ⊆ W = V . Moreover, we set v k,t = χ dk+q e −itf for some fixed integer q ≥ n + 1 + ν. We conclude that The normalized functions |t| + |ξ| with f ∈ F and t > 0 form a compact family of analytic functions without a critical point in x 0 as long as ξ / ∈ Γ or ξ ∈ Γ and min(|t|/|ξ|, |ξ|/|t|) < ε for some sufficiently small ε > 0. If the supports of the χ k are sufficiently small around x 0 Lemma 3.11 allows us to estimatev k,t (−ξ). In fact, there exist constant C ′ , h ′ > 0 such that for f ∈ F, t > 0, ξ / ∈ Γ or ξ ∈ Γ and min(|t|/|ξ|, |ξ|/|t|) < ε. Note that the right-hand side of (3.15) can be bounded by C ′ (h ′ ) dk+q . Now recall that (2.7) implies that for all M ∈ M there is some γ > 0 such that From this we obtain, with the same constant γ, for all k ∈ N and all τ > 0. Hence we obtain from (3.12), (3.13), (3.14) and (3.15) the following estimate Note that if 0 < γ ≤ 1 then On the other hand, for γ > 1 we have the trivial estimate (M k ) 1/k ≤ γ(M k ) 1/k . Thence, since t ≥ 1, the first integrand in the right-hand side above can be bounded by with h 1 being a multiple of h, h ′ and possibly γ.
Following iterated application of (2.8) we can conclude that there are constants C, h > 0 and a weight sequence M ′ ∈ M such that and we have proven the theorem in the Roumieu case.
It is easy to see that the same proof holds also in the Beurling category. Proof. Let F : U → U ′ be an analytic diffeomorphism from U onto an open subset U ′ ⊆ R n which transforms the operator P into the operator P F defined by Then for all k ∈ N 0 . We set y = F (x) and u = v • F . We are going to show that, if (x 0 , ξ 0 ) / ∈ WF [M] (u; P ) then (y 0 , η 0 ) / ∈ WF [M] (v; P F ) where y 0 = F (x 0 ) and ξ 0 = F ′ (x 0 ) T η 0 . Let χ k ∈ D(U ) be a sequence of functions with supports in a small enough neighborhood of x 0 and which are equal to 1 near x 0 and satisfy |D α χ k | ≤ C(hk) |α| when |α| ≤ k. If Γ is the cone associated to ξ 0 in Definition 3.6 then (F ′ (x 0 ) T ) −1 Γ is an open conic neighborhood of η 0 . It follows that the family According to Lemma 3.10 we have that for some constants C, h > 0. In the Roumieu case Theorem 3.12 implies that there are constants C, h > 0, ν ′ ∈ R and q ∈ N such that If we define ϕ k = χ k • F −1 and g k = ϕ dk+q P k F v then we obtain Furthermore, by Lemma 3.10 the functions ϕ k satisfy for some constants C, h > 0. Hence, by Proposition 3.5 the estimate (3.6) holds for the sequence g k too. Since g k | V = P k v in some neighborhood V ⊆ U ′ of y 0 we have therefore shown that (y 0 , η 0 ) / ∈ WF {M} (v; P F ).
Virtually the same proof gives us also the result in the Beurling case. Lemma 3.14. Let K ⊆ U be compact, F ⊆ R n \ {0} be a closed cone, u ∈ D ′ (U ), P be an analytic differential operator and ϕ k (x, ξ) be a sequence of smooth functions on U × F with supp ϕ k ( . , ξ) ⊆ K for all k ∈ N 0 and ξ ∈ F for which there are constants C, h > 0 such that

16)
for all k ∈ N 0 . Furthermore assume that M is a [semiregular] weight matrix and let µ be the order of u near K. Then the following holds: (2) If WF [M] (u; P ) ∩ K × F = ∅ then there are M ∈ M constants ν ≥ 0 and h, C > 0 (resp. there is some ν ≥ 0 such that for all M ∈ M and h > 0 there exists some C > 0) satisfying Proof. We begin with the proof of (1) in the Roumieu category. Due to Lemma 2.16 there is a bounded sequence u k ∈ E ′ (U ) such that u k | W = u| W in some neighborhood W of K and η ∈ R n , ξ ∈ F, |ξ| > k, (3.17) whereφ k (η, ξ) = e −ixη ϕ k (x, ξ) dx is the partial Fourier transform of ϕ k . Furthermore if ξ ∈ F we can choose 0 < c < 1 such that η ∈ Γ when |ξ − η| ≤ c|ξ|. [36, equation (8.1. 3)] states that Hence there are some C, h > 0 such that for ξ ∈ F , |ξ| > k and k > µ + n.
We now turn to the proof of (2). In the Roumieu case Lemma 3.8 and Proposition 3.5 imply that there are a neighborhood W of K, an open conic neighborhood Γ of F and a sequence E ′ (U ) such that f k = P k u in W and for some M ∈ M and constants ν ∈ R and C, h > 0. Similarly to above we have Without loss of generality we may assume that ν ≥ 0. By (3.16) we have that there are constants C, h > 0 such that Moreover, there is a constant κ > 0 such that if ξ ∈ F and η / ∈ Γ then |ξ − η| ≥ κ(|ξ| + |η|). Hence, by (3.17) we have Thus there exists some M ′ ∈ M and constants C, h > 0 such that for some M ∈ M and some constants C, h > 0. Let W ⋐ V be a neighborhood of x 0 and F ⊆ Γ ∪ {0} a closed conic neighborhood of ξ 0 . Choose a sequence χ k ∈ D(V ) with χ| W = 1 and |D α χ k (x)| ≤ Ch |α| k |α| for |α| ≤ k. We set f k = χ 2dk P k u. It follows thatf k (ξ) = χ 2dk P k u, e −ixξ = u, Q k e −ixξ χ 2dk where Q denotes the formal adjoint of P given by Qφ, ψ = φ, P ψ with φ, ψ ∈ D. Hence if P = |α|≤d p α (x)D α then Qg = |α|≤d (−D) α (p α g) = |α|≤d q α D α g. We define a new differential operator R by setting Q e −ixξ χ 2dk = e −ixξ |ξ| dk Rχ 2dk .
It follows that R = R 1 + · · · + R d , where R j = R j (x, ξ, D) is a differential operator of order ≤ j with analytic coefficients which are homogeneous of degree −j with respect to ξ. More precisely, It follows that By [35, Lemma 5.2] we have, for |β| + j ≤ 2dk and j = j 1 + · · · + j k , We conclude that D β R k χ 2dk ≤ Ch k (dk) |β| when |ξ| ≥ dk and |β| ≤ dk. Lemma 3.14(1) gives that there is some M ∈ M such that Proposition 3.5 implies that there is some ν such that for any M ∈ M and hence (g k ) k satisfies (3.5).
On the other hand, if |ξ| < dk then by (3.18) we obtain  Proof. As in [13] for the Denjoy-Carleman case the proof follows closely the pattern used in [35] to show the elliptic regularity theorem, see also [1] and [29].
Let (x 0 , ξ 0 ) ∈ U × R n \{0} be such that (x 0 , ξ 0 ) / ∈ WF [M] (U ; P ) and p d (x 0 , ξ 0 ) = 0. Thence there exist a conic neighborhood V ×Γ of (x 0 , ξ 0 ) and a sequence f k ∈ E ′ (U ) with f k | V = P k u| V which satisfies (3.5) and (3.6). Furthermore there are a compact neighborhood K of x 0 and a conic neighborhood F of ξ 0 , closed in R n \ {0}, such that p d (x, ξ) = 0 for (x, ξ) ∈ K × F . W.l.o.g. we can assume that K × F ⊆ V × Γ. Suppose that χ k ∈ D(K) is a sequence with for some constants C, h independent of k. We set u k = χ 3d 2 k u and thus haveû k (ξ) = u, χ 3d 2 k e −ixξ . If Q is the adjoint of P then we want to construct a solution v of the equation We define a differential operator R = R(x, ξ, D) on K × F by is a differential operator of order ≤ j with analytic coefficients in x which are homogeneous of degree −j in ξ. By recurrence we obtain for k ∈ N that If we set in (3.19) v = e −ixξ w p k d (x, ξ) then w satisfies the equation A formal solution of the above equation would be However, we cannot estimate arbitrary high derivatives of χ 3d 2 k , hence we consider the following approximate solution of (3.20) Then we obtain Inserting (3.21) in (3.19) gives Hence we obtain the following representation forû k , i.e.
Since p −1 d is real analytic in a neighborhood of K and homogeneous of degree −d in ξ ∈ F we can apply the proof of [35,Lemma 5.2] in order to obtain that there are constants C, h > 0 such that for |β| ≤ dk, |ξ| ≥ dk, ξ ∈ F and x ∈ K.
Recall that a system {P 1 , . . . , P ℓ } of differential operators defined on U is said to be elliptic iff ℓ j=1 Char P j = ∅. Char P j = ∅.
We conclude that u ∈ E [M] (U ), cf. Proposition 2.15. Therefore we have obtained

cf. Proposition 3.2(3).
Remark 3.18. Clearly the correspondence between weight functions and their associated weight matrices as described in Subsection 2B yields instantly the transfer of all results in this section to structures given by weight functions. Thus we have in particular generalized the results of [7] to operators with analytic coefficients. We note here only the version of Corollary 3.17: Corollary 3.19. Let P = {P 1 , . . . , P ℓ } be an elliptic system of analytic differential operators and ω be a weight function such that Here we have to generalize the definition of E [ω] (U ; P j ) from Section 1 in analogy to Definition 3.1. However, note that by Remark 3.3 the two definitions agree for subelliptic systems of operators. The proof of Corollary 3.19 follows then immediately from Corollary 3.17, if we recall that W satisfies (2.14). We leave the details to the reader.

Ultradifferentiable scales
In this section we introduce the notion of ultradifferentiable scales and apply them to the Problem of Iterates of analytic differential operators of principal type.
( ) 24 On the other hand the matrix M ζ is R-semiregular if and only if and B-semiregular if and only if ζ satisfies For a generating function ζ we call the ordered family of weight sequences (M λ ζ ) λ the ultradifferentiable scale generated by ζ. We also say that M ζ is the weight matrix associated to the scale (M λ ζ ) λ . To each ultradifferentiable scale (M λ ζ ) λ we can associate two scales of ultradifferentiable classes, namely the scale of Roumieu classes and of Beurling classes, respectively. Clearly, We say that an ultradifferentiable scale (M λ ζ ) λ with generating function ζ is fitting if ζ satisfies ( ) and ∀λ ∈ Λ ∀ α > 1 ∃ λ * ≥ λ ∃ γ > 0 : On the other hand, the scale (M λ ζ ) λ is apposite if the generating function ζ obeys ( ) and Furthermore, a scale (M λ ζ ) λ is R-admissible if (⋆) and (⊲) hold for ζ and B-admissible if (⋄) and (⊳) are satisfied. We use the notation [admissible] if the scale is either R-or B-admissible, depending on the context. Furthermore we say that a scale is admissible if it is R-and B-admissible. We observe that a fitting scale is also R-admissible and an apposite scale is B-admissible but the other implications do not hold in general.
It follows that the scale (L q,r ) q is admissible. It is fitting and apposite if and only if r ≤ 2.
4B. Vectors of operators of principal type. If P is an operator of principal type with analytic coefficients in U ⊆ R n and (x 0 , ξ 0 ) ∈ U × R n \{0} then we say following [68] that P satisfies Condition C x0,ξ0 if either p d (x 0 , ξ 0 ) = 0 or p d (x 0 , ξ 0 ) = 0 and for all z ∈ C with d ξ Re(zp d (x 0 , ξ 0 )) = 0 we have that the function Im(zp d ), restricted to the bicharacteristic strip of Re(zp d ) through (x 0 , ξ 0 ), has a zero of finite even order. We recall Theorem 4.2 ([68, Theorem II]). Let P be an analytic differential operator of principal type. The following statements are equivalent: (1) P is hypoelliptic.
Since P is subelliptic we have by Remark 3.3 that u ∈ E {M} (U ; P ) if and only if for every V ⋐ U there are M ∈ M and constants h, C > 0 such that P k u ∈ L 2 (V ) and for all k ∈ N 0 . On the other hand u ∈ E (M) (U ; P ) if and only if P k u ∈ L 2 loc (U ) and for all V ⋐ U , all M ∈ M and all h > 0 there is some C > 0 such that (4.2) is satisfied for all k.
The main technical result of [3] is the following theorem: . Let P be a differential operator of order d with analytic coefficients in U ⊆ R n . Let (x 0 , ξ 0 ) ∈ U × R n \{0} and assume that there is a conic neighborhood W 0 × Γ 0 of (x 0 , ξ 0 ) such that P is of principal type in W 0 × Γ 0 and Condition C x,ξ is satisfied for all (x, ξ) ∈ V 0 × Γ 0 . Then there are neighborhoods W ′ ⋐ W ⋐ W 0 of x 0 , a conical neighborhood Γ ⊆ Γ 0 of ξ 0 , C > 0, 0 ≤ δ < 1 and a sequence of functions (ψ k ) k ⊆ D(W ) satisfying 0 ≤ ψ k ≤ 1 and ψ k ≡ 1 on W ′ such that the following holds: For every k ∈ N and u ∈ L 2 (W ) with P k u ∈ L 2 (W ) we have Remark 4.4. According to [3, Remark 1.2] the number δ in Theorem 4.3 can be chosen to be 0 if P is elliptic at (x 0 , ξ 0 ). When P is non-elliptic at (x 0 , ξ 0 ) then we can take δ = 2k/(2k + 1) where 2k is the maximum order of vanishing of Im(zp d ) mentioned in Condition C x,ξ , for (x, ξ) in a compact neighborhood of (x 0 , ξ 0 ) and z ∈ C.
Note that δ(V ) is closely related to the subellipticity of P : For V ⋐ U we can choose in (3.2) ε = d − δ(V ), see [67]. Now suppose that P is a hypoelliptic operator of principal type with analytic coefficients in U and that (M λ ζ ) is a fitting ultradifferentiable scale with generating function ζ. Recall that Theorem 4.2 implies that Condition C x,ξ holds for all (x, ξ) ∈ U × R n \{0}. Furthermore let u ∈ D ′ (U ) be an {M λ }-vector of P for some λ ∈ Λ and (x 0 , ξ 0 ) ∈ U × R n \ {0}. Applying Theorem 4.3 we conclude that there are neighborhoods W ′ ⋐ W ⋐ U of x 0 , a conical neighborhood Γ of ξ 0 , 0 ≤ δ < 1 and a bounded sequence is satisfied for all M λ , we have that for each ρ > 0 there exists C ρ > 0 such that 1 ≤ C ρ ρ k m λ k for all k ∈ N 0 . Applying also Stirling's formula we obtain that there are constants h > 0 and C > 0 such that If we denote by ⌈y⌉ the smallest integer ≥ y ∈ R then we choose for ℓ ∈ N an integer k ℓ in the following way and on the other hand 0 < δ < 1 implies that δ k ℓ ≤ δ ℓ/(d−δ) . Thus, if we set v ℓ = u k ℓ then we have that for ξ ∈ Γ with |ξ| ≥ 1 and some λ * according to (⊲). Then ( ) and the Stirling formula imply that for some constants C, h > 0. Hence (x 0 , ξ 0 ) / ∈ WF {M λ * } u. If u ∈ D ′ (U ) is a (M λ )-vector of P for some λ, then we have by essentially the same arguments that for every (x 0 , ξ 0 ) ∈ U × R n \{0} there is some λ * ∈ Λ such that (x 0 , ξ 0 ) / ∈ WF (M λ * ) u. In fact, we have obtained the following theorem.
Theorem 4.5. Let P be a hypoelliptic differential operator of principal type with analytic coefficients in U ⊆ R n and (M λ ) λ be an ultradifferentiable scale. Then the following holds: (1) If (M λ ) λ is fitting then for all V ⋐ U and all λ ∈ Λ there is some λ * ∈ Λ such that every Proof. Note first that by Remark 4.4 for every V ⋐ U there is some δ(V ) ∈ [0, 1) such that (4.3) holds with δ = δ(V ) for all (x 0 , ξ 0 ) ∈ V × R n \{0}. Condition (⊲) implies that for every λ there is some λ * such that for all t ∈ [1, ∞) and some C > 0. Thus the above arguments give Hence u is of class [M λ * ] in V by Proposition 2.15, which proves (1). On the other hand, by (⊳) we obtain that for every λ * there is some λ such that (4.4) holds for t ∈ [1, ∞) and some C > 0. Adapting the arguments above we then conclude that WF [ (1) If u is an [L q,r ]-vector of P for some q > 1 and 1 < r ≤ 2 then u is of class (2) If u is a [B j,λ ]-vector for some j ∈ N and λ > 0, then u is of class Theorem 4.7. Let P be as in Theorem 4.5, (M λ ζ ) λ be an [admissible] ultradifferentiable scale and M ζ the associated weight matrix. Then Proof. We begin with the Roumieu case. If u ∈ E {M ζ } (U ; P ) then for every V ⋐ U there are λ ∈ Λ and C, h > 0 such that

As above we obtain from Theorem 4.3 and (4.3) that there is a bounded sequence
where C > 0, h > 0, Γ is a conic neighborhood of ξ 0 and δ = δ(V ), depending only on the operator P and V , is as in Remark 4.4. If we choose k ℓ , ℓ ∈ N, as before and set v ℓ = u k ℓ then we can conclude in the same manner from (⊲) that for some λ * ∈ Λ. Hence (⋆) gives for some constants C, h > 0 and λ ′ ∈ Λ independent of ℓ. Therefore, since (x 0 , ξ 0 ) ∈ V × R n \{0} was chosen arbitrarily, and by Theorem 2.14 Since this holds for all V ⋐ U it follows that WF {M ζ } u = ∅. Hence u ∈ E {M ζ } (U ) by Proposition 2.15. If u ∈ E (M ζ ) (U ; P ) then for all V ⋐ U , λ ∈ Λ and h > 0 there is a constant C > 0 such that Furthermore there is a conic neighborhood Γ of ξ 0 such that for all λ ∈ Λ and all h > 0 there exists a constant C > 0 such that If k ℓ for ℓ ∈ N is defined as before then it is easy to see that (⊳) implies that for all λ * and h > 0 there is a constant C > 0 such that It follows from (⋄) that for all λ ′ ∈ Λ and h > 0 there is some C > 0 such that for all ℓ ∈ N 0 we have for all λ ′ ∈ Λ and therefore by Proposition 2.14 This means that WF (M ζ ) u = ∅ and consequently u ∈ E (M ζ ) (U ).
Corollary 4.8. Let P be as in Theorem 4.5. Then and Example 4.9. Let P be as in Theorem 4.5.
Then any (M ζ )-vector u ∈ D ′ (U ) is of class (N η ) in V .
Remark 4.11. Another important fact in Example 4.9 (2) was that the weight matrices J σ associated to the scales (B j,σ ) j satisfy J ρ (≈)J σ for all ρ, σ. Of course, we can express this property in terms of the generating functions of the scales. Assume, again, that two ultradifferentiable scales (M λ ζ ) λ∈Λ and (N υ η ) υ∈Υ with generating functions ζ : Λ × [0, ∞) → [0, ∞) and η : Υ × [0, ∞) → [0, ∞), respectively, are given. For such a pair of generating functions we define an auxillary function Φ ζ η : It is clear that M λ N υ if lim sup t→∞ Φ ζ η (λ, υ; t) < ∞. We can distinguish the following cases: (2) On the other hand We might also ask ourselves, when do two ultradifferentiable scales generate the same scales of Denjoy-Carleman classes? In order to give an answer to this question, we say that two generating functions If ζ and η are comparable then M λ ζ ≈ N for all λ ∈ Λ and all systems P of differential operators.

Scales induced by weight functions
5A. Condition (Ξ). In this section we are going to prove Theorem 1.1, but first we need to analyze condition (Ξ). It is useful for our deliberations to set ω| [0,1] ≡ 0 and ω satisfies (β) and (γ) since we have the following Lemma.
Proposition 5.2. Let ω be a weight function which satisfies (Ξ) and denote its associated weight matrix by W = {W λ : λ > 0}. If we define another weight matrix W by then W{≈} W and W(≈) W. 30 The main idea of the proof of Theorem 1.1 is to associate to ω the scale generated by If ω satisfies (Ξ) and W is the weight matrix associated to the scale generated by ζ ω then E [ω] (U ) = E [ W] (U ) by Proposition 5.2. The generating function ζ ω satisfies (⋆) and (⋄): Let ω ∈ W 0 and ϕ * λ,ω (t) := 1 λ ϕ * ω (λt) for λ > 0. Then we have for all λ > 0.
On the other hand, if (1) holds then there are constants C, h > 0 such that Thus for t ≥ 0 we can compute that we have ϕ σ (u) = 0 for u < 0 by normalization. Hence for all λ > 0 and t ≥ 0 we have Thus (2) is verified with the constants A := C and D := max{hα, λ −1 }. Observe that A does not depend on λ.
An immediate consequence of Lemma 5.5 is Corollary 5.6. If ω ∈ W 0 then the following are equivalent: (1) For all α > 1 there exists σ ∈ W 0 and L ≥ 1 such that (2) For all α > 1 there exists σ ∈ W 0 such that Hence, if we combine Corollary 5.6 with Lemma 5.4 we obtain Corollary 5.7. Let ω ∈ W 0 . The following two conditions are equivalent: (1) ω satisfies (Ξ). for any differential operator P with analytic coefficients.  [9,Example 3.1] showed that if P is not elliptic then there is a weight function ω P which is not equivalent to any Gevrey weight function t 1/s such that E {ωP } (U ) E {ωP } (U ; P ). This example does not contradict Theorem 1.1 since ω P does not satisfy (Ξ). In fact, for each ω P there exist 1 < s < s ′ by construction such that G s (U ) E {ωP } (U ) G s ′ (U ), but the class associated with a weight function satisfying (Ξ) is not contained in any Gevrey class as the following result shows. However, by Lemma 5.1 there is some 0 < γ < 1 such that t γ ω, which in particular implies that the space of analytic functions is strictly contained in E [ω] (R).
5B. Some remarks. We can use the "mixed" conditions of Corollary 5.6 to obtain results like Theorem 4.5, cf. also Remark 4.10, for weight functions. In fact, the conditions in Corollary 5.6 seem to be similar to those in Remark 4.10. However, arguing absolutely analogously to Section 4 we would not obtain results for some weight functions ω and σ and their associated weight matrices W and S but for the weight matrices W and S, cf. Proposition 5.2. As we have seen that does not matter if ω = σ satisfies (Ξ).
But for the "mixed" setting note first that we can drop (k!) −δ in (4.3) since (k!) δ ≥ 1 for all k ∈ N 0 and δ > 0. The other estimates before Theorem 4.5 remain also valid if we drop the "factorial" factors of the form k k(d−δ) . We obtain therefore the following Theorem, but we need to discuss subsequently how it fits in the theory presented in Section 4. Proof. We denote by W = {W λ : λ > 0}, W λ k = ϕ * λ,ω (k), and S = {S λ : λ > 0}, S λ k = ϕ * λ,σ (k), the weight matrices associated to ω and σ, respectively. According to Corollary 5.6 there is a constant A > 0 such that for every λ > 0 we have for some constant D > 0. If u ∈ E {σ} (U ; P ) then there exist λ > 0, h > 0 and C > 0 such that for all k ∈ N 0 . Now (4.3) and (5.2) imply similarly to the argument before Theorem 4.5 that Hence u| V ∈ E {W} (V ) = E {ω} (V ) by Proposition 2.14 and Proposition 2.15. The Beurling case follows analogously.
Remark 5.12. If we set in Theorem 5.11 ω(t) = t 1/(αs) and σ(t) = t 1/s then we obtain that any s-Gevrey vector is a αs-Gevrey function in V . But this is a weaker result than [3,Theorem 1.3]. In particular by Theorem 5.11 we would only obtain that an analytic vector is an α-Gevrey function in V . This reflects the difference in the definition of the ultradifferentiable scales: In section 4 we have defined the weight sequences M λ of the scale generated by ζ by m λ k = exp •ζ λ (k), i.e. M λ k = k!(exp •ζ λ (k)), whereas the definition of the scale associated to a weight function in this section corresponds to M λ k = exp •ζ λ (k). By Proposition 5.2 the two definitions are essentially equivalent when the weight function satisfies (Ξ). For the moment we may call a scale (M λ ) λ of semiregular weight sequences weak if it is defined via the sequences M λ k = exp •ζ λ (k). 2 On the other hand, we might say that the scales from Section 4, i.e. those given by k! exp •ζ λ (k), are strong.
We observe that Theorem 5.11 shows that it would make a big difference if we would have used weak scales in Section 4: As we pointed out above, for the Gevrey scale it would mean that we could only prove a weaker version of [3,Theorem 1.3], and we would prove the Roumieu version of Proposition 1.2 but not the Beurling version. We note also that in this situation the scales (B j,σ ) σ are not recognized under the framework of weak scales, as the fact from above that analytic vectors might only be Gevrey functions indicates.
On the other hand, if we consider scales that are larger than the Gevrey scales there is not much difference. In the case of the scales (L q,r ) q we have already noted that for the proof of Theorem 4.7 for the matrices Q r (and therefore for the weight matrix R) there is no real difference if we use the scale (L q,r ) q or the scale (N q,r ) q given by N q,r k = q k r . In fact, we have the following variant of Corollary 4.6: Corollary 5.13. Let q > 1, 1 < r ≤ 2 and P be as in Theorem 4.5 and suppose that u is an [N q,r ]-vector.
Remark 5.14. In order to decide which kind of scales should be used for studying the regularity of vectors of a given operator, one can, in the case of operators which have been already studied, look at the regularity of Gevrey vectors. The technical reason why strong scales are advantageous for the study of vectors of operators of principal type is the factor (k!) −δ in the main estimate (4.3), cf. the estimates before Theorem 4.5. For another example using the definition from section 4 see Subsection 6A.
On the other hand, in the case of Hörmander's sum of squares operators, introduced in [34], there is some r > 1 depending on the operator such that s-Gevrey vectors are rs-Gevrey vectors and these results are strict, see [52], [18] and also the survey in [23]. A similar result was obtained for some class of locally integrable structures of corank one in [20]. Hence in these two instances weak scales are more appropriate for the study of ultradifferentiable vectors.
We can try to analyze these examples to find some general conditions which can help to decide which kind of scales to use for the study of vectors of a given operator or system of operators. It seems that two properties play an important role: subellipticity and that analytic vectors are analytic. We have seen that hypoelliptic operators of principal type satisfy both conditions (as the systems of vector fields from Subsection 6A do). In contrast, the sums of squares operator of Hörmander are subelliptic but there are analytic vectors which are not analytic functions, see [52] and also [18]. The analytic vectors of the locally integrable structures considered in [20] are analytic but locally integrable structures are in general not subelliptic, cf. [41]. In this case we refer also to the discussion in [20, Section 10].
The main result of [34] is that if the system X = {X 1 , . . . , X ℓ } is of finite type then X is hypoelliptic. In the case of analytic vector fields [22] showed that the finite type condition is also necessary for smooth hypoellipticity.
In [31] it was proven that if the family X = {X 1 , . . . , X ℓ } is analytic and of finite type then ℓ j=1 A(U ; X j ) = A(U ).
For Gevrey vectors [21] showed that if the collection of analytic X 1 , . . . , X ℓ is of finite type of order ν and generates a stratified nilpotent Lie algebra G of rank ν, i.e.
The theory of ultradifferentiable scales from Section 4 allows us to generalize the result of [21]: Theorem 6.1. Let X = {X 1 , . . . , X ℓ } be a family of analytic, real-valued vector fields on U ⊆ R n that is of finite type of order ν and generates a stratified Lie algebra of rank ν.
(1) If (M λ ζ ) λ is a fitting scale then for every λ ∈ Λ there is some Proof. If (1) holds then we can without loss of generality assume that C ∈ N. Hence (6.1) gives Since M 2 k ≤ M 2k by (2.1) we have proven (2) for q = C, A = e C and γ = H 2C . On the other hand (2) implies that We denote by M = {M (λ) : λ > 0} resp. N = {N (λ) : λ > 0} the weight matrix associated to ω M resp. ω N . It is easy to show that M = M (1) (see for example the proof of [61,Theorem 6.4]) and observe also that for all q ∈ N and k ∈ N 0 . Therefore from (2) we obtain for all t > 0 and k ∈ N 0 . Hence by definition We recall that ω N (λ) ∼ ω N , more precisely we have Hence (1) is proven with H = √ γ 1 and C = max{4q, 2q log(A 1 ) + D q }.
Corollary 6.4. Let M be a weight sequence. Then the following are equivalent: (1) The associated weight function ω M satisfies (Ξ).
(2) There is a positive integer p ∈ N and constants A, B > 0 such that holds for all k ∈ N 0 .
Note that in (6.2) we can assume that p ≥ 2, because p = 1 would yield that sup k M 1/k k < ∞. It is a natural question to ask if there is a weight sequence M such that (6.2) and E [M] (U ) = E [ωM] (U ). However, according to [16], a necessary condition for the last identity is for M to be of moderate growth, i.e. there is a constant γ > 0 such that for all j, k ∈ N 0 . Lemma 6.5. A weight sequence M does not satisfy simultaneously (6.2) and (6.3).
Proof. Assume that both (6.2) and (6.3) hold for M. Then (6.3) implies that where p is the integer from (6.2). Hence we have, if we combine the estimate above with (6.2), that It follows that sup k (M k ) 1/k < ∞ and therefore M is not a weight sequence.
(2) The sequence M given by M 0 = 1 and M k = e e k , k ∈ N, satisfies (6.2) with p = 8 because we have the estimate e e k 16 = e 16e k ≤ e e 8k since 4 + k ≤ 8k for all k ∈ N.
6C. Families of weight functions: An example. Let P be again a hypoelliptic operator of principal type with analytic coefficients in an open set U ⊆ R n and consider Ω = {ω s : s > 0}, where ω s (t) = (max{0, log(t))}) s is the weight function from Example 2.11. Then by Theorem 1.1 we know that E [ωs] (U ; P ) = E [ωs] (U ), but analogously to the case of weight matrices, i.e. families of weight sequences, we can also consider the spaces associated to Ω, i.e. we define and also We have Proposition 6.7. If P is a hypoelliptic analytic operator of principal type then On the other hand, if u ∈ E {Ω} (U ; P ) then for all V ⋐ U there is some s > 1 such that However, it turns out that we have already encountered the spaces E [Ω] : Theorem 6.8. Let P as above and R be as in Example 2.6(2). Then The first equality follows from a more general theorem in [61]. In order to state that theorem we need to recall some notations. If M is a weight matrix we denote by Ω M = {ω M : M ∈ M} the family of weight functions associated to M. Similarly to above we can define the spaces of ultradifferentiable functions associated to Ω M : We consider the following conditions In [61] the following result was shown: Proof of Theorem 6.8. We recall from Example 2.11 that the weight matrix associated to ω s , s > 1, is W s = {N q,r : q > 1}, where r = s/(s − 1) and N q,r = q k r . Then ω N q,r ∼ ω s for all s > 0 and q > 0 by [55,Lemma 5.7]. Note also that by Proposition 5. On the other hand, for any h > 0 and r > 1 we can choose r ′ > 1 and D > 0 large enough such that It follows that R satisfies (6.6) and (6.7).
Hence 6D. A characterization of ellipticity by non-Gevrey vectors. The aim of this section is to prove Theorem 1.4. We begin with noticing two easy observations, which we will need later on: Lemma 6.10. Let M be a weight sequence and ρ, R ≥ 1. Then for all j, k, ℓ ∈ N 0 .
Before we can begin with the proof of Theorem 1.4 we also need to take a closer look at the scale (N q ) q , given by N q k = q k 2 , specifically. Recall that N = {N q : q > 1} is the weight matrix associated to ω 2 (t) = (max{0; log t}) 2 . More precisely, ϕ 2 (t) = ω 2 • exp(t) = t 2 and ϕ * 2 (t) = t 2 /4. Hence the canonical weight matrix W 2 = {W 2,ρ : ρ > 0} associated to ω 2 is given by W 2,ρ k = exp(ρk 2 /4), cf. [57,Section 5.5]. Thus it is convenient to set λ = log q and to write in a slight abuse of notation N λ = N q . It follows that N λ = W 2,4λ and therefore Lemma 6.11 implies that Now observe that (N λ ) λ is the weak scale associated to the generating function ζ(t, λ) = λt 2 , which clearly can be extended to an entire function ζ(z, λ) in the first variable. Hence θ(z, λ) = exp •ζ(z, λ) is holomorphic in z and when λ is fixed we have that for every strip G = {w = u + iv ∈ C : a < u < b} there is a constant C > 0 such that |θ(z, λ)| ≤ Ce −|Im z| 2 . It follows that θ( . , λ) is the Mellin transform of the function see e.g. [65] or [51]. In particular In order to prove Theorem 1.4 it is enough to show the following statement.
Theorem 6.12. Let P be a differential operator with analytic coefficients on U which is not elliptic at some point x 0 ∈ U . Then we have for all λ > 0.
Proof. It is sufficient for given λ > 0 to construct a function u which is an {N λ }-vector of P but is not in E {N λ } (U ). In order to do so we shall try to follow the pattern of the proof of [52,Theorem 2.3]. From now on let λ > 0 be arbitrary but fixed and choose parameters ε, λ ′ > 0 and 0 < λ 0 < λ depending on λ which will later be specified. Since P is not elliptic at x 0 there exists some ξ 0 ∈ S n−1 such that p d (x 0 , ξ 0 ) = 0. (6.10) Let δ > 0 be such that B 0 = {x ∈ R n : |x − x 0 | < 2δ} ⋐ U and let ψ ∈ E {N λ 0 } (R n ) be such that supp ψ ⊆ {x ∈ R n : |x| < 2δ} and ψ(x) = 1 for |x| ≤ δ. Thus there are constants C 0 , h 0 > 0 such that for all x ∈ R n . This is possible since N τ is non-quasianalytic for any τ > 0.

It follows that
where D ξ0 = −i ∂ ∂ξ0 is the directional derivative in direction ξ 0 . Thence (6.9) implies that Since 1 0 t k Θ(t, λ ′ ) dt → 0 when k → ∞ we have shown that u cannot be of class {N τ } in any neighborhood of x 0 for all τ < λ ′ .
On the other hand, it is easy to see that where Q k is defined recursively by Since P is analytic in U we have that there is a constant H > 0 such that for all ν, α ∈ N n 0 with |α| ≤ d, all x ∈ B 2δ (x 0 ) and all t ≥ 1: D ν x ∂ α ξ p(x, tξ 0 ) ≤ H |ν|+1 |ν|!t d−|α| , (6.12) and due to (6.10) for all 0 < ε < 1 there is C 1 > 0 such that for all t > 0 and all x ∈ U with |x − x 0 | ≤ 2δt −ε : p x, tξ 0 ≤ C 1 t d−ε . (6.13) Using the above estimates (6.12) and (6.13) together with Lemma 6.10 it is easy to see that we can adapt the proof of [52, Lemme 2.1] and therefore obtain the following statement.

40
If for fixed λ > 0 we choose the parameters 0 < λ 0 < λ and ε such that The integral converges as long as The proof of Theorem 6.12 is complete if we put additionally λ ′ > λ.
Appendix A. Subelliptic estimates The aim of this appendix is to indicate how (3.2) implies (3.2 ′ ). Following [48] we introduce the local Sobolev space H σ (V ), σ ∈ R, over an arbitrary open set V ⊆ R n as the quotient space H σ (V ) = H σ (R n )/F σ (V ), where F σ (V ) is the space of all functions f ∈ H σ (R n ) which vanish on V . Clearly F σ (V ) is a closed subspace of H σ (R n ) hence H σ (V ) is a Hilbert space with the structure inherited from H σ (R n ).
Remark A.1. It is easy to see that H 0 (V ) = L 2 (V ) for all open sets V ⊆ R n . However, if we consider the classical Sobolev space W k (V ) = f ∈ L 2 (V ) : ∂ α f ∈ L 2 (V ) ∀ |α| ≤ k then we cannot conclude in general that W k (V ) = H k (V ) for k ∈ N, unless V is a relatively compact set in R n with smooth boundary.
We denote the (quotient) norm of H σ (V ) by . for all ϕ ∈ D(V ) and some constant C > 0 independent of ϕ. If we multiply all of the coefficients of the operator P j with a test function χ ∈ D(U ) satisfying χ| V = 1 we may assume that the operator P j is a continuous mapping from the space H σ (R n ) into H σ−dj (R n ) for all σ. This clearly does not change the value of P j ϕ σ when ϕ ∈ D(V ) or of P j g H σ (V ) when g ∈ H σ loc (U ). Therefore the mapping P j : H ∞ (R n ) → H ∞ (R n ), where H ∞ (R n ) = proj σ H σ (R n ), is also continuous. Moreover, observe that F ∞ (V ) = σ F σ (V ) is closed in H ∞ (R n ). Similarly, H ∞ (V ) = proj σ H σ (V ) is a Fréchet space and P j is a continuous automorphism on H ∞ (V ) since P j is a local operator.
We recall that we want to show that (A.1) implies for all g ∈ E(U ). Proof. For each g ∈ H ∞ loc (U ) = E(U ) we have to find a sequence ϕ j ∈ D(B) such thatφ j = ϕ j + F ∞ (B) converges toġ = ι B (g) in H ∞ (B). A representative ofġ is given by χg where χ ∈ D(U ) with χ| B = 1. We choose two sequences (K j ) j , (L j ) j of compact subsets of U with the following properties: • K j ⊆ B and dist(K j , U \ B) → 0 when j → ∞.