Anisotropic, Mixed-Norm Lizorkin--Triebel Spaces and Diffeomorphic Maps

This article gives general results on invariance of anisotropic Lizorkin--Triebel spaces with mixed norms under coordinate transformations on Euclidean space, open sets and cylindrical domains.


Introduction
This paper continues a study of anisotropic Lizorkin-Triebel spaces F s, a p,q (R n ) with mixed norms, which was begun in [JS07,JS08] and followed up in our joint work [JHS12].
First Sobolev embeddings and completeness of the scale F s, a p,q (R n ) were established in [JS07], using the Nikol'skiȋ-Plancherel-Polya inequality for sequences of functions in the mixed-norm space L p (R n ), which was obtained straightforwardly in [JS07]. Then a detailed trace theory for hyperplanes in R n was worked out in [JS08], e.g. with the novelty that the well-known borderline s = 1/p has to be shifted upwards in some cases, because of the mixed norms.
Secondly, our joint paper [JHS12] presented some general characterisations of F s, a p,q (R n ), which may be specialised to kernels of local means, in Triebel's sense [Tri92]. One interest of this is that local means have recently been useful for obtaining wavelet bases of Sobolev spaces and especially of their generalisations to the Besov  In the present paper, we treat the invariance of F s, a p,q under coordinate changes. During the discussions below, the results in [JHS12] are crucial for the entire strategy.
Indeed, we address the main technical challenge to obtain invariance of F s, a p,q (R n ) under the map when σ is a bounded diffeomorphism on R n . (Cf. Theorem 4, 5 below.) Not surprisingly, this will require the condition on σ that it only affects blocks of variables x j in which the corresponding integral exponents p j are equal, and similarly for the anisotropic weights a j . Moreover, when estimating the operator norm of f → f • σ, i.e. obtaining the inequality the Fourier analytic definition of the spaces seems difficult to manage directly, so as done by Triebel [Tri92] we have chosen to characterise F s, a p,q (R n ) in terms of local means, as developed in [JHS12].
However, the diffeomorphism invariance relies not just on the local means, but first of all also on techniques underlying them. In particular, we use the following inequality for the maximal function ψ * j f (x) of Peetre-Fefferman-Stein type, which was established in [JHS12, Th. 2] for mixed norms and with uniformity with respect to a general parameter θ: Hereby the 'cut-off' functions ψ j , ϕ j should fulfill a set of Tauberian and moment conditions; cf. Theorem 1 below for the full statement. In the isotropic case this inequality originated in a well-known article of Rychkov [Ryc99a], which contains a serious flaw (as pointed out in [Han10]); this and other inaccuracies were corrected in [JHS12]. A second adaptation of Triebel's approach is caused by the anisotropy a we treat here. In fact, our proof only extends to e.g. s < 0 by means of the unconventional lift operator Moreover, to cover all a = (a 1 , . . . , a n ), especially to allow irrational ratios a j /a k , we found it useful to invoke the corresponding pseudo-differential operators (1−∂ 2 j ) µ = OP((1+ξ 2 j ) µ ) that for µ ∈ R are shown here to be bounded F s, a p,q (R n ) → F s−2a j µ, a p,q (R n ) for all s. Local versions of our result, in which σ is only defined on subsets of R n , are also treated below. In short form we have e.g. the following result (cf. Theorem 6 below): Theorem. Let U, V ⊂ R n be open and let σ : U → V be a C ∞ -bijection on the form σ(x) = (σ ′ (x 1 , . . . , x n−1 ), x n ). When f ∈ F s, a p,q (V ) has compact support and all p j are equal for j < n, and similarly for the a j , then f • σ ∈ F s, a p,q (U) and f • σ |F s, a p,q (U) ≤ c(supp f, σ) f |F s, a p,q (V ) .
This is useful for introduction of Lizorkin-Triebel spaces on cylindrical manifolds. However, this subject is postponed to our forthcoming paper [JHS]. (Already this part of the mixed-norm theory has seemingly not been elucidated before). Moreover, in [JHS] we also carry over trace results from [JS08] to spaces over a smooth cylindrical domain in Euclidean space e.g. by analysing boundedness and ranges for traces on the flat and curved parts of its boundary.
To elucidate the importance of the results here and in [JHS], we recall that the F s, a p,q are relevant for parabolic differential equations with initial and boundary value conditions: when solutions are sought in a mixed-norm Lebesgue space L p (in order to allow different properties in the space and time directions), then F s, a p,q -spaces are in general inevitable for a correct description of non-trivial data on the curved boundary.
This conclusion was obtained in works of P. Weidemaier [Wei98,Wei02,Wei05], who treated several special cases; one may also consult the introduction of [JS08] for details.
Contents. Section 2 contains a review of our notation, and the definition of anisotropic Lizorkin-Triebel spaces with mixed norms is recalled, together with some needed properties, a discussion of different lift operators and a pointwise multiplier assertion.
In Section 3 results from [JHS12] on characterisation of F s, a p,q -spaces by local means are recalled and used to prove an important lemma for compactly supported elements in F s, a p,q . Sufficient conditions for f → f • σ to leave the spaces F s, a p,q (R n ) invariant for all s ∈ R are deduced in Section 4, when σ is a bounded diffeomorphism. Local versions for spaces on domains are derived in Section 5 together with isotropic results.
For 0 < p ≤ ∞ the space L p (R n ) consists of all Lebesgue measurable functions such that with the modification of using the essential supremum over x j in case p j = ∞. Equipped with this quasi-norm, L p (R n ) is a quasi-Banach space (normed if p j ≥ 1 for all j). Furthermore, for 0 < q ≤ ∞ we shall use the notation L p (ℓ q )(R n ) for the space of all sequences {u k } ∞ k=0 of Lebesgue measurable functions u k : R n → C such that with supremum over k in case q = ∞. This quasi-norm is often abbreviated to u k |L p (ℓ q ) ; and when p = (p, . . . , p) we simplify L p to L p . If max(p 1 , . . . , p n , q) < ∞ sequences of C ∞ 0functions are dense in L p (ℓ q ).
Generic constants will primarily be denoted by c or C and when relevant, their dependence on certain parameters will be explicitly stated. B(0, r) stands for the ball in R n centered at 0 with radius r > 0, and U denotes the closure of a set U ⊂ R n .
2.2. Anisotropic Lizorkin-Triebel Spaces with Mixed Norms. The scales of mixednorm Lizorkin-Triebel spaces refines the scales of mixed-norm Sobolev spaces, cf. [JS08, Prop. 2.10], hence the history of these spaces goes far back in time; the reader is referred to [JHS12,Rem. 2.3] and [JS07, Rem. 10] for a brief historical overview, which also list some of the ways to define Lizorkin-Triebel spaces.
Our exposition uses the Fourier-analytic definition, but first we recall the definition of the anisotropic distance function | · | a , where a = (a 1 , . . . , a n ) ∈ [1, ∞[ n , on R n and some of its properties. Using the quasi-homogeneous dilation t a x := (t a 1 x 1 , . . . , t an x n ) for t ≥ 0, |x| a is for x ∈ R n \ {0} defined as the unique t > 0 such that t − a x ∈ S n−1 (|0| a := 0), i.e.
Definition 1. The Lizorkin-Triebel space F s, a p,q (R n ) with s ∈ R, 0 < p < ∞ and 0 < q ≤ ∞ consists of all u ∈ S ′ (R n ) such that The number q is called the sum exponent and the entries in p are integral exponents, while s is a smoothness index. Usually the statements are valid for the full ranges 0 < p < ∞, 0 < q ≤ ∞, so we refrain from repeating these. Instead we focus on whether s ∈ R is allowed or not. In the isotropic case, i.e. a = (1, . . . , 1), the parameter a is omitted.
We shall also consider the closely related Besov spaces, recalled using the abbreviation Definition 2. The Besov space B s, a p,q (R n ) consists of all u ∈ S ′ (R n ) such that In [JS07,JS08] many results on these classes are elaborated, hence we just recall a few facts. They are quasi-Banach spaces (Banach spaces if min(p 1 , . . . , p n , q) ≥ 1) and the quasi-norm is subadditive, when raised to the power d := min(1, p 1 , . . . , p n , q), Also the spaces do not depend on the chosen anisotropic decomposition of unity (up to equivalent quasi-norms) and there are continuous embeddings where S is dense in F s, a p,q for q < ∞. Since for λ > 0, the space F s, a p,q coincides with F λs,λ a p,q , cf. [JS08, Lem. 3.24], most results obtained for the scales when a ≥ 1 can be extended to the case 0 < a < 1 (for details we refer to [JHS12, Rem. 2.6]).
The subspace L 1,loc (R n ) ⊂ D ′ (R n ) of locally integrable functions is equipped with the Fréchet space topology defined from the seminorms u → |x|≤j |u(x)| dx, j ∈ N. By C b (R n ) we denote the Banach space of bounded, continuous functions, endowed with the sup-norm.
Lemma 1. Let s ∈ R and α ∈ N n 0 be arbitrary.
The embedding F s, a p,q ֒→ C b (R n ) holds true for s > a 1 p 1 + · · · + an pn . Proof. For part (i) the reader is referred to [JS08,Lem. 3.22], where a proof using standard techniques for F s, a p,q is indicated (though the reference should have been to Proposition 3.13 instead of 3.14 there).
Part (ii) is obtained from the Nikol'skij inequality, cf. [JS07, Cor. 3.8], which allows a reduction to the case in which p j ≥ 1 for j = 1, . . . , n, while s > 0; then the claim follows from the embedding F 0, a p,1 ֒→ L 1,loc . Part (iii) follows at once from [JS08,(3.20)]. A local maximisation over a ball can be estimated in L p , at least for functions in certain subspaces of C b (R n ); cf. Lemma 1(iii): Lemma 2 ( [JHS12]). When C > 0 and s > n l=1 a l min(p 1 ,...,p l ) , then Next we extend a well-known embedding to the mixed-norm setting. Let C ρ * (R n ) denote the Hölder class of order ρ > 0, which by definition consists of all u ∈ C k (R n ) satisfying whereby k is the integer satisfying k < ρ ≤ k + 1.
u |B s, a ∞,∞ = sup The expressions in the Besov norm are for j ≥ 1 estimated using that F −1 Φ has vanishing moments of arbitrary order, Using a Taylor expansion of order k − 1 with k ∈ N chosen such that k < ρ ≤ k + 1 (or directly if k = 0), we get an estimate of the parenthesis by Now we obtain, since a ≥ 1, This bound can also be used for j = 0, if c ρ is large enough, so (17) holds for ρ ≥ s.
As a tool we also need to know the mapping properties of certain Fourier multipliers λ(D)u := F −1 (λ(ξ)û(ξ)). For generality's sake, we give Proposition 1. When λ ∈ C ∞ (R n ) for some r ∈ R has finite seminorms of the form Proof. The quasi-homogeneity of | · | a yields that |D α λ(ξ)| ≤ cC α (λ)(1 + |ξ| a ) r− a·α , hence every derivative is of polynomial growth, cf. (10), so λ(D) is a well-defined continuous map on S ′ . Boundedness follows as in the proof of [JS08, Prop. 3.15], mutatis mutandis. In fact, only the last step there needs an adaptation to the symbol λ(ξ), but this is trivial because finitely many of the constants C α (λ) can enter the estimates.

Lift Operators.
The invariance under coordinate transformations will be established below using a somewhat unconventional lift operator Λ r , r ∈ R, To apply Proposition 1, we derive an estimate uniformly in j ∈ N 0 and over the set 1 4 ≤ |ξ| a ≤ 4: while the mixed derivatives vanish, the explicit higher order chain rule in Appendix A yields k=n 1 +n 2 α l =n 1 +2n 2 (2(2 ja l ξ l )) n 1 2 n 2 < ∞.
(23) Indeed, the precise summation range gives α l = n 1 + 2(k − n 1 ), so the harmless power 2 n 1 +n 2 results. (Note that this means that |D α λ r (2 j a ξ)| ≤ C α 2 j(r− a·α) .) Now λ r (ξ) has no zeros, and for λ r (ξ) −1 it is analogous to obtain such estimates uniformly with respect to j of D α (2 jr λ r (2 j a ξ) −1 ), using Appendix A and the above. So Proposition 1 gives both that Λ r is a homeomorphism on S ′ (although Λ −1 r = Λ −r ) and the proof of In a similar way one also finds the next auxiliary result.
A standard choice of an anisotropic lift operator is obtained by associating each ξ ∈ R n with (1, ξ) ∈ R 1+n , which is given the weights (1, a), and by setting This is in C ∞ , as | · | (1, a) is so outside the origin. (Note the analogy to ξ = 1 + |ξ| 2 ). Moreover, ∂ α ξ t a is for each t ∈ R estimated by powers of |ξ|, cf. [Yam86, Lem. 1.4]. Therefore there is a linear homeomorphism Ξ t a : S ′ → S ′ given by Ξ t a u : In our mixed-norm set-up it is a small exercise to show that it restricts to a homeomorphism Indeed, invoking Proposition 1, the task is as in (23) to show a uniform bound, and using the elementary properties of ξ a (cf. [Yam86, Lem. 1.4]) one finds for t − a · α ≥ 0, When t − a · α ≤ 0, then |ξ| t− a·α a is the outcome on the right-hand side. But the uniformity results in both cases, since the estimates pertain to 1 4 ≤ |ξ| a ≤ 4. We digress to recall that the classical fractional Sobolev space H s, a p (R n ), for s ∈ R and 1 < p < ∞, consists of the u ∈ S ′ for which Ξ s a u ∈ L p (R n ); with u |H s, a p := Ξ s a u |L p . If m k := s/a k ∈ N 0 for all k, then H s, a p coincides (as shown by Lizorkin [Liz70] This characterisation is valid for F s, a p,2 with 1 < p < ∞ in view of the identification which by use of Ξ s reduces to the case L p = F 0, a p,2 . The latter is a Littlewood-Paley inequality that may be proved with general methods of harmonic analysis; cf. [JS08, Rem. 3.16].
A general reference on mixed-norm Sobolev spaces is the classical book of Besov, Ilin and Nikolskii [BIN79,BIN96]. Schmeisser and Triebel [ScTr87] treated F s, a p,q for n = 2.
Remark 1. Traces on hyperplanes were considered for H s, a p (R n ) by Lizorkin [Liz70] and for W m p (R n ) by Bugrov [Bug71], who raised the problem of traces at {x j = 0} for j < n. This was solved by Berkolaiko, who treated traces in the F s, a p,q (R n )-scales for 1 < p < ∞ in e.g. [Ber85]. The range 0 < p < ∞ was covered on R n for j = 1 and j = n in [JS08], and in our forthcoming paper [JHS] we carry over the trace results to F s, a p,q -spaces over a smooth cylindrical domain Ω×]0, T [. Remark 2. We take the opportunity to correct a minor inaccuracy in [JS08], where a lift operator (also) called Λ r unfortunately was defined to have symbol (1 + |ξ| 2 a ) r/2 . However, it is not in C ∞ (R n ) for a = (1, . . . , 1); this can be seen from the example for n = 2 with a = (2, 1) where [Yam86, Ex. 1.1] gives the explicit formula Here an easy calculation shows that D ξ 1 |ξ| 2 a is discontinuous along the line (ξ 1 , 0), which is inherited by the symbol e.g. for r = 2. The resulting operator is therefore not defined on all of S ′ . However, this is straightforward to avoid by replacing the lift operator in [JS08] by the better choice Ξ r given in (26). This gives the space H s, a p (R n ) in (28). 2.4. Paramultiplication. This section contains a pointwise multiplier assertion for the F s, a p,q -scales. We consider the densely defined product on S ′ × S ′ , introduced in [Joh95, Def. 3.1] and in an isotropic set-up in [RS96,Ch. 4], which is considered for those pairs (u, v) in S ′ ×S ′ for which the limit on the right-hand side exists in D ′ and is independent of ψ. Here ψ ∈ C ∞ 0 is the function used in the construction of the Littlewood-Paley decomposition (in principle the independence should be verified for all ψ ∈ C ∞ 0 equalling 1 near the origin; but this is not a problem here). To illustrate how this product extends the usual one, and to prepare for an application below, the following is recalled: Joh95]). When f ∈ C ∞ (R n ) has derivatives of any order of polynomial growth, and when g ∈ S ′ (R n ) is arbitrary, then the limit in (30) exists and equals the usual product f · g, as defined on C ∞ × D ′ .
Using this extended product, we introduce the usual space of multipliers equipped with the induced operator quasi-norm As Lemma 3 at once yields , the next result is in particular valid for u ∈ C ∞ L∞ : Lemma 7. Let s ∈ R and take s 1 > s such that also Proof. The proof will be brief as it is based on standard arguments from paramultiplication, cf. [Joh95] and [RS96, Ch. 4] for details. In particular we shall use the decomposition The exact form of this can also be recalled from the below formulae. In terms of the Littlewood-Paley partition 1 = ∞ j=0 Φ j (ξ) from Definition 1, we set Ψ j = Φ 0 + · · · + Φ j for j ≥ 1 and Ψ 0 = Φ 0 . These are used in Fourier multipliers, now written with upper indices as u j = F −1 (Ψ j u).
Note first that s 1 > 0, whence B s 1 , a ∞,∞ ֒→ L ∞ , which is useful since the dyadic corona criterion for F s, a p,q , cf. [JS08, Lem. 3.20], implies the well-known simple estimate Furthermore, since using the dyadic ball criterion for F s, a p,q , cf. [JS08, Lem. 3.19], we find that To estimate Π 3 (u, v) we first consider the case s > 0 and pick t ∈ ]s, s 1 [ . The dyadic corona criterion together with the formula v j = v 0 + · · · + v j and a summation lemma, which exploits that t − s 1 < 0 (cf. [Yam86,Lem. 3.8]), give Since t − s 1 < 0 < s implies F s, a p,q ֒→ F t−s 1 , a p,q , and also F t, a p,q ֒→ F s, a p,q holds, the above yields For s ≤ 0 the procedure is analogous, except that (39) is derived for t ∈ ]0, s 1 + s[ , which is non-empty by assumption (33) on s; then standard embeddings again give (40).
In closing, we remark that as required the product u·v is independent of the test function ψ appearing in the definition. Indeed for q < ∞ this follows from Lemma 6, which gives the coincidence between this product on S ′ × S and the usual one, hence by density of S, cf. (14), and the above estimates, the map v → u · v extends uniquely by continuity to all g ∈ F s, a p,q . For q = ∞ the embedding F s, a p,∞ ֒→ F s−ε, a p,1 for ε > 0 yields the independence using the previous case.

Characterisation by Local Means
Characterisation of Lizorkin-Triebel spaces F s p,q by local means is due to Triebel, [Tri92, 2.4.6], and it was from the outset an important tool in proving invariance of the scale under diffeomorphisms. An extensive treatment of characterisations of mixed-norm spaces F s, a p,q in terms of quasi-norms based on convolutions, in particular the case of local means, was given in [JHS12], which to a large extent is based on extensions to mixed norms of inequalities in [Ryc99a]. For the reader's convenience we recall the needed results.
Throughout this section we consider a fixed anisotropy a ≥ 1 with a := min(a 1 , . . . , a n ) and functions ψ 0 , ψ ∈ S(R n ) that fulfil Tauberian conditions in terms of some ε > 0 and/or a moment condition of order M ψ ≥ −1 (M ψ = −1 means that the condition is void), Note by (10) that in case (41) is fulfilled for the Euclidean distance, it holds true also in the anisotropic case, perhaps with a different ε. We henceforth change notation, from (12), to which gives rise to the sequence (ψ j ) j∈N 0 . The non-linear Peetre-Fefferman-Stein maximal operators induced by (ψ j ) j∈N 0 are for an arbitrary vector r = (r 1 , . . . , r n ) > 0 and any f ∈ S ′ (R n ) given by (dependence on a and r is omitted) Later we shall also refer to the trivial estimate Finally for an index set Θ, we consider ψ θ,0 , ψ θ ∈ S(R n ), θ ∈ Θ, where the ψ θ satisfy (43) for some M ψ θ independent of θ ∈ Θ, and also ϕ 0 , ϕ ∈ S(R n ) that fulfil (41)-(42) in terms of an ε ′ > 0. Setting ψ θ,j (x) = 2 j| a| ψ θ (2 j a x) for j ∈ N, we can state the first result relating different quasi-norms.
Definition 3. Let U ⊂ R n be open. The space F s, a p,q (U) is defined as the set of all u ∈ D ′ (U) such that there exists a distribution f ∈ F s, a p,q (R n ) satisfying We equip F s, a p,q (U) with the quotient quasi-norm u |F s, a p,q (U) = inf r U f =u f |F s, a p,q (R n ) ; it is normed if p, q ≥ 1.
In (51) it is tacitly understood that on the left-hand side ϕ is extended by 0 outside U. For this we henceforth use the operator notation e U ϕ. Likewise r U denotes restriction to U, whereby u = r U f in (51).
The Besov spaces B s, a p,q (U) on U can be defined analogously. The quotient norms have the well-known advantage that embeddings and completeness can be transferred directly from the spaces on R n . However, the spaces are probably of little interest, if ∂U does not satisfy some regularity conditions, because we then expect (as in the isotropic case) that they do not coincide with those defined intrinsically. holds for some f ∈ F s, a p,q (U) with compact support, then f |F s, a p,q (U) = e U f |F s, a p,q (R n ) .
In other words, the infimum is attained at e U f for such f .
Proof. For any other extensionf ∈ S ′ (R n ) the difference g =f − e U f is non-zero in S ′ (R n ) and supp e U f ∩ supp g = ∅. So by the properties of r, Since g = 0 there is some j such that supp(k j * g) = ∅, hence k j * g(x) = 0 on an open set disjoint from supp(k j * e U f ). This term therefore effectively contributes to the L p -norm in (50) and thus f |F s, a p,q = e U f + g |F s, a p,q > e U f |F s, a p,q , which shows (53).

Invariance under Diffeomorphisms
The aim of this section is to show that F s, a p,q (R n ) is invariant under suitable diffeomorphisms σ : R n → R n and from this deduce similar results in a variety of set-ups.

Bounded Diffeomorphisms.
A one-to-one mapping y = σ(x) of R n onto R n is here called a diffeomorphism if the components σ j : R n → R have classical derivatives D α σ j for all α ∈ N n . We set τ (y) = σ −1 (y).
Recall that for a bounded diffeomorphism σ and a temperate distribution f , the composition f • σ denotes the temperate distribution given by It is continuous S ′ → S ′ as the adjoint of the continuous map ψ → ψ • τ | det Jτ | on S: since | det Jτ | is in C ∞ L∞ , continuity on S can be shown using the higher-order chain rule to estimate each seminorm q N,α (ψ • τ ), cf. (5), by |β|≤|α| q N,β (ψ) (changing variables, σ(·) can be estimated using the Mean Value Theorem on each σ j ).
We need a few further conditions, due to the anisotropic situation: one can neither expect f • σ to have the same regularity as f , e.g. if σ is a rotation; nor that f • σ ∈ L p when f ∈ L p . On these grounds we first restrict to the situation in which a 0 := a 1 = a 2 = . . . = a n−1 , p 0 := p 1 = . . . = p n−1 and (60) To prepare for Theorem 4 below, which gives sufficient conditions for the invariance of F s, a p,q under bounded diffeomorphisms of the type (60), we first show that it suffices to have invariance for sufficiently large s: Proposition 2. Let σ be a bounded diffeomorphism on R n on the form in (60). When (59) holds and there exists s 1 ∈ R with the property that f → f • σ is a linear homeomorphism of F s, a p,q (R n ) onto itself for every s > s 1 , then this holds true for all s ∈ R.
Proof. It suffices to prove for s ≤ s 1 that with some constant independent of f , as the reverse inequality then follows from the fact that the inverse of σ is also a bounded diffeomorphism with the structure in (60). First r > s 1 − s + 2a n is chosen such that d 0 := r 2a 0 is a natural number. Setting d n = r 2an and taking µ ∈ [0, 1[ such that d n − µ ∈ N, we have that r µ := r − 2µa n > s 1 − s. Now Lemma 4 yields the existence of h ∈ F s+r, a p,q Setting g 1 = (1 − ∂ 2 xn ) µ h • σ and g 0 = h • σ, we may apply the higher-order chain rule to e.g. h = g 0 • τ (using denseness of S in S ′ and the S ′ -continuity of composition in (58), Appendix A extends to S ′ ). Taking into account that τ (x) = (τ ′ (x ′ ), x n ), and letting prime indicate summation over multi-indices with β n = 0, where η n,l := (−1) l dn−µ l and the η k,β are functions containing derivatives at least of order 1 of τ , and these can be estimated, say by c 1≤m≤2d 0 ∂ m x k τ 2d 0 . Composing with σ and applying Lemma 1(i) gives for d := min(1, q, p 0 , p n ), when · denotes the F s, a p,q -norm, According to the remark preceding Lemma 7, the last sum is finite because η k,β ∈ C ∞ L∞ . Finally, since s + r µ > s 1 and s + r > s 1 , the stated assumption means that h → g 1 and h → g 0 are bounded, which in view of r µ + 2µa n = r and Lemmas 4-5 yields proving the boundedness of f → f • σ in F s, a p,q for all s ∈ R. In addition to the reduction in Proposition 2, we adopt in Theorem 4 below the strategy for the isotropic, unmixed case developed by Triebel [Tri92,4.3.2], who used Taylor expansions for the inner and outer functions for large s.
While his explanation was rather sketchy, our task is to account for the fact that the strategy extends to anisotropies and to mixed norms. Hence we give full details. This will also allow us to give brief proofs of additional results in Sections 4.2 and 5 below.
To control the Taylor expansions, it will be crucial for us to exploit both the local means recalled in Theorem 3 and the parameter-dependent set-up in Theorem 1. This is prepared for with the following discussion.
The functions k 0 and k in Theorem 3 are for the proof of Theorem 4 chosen (as we may) so that N in the definition of k fulfils s < 2Na and so that both are even functions and The set Θ in Theorem 1 is chosen to be the set of (n − 1) × (n − 1) matrices B = (b i,k ) that, in terms of the constants c σ , C α,σ in (57) and (55), respectively, satisfy Splitting z = (z ′ , z n ), we set g(z) = z ′ γ ′ k(z) for some γ ′ ∈ N n−1 0 (chosen later) and define where θ is identified with A −1 := Jσ ′ (x ′ ), which obviously belongs to Θ (for each x ′ ).
Step 3, we obtain a θ-independent estimate of |γ ′ |, hence of M ψ θ . Moreover, the constant A in Theorem 1 is finite: Basic properties of the Fourier transform give the following estimate, where the constant is independent of A −1 ∈ Θ: To estimate B we exploit that F : B n/2 2,1 (R n ) → L 1 (R n ) is bounded according to Szasz's inequality (cf. [ScTr87, Prop. 1.7.5]) and obtain when m ∈ N is chosen so large that m > M + 1 + n/2. In fact, the last inequality is obtained using the embeddings C m 0 ֒→ H m ֒→ B M +1+n/2 2,1 and the estimate This relies on the higher-order chain rule, cf. Appendix A, and the support of k: it suffices to use the supremum over |α| ≤ m and {y ∈ R n | |Ay ′ | 2 + y 2 n ≤ 1}, and for a point in this set |y ′ | ≤ A −1 |Ay ′ | ≤ c(C σ ), so we need only estimate on an A-independent cylinder.
Replacing k by k 0 in the definition of g and setting ψ θ,0 (y) := g(Ay ′ , y n ), the finiteness of C and D follows analogously. The Tauberian properties follow from k 0 = 0 = k 0 .
Hence all assumptions in Theorem 1 are satisfied, and we are thus ready to prove our main result Theorem 4. If σ is a bounded diffeomorphism on R n on the form in (60), then f → f • σ is a linear homeomorphism F s, a p,q (R n ) → F s, a p,q (R n ) for all s ∈ R when (59) holds.
Proof. According to Proposition 2, it suffices to consider s > s 1 , say for whereby K 0 is the smallest integer satisfying We now let s ∈ ]s 1 , ∞[ be given and take some K ≥ K 0 , i.e. K solving (75), such that (The interval thus defined is non-empty by (75), and the left end point is at least s 1 .) Note that (76) yields that every f ∈ F s, a p,q is continuous, cf. Lemma 1(iii); so are even the derivatives D β f for β = (β 1 , . . . , β n−1 , 0), |β| ≤ K, since s − β · a = s − |β|a 0 > a · 1/ p.
Step 2. Concerning the remainder terms in (80) we exploit (81) to get The exponent in 2 −2jKa 0 is a result of (59) and the chosen Taylor expansion of σ(x+2 −j a z), and since s − 2Ka 0 < 0 the norm of ℓ q is trivial to calculate, whence Now we use that p 1 = . . . = p n−1 to change variables in the resulting integral over R n−1 , with τ ′ denoting (σ ′ ) −1 . Since Lemma 2 in view of (76) applies to ∂ x d f , d = 1, . . . , n − 1, the right-hand side of the last inequality can be estimated, using also Lemma 1(i), by c sup Step 3. To treat the first term in (80), we Taylor expand f (·, x n ), which is in C K (R n−1 ). Setting P (z ′ ) = P 2K−1 (z ′ ) − P 1 (z ′ ), expansion at the vector P 1 (2 −ja ′ z ′ ) gives where y ′ is a vector analogous to that in (80) and satisfies (81), perhaps with another C.
Step 4. Before we estimate (89), it is first observed that all previous steps apply in a similar way to the convolution k 0 * (f • σ) -except in this case there is no dilation, so the ℓ q -norm is omitted and the function ψ θ is replaced by ψ θ,0 .
So, when collecting the terms of the form (89) with finitely many β, γ in both cases (omitting remainders from Steps 2-3), we obtain with two changes of variables and (46), Here we apply Theorem 1 to the family of functions ψ θ,0 , ψ θ with the ϕ j chosen as the Fourier transformed of the system in the Littlewood-Paley decomposition, cf. (11). Estimating |γ|, the ψ θ satisfy the moment condition (43) with M ψ θ := 2N −1−(K −1)(2K −1), which fulfils s < (M ψ θ + 1)a, because of the choice of N in Step 1. So, by applying Theorem 2 and Lemma 1(i), using s − 2|β|a 0 ≤ s − β · a, the above is estimated thus: This proves the necessary estimate for the given s > s 1 .

4.2.
Groups of bounded diffeomorphisms. It is not difficult to see that the proofs in Section 4.1 did not really use that x n is a single variable. It could just as well have been replaced by a whole group of variables x ′′ , corresponding to a splitting x = (x ′ , x ′′ ), provided σ acts as the identity on x ′′ . Moreover, x ′ could equally well have been 'embedded' into x ′′ , that is x ′′ could contain variables x k both with k < j 0 and with k > j 1 when x ′ = (x j 0 , . . . , x j 1 ) (but no interlacing); in particular the changes of variables yielding (84) would carry over to this situation when p j 0 = . . . = p j 1 . It is also not difficult to see that Proposition 2 extends to this situation when a j 0 = . . . = a j 1 (perhaps with several g 1 -terms, each having a value of µ).

Derived results
5.1. Diffeomorphisms on Domains. The strategies of Proposition 2 and Theorem 4 also give the following local version. E.g., for the paraboloid U = { x | x n > x 2 1 + . . . + x 2 n−1 } we may take σ to consist in a rotation around the x n -axis; cf. (60).
Theorem 6. Let U, V ⊂ R n be open and σ : U → V a C ∞ -bijection as in (60). If (59) is fulfilled and f ∈ F s, a p,q (V ) has compact support, then f • σ ∈ F s, a p,q (U) and holds for a constant c depending only on σ and the set supp f . Proof.
Step 1. Let us consider s > s 1 , cf. (74), and adapt the proof of Theorem 4 to the local set-up. We shall prove the statement for the f ∈ F s, a p,q (V ) satisfying supp f ⊂ K ⊂ V for some arbitrary compact set K. First we fix r ∈ ]0, 1[ so small that 6r < min dist(K, R n \ V ), dist(σ −1 (K), R n \ U) . (97) Then, by Lemma 8, we have f •σ |F s, a p,q (U) = e U (f •σ) |F s, a p,q when Theorem 3 is utilised for k 0 , k ∈ S, say so that supp k 0 , supp k ⊂ B(0, r); cf. also (66). Extension by 0 outside U of f • σ is redundant, for it suffices to integrate over x ∈ W := supp(f • σ) + B(0, r). However, to apply the Mean Value Theorem, cf. (80), we extend f by 0 instead, i.e. we consider (77) with integration over |z| ≤ r and with f replaced by e V f .
Since e V f inherits the regularity of f (cf. Lemma 8) and ∂ α σ can be estimated on the compact set W , the proof of Theorem 4 carries over straightforwardly. E.g. one obtains a variant of (84) where | det Jτ ′ (x ′ )| 1/p 0 is estimated over {x ′ |∃x n : (x ′ , x n ) ∈ σ(W )}, and the integration is then extended to R n , which by Lemma 8 yields To estimate the first term in (80) in this local version, the argumentation there is modified as above and the set Θ is chosen to be the set of all (n−1)×(n−1) matrices satisfying (67) with infimum over x ∈ W and (68) with C σ := max 1≤j≤n, |α|=1 sup x∈W |D α σ j (x)|.
Before applying Theorem 1 to the new estimate (90), the integration is extended to R n (using e V f ). Then application of Theorem 1 and Theorem 2 together with Lemma 8 finishes the proof for s > s 1 .
Step 2. For s ≤ s 1 we use Lemma 4 to write e V f = Λ r h for some h ∈ F s+r, a p,q (R n ); hence the identity (62) holds in D ′ (R n ) for e V f and h. Applying r V to both sides and using that it commutes with differentiation on C ∞ 0 , hence on D ′ , we obtain (63) as an identity in D ′ (V ) for the new g 0 := (r V h) • σ and g 1 : Composing with σ yields an identity in D ′ (U), when η k,β • σ is treated using cut-off functions. E.g. we can take χ, χ 1 ∈ C ∞ 0 (U) with χ ≡ 1 on supp(f • σ) + B(0, r) =: W r and supp χ ⊂ W 2r , while χ 1 ≡ 1 on W 3r and supp χ 1 ⊂ W 4r . This entails Using e U on both sides (and omitting R n in the spaces), Lemma 8 and Lemma 7 imply As e U and differentiation commute on E ′ (U) ∋ χ 1 g j , Lemma 1(i) leads to an estimate from above. But Lemma 8 applies since the supports are in W 4r , so with χ 1 := χ 1 • τ we find that the above is less than or equal to c e U (χ 1 g 1 ) |F This shows the local theorem for s ≤ s 1 .
This concise proof has seemingly not been worked out before, so it should be interesting in its own right. E.g. the Taylor expansions make the presence of the β j obvious, and the condition γ = j,β j n β j β j is natural. Also the constants γ!/ n β j ! and (β j )! −n β j lead to easy applications. Clearly ∂ α g(f (x 0 )) is multiplied by a polynomial in the derivatives of f 1 , . . . , f m , which has degree m j=1 β j n β j = j α j = |α|. The formula (107) itself is well known for n = 1 = m as the Faa di Bruno formula; cf. [Jsn02] for its history. For higher dimensions, the formulas seem to have been less explicit.
The other contributions we know have been rather less straightforward, because of reductions, say to f, g being polynomials (or to finite Taylor series), and/or by use of lengthy combinatorial arguments with recursively given polynomials, which replace the sum over the β j in (107); such as the Bell polynomials that are used in e.g. [Rod93,Thm. 4.2.4].
Closest to the present approach, we have found the contributions [Spd05] and [Frae78] in case of one and several variables, respectively.