BiLipschitz decomposition of Lipschitz maps between Carnot groups

Let $f : G \to H$ be a Lipschitz map between two Carnot groups. We show that if $B$ is ball of $G$, then there exists a subset $Z \subset B$, whose image in $H$ under $f$ has small Hausdorff content, such that $B \backslash Z$ can be decomposed into a controlled number of pieces, the restriction of $f$ on each of which is quantitatively biLipschitz. This extends a result of \cite{meyerson}, which proved the same result, but with the restriction that $G$ has an appropriate discretization. We provide an example of a Carnot group not admitting such a discretization.


Introduction
Let h n denote the Hausdorff n-content (the definition will be reviewed in the next section). We prove the following theorem.
The first results of these type were published independently by P. Jones [7] and G. David [4] (in the same issue of the same journal) and were motivated by problems in the field of singular integrals. Jones proved Theorem 1.1 for Lipschiz maps f : [0, 1] n → R m . Jones's result was later generalized by Schul in [13] where he showed the same result except now f : [0, 1] n → (X, d) takes value in a general metric space and can be a Lipschitz map up to a controlled additive error. The result was also generalized by G.C. David (not the same author of [4]) in [5] where he proved the result for maps between certain topological manifolds of equivalent (topological and Hausdorff) dimensions.
Carnot groups are a class of metric groups that are natural generalizations of Euclidean spaces. They have many familiar geometric qualities, including properness, geodicity, a dilation structure, and transitiveness of isometries (in fact, these four attributes characterize sub-Finsler Carnot group [8]) although they may not be abelian. Carnot groups also admit a class of nested dyadic cubes (to be described in the next section) that allows one to import many arguments from harmonic analysis. Thus, it is natural to ask whether certain analytic or geometric statements can be generalized from the Euclidean world to the Carnot world. This has been an active area of research.
BiLipschitz decomposition for Lipschitz maps between Carnot groups was first studied in [10]. There, the author showed proved Theorem 1.1 when H is a Carnot group and G is another Carnot group that is appropriately discretizable (see Definition 2.11 of the same paper). There are examples of Carnot groups that cannot be discretized, and we give an example in the next section. Thus, our theorem extends the result of [10] to arbitrary pairs of Carnot groups. It should be noted that the result of [5] gives that self-maps of any Carnot group satisfy Theorem 1.1 as Carnot groups satisfy the topological manifold conditions studied there. However, the result of [5] does not apply for maps between Carnot groups of different dimensions.
Note that we require that the map f be defined on a neighborhood of B(x, R) whereas in the Euclidean case the function needs to be defined only on [0, 1] n . We could have it defined on a smaller neighborhood, but that would affect the M(δ) bound. That a Lipschitz extension theorem for Carnot-valued maps is not known impedes any reduction to a map defined on B(x, R). Indeed, it may be possible that there are Carnot groups G and H for which the behavior of f on B(x, R) is more limited the larger the domain of f is. See [1,12,14] for partial results and nonresults on Lipschitz extensions in the Carnot setting.
BiLipschitz decomposition theorems all follow a similar strategy. The first step is to decompose the domain into a set of dyadic-like cubes. Then one proves a lemma stating that if a cube whose image under f has large Hausdorff content and a certain wavelet coefficient-like quantity of f on the cube is small, then f acts biLipschitzly on points of the cube that are far apart. This is the step that usually heavily involves the geometry of the setting. Lemma 4.3 gives this statement for us. One then uses the fact that a weighted sum of the wavelet coefficient-like quantity is bounded to show that the quantity cannot be big for many cubes. After throwing out the cubes that have small image, one can then decompose most of the rest of the domain into a controlled number of pieces on which f is biLipschitz using a coding scheme. For us, the wavelet coefficient-like quantity will be the deviation of f from an affine function on a cube, a quantity that was studied in [9].
Acknowledgements. I am grateful to Jonas Azzam, Enrico Le Donne, and Pierre Pansu for enlightening discussions. The research presented here is supported by an NSF postdoctoral research fellowship.

Preliminaries and notations
Given a metric space (X, d), a subset E ⊆ X, and two numbers N ≥ 0 and δ ∈ (0, ∞], we define Note that if s ≤ t, then H N s (E) ≥ H N t (E). We then let H N (E) := lim δ→0 H N δ (E) be the Hausdorff N-measure and h N (E) := H N ∞ (E) be the Hausdorff N-content. It follows that h N (E) ≤ H N (E) always.
We will use the convention that if λ > 1 and E ⊂ X, then A Carnot group G is a simply connected Lie group whose Lie algebra g is stratified, that is, it can be decomposed into direct sums of subspaces Here, it is understood that V k = 0 for all k > r. The layer V 1 is called the horizontal layer and if V r = 0 then r is the (nilpotency) step of G. If we are dealing with multiple graded Lie algebras g, we will write V i (g) to differentiate the layers between Lie algebras. For simplicity, we will suppose all the constants in all Lie bracket structures are 1. All proofs will go through in the general case and the results will only differ by some factor depending on these constants.
As exponential maps are diffeomorphisms between a Carnot group and its Lie algebra, we can use it to canonically identify elements of the Lie group G to the Lie algebra g. This shows that a Carnot group is topologically a Euclidean space. Furthermore, we can push forward the coordinate system of g to G and so each element g ∈ G can be written as g = (g 1 , ..., g r ) where each g i is a vector of dim V i coordinates. These are known as exponential coordiantes. We will write the identity element as 0. Let | · | denote the standard Euclidean norm on as induced by the coordinates of g. Then we can make sense of |g i | and |g i − h i |, and so forth.
A homogeneous norm on a Carnot group G is a function N : G → [0, ∞) such that , Homogeneous norms induce left-invariant homogeneous (semi)metrics by the formula d(x, y) = N (x −1 y) and vice versa. Here, d may not satisfy the triangle inequality, but there does exist some C ≥ 1 so that Any two metrics on G induced by two homogeneous norms are biLipschitz equivalent. We will define a special group norm as It was shown in [2] that for each Carnot group there exists some set of positive scalars {λ k } r k=1 so that d ∞ , the associated metric, satisfies the actual triangle inequality. We will suppose for simplicity that λ k = 1 for all k. This will change everything we do by only a constant.
Carnot group also admit a path metric that we describe now. We begin by constructing a left-invariant tangent subbundle H of the tangennt bundle which is just V 1 pushed to every point by left translation. We can similarly endow H with a left-invariant field of inner products. We then define the Carnot-Carathéodory metric between x and y in G to be Here, |·| γ(t) is the norm coming from the left invariant inner product. For Carnot groups, such a path between any two points always exists [11] and so the metric is finite. It is clearly left invariant and scales with dilation by construction. Thus, the Carnot-Carathéodory metric is biLipschitz equivalent to any metric induced by a homogeneous norm.
Let L : G → H be a Lie group homomorphism between Carnot groups. As G and H are simply connected, we can then lift it to a linear transform of the Lie algebras T L : g → h by the formula T L = exp −1 •L • exp. We will always assume that T L (V 1 (g)) ⊆ T L (V 1 (h)) and so will not explicitly say this from now on. This is necessary for the L to be Lipschitz. It follows then that T L (V j (g)) ⊆ T L (V j (h)) for all j. In the exponential coordinates, we then have that The images of these lines are called horizontal lines.
Finally as we have identified G with R n , we can speak of the Lebesgue measure. From looking at the Jacobians of the BCH formula and the dilation automorphism, we get that L n is left-invariant and satisfies the identity is the homogeneous dimension of G. As H N is also a left-invariant N-homogeneous measure, by uniqueness of the Haar measure, we have that H N and L n are multiples of each other. The Hausdorff dimension of a Carnot group is exactly its homogeneous dimension.
As |B(x, r)| = cr N for some c > 0 depending only on G, we have by basic packing arguments that G is metrically doubling, that is, there exists some M > 0 depending only on G so that for each x ∈ G and r > 0, there The following theorem of Christ says that such a space contains a collection of partitions that behave like dyadic cubes.

2.1.
A nondiscretizable Carnot group. We recall Definition 2.11 of [10] of discretizability of a Carnot group. Let G be a Carnot group whose Lie algebra g admits the stratification and let m j = dim V j . We say G is discretizable if for each j ∈ {1, ..., r} there exist a collection of vectors In other words, a group G is discretizable if there exists a basis of horizontal elements that generate a discrete subgroup spanning all of G. Recall that the biLipschitz decomposition result of [10] required that the domain Carnot group be discretizable. We now prove that not all Carnot groups are discretizable.
There exists a Carnot group that is not discretizable.
Proof. We let G be the Carnot group that has the stratified Lie algebra g of step 6 that we now describe. The horizontal layer V 1 is two dimensional and spanned by two vectors X and Y . The other layers are 1-dimensional, and we let Z i be vectors spanning V i for i ∈ {2, ..., 6}. We define the relations [X, Y ] = Z 2 , Here, t 3 , ..., t 6 are real numbers that we choose later. Suppose G is discretizable, and let G ′ be the discrete subgroup as in the definition. Then G ′ is generated by g = exp(aX + bY ) and h = exp(cX + dY ) for a, b, c, d ∈ R, and we may suppose that ad − bc = 1. By assumption, there exists some s i = 0 such that u i = exp(s i Z i ) for i ∈ {2, ..., 5} are elements of G ′ . We then have by the BCH formula that In order for G ′ to be discrete, we must have that at i +b ct i +d ∈ Q for all i ∈ {3, ..., 6}. Note that if the map ϕ : x → ax + b cx + d takes three distinct rationals to rationals and ad−bc = 1, then a, b, c, d are all rational and so ϕ takes all rationals to rationals (and possibly infinity) and irrationals to irrationals. Thus, if we specify t 3 , ..., t 6 to be three distinct rational numbers and an irrational one, then G ′ cannot be discrete for any such choice of a, b, c, d and so G is not discretizable.

Distortion and nets
From here on, we let G and H be two Carnot groups with Lie algebras g and h, and we will also assume that n = dim(g) ≤ dim(h) = m and r and s are the steps of G and H, respectively. Let L : G → H be a homomorphism. As mentioned before, one can lift this homomorphism via the exponential map to a linear transform T L : g → h.
Our first lemma says that a homomorphism that collapses points does so on the layers.
Lemma 3.1. Let G and H be as above and L : G → H be a homomorphism such that there exists g ∈ G so that N ∞ (L(g)) < εN ∞ (g) for some ε > 0. Then there exists j ∈ {1, ..., r} and some v ∈ V j (g) for which |v| = 1 and N ∞ (L(e v )) < ε.
As g was arbitrary, this contradicts our assumption.
Our next lemma says that right translation does not distort coordinates too much.
Proof. This follows from the BCH formula. Fix some i ∈ {1, ..., s}. Then As |h i | ≤ ε i ≤ ε, it suffices to show that |P i | ≤ Cε for some C > 0. As P i is a polynomial of nested Lie brackets where the number of brackets and the coefficients are bounded by some number depending only on i, it further suffices to bound each nested Lie bracket by ..]] be one such term. By the BCH formula, one of x ℓ−1 and x ℓ is a coordinate of h. Thus, as |g j | ≤ 1 and |h j | ≤ ε j ≤ ε for all j ≥ 1, we have that This finishes the proof.
We can now prove the main result of this section which says that if a homomorphism collapses points, then we can cover the homomorphic image of a ball by only a few small balls. Lemma 3.3. There exists some C 3 > 0 depending only on G and H so that if ε > 0 and L : (G, d ∞ ) → (H, d ∞ ) is a 1-Lipschitz homomorphism such that there exists g ∈ G so that N ∞ (L(g)) < εN ∞ (g), then for every x ∈ G and ℓ ≥ 0, there exist points Here, the balls in both G and H are with respect to the d ∞ metric and N is the Hausdorff/homogeneous dimension of G.
Proof. By homogeneity and left-invariance of the metric, we may suppose that ℓ = 1 and x = 0. Note that L(G) is a Lie subgroup of H with Lie algebra T L (g) ⊆ h. As we are identifying H with its Lie algebra h, which we can also view as R m , L(G) can also be identified with a linear subspace R n ⊆ R m . Note then that L(B(0, 1)) is a symmetric convex subset of R n . From Lemma 3.1, we have that the inradius of L(B(0, 1)) is less than ε. As L is 1-Lipschitz, we also have that L(B(0, 1)) is contained in [−1, 1] n .
One can see from the Jacobian of the dilation that L n (B(0, s) ∩ L(G)) = s N L n (B(0, 1) ∩ L(G)). In addition, as L(G) is a Lie subgroup of H, we have by the BCH formulas that left translation by an element of L(G) preserves the volume form of R n . Thus, L n is a left invariant measure on L(G) that is N-homogeneous with respect to dilations.
Remark 3.4. Note that the result of Lemma 3.3 still holds if we pass from d ∞ of G and H to biLipschitz equivalent (semi)metrics. The constant C 3 will change only by a factor controlled by the biLipschitz equivalence.

Weak biLipschitzness
In this section, we let first let f : [a, b] → X be a 1-Lipschitz map where (X, d) is an arbitrary metric space. We now recall some terminology from [9]. Given some p ≥ 1 and x, y ∈ R, we let We will not actually need that d satisfies the triangle inequality in this section, just that which follows when d is a metric by Jensen's inequality and the triangle inequality. We now define the quantity Now let G be a Carnot group of Hausdorff dimension N, k = dim(V 1 (g)), and f : G → H be a Lipschitz function. We will equip G with the d ∞ metric but will define the metric of H later. We can extend the definition of α to the Christ cubes of G. Let L ≥ 1 and Q ∈ ∆. Then we can define Here, G ⊖ v is the exponential image of the subspace of g orthogonal to v. Integration in x is with respect to the N − 1 dimensional Hausdorff measure H N −1 and in v is with respect to the probability measure on the unit sphere of V 1 (g).
One may be worried that there is no guarantee that x exp(Rv) ∩ B(z Q , 3Lℓ(Q)) is connected. We simply specify that it be the unique connected subset I that contains the subset x exp(Rv) ∩ B(z Q , 3Lℓ(Q)). See Lemma 3.3 and the following discussion of [9].
The following proposition shows that the α(Q, L) are Carleson summable in a cube S with constant depending on L.
Proposition 4.1. Let G be a Carnot group. For each L ≥ 1, there exists some C 1 = C 1 (L) > 0 depending on L so that if f : G → X is 1-Lipschitz and S ∈ ∆, then we have Proof. The proof is essentially that of Proposition 3.5 of [9] with ε = 0 and m = ∞. Note that the second parameters of the α (p) f quantities defined here and in [9] are not the same. One makes the straightforward changes to account for the scaling L, which only changes the constants in the proof by an amount controllable by L. Notably, the constant C 2 of that proof will change by an amount depending on L and G.
The following lemma will give the needed metric for H. Lemma 4.2. There exists p ≥ 2, C, α, β, γ > 0, and a homogeneous norm N on H so that if we define α (p) f with respect to the semimetric d H it induces, then the following property holds. Let ε ∈ (0, 1/2) and f : G → H be Lipschitz. If , then there exists a homomorphism L : G → H and g ∈ H so that for all x ∈ B(z Q , 100 diam(Q)), Proof. The norm is the one given by Proposition 7.2 of [9]. The result then follows from using Lemma 6.12 together with equation (55) of the same paper. The statement of Lemma 6.12 derives from a bound on α (p) f (Q, 1) a statement like (6) on a small subball of Q centered around x Q . It follows easily then that a similar bound on a dilate of Q-as in our hypothesisgives our needed result on B(z Q , 100 diam(Q)).
From now on, we now endow H with the metric d H induced by the norm of Lemma 4.2. Note that we never stated that d H satisfies the triangle inequality. Thus, let C Q ≥ 1 be such that The properties of cubes given by Theorem 2.1 easily imply that there exists some b ∈ (0, 1/10) depending only on G so that if x, y ∈ G and Q is the smallest cube containing x such that y ∈ 2Q, then d(x, y) ≥ 10b diam(Q).
We fix this b for the rest of the paper. The following lemma is the main result of this section. It says that if a cube Q has an image under f with large Hausdorff content but small α (p) f , then it must push far away points apart. A function that satisfies the result of this lemma is sometimes called weakly biLipschitz. Lemma 4.3. There exists some c 1 > 0 depending only on G and H so that for each δ > 0, we have that Proof. By our assumption of α (p) f and Lemma 4.2, there exists a homomorphism L : G → H and g ∈ H so that for all x ∈ B(z Q , 100 diam(Q)) we have Note that x, x ′ ∈ 2Q ⊆ B(z Q , 100 diam(Q)). If for all z ∈ G, we have d H (L(z), 0) ≥ 4b −1 C 2 Q δd(z, 0), then we are done because In the last inequality, we used the fact that x, x ′ ∈ 2Q. Thus, we may suppose that there exists some z ∈ G so that d H (L(z), 0) < 4b −1 C 2 Q δd(z, 0). By Lemma 3.3 and Remark 3.4, there exists some C 4 > 0 depending only on G and H and points Note then that Indeed, we have that as Q ⊆ B(z Q , τ ℓ(Q)), we have for any x ∈ Q that there exists some i so that d H (L(x), It follows from (10) that there exists some C 5 > 0 depending only on G and H so that Here we used the fact that ℓ(Q) N is comparable to |Q|. Choosing c 1 small enough, we get a contradiction.

Proof of main theorem
The proof is relatively standard and follows the arguments of [4,6,7]. As proving the theorem for one metric on H immediately implies the same result for all metrics on H with just modified constants, we are free to assign any metric to H. We will equip H with the d H metric from Lemma 4.2. By scale invariance, we may suppose that R = 1.
We can specify a j small enough depending only on δ and G so that if Q ∈ ∆ j is a cube such that Q ∩ B(x, 1) = ∅, then all the horizontal line segments x exp(Rv) ∩ B(z Q , 3Lℓ(Q)) needed in the calculation of α  B(x, 2). The number and collective volume of such cubes are bounded by constants depending only on δ and G. Thus, if we show the result for each of these cubes, we can take the union of all the biLipschitz pieces (of which there is a controlled number) to get our needed biLipschitz decomposition of f on B(x, 1). We now let S be one of these cubes and we will prove the statement of the theorem for S in place of B(x, R).
Define the following families of cubes Given a cube Q, let Q be the union of cubes of ∆ j(Q) that intersect 2Q. Given L > 0, we define We have that Thus, there exists some L > 1 depending only on G, H, and δ so that |R 2 | < δ|S|. Let R 1 = Q∈B 1 Q. Using the definition of B 1 , we have that Likewise, as f is 1-Lipschitz and |R 2 | < δ|S|, we have that h N (f (R 2 )) < δ|S|. Thus, we have that h N (f (R 1 ∪ R 2 )) < (1 + c 1 )δ|S|.
It remains to decompose S\(R 1 ∪R 2 ) into M biLipschitz pieces. We use the usual encoding scheme, which we will give a sketch of right now.
Let l ≥ 1 be large enough so that if Q ∈ ∆ k and S ∈ ∆ k+l , then diam Q < b diam(S). Then for each k and Q ∈ ∆ k ∩ ∆(S), we let F (Q) denote the set of cubes Q ′ ∈ ∆ k ∩ ∆(S) such that Q ′ = Q and Q and Q ′ are both contained in some S for some S ∈ ∆ k+l ∩ B 2 . As G is doubling, we get that there exists some T ≥ 1 so that #F (Q) ≤ T for all Q ∈ ∆(S).
Let A be a set of T + 1 distinct elements. We will associate to each Q ∈ ∆(S) an (possibly empty) ordered string of of characters from A (a word) that we will denote a(Q). For any Q ∈ ∆, we let Q * be the unique parent of Q. The words that we assign will satisfy the following property: a(S) = ∅, a(Q) = a(Q * ) if F (Q) = ∅, if F (Q) = ∅ then a(Q) will be the word a(Q * ) appended with an additional element from A at the end so that if Q ′ is another cube of F (Q) then • a(Q) = a(Q ′ ) when a(Q) and a(Q ′ ) are of equal length, • when a(Q) is shorter than a(Q ′ ) then a(Q) does not begin with a(Q ′ ), • when a(Q ′ ) is shorter than a(Q) then a(Q ′ ) does not begin with a(Q). Such an association can be done recursively. We omit the details but the reader can consult [4,7] for full details.
Note that if x / ∈ R 2 , then the number of cubes Q containing x for which F (Q) = ∅ is bounded by L. Thus, there must be some Q ∈ ∆(S) containing x for which if Q ′ ⊂ Q is any other cube and Q ′ also contains x, then a(Q ′ ) = a(Q). That is, the code stabilizes. We can then associate to x the word a(x) = a(Q) for this Q. It follows that a(x) is a word of at most L letters. Thus, we have partitioned S\(R 1 ∪ R 2 ) into at most (T + 1) L measurable sets {F ω } ω∈A L+1 based on each point's word assignment. It remains to prove that if x, y ∈ F i , then d H (f (x), f (y)) ≥ δd(x, y).
Let x, y ∈ F ω be two distinct points and let Q be the smallest cube such that x ∈ Q and y ∈ 2Q. If Q / ∈ B 2 , then Lemma 4.3 gives us our needed result. Thus, we may suppose that Q ∈ B 2 . We let Q 0 , Q 1 ∈ ∆ j(Q)+l so that x ∈ Q 0 and y ∈ Q 1 . As d(x, y) ≥ 10b diam(S) > diam(Q 0 ) by definition of b and l, we get that Q 0 = Q 1 . Thus, Q 1 ∈ F (Q 0 ) and so by the rules of the assignment of words to cubes, we have that a(x) = a(y). This contradicts our assumption that x, y ∈ F ω .