The L^p-to-L^q boundedness of commutators with applications to the Jacobian operator

Supplying the missing necessary conditions, we complete the characterisation of the $L^p\to L^q$ boundedness of commutators $[b,T]$ of pointwise multiplication and Calder\'on-Zygmund operators, for arbitrary pairs of $1<p,q<\infty$ and under minimal non-degeneracy hypotheses on $T$. For $p\leq q$ (and especially $p=q$), this extends a long line of results under more restrictive assumptions on $T$. In particular, we answer a recent question of Lerner, Ombrosi, and Rivera-R\'ios by showing that $b\in BMO$ is necessary for the $L^p$-boundedness of $[b,T]$ for any non-zero homogeneous singular integral $T$. We also deal with iterated commutators and weighted spaces. For $p>q$, our results are new even for special classical operators with smooth kernels. As an application, we show that every $f\in L^p(R^d)$ can be represented as a convergent series of normalised Jacobians $Ju=\det\nabla u$ of $u\in \dot W^{1,dp}(R^d)^d$. This extends, from $p=1$ to $p>1$, a result of Coifman, Lions, Meyer and Semmes about $J:\dot W^{1,d}(R^d)^d\to H^1(R^d)$, and supports a conjecture of Iwaniec about the solvability of the equation $Ju=f\in L^p(R^d)$.


Introduction
The first goal of this paper is to complete the following picture of the L p (R d )to-L q (R d ) boundedness properties of commutators of pointwise multiplication and singular integral operators: Theorem 1.0.1. Let 1 < p, q < ∞, let T be a "non-degenerate" Calderón-Zygmund operator on R d , and let b ∈ L 1 loc (R d ). Then the commutator • p = q and b has bounded mean oscillation, or and b is α-Hölder continuous for α = 1 p − 1 q d, or • q > p * and b is constant, or • p > q and b = a + c, where a ∈ L r (R d ) for , and c is constant.
To be explicit, the definition of the Sobolev exponent p * above is pd/(d − p), if p < d, and ∞ otherwise; thus p < q ≤ p * is precisely the condition that the Hölder exponent satisfies α ∈ (0, 1]. We say that a Calderón-Zygmund operator T f (x) =´K(x, y)f (y) dy, with usual (or weaker) assumptions on the kernel K recalled in Section 2.1, is "non-degenerate" provided that, for some c 0 > 0, (1.1) for every y ∈ R d and r > 0, there is x ∈ B(y, r) c with |K(x, y)| ≥ 1 c 0 r d ; i.e., uniformly over all positions and length-scales, the kernel takes some values that are as big as they are allowed to be by the standard upper bound for K(x, y). When K(x, y) = Ω(x − y) |x − y| d is a (possibly rough) homogeneous kernel, this requirement simply says that Ω is not identically zero.
1.1. Sufficient conditions for boundedness. We note that all the "if" parts of Theorem 1.0.1 are either well known or easy. The cases when b is constant are completely trivial, since in this case the commutator vanishes. If b ∈ L r (R d ) with 1 r = 1 q − 1 p , the boundedness is also immediate simply from the boundedness of T on both L p (R d ) and L q (R d ) (taking this as part of the definition of a "Calderón-Zygmund operator"), together with Hölder's inequality: In particular, no mutual cancellation between the two terms of the commutator is involved in this estimate. This computation is also valid when p = q and r = ∞, showing the trivial sufficiency of b ∈ L ∞ (R d ) for the boundedness of [b, T ] on L p (R d ). The fact that the larger space BMO(R d ) is still admissible for this boundedness is a celebrated theorem of Coifman, Rochberg and Weiss [7] and the only truly nontrivial result among the "if" statements of Theorem 1.0.1.
If b is α-Hölder continuous, using only the standard pointwise bound for Calderón-Zygmund kernels, we see that is pointwise dominated by the usual fractional integral operator, whose L p (R d )-to-L q (R d ) bounds are classical and well known.
1.2. Necessary conditions for boundedness. Let us then discuss the "only if" parts of Theorem 1.0.1. For p = q, already Coifman, Rochberg and Weiss [7] proved the necessity of b ∈ BMO(R d ) for the L p (R d )-boundedness of [b, T ] for all d Riesz transforms R j , j = 1, . . . , d. (This reduces to just the Hilbert transform when d = 1.) Their argument made explicit use of the special algebraic form of the relevant kernels.
Janson [20] and Uchiyama [35], independently, extended the necessity part of the Coifman-Rochberg-Weiss theorem to more general classes of homogeneous Calderón-Zygmund kernels with "sufficient" smoothness. In particular, their results contain the fact that the boundedness of [b, R j ] for just one (instead of all) j = 1, . . . , d already implies that b ∈ BMO(R d ). Janson's argument may be viewed as an analytic extension of that of Coifman et al., in that he used the smoothness to guarantee absolute convergence of the Fourier expansion of the inverse 1/K of the kernel, where the individual frequency components could then be treated by the algebraic method. Janson also proves the "only if" part of Theorem 1.0.1 for p < q (and in fact for more general Orlicz norms) for the same class of smooth homogeneous kernels. Uchiyama's argument is different, but still dependent on both smoothness and homogeneity of the kernel.
A recent advance was made by Lerner, Ombrosi and Rivera-Ríos [26], who identified sufficient local positivity (lack of sign change in a nonempty open set) as a workable replacement of the previous smoothness assumptions on the (still homogeneous) kernel to deduce the necessity of b ∈ BMO(R d ) for the L p (R d )-boundedness of [b, T ]. Similar results in the case of not necessarily homogeneous Calderón-Zygmund kernels were subsequently obtained by Guo, Lian and Wu [13]; see also Duong, Li, Li and Wick [10] for the concrete case when T is a Riesz transform related to the sub-Laplacian on a stratified nilpotent Lie group.
In the present work, we take the final step in generalising the class of admissible kernels, showing that any non-degenerate Calderón-Zygmund kernel is admissible for the "only if" conclusions of Theorem 1.0.1. In particular, our result applies to both two-variable kernels K(x, y) (with very little smoothness) and rough homogeneous kernels Ω(x − y) |x − y| d , under a minimal non-degeneracy assumption. In the case of homogeneous kernels we merely need that Ω ∈ L 1 (S d−1 ) does not vanish identically. This answers positively a question raised by Lerner et al. [26,Remark 4.1]; as discussed below, we also address the more general two-weight bounds and higher commutators as considered in [26]. Also in the case of two-variable kernels, our non-degeneracy hypothesis seems to be at least as general as anything found in the literature; in contrast to [13] in particular, we allow in (1.1) that the point of non-degeneracy x may lie in any direction from the reference point y.
1.3. The case p > q and applications to the Jacobian operator. The case p > q of Theorem 1.0.1 is completely new even for special Calderón-Zygmund operators like the Riesz transforms, for which the complementary range p ≤ q was understood for a long time. The result in this new range is perhaps surprising, in that it says that there is essentially no cancellation between bT and T b in this regime. (An initial working hypothesis before discovering this result was that the role of BMO in the commutator boundedness in this regime of exponents could be taken by another space JN r , which was implicitly introduced by John and Nirenberg [21, §3] and recently studied in [9]. However, the obtained result disproves this hypothesis.) Technically, this is the hardest case of the proof, which is somewhat explained by the fact that membership in L r (R d ) is a "global" condition, in contrast to the "uniform local" conditions defining both BMO(R d ) and α-Hölder continuous functions. Incidentally, a similar dichotomy between "global" conditions characterising L p -to-L q (or similar) boundedness for p > q, and "uniform local" conditions in the case p ≤ q, has also been recently discovered in a couple of other settings as well: (1) In the context of two-weight norm inequalities for certain discrete positive operators, the characterisation for p ≤ q by Lacey, Sawyer and Uriarte-Tuero [24] is in terms of local "testing conditions" uniform over all dyadic cubes, while the characterisation for p > q due to Tanaka [34] involves the L r membership of a "discrete Wolff potential"; see also [14] for a unified approach to both cases. (2) The boundedness of certain Toeplitz type operators between the holomorphic Hardy spaces H p and H q of the unit ball was characterised by Pau and Perälä [31] in both regimes of the exponents, in terms of a uniform local Carleson measure condition for p ≤ q, and in terms of the global L r membership of a certain auxiliary function for p > q. These results, analogous to our present ones but in a different context, were found independently at almost the same time: the first arXiv versions of [31] and the present work came out within two weeks of each other.
It might be of interest for general operator theory in L p spaces to find further examples of, and/or a broader context for, this phenomenon. A part of the motivation to study this regime of exponents for commutator inequalities came from a recent observation of Lindberg [28] about the connections of such bounds, in the particular case when T is the Ahlfors-Beurling transform, to the Jacobian equation . It has been conjectured by Iwaniec [19] that, for p ∈ (1, ∞), the (obviously bounded) map J :Ẇ 1,pd (R d ) d → L p (R d ), whereẆ 1,pd is the homogeneous Sobolev space, has a continuous right inverse and in particular is surjective. As a variant of our estimates for commutators, we will provide partial positive evidence by showing that the closed linear span of the range of J is all of L p (R d ). This is an L p -analogue of a result of Coifman, Lions, Meyer and Semmes [6, p. 258] who obtained a similar conclusion for J : , which corresponds to the case p = 1, with the usual replacement of L 1 by the Hardy space H 1 .
Recently, Lindberg [28, p. 739] proposed an approach to the planar (d = 2) case of the Jacobian operator via the complex-variable framework , and S is the Ahlfors-Beurling operator. This led him to a question about the boundedness of the commutator [b, S] : L 2p → L (2p) ′ , which is solved as a particular case of Theorem 1.0.1; observe that 2p > 2 > (2p) ′ here. Following Lindberg's outline [28, p. 739], conclusions about the planar Jacobian could then be obtained as corollaries to Theorem 1.0.1; but it turns out that a combination of some elements of its proof, together with the techniques of Coifman, Lions, Meyer and Semmes [6], actually allows to prove such results in any dimension; see Section 3.

1.4.
A priori assumptions on b, T and [b, T ]. In general it takes some effort to define precisely what is meant by "T f ", when T is a singular integral operator, or by saying that such an operator "is bounded" from one space to another. In our approach to the "only if" statements of Theorem 1.0.1, we avoid all this subtlety; in fact, our assumptions may be formulated entirely in terms of the kernel K without ever having to define the operator T or [b, T ], although we still use these symbols as convenient abbreviations. All we need is estimates for the bilinear form where the functions f, g ∈ L ∞ (R d ) have bounded supports separated by a positive distance; we refer to such estimates as off-support bounds for [b, T ]. Under the standard estimates for a Calderón-Zygmund kernel, the above integral exists as an absolutely convergent Lebesgue integral when b ∈ L 1 loc (R d ), as in Theorem 1.0.1. For p ≤ q, we only need the bound where B andB denote arbitrary balls of radius r B and rB, respectively. This is weaker than a restricted weak type (p, q) estimate in two ways: the bound involves the bigger quantities |B| in place of |B| on the right, and it is only required to hold under the quantitative off-support condition above. (A certain technical strengthening, but still formally weaker than the global boundedness of [b, T ] : L p → L q , and involving off-support bounds only, is needed when p > q. ) We note that Liaw and Treil [27] have provided a framework to interpret the boundedness of a singular integral operator (an issue that we have chosen to avoid) via off-support conditions of a similar flavour. However, the off-support conditions that we impose on f and g are significantly stronger (and hence the resulting estimate on the operator restricted to such pairs of functions much weaker) than those of [27]; in particular, the quantitative separation of supports in (1.3) efficiently prevents approximating a form with arbitrary f, g (as done in [27]) by the offsupport forms above.
The fact that one only needs off-support estimates in the "only if" directions of Theorem 1.0.1 is already implicit in the argument of Uchiyama [35, proof of Theorem 1], but not in all recent works, and it seems not to have been explicitly stated in the literature. On the other hand, Lerner et al. [26] use a restricted strong type assumption, while Guo et al. [13] state one of their results under a weak type hypothesis. Our condition (1.3) simultaneously relaxes both these assumptions.
Note that the a priori assumption that b ∈ L 1 loc (R d ) is essentially the weakest possible to make sense of the commutator [b, T ], even in the off-support sense as above. While many earlier results related to Theorem 1.0.1 are obtained under this same minimal assumption, some others assume b ∈ BMO(R d ) qualitatively to begin with, and then prove the quantitative bound b BMO [b, T ] L p →L p ; see e.g. [10,Theorem 1.2]. A simplification brought by this stronger a priori assumption is that one can absorb error terms of the form ε b BMO in the argument. We will also use absorption, but only to quantities whose finiteness is guaranteed by b ∈ L 1 loc (R d ).
1.5. Methods and scope. We will prove versions of Theorem 1.0.1 by two methods of somewhat different scopes. The first method is based on the well-known connection of commutator estimates to weak factorisation, which has been widely used since the pioneering work [7]. (In contrast to proper factorisation, where an object is expressed as a product of other objects, weak factorisation refers to decompositions in terms of sums, or possibly infinite series, of products.) This depends on the basic identity where each term is well-defined as a Lebesgue integral for disjointly supported f and g. Hence, if an arbitrary h in (a dense subspace of) a predual of the space hoped to contain b can be expanded as then we can hope to estimate An inherent difficulty is that, even with good convergence properties of the expansion (1.4) in the predual space, lacking the a priori knowledge that b should be in the relevant space, it may be difficult to justify the "≤" above. We circumvent this problem by replacing (1.4) by an approximate weak factorisation, where the sum over i is finite, but there is an additional error termh that will be eventually absorbed. This method is strong enough for proving Theorem 1.0.1 as stated, where both the function b and the kernel K(x, y) of T are allowed to be complex-valued. Besides completeness of the theory, achieving this level of generality was initially motivated by the applications to the Jacobian operator via the Ahlfors-Beurling transform, as discussed above. The kernel of this operator, K(z, w) = −π −1 /(z − w) 2 for z, w ∈ C, is genuinely complex-valued, and it is only natural to view it as acting on (and forming commutators with) complex-valued functions. While this is hardly exotic, it should be stressed that some of the recent contributions, like our second method, are inherently restricted to real-valued b.
Our second approach could be called the median method, and it is a close cousin of the recent work [26]. It makes explicit use of the order structure of the real line as the range of the function b. The advantage of this method is that, with little additional effort, it can also handle the higher order commutators As before, we only need the off-support bilinear form of these operators for f, g ∈ L ∞ (R d ) with bounded supports separated by a positive distance, and b ∈ L k loc (R d ) is a sufficient a priori assumption to make sense of this. We also apply this method to two-weight commutator inequalities in Section 4.3. (1) Bilinear Calderón-Zygmund operators map T : L p1 × L p2 → L p with 1 p = 1 p1 + 1 p2 , and one can ask about conditions for The necessity of b ∈ BMO when p = q was first obtained by Chaffee [4] under Janson-type assumptions and methods involving the Fourier expansion of the inverse kernel 1/K. Since the circulation of our results, Oikari [29] has extended the present hypotheses and methods to bilinear operators, obtaining a close analogue of Theorem 1.0.1 in this setting.  [23] in the celebrated Ferguson-Lacey theorem [11] and its extensions; nevertheless, various mixed-norm L p1 (L p2 ) → L q1 (L q2 ) bounds with (p 1 , p 2 ) = (q 1 , q 2 ) have been recently characterised by Airta et al. [2], by extending the methods of the present paper. 1.7. About notation. We will make extensive use of the notation " " to indicate an inequality up to an unspecified multiplicative constant. Such constants are always allowed to depend on the underlying dimension d, any of the Lebesgue space exponents p, q, r, . . ., and also on the Calderón-Zygmund operator T and its kernel K, as well as on the order k of an iterated commutator; these are regarded as fixed throughout the argument. The implied constants may never depend on any of the functions under consideration (neither on the function b appearing in the commutator [b, T ] itself nor on any of the functions f, g, . . . on which the commutator acts), nor points or subsets (balls, cubes, etc.) of their domain R d . Many arguments involve an auxiliary (large) parameter A, and dependence on it is also indicated explicitly until a suitable value of A (depending only on the admissible quantities) is fixed once and for all for the rest of the argument. The subscript zero of a Lebesgue space indicates vanishing integral, i.e., L p 0 (Q) = {f ∈ L p (Q) :´f = 0}. The subscript zero of a Sobolev space W 1,p 0 (Ω) (which will be only mentioned in passing) indicates vanishing boundary values in the Sobolev sense. Compact support is indicated by the subscript c, mainly in the context of the test function space C ∞ c (R d ). We denote by ffl E f := |E| −1´E f the average of a function over a set E of finite positive measure.

Complex commutators and approximate weak factorisation
In this section we prove the "only if" claims of Theorem 1.0.1.

2.1.
Non-degenerate Calderón-Zygmund kernels. We begin by describing the precise class of singular integral kernels that we study. We consider two-variable Calderón-Zygmund kernels under the standard conditions where the modulus of continuity ω : [0, 1) → [0, ∞) is increasing. We refer to such a kernel as an ω-Calderón-Zygmund kernel. A common assumption is that ω(t) = c α t α for some α ∈ (0, 1], or a more general Dini-conditioń 1 0 ω(t) dt t < ∞, but we need even significantly less, namely that ω(t) → 0 as t → 0. We also consider rough homogeneous kernels where Ω ∈ L 1 (S d−1 ) and Ω(tx) = Ω(x) for all t > 0 and x ∈ R d . We note that the off-support bilinear form (1.2) is also well defined (absolutely integrable) for this type of kernels: the integrals of y → |K(x − y)f (y)| are uniformly bounded over x ∈ spt g, and x → |b(x)g(x)| is integrable; the term involving b(y) can be estimated similarly by carrying the iterated integrals in a different order. In either case, the L p (R d )-boundedness of an integral operator T associated with K neither follows from these assumptions, nor is assumed as a separate condition, as this is not needed. The story is different for the "if" directions of Theorem 1.0.1, but our present goal is to prove the "only if" directions with minimal assumptions. Definition 2.1.1. We say that K is a non-degenerate Calderón-Zygmund kernel, if (at least) one of the following two conditions holds: (1) K is an ω-Calderón-Zygmund kernel with ω(t) → 0 as t → 0 and for every y ∈ R d and r > 0, there exists x ∈ B(y, r) c with In particular, there exists a Lebesgue point θ 0 ∈ S d−1 of Ω such that Ω(θ 0 ) = 0.
Remark 2.1.2 (Comparison with non-degenerate kernels in the sense of Stein). Suppose that K is an ω-Calderón-Zygmund kernel of the convolution form K(x, y) = K(x − y). Then the non-degeneracy condition (1) of Definition 2.1.1 simplifies into the following form: for every r > 0, we have For convolution kernels, there is also the following well-known non-degeneracy condition introduced by Stein [33, IV.4.6]: there exists a constant a > 0, and a unit vector u 0 , so that It is immediate that Stein's non-degeneracy implies our version. In fact, assume (2.2) and fix some c 1 ≥ 1. Given r > 0, we find that any x = t·u 0 , where |t| ∈ [r, c 1 r], satisfies(2.1) with c 0 = c d 1 /a. Thus, while (2.1) requires just the existence of one x, Stein's condition provides two symmetric line segments of admissible x that, moreover, have simple explicit dependence on r and are always located on the same fixed ray through the origin. It is not surprising that (2.1) is easily satisfied even when (2.2) is not, and we provide some examples below.
Note that it is assumed in the discussion of non-degeneracy in [33, IV.4.6], but not in Definition 2.1.1, that K should be the kernel of a bounded operator on L 2 (R d ), and this would offer a source of cheap examples in terms of kernels of unbounded operators. To make clear that this is not a decisive difference between the two conditions, we take the slight additional trouble of making our examples correspond to bounded operators on L 2 (R d ).
Several of our examples to follow will exploit a standard resolution of unity where we note in particular that ϕ(1) = 1 under these conditions.
Example 2.1.3 (Stein's non-degeneracy violated at one or two points). When d = 1, Stein's condition (2.2) simply says that |K(x)| ≥ a|x| −1 , so any K that vanishes even at one point of R \ {0} is not admissible. Let us fix some K 0 that does satisfy (2.2), say the Hilbert kernel K 0 (x) = 1/x. We then define It is immediate that this perturbation of K 0 neither destroys the Calderón-Zygmund kernel bounds nor the L 2 (R)-boundedness of the operator. But K(1) = 0, so (2.2) is clearly violated. In contrast, (2.1) trivially holds; we can e.g. take x = −r for any given r > 0. If we also subtract K 0 (−1)ϕ(−x), so as to violate Stein's condition at both x = ±1, we still have (2.1), where we can e.g. take x = ±r when and consider a homogeneous convolution kernel , this is the truncation of a second order Riesz transform to a half space. Since and it also satisfies´S d−1 Ω(u) dσ(u) = 0. Under these conditions, it is classical that K is the convolution kernel of a bounded operator on L 2 (R d ) (see e.g. [12, Proposition II.5.5]). For any unit vector u 0 , it is clear that K(t · u 0 ) must vanish for either all t ∈ (0, ∞) or all t ∈ (−∞, 0), so that Stein's condition (2.2) is impossible. On the other hand, for any r > 0, choosing Example 2.1.5 (Stein's non-degeneracy violated on a half-line, d = 1). In dimension d = 1, it takes a bit more effort to construct an analogue of Example 2.1.4; but this pays off, as it allows us to connect the example to the theory of one-sided singular integrals introduced by Aimar, Forzani and Martín-Reyes [1]. These are simply convolution-type ω-Calderón-Zygmund kernel K supported on (0, ∞). The basic example of a non-trivial one-sided kernel provided in It is immediate that this satisfies the higher order Calderón-Zygmund estimates |x| n |D n k(x)| ≤ c n for all n = 0, 1, 2, . . ., and in particular the ω-Calderón-Zygmund estimates with ω(t) = t. Since consecutive bumps in the series of K have equal integral with opposite signs, K also satisfies the usual cancellation condition |´ε <|x|<N K(x) dx| ≤ C for all 0 < ε < N < ∞. However, it fails to satisfy the existence of the limit lim ε→0´ε <|x|<1 K(x) dx, which is needed to define the associated principal value convolution operator in the classical theory. But if we take the limit ε → 0 only along the powers ε = 4 −n , n ∈ Z (so as to proceed in steps of two consecutive bumps of opposite signs), then the relevant limit exists, and a trivial modification of the standard theory (see e.g. [12,Proposition II.5.5]) shows that T f (x) := lim n→∞´| x−y|>4 −n K(x − y)f (y) dy defines a bounded operator on L 2 (R d ) with convolution kernel K. Finally, this K easily satisfies Definition 2.1.1(1): Given r > 0, let r ≤ 2 j < 2r so that Recall that Stein's non-degeneracy condition was introduced for the following result [33, IV.4.6, Proposition 7]: If a convolution operator with non-degenerate kernel in Stein's sense acts boundedly on a weighted space L p (w), then the weight w must belong to Muckenhoupt's class A p . On the other hand, Aimar et al. [1] show that their one-sided operators, and hence in particular the example that we just gave, act boundedly on L p (w) for a strictly larger weight class A − p . As we will show in this paper, non-degeneracy in the sense of Definition 2.1.1(1) (which is satisfied by the said example) is enough to imply various necessary conditions on b for the boundedness of the commutator [b, T ]. In particular, a weaker notion of non-degeneracy of a singular integral T is needed to deduce that b ∈ BMO from the L p -boundedness of [b, T ], than what is needed to deduce that w ∈ A p from the L p (w)-boundedness of T . This is perhaps unexpected in view of the many known connections between the two questions.
Example 2.1.6 (Stein's non-degeneracy violated all over the place). In the two previous examples, a variant of Stein condition would still be satisfied, if we only demanded (2.2) for t ∈ (0, ∞). This final (arguably somewhat artificial) example shows that we can make (2.2) fail for a significantly larger set of t ∈ R, while still retaining non-degeneracy in the sense of Definition 2.1.1(1).
Let d ≥ 2 and ϕ be as in (2.3). Let (w k ) k∈Z be a sequence of unit vectors that is dense in the unit sphere S d−1 of R d , and let (v j ) j∈Z be a sequence that, for each w k , contains arbitrarily long subsequences of constant value w k . Fixing a resolution as in (2.3), let finally It is immediate that K satisfies not only the ω-Calderón-Zygmund estimates with ω(t) = t, but in fact the higher Calderón-Zygmund estimates |∂ α K(x)| ≤ c α |x| −d−|α| of any order, and also that K has vanishing integral over any sphere centred at the origin. It is well-known (see again [12,Proposition II.5.5]) that, under these conditions, K is the convolution kernel of a singular integral operator bounded on To see that K satisfies Definition 2.1.1(1), given r > 0, let r ≤ 2 k < 2r, and On the other hand, let us fix some candidate unit-vector u 0 and a > 0 for Stein's condition (2.2), and choose another unit vector u 1 ⊥ u 0 . By density, we can find and in particular So not only is (2.2) violated, but it is violated on symmetric line-segments that may be arbitrarily long relative to their distance from 0. On the other hand, the points of non-degeneracy for Definition 2.
have a rather wild distribution in the underlying space R d .

2.2.
Consequences of non-degeneracy. We will use the assumption of nondegeneracy through the following result: and for all y 1 ∈ B and x 1 ∈B, we have The implied constants can depend at most on c K , ω and d, as well as c 0 or |Ω(θ 0 )| from Definition 2.1.1. If K is homogeneous, we can take x 0 = y 0 + Arθ 0 .
Proof of Proposition 2.2.1, case (1). We assume that K is as in Definition 2.1.1(1). Fix a ball B = B(y 0 , r) and A ≥ 3. We apply the assumption with y 0 in place of y and Ar in place of r. This produces a point x 0 ∈ B(y 0 , Ar) c such that where ε A → 0 as A → ∞ by the condition that ω(t) → 0 as t → 0. Integrating this over x ∈B or y ∈ B, which both have measure |B| = |B| r d , we obtain (2.5).
We then consider the integrals in (2.5). Writing x ∈ B(x 0 , r) = B(y 0 + Arθ 0 , r) as x = y 0 + Arθ 0 + ru and y ∈ B(y 0 , r) as y = y 0 + rv, where u, v ∈ B(0, 1), and using the homogeneity of Ω, we have Here it is immediate that |II| A −1 , and hence the integral of (Ar) −d II over either x ∈B or y ∈ B is bounded by We turn to term I. Keeping either x ∈B fixed and varying y ∈ B, or the other way round, the difference u − v varies over a subset of B(0, 2). Hence bothB (Ar) −d |I| dx and´B(Ar) −d |I| dy are dominated by by the assumption that θ 0 is a Lebesgue point of Ω.
2.3. Approximate weak factorisation. For the class of non-degenerate Calderón-Zygmund operators just described, we prove certain "weak factorisation" type results that are pivotal in our proof of Theorem 1.0.1. These results have a technical flavour and may fail to have an "independent interest", but they are precisely what we need below. For a ball B ⊂ R d , we denote provided that A is chosen large enough so that ε A ≪ 1.
Proof. The decomposition is given by where we need to justify that the definition of h := −f /T * g does not involve division by zero. However, if y ∈ B, then where, using Proposition 2.2.1, recalling that A was chosen large enough so that ε A ≪ 1. This justifies the welldefinedness of the decomposition, and we turn to the quantitative bounds. From the previous considerations it directly follows that It is also immediate that Let us then estimate For y ∈ B, Hence for x ∈B, On the other hand, recalling that f ∈ L ∞ 0 (B), and thus It is then immediate that By iterating the previous decomposition (but just once more), we achieve the useful additional property that the error term is supported on the same set as the original function. In the following lemma and below, we will make use of the following notion: Definition 2.3.2 (Major subset). If E ⊂ F ⊂ R d are sets of finite measure, we say that E is a major subset of F if |E| ≥ c|F | for some fixed constant c ∈ (0, 1) that depends only on the admissible parameters, as described in Section 1.7.
In the following lemma, we denote certain major subsets by the suggestive letter Q, since the main subsequent application deals with the case, where these sets a cubes; however, the lemma itself does not require assuming this. If f ∈ L ∞ 0 (Q), there is a decomposition provided that A is chosen large enough so that ε A ≪ 1.
Proof. We first apply Lemma 2.3.1 to f and g 1 := 1Q ∈ L ∞ + (B), which clearly satisfies the condition We then wish to apply Lemma 2.
(We could write ε 2 A in the ultimate right, but since ε A → 0 at an unspecified rate anyway, this is irrelevant.) It remains to define and we get the required decomposition (2.6).
where the supremum is over all balls B ⊂ R d , and the homogeneous Hölder spaceṡ Note that we do not impose any boundedness condition on b; this would lead to the inhomogeneous Hölder space C 0,α , which does not play any role in our results.
Theorem 2.4.1. Let K be a non-degenerate Calderón-Zygmund kernel, and b ∈ L 1 loc (R d ). Let further and suppose that [b, T ] satisfies the following weak form of L p → L q boundedness:

7)
whenever f ∈ L ∞ (B), g ∈ L ∞ (B) for any two balls of equal radius r and distance dist(B,B) r. Then Proof. Let us consider a fixed ball B ⊂ R d of radius r. Then is finite by the assumption that b ∈ L 1 loc (R d ). Given f ∈ L ∞ 0 (B), we apply Lemma 2.3.3 to write andB is another ball of radius r such that dist(B,B) Ar. Thenˆb where, by assumption (2.7), Taking the supremum over f ∈ L ∞ 0 (B) of norm one, we deduce that and the last term can be absorbed if A is fixed large enough, depending only on the implied constants. Thus If α = 0, this is precisely the condition b ∈ BMO(R d ) with the claimed estimate. For α > 0, this is also a well-known reformulation of b ∈Ċ 0,α (R d ) (which consists only of constants for α > 1). We recall the argument for completeness.
Let x i , i = 1, 2, be two Lebesgue points of b, and let r : If B ⊂ B * are two balls of radius comparable to R, then where another application of (2.8) shows that and this can be extended to all x 1 , x 2 by redefining b in a set of measure zero. This is the required bound for b Ċ0,α if α ∈ (0, 1]. If α > 1, we let y k := x 1 + N −1 k(x 2 − x 1 ) for k = 0, 1, . . . , N to deduce that With N → ∞, this shows that b(x 1 ) = b(x 2 ), and hence b is constant.

Necessary condition for [b,
T ] : L p → L q when p > q. We now come to the more exotic case of Theorem 1.0.1, which is precisely restated in the following: and suppose that [b, T ] satisfies the following weak form of L p → L q boundedness: Then b = a + c for some a ∈ L r (R d ) and some constant c ∈ C, where a r Θ.
Proof. For certain fixed signs σ i , and random signs ε i on some probability space with expectation denoted by E, we have If p ≤ q, using (2.7) followed by Hölder's inequality and several applications of where Θ = Θ(2.7).
For the proof of Theorem 2.5.1, we need the following lemma. Given a cube Q 0 ⊂ R d , we denote by D(Q 0 ) the collection of its dyadic subcubes (obtained by repeatedly bisecting each side of the initial cube any finite number of times).
where f n,k ∈ L ∞ 0 (Q n,k ) and f n,k ∞ |f | Q n,k , and Q n,k ∈ D(Q 0 ) are disjoint in k for each n. Moreover, for all n and k we have Q n,k ⊂ Q n−1,j for a unique j, and |f | Q n,k > 2 |f | Qn−1,j .
Letting (f n,k , Q n,k ) ∞ k=0 be some enumeration of (f F , F ) F ∈Fn , the claimed properties are easily checked.
Lemma 2.5.4. Let Q k be cubes, and E k ⊂ Q k their subsets with |E k | ≥ η|Q k | for some η ∈ (0, 1).Let λ k ≥ 0 and p ∈ [1, ∞). Then In the second claim, a more delicate argument could be given to improve the bound A d to log A, but this is unnecessary for the present purposes.
Proof. Dualising the left side with φ ∈ L p ′ , we find that and the first claim by the boundedness of the maximal operator on L p ′ . For the second claim, let Q * k be a cube that contains both Q k andQ k , with ℓ(Q * k ) Aℓ(Q k ). Then we first use the trivial bound 1 Q k ≤ 1 Q * k , and then the first part of the lemma with Q * k in place of Q k , andQ k ⊂ Q * k in place of E k , observing that |Q k | A −d |Q * k | .
Proof of Theorem 2.5.1. We fix a cube Q 0 ⊂ R d and consider the quantity This has the trivial a priori upper bound C R ≤ b L 1 (Q0) R < ∞, since b ∈ L 1 loc (R d ), but we wish to deduce a bound independent of R. To this end, we fix an f ∈ L ∞ 0 (Q 0 ) and make the decomposition given by Lemma 2.5.3. Then the last step follows since bf n = ∞ k=0 bf n,k is integrable and the terms bf n,k are disjointly supported. For each (k, n), we apply the decomposition of Lemma 2.3.3 to write wheref n,k ∈ L ∞ 0 (Q n,k ), g i n,k ∈ L ∞ (Q n,k ) and h i n,k ∈ L ∞ (Q n,k ) for some cubesQ n,k of the same size as Q n,k and distance dist(Q n,k ,Q n,k ) A diam(Q n,k ). In particular, the functionsf n,k ∈ L ∞ 0 (Q n,k ) are again disjointly supported with respect to k, for each fixed n. Thuŝ Since both the left side and the second term on the right is summable over k, so is the first term on the right, and we havê We notice that This pointwise maximal function bound proves both On the other hand, by the definition of convergent series, we have Recalling that g i n,k ∈ L ∞ (Q n,k ) and h i n,k ∈ L ∞ (Q n,k ), where dist(Q n,k ,Q n,k ) A diam(Q n,k ) = A diam(Q n,k ), the finite triple sum has exactly the form appearing in (2.9), and we can estimate Note that g i n,k and h i n,k appear in the decomposition of f i n,k in a bilinear way so that we are free to multiply these functions by any α > 0 and α −1 , respectively. In particular, since 1/r = 1/q − 1/p implies that 1/r ′ = 1/q ′ + 1/p, we may arrange the bound At a fixed point x ∈ Q N,kN ⊂ . . . ⊂ Q 1,k1 ⊂ Q 0 , the averages |f | Q n,kn satisfy |f | Q n+1,k n+1 > 2 |f | Q n,kn ; thus |f | Q n,kn ≤ 2 n−N |f | Q N,k N , and hence N n=0 |f | For the similar term involving the g i n,k , we need in addition Lemma 2.5.4: where, as before, Collecting the bounds, we have proved that Fixing A large enough so that ε A ≪ 1, we can absorb the last term and conclude that As this is a dense subspace of L r ′ (R d ), there exists a unique bounded linear functional Λ ∈ (L r ′ (R d )) * such that By the Riesz representation theorem, such a Λ ∈ (L r ′ (R d )) * is represented by a unique function a ∈ L r (R d ) of the same norm, and hence a r Θ,ˆaf =ˆbf ∀f ∈ L ∞ 0 (R d ). t) ) and letting t → 0, we deduce that ∆(x) = ∆(y) for all Lebesgue points x and y of ∆. Thus ∆(x) ≡ c is a constant, and b = a + c with a r Θ, as claimed.

Applications to the Jacobian operator
We now discuss applications of the previous methods towards the problem of finding an unknown function u with the prescribed Jacobian The Jacobian equation has been quite extensively studied in the form of a Dirichlet boundary value problem in a bounded, sufficiently smooth domain Ω ⊂ R d , There are several works dealing with datum f in Hölder [8] or Sobolev spaces [36]; in a different direction, a recent result [22,Theorem 6.3] addresses f ∈ L p (Ω) with p ∈ ( 1 d , 1). Our interest is in the conjecture of Iwaniec [19] discussed in Section 1.3; besides being set on the full space R d , it deals with datum f in the spaces L p (R d ), p ∈ (1, ∞), which fall in some sense "between" the higher regularity classes considered by [8,36], and the sub-integrability classes in [22]. The closest analogue of our results in the existing literature is the Hardy space H 1 (R d ) results of Coifman, Lions, Meyer and Semmes [6].

3.1.
Norming properties of Jacobians. We prove that the norm of a function b in various function spaces can be computed by dualising against functions in the range of the Jacobian operator. The following lemma, a variant of considerations used in [6, p. 263], already gives a flavour of such results: Proof. We can find g ∈ L ∞ 0 (Q ′ ), supported in a slightly smaller cube Q ′ = (1 − δ)Q, and with g ∞ ≤ 1 such that If we now replace v by a standard mollification φ ε * v and note that ∇(φ ε * v) = φ ε * ∇v, we observe that the above display remains valid for small enough ε > 0, We now proceed with this replacement, writing w = φ ε * v.
Next, at least one of the integrals´Q b∂ k w k , k = 1, . . . , d, has to be at least as big as their average d −1 ffl Q b div w, so in fact We now define a vector-valued function u = ( where c is the centre of Q, we write x i (resp. c i ) for the ith component of x (resp. c), and ϕ Q ∈ C ∞ c (2Q) is a usual bump such that 1 Q ≤ ϕ Q ≤ 1 2Q and |∇ϕ Q | 1/ℓ(Q). Then where e i is the ith coordinate vector. Thus ffl 2Q |∇u i | q 1 for i = k, and we already knew this for i = k. Since u k = w k is compactly supported inside Q, so is J(u), and for x ∈ Q, we simply have ∇u i (x) = e i for i = k. Hence since only the identity permutation gives a contribution. We have shown that for a certain u ∈ C ∞ c (2Q) d such that ffl 2Q |∇u| q 1, and this proves the lemma.
For the passage from the local estimate of Lemma 3.1.1 to global function space norms, we need two further lemmas that have nothing to do with the Jacobian, and will also be used in the next section.
Then there is collection Q of dyadic subcubes of Q 0 such that, at almost every x ∈ Q 0 , and Q is sparse in the sense that each Q ∈ Q has a major subset E(Q) such that |E(Q)| ≥ 1 2 |Q| and the subsets E(Q) are pairwise disjoint. Proof. This is a more elementary variant of Lerner's oscillation formula [25]; we recall the idea of the proof. For any disjoint subcubes Q 1 j of Q 0 , we have If the Q 1 j are chosen to be the maximal dyadic subcubes Q ⊂ Q 0 such that j qualifies for a major subset. Moreover, the sum of the first two terms on the right of (3.3) is dominated by 1 Q0 ffl Q0 |b − b Q0 | and the last term is a sum over disjointly supported terms of the same form as where we started, and we can iterate.

Proof. Let us consider a sequence of cubes
and hence, taking the L r average over x ∈ Q m , n=0 is a Cauchy sequence, and hence converges to some c. We conclude by Fatou's lemma that We are now ready for the main result of this section: ∞) for i = 1, . . . , d, and Moreover, in each case the respective function space norm is comparable to (3.4).
Proof. Let us first consider the "if" directions. The constant cases follow from the fact that´J(u) = 0, and it is immediate from Hölder's inequality that We then deal with r ∈ [ d d+1 , 1]. Let us first check that there is at least one k such that 1/r − 1/r k < 1. Suppose for contradiction that we have 1/r − 1/r k ≥ 1 for all k = 1, . . . , d. Summing over k, this gives d/r − 1/r ≥ d, and thus r ≤ d−1 d . But we are also assuming that d d+1 ≤ r, thus d 2 ≤ (d − 1)(d + 1) = d 2 − 1, a contradiction. Without loss of generality, we may assume that k = 1, thus so that s ∈ (1, ∞). We can then write where σ ∈ C ∞ c (R d ) d satisfies div σ = 0 and σ s d i=2 ∇u i ri ≤ 1, is the vector of the Riesz transforms, and finally f = (−∆) 1/2 u 1 satisfies f r1 The last two norms are bounded by one, and 1/s ′ = 1 − 1/r + 1/r 1 , so that We turn to the "only if" parts of the theorem. Recall the definition of Γ from (3.4). We apply Lemma 3.1.1 with some q > max i=1,...,d r i . If Q ⊂ R d is any cube, then for some u ∈ C ∞ c (2Q) d with (3.5) If r = 1, this is precisely the condition that b BMO Γ. For r < 1, the conclusion follows as in the proof of Theorem 2.4.1.
Let us then consider r > 1. Let Q 0 ⊂ R d be an arbitrary cube. We apply Lemma 3.1.2 and monotone convergence to see that where {Q k } ∞ k=1 is an enumeration of the collection Q given by Lemma 3.1.2. We then dualise with some φ r ≤ 1, and apply just the first step of (3.5) to each Q k in place of Q. Note that this produces a possibly different In order to proceed, we make a randomisation trick. Due to the d-linear nature of the Jacobian, we invoke a sequence (ζ k ) N k=1 of independent random dth roots of unity, i.e. the ζ k 's are independent random variables on some probability space, distributed so that P(ζ k = e i2πa/d ) = 1/d for each a = 0, 1, . . . , d − 1. The case d = 2 thus corresponds to the familiar random signs. The important feature of these random variables is that, denoting by E the expectation, Indeed, if k 1 = . . . = k d = k, then d j=1 ζ kj = ζ d k ≡ 1, so also its expectation is equal to 1. Otherwise, we have noting that e i2πnj /d = 1 since 0 < n j < d.
Using ( * ), we can now continue the computation from above witĥ To estimate each L ri norm above, we dualise with ψ r ′ i ≤ 1. Recalling that u k i ∈ C ∞ c (2Q k ) satisfies the bound for u in Lemma 3.1.1, and using the definition of λ k and the disjoint major subsets E(Q k ) from Lemma 3.1.2, we havê , by the boundedness of the maximal operator and the choice of q > r i so that r ′ i > q ′ . Substituting back, we have checked that for an arbitrary cube Q 0 ; by Lemma 3.1.3, this completes the proof of the theorem in the remaining case that r > 1.
Remark 3.1.5. The "if" parts of the cases r ∈ ( d d+1 , 1] of Theorem 3.1.4 could also be deduced from a result of [6, cf. Theorem II.3], which says that J(u) belongs to the Hardy space H r (R d ) under the same assumptions, together with the H 1 -BMO duality when r = 1 or the H r -Ċ 0,d(1/r−1) -duality for r ∈ ( d d+1 , 1). However, a separate argument would be required for the end-point r = d d+1 any way: in fact, , since J(u) fails, in general, to satisfy the required moment conditions´x i a = 0 of an H d/(d+1) -atom a. This follows e.g. from the proof of Lemma 3.1.1, which contains the observation that any ∂ k w, with w ∈ C ∞ c (R d ), can arise as the Jacobian J(u) of a suitable u ∈ C ∞ c (R d ) d . However, we have´x k ∂ k w = −´w∂ k x k = −´w, which can easily be nonzero. The departure from the Hardy-Hölder duality is also reflected by the fact that the condition for b in Theorem 3.1.4 corresponding to r = d d+1 is the usual Lipschitz-continuity, |b(x) − b(y)| |x − y|, and not the Zygmund class condition arising from the Hardy space duality.
On the other hand, one can also give a different proof of the "if" part of Theorem 3.1.4 in this special case r = d d+1 . Using the notation from the previous proof, , we find that r 1 ∈ (1, d). Writing, as before, J(u) = ∇u 1 · σ, we havê bJ(u) =ˆb∇u 1 · σ = −ˆu 1 div(bσ) = −ˆu 1 (∇b) · σ, since div σ = 0. But then we can estimate where ∇b ∞ is bounded by the Lipschitz constant, σ s ≤ 1, and by Sobolev's inequality, and this completes the alternative proof.
3.2. The linear span of Jacobians. Here we will obtain the following consequence of Theorem 3.1.4: The power d in the series is related to the d-homogeneity of the Jacobian, so that for p = 1. The case p = 1 is already due to Coifman et al. [6]; they explicitly formulate a similar result [6, Theorem III.2] for the "div-curl example" but point out that "this type of answer applies also to other examples like the Jacobian". Our proof of the full Theorem 3.2.1 depends on the same functional analytic lemma as used in [6] for the case p = 1. The formulation below combines [6, Lemmas III.1, III.2] and is taken from [28]. We recall the short proof for the sake of recording a precise quantitative relation between the equivalent qualitative conditions: Let V ⊂B X (0, 1) be a symmetric subset of the unit-ball of a Banach space X. Then the following conditions are equivalent: (1) There is α > 0 such that sup x∈V | λ, x | ≥ α λ X * for all λ ∈ X * .
Proof of Theorem 3.2.1. We apply Lemma 3.2.
. It is immediate that V is symmetric, and that V ⊂B X (0, 1) if p > 1. For p = 1, this last inclusion is nontrivial but well known from [6,Theorem II.1].
The assertion of Theorem 3.2.1 is clearly the same as (3) of Lemma 3.2.2 for these choices of X and V . By Lemma 3.2.2, it hence suffices to verify (1) of the same lemma, i.e., that But this is precisely the statement of Theorem 3.1.4 for r = p ∈ [1, ∞) and r 1 = . . . = r d = pd. The a priori condition that b ∈ L p ′ (R d ) guarantees that the additive constant present in Theorem 3.1.4 for r > 1 does not appear here. (1) Lindberg [28,Lemma 3.1] shows that another equivalent condition in Lemma 3.2.2 is that ∞ n=1 n · s(V ) has second category in X. Hence, if any of these conditions fails, then ∞ n=1 n · s(V ) has first category in X. Lindberg uses this to show [28,Theorems 1.2,7.4] that the set [28, p. 739] also sketches how to deduce the special case d = 2 of Theorems 3.1.4 and 3.2.1 from the special case of (then unknown) Theorem 1.0.1, where T is the Ahlfors-Beurling operator. Since a more general result is proved above by working directly with the Jacobian, we do not repeat his argument here, but the interested reader may consult the companion paper [18] for this approach. Nevertheless, the strategy proposed by Lindberg was an important motivation for the discovery of our present results.

Higher order real commutators and the median method
In this section we establish the following variant of Theorem 1.0.1. In one direction, it generalises Theorem 1.0.1 by allowing iterated commutators of arbitrary order, but in another direction it imposes a more restrictive assumption by requiring the pointwise multiplier b to be real-valued. This restriction arises from the proof using the so-called median method, which takes explicit advantage of the order structure of the real line. We note, however, that this restriction is imposed on b only; the kernel K of T may still be complex-valued.
Theorem 4.0.1. Let 1 < p, q < ∞, let T be a non-degenerate Calderón-Zygmund operator on R d , and let k ∈ {1, 2, . . .} and b ∈ L k loc (R d ; R). Then the k times iterated commutator if and only if: • p = q and b has bounded mean oscillation, or , and c is constant.
As in the case of Theorem 1.0.1, all the "if" statements are either classical (such as the case p = q that goes back to Coifman, Rochberg and Weiss [7]) or straightforward; this applies to the remaining cases, which may be handled by easy extensions of the arguments sketched for k = 1 in Section 1.1. (There is also a variant of the p < q case of Theorem 4.0.1 due to Paluszyński, Taibleson and Weiss [30], but for k > 1, it deals with operators that are related to, but not exactly the same as, the iterated commutators T k b that we study. This leads to a slightly different result.) As before, our principal task is to prove the "only if" directions.

4.1.
Basic estimates of the median method. We will not give a formal definition of the "median method", but the reason for this nomenclature should be fairly apparent from the considerations that follow. The broad philosophy of this method should be attributed to Lerner, Ombrosi and Rivera-Ríos [26], but we fine-tune some of its details in such a way as to be able, in particular, to answer a problem that was raised but left open in [26,Remark 4.1]. The simplest form of the median method is contained in the following lemma. Under a quantitative positivity assumption on the kernel (which may nevertheless be complex-valued!), it needs no additional "Calderón-Zygmund" structure.
Proof of Lemma 4.1.1. The basic observation is that, if α ∈ R and x ∈B ∩{b ≤ α}, thenˆB and hence In a completely analogous way, integrating over x ∈B ∩ {b ≥ α}, we also prove that Choosing α as a median of b onB, we have We present a variant of the result for rough homogeneous kernels. While the conclusion is essentially identical, the proof requires an additional iteration of the basic argument.
Then there is a (large) constant A, depending only on the above data, such that every b ∈ L k loc (R d ) satisfies the following estimate for every ball B: Proof. Given B = B(y 0 , r), let x 0 = y 0 + Arθ 0 , where the large A is yet to be chosen, andB = B(x 0 , r). The basic observation is that, if b(x) ≤ α, then Hence, taking α as the median of b onB, we havê the double integral can be dominated by the sum of where ε A → 0 as A → ∞, by the assumption that θ 0 is a Lebesgue point of K.
Substituting back, and observing in particular the cancellation of the factors A d and A −d in the double integral, we have proved that and hence Replacing (b, α) by (−b, −α), we also have and adding the two estimates, where E i ⊂ B andẼ i ⊂B for i = 1, 2. Recall that α was the median of b onB, but since this choice of α is a quasi-minimiser for the integral on the right, we also deduce the more symmetric version for some subsets E j ⊂ B,Ẽ j ⊂B, whereB is a ball of the same radius r and dist(B,B) r. By assumption (4.3), it follows that From this the rest follows as in the proof of Theorem 2.4.1.
Theorem 4.2.2. Let K be a non-degenerate Calderón-Zygmund kernel, let k ∈ {1, 2, . . .}, and b ∈ L k loc (R d ; R). Let and suppose that T k b satisfies the following weak form of L p → L q boundedness: whenever, for each i = 1, . . . , N , we have f i ∈ L ∞ (Q i ) and g i ∈ L ∞ (Q i ) for cubes Then b = a+c for some a ∈ L rk (R d ) and some constant c ∈ C, where a rk Θ.
Proof. Let us fix some (large) cube Q 0 ⊂ R d . We apply Lemma 3.1.2 to find that where we also introduced an enumeration of the sparse collection Q of dyadic subcubes of Q 0 given by Lemma 3.
It is enough to give a uniform bound for the finite sums Using the sparseness of Q ⊃ {Q j } N j=1 , we can bound the second factor by We dualise the first factor with N j=1 λ r ′ j |Q j | ≤ 1 to end up considering where we used the assumption (4.5) in the last step. By Lemma 2.5.4, we have, using the disjoint major subsets E(Q j ) ⊂ Q j , For the first factor on the right of (4.6), we obtain a similar bound by starting with which also follows from Lemma 2.5.4, and then finishing as before.
We have now proved that b − b Q0 L kr (Q0) Θ 1/k for any cube Q 0 ⊂ R d . This shows in particular that b ∈ L kr loc (R d ), and we conclude by Lemma 3.1.3.

4.3.
Two-weight norm inequalities of Bloom type. We finally discuss the boundedness of commutators between weighted L p spaces with weights from the Muckenhoupt class where the supremum is over all balls B ⊂ R d . We consider p ∈ (1, ∞) fixed throughout this discussion, and denote by w ′ := w − 1 p−1 the dual weight. One checks that w ∈ A p if and only if w ′ ∈ A p ′ . The space L p ′ (w ′ ) is the dual of L p (w) with respect to the unweighted duality f, g =´f g. We will identify a weight and its induced measure, using notation like w(Q) :=´Q w.
We will be concerned with the boundedness of i.e., we allow two different weights on the domain and the target space, but (in contrast to the rest of the paper) we restrict the Lebesgue exponents to p = q ∈ (1, ∞). This fits with the line of investigation that was started by Bloom [3] and that has been recently revived by Holmes, Lacey, and Wick [15], followed by several others as we shortly recall. Here we complete the following picture: The first version of Theorem 4.3.1, when k = d = 1 and T is the Hilbert transform, is due to Bloom [3]. Still for first order commutators (k = 1) but in arbitrary dimension d ≥ 1, Holmes, Lacey, and Wick [15] proved the "if" part of Theorem 4.3.1 for all standard Calderón-Zygmund operators, and the "only if" part assuming the boundedness of each of the d Riesz transforms R i , i = 1, . . . , d, thus extending the exact scope (in terms of operators) of the classical Coifman-Rochberg-Weiss theorem [7] to the two-weight setting. The first two-weight result for iterated commutators was achieved in the "if" direction by Holmes and Wick [16] (with a simplified proof in [17]): they obtained the boundedness of T k b : L p (µ) → L p (λ) for any k ≥ 1 under the stronger condition that b ∈ BMO(ν) ∩ BMO(R d ) ⊂ BMO(ν 1/k ). (For the inclusion, which in general is strict, see [26,Lemma 4.7].) Finally, Lerner, Ombrosi, and Rivera-Ríos [26] obtained Theorem 4.3.1 almost as stated: they identified the correct BMO space with the weight ν 1/k depending on the order k of the commutator, and they proved the "if" part of Theorem 4.3.1 for all standard Calderón-Zygmund operators and the "only if" part for all homogeneous Calderón-Zygmund operators with the fairly general local positivity assumption discussed in Section 1.2. For us, it remains to prove this "only if" part assuming non-degeneracy only, and more precisely we prove: Theorem 4.3.2. Let K be a non-degenerate Calderón-Zygmund kernel, let k ∈ {1, 2, . . .} and b ∈ L k loc (R d ; R). Let p ∈ (1, ∞), let λ, µ ∈ A p (R d ), and suppose that T k b satisfies the following weak form of L p (µ) → L p (λ) boundedness: whenever f ∈ L ∞ (B), g ∈ L ∞ (B) for any two balls of equal radius r and distance dist(B,B) r. Then b ∈ BMO(ν 1/k ), where ν = (µ/λ) 1/p , and more precisely b BMO(ν 1/k ) Θ 1/k .
Let us first observe that (4.7) is indeed a weak form of the boundedness of T k b : L p (µ) → L p (λ): if this boundedness holds, then | T k b f, g | ≤ T k b f L p (λ) g L p ′ (λ ′ ) ≤ T k b L p (µ)→L p (λ) f L p (µ) g L p ′ (λ ′ ) ≤ T k b L p (µ)→L p (λ) · f ∞ µ(B) 1/p · g ∞ λ ′ (B) 1/p ′ , and thus Θ ≤ T k b L p (µ)→L p (λ) . Turning to the proof of Theorem 4.3.2, we need a simple lemma, which is the only place where the A p condition is used. Proof. We recall that all A p weights, and then also λ ′ ∈ A p ′ , are doubling. Hence λ ′ (B) λ ′ (B). We then use the A p property of both µ and ν directly via the definition (together with some basic algebra involving p and p ′ ) to see that Proof of Theorem 4.3.2. As in the proof of the unweighted version in Theorem 4.2.1, we have (just copying (4.4) from the said proof) for some subsets E j ⊂ B,Ẽ j ⊂B, whereB is a ball of the same radius r and dist(B,B) r.
Acknowledgements. I would like to thank Sauli Lindberg for bringing the problem about the Jacobian operator and its connection to commutators to my attention, and for pointing out some oversights in an earlier version of the manuscript. I would also like to thank Riikka Korte for discussions on the theme of the paper, and the anonymous referee for constructive suggestions on the presentation.
Declarations of interest: None.