Weak and Strong Type \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_1$$\end{document}A1–\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_\infty $$\end{document}A∞ Estimates for Sparsely Dominated Operators

We consider operators T satisfying a sparse domination property \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\begin{aligned} |\langle Tf,g\rangle |\le c\sum _{Q\in \mathscr {S}}\langle f\rangle _{p_0,Q}\langle g\rangle _{q_0',Q}|Q| \end{aligned}$$\end{document}|⟨Tf,g⟩|≤c∑Q∈S⟨f⟩p0,Q⟨g⟩q0′,Q|Q|with averaging exponents \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$1\le p_0<q_0\le \infty $$\end{document}1≤p0<q0≤∞. We prove weighted strong type boundedness for \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_0<p<q_0$$\end{document}p0<p<q0 and use new techniques to prove weighted weak type \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(p_0,p_0)$$\end{document}(p0,p0) boundedness with quantitative mixed \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_1$$\end{document}A1–\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$A_\infty $$\end{document}A∞ estimates, generalizing results of Lerner, Ombrosi, and Pérez and Hytönen and Pérez. Even in the case \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$p_0=1$$\end{document}p0=1 we improve upon their results as we do not make use of a Hörmander condition of the operator T. Moreover, we also establish a dual weak type \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(q_0',q_0')$$\end{document}(q0′,q0′) estimate. In a last part, we give a result on the optimality of the weighted strong type bounds including those previously obtained by Bernicot, Frey, and Petermichl.


Introduction
In recent years, after a solution was found to the well-known A 2 conjecture [24], the role of sparse operators has become increasingly important in the weighted theory of many operators, see for instance [5,12,14,29,31] and references therein. Sparse domination yields optimal quantitative A p estimates for 1 < p < ∞, for example, for the classical Riesz transforms in R n . As has been shown by Bernicot, Frey, and Petermichl [6], the idea of sparse domination reaches far beyond the theory of Calderón-Zygmund operators. Indeed, one can consider the Riesz transform ∇ L −1/2 in, e.g. a convex doubling domain in R n , where L is the Laplace operator with respect to Neumann boundary conditions. Generally, the Riesz transform in such a setting does not satisfy any pointwise regularity estimates and therefore falls outside of the class of Calderón-Zygmund operators. However, it satisfies a sparse domination property which does in fact yield the quantitative weighted bounds from the A 2 conjecture. In R n , foregoing the full range of 1 < p < ∞, one can consider the Riesz transform for elliptic operators L = − div(A∇) for A with bounded, complex coefficients. Such operators are only bounded in L p for a certain range p 0 < p < q 0 , and it was established in [6] that they satisfy a sparse domination property | T f, g | ≤ c Q∈S f p 0 ,Q g q 0 ,Q |Q| from which general quantitative weighted bounds in the respective weighted L p -spaces are deduced.
For Calderón-Zygmund operators, weighted weak type (1, 1) estimates were established by Lerner et al. [33] and later improved upon by Hytönen and Pérez [26]. In this article, we establish the corresponding ( p 0 , p 0 ) estimate in the more general setting described above. The arguments used in [33] rely on introducing weights in the classical arguments involving Calderón-Zygmund decompositions f = g + b and the vanishing mean value property of the 'bad' part b in combination with the Hörmander condition of the kernel of the operator. In general, the operators we are considering here need not be integral operators at all and for the more general operators such as the Riesz transform associated to an elliptic operator, an argument by Blunck and Kunstmann [8] (see also [23]) gave a weak type ( p 0 , p 0 ) boundedness using an adapted L p 0 Calderón-Zygmund decomposition, where a certain cancellation of the operator with respect to the semigroup generated by the elliptic operator replaces the regularity estimates of the kernel. Weights were then introduced into this argument by Auscher and Martell [2], but it seems like these techniques do not yield optimal bounds in terms of the constants of the weights. Therefore, we give a new argument to establish the corresponding bounds while still recovering the old bounds found in [26].
Here, in order to combine the previous approaches and to tie the theory together, we deduce quantitative weighted bounds directly from sparse domination assumptions. We introduce weights into a weak boundedness argument for sparse operators where there exists a Calderón-Zygmund decomposition with the property that the 'bad' part b cancels completely. We then combine this with generalizations of the main lemmata used in [33]. Moreover, we leave the Euclidean setting and extend the results to more general doubling metric measure spaces including certain bounded domains and Riemannian manifolds as was also studied in [8] and [2,3].
In a last part we show that the strong type weighted estimates are optimal, given a precise control of the asymptotic behaviour of the unweighted L p operator norm of T at the endpoints p = p 0 and p = q 0 . We give an example of such an operator in the case p 0 = 1, q 0 = n.

The Setting
We consider the Euclidean space R n equipped with a Borel measure μ that satisfies 0 < μ(B) < ∞ for all balls B and which satisfies the doubling property, i.e. there is a C > 0 such that μ(2B) ≤ Cμ(B) (1.1) for all balls B, where 2B denotes the ball with the same centre as B and whose radius is twice that of the radius of B. Taking the smallest such C we define ν := log 2 C, which we refer to as the doubling dimension. We write |E| := μ(E) and for each measurable set E of finite non-zero measure and each 0 < p ≤ ∞ we will write where χ E denotes the indicator function of the set E. We write f, g := f g dμ, and define p = p/( p − 1) ∈ [1, ∞] for 1 ≤ p ≤ ∞.
For α ∈ 0, 1 3 , 2 3 n we will consider the translated dyadic systems The collection D is used as a replacement for the collection of balls or the collection of all cubes in R n , which is justified by the fact that for any ball B(x; r ) ⊆ R n there is a cube Q ∈ D so that B(x; r ) ⊆ Q and diam(Q) ≤ ρr for a constant ρ = ρ(n) > 0, and for any cube P ⊆ R n there is a cube Q ∈ D such that P ⊆ Q and (Q) ≤ 6 (P), where (R) denotes the side length of a cube R.
We say that a collection S ⊆ D is called η-sparse for 0 < η ≤ 1 if for each Remark 1.1 Since R n is connected and unbounded, the doubling property implies that μ(R n ) = ∞ [21]. We are working in R n for notational reasons only; since our applications lie in a more general framework, our arguments are written so that they work with minimal adaptations in general doubling metric measure spaces X . Our main results remain true even when μ(X ) < ∞, for example when X is a bounded Lipschitz domain in R n . We will detail how this can be seen in Sect. 4.
We let D be a space of test functions on R n with the property that it is dense in L p (w) for all 1 ≤ p < ∞ and all weights w ∈ A ∞ , for example, D = C ∞ c (R n ).

Definition 1.2
Let T be a (sub)linear operators, initially defined on D, with the following property: There are 1 ≤ p 0 < q < q 0 ≤ ∞ and constants c > 0 and 0 < η ≤ 1 so that for each pair of functions f, g ∈ D there is an η-sparse collection S ⊆ D so that Then we will write T ∈ S( p 0 , q 0 ), or T ∈ S( p 0 , q 0 ; μ) if we wish to emphasize the underlying measure, and we shall refer to the operators in this class as sparsely dominated operators.
If T ∈ S( p 0 , q 0 ), then it extends to a bounded operator on L p for all p 0 < p < q 0 , see Proposition 2.2. For examples of operators in this class we refer the reader to Sect. 1.3 When writing that a constant C = C(T ) > 0 depends on T , we mean that it depends on the constants c, η in the domination property (1.2). We remark that the sum on the right-hand side of (1.2) can be split into 3 n sums by considering the different dyadic grids, simplifying the proofs by only having to consider a single dyadic grid at a time. Finally, we remark that if T is linear, then T ∈ S( p 0 , q 0 ) if and only if T * ∈ S(q 0 , p 0 ), where T * denotes the dual operator of T .
We will write A B when there is a constant C > 0, independent of the important parameters, so that A ≤ C B. Moreover we write A B if A B and B A.

Main Results
For 1 ≤ p 0 < q 0 ≤ ∞ we consider an operator T ∈ S( p 0 , q 0 ). Then T will be of strong type ( p, p) for any p 0 < p < q 0 and of weak type ( p 0 , p 0 ), see Proposition 2.2. As a matter of fact, T will satisfy weighted boundedness for various classes of weights. It has been shown in [6] that for p 0 < p < q 0 and any w ∈ A p/ p 0 ∩RH (q 0 / p) we have where φ( p) = (q 0 / p) ( p/ p 0 − 1) + 1, and that the exponent in the last estimate is optimal for sparse operators. This generalizes the positive result of the well-known A 2 -conjecture, stating that for all Calderón-Zygmund operators T one has (1.4) Indeed, the result in (1.3) recovers this result since Calderón-Zygmund operators are in the class S(1, ∞). Historically, the estimate (1.4) was first proven to be true for the Beurling-Ahlfors transform by Petermichl and Volberg [38], solving an optimal regularity problem for solutions to Beltrami equations. In between this period and the time that (1.4) was established in full generality by Hytönen [24], it was shown by Lerner, Ombrosi, and Pérez [32] that for all Calderón-Zygmund operators T one has for all 1 < p < ∞, showing a significantly better exponent of the constant of the weight when considering the smaller class of weights A 1 ⊆ A p . Using mixed A 1 -A ∞ type estimates, this result was improved by Hytönen and Pérez [26] to which appears in [41][42][43]. They also provided an improvement to (1.4) using mixed A p -A ∞ type estimates. Such mixed type estimates have also appeared in the recent work by Li [34], who gives a direct improvement of (1.3).
To continue on along this line of results, we establish the following: Then there is a constant c = c(T, ν, n) > 0 so that In particular, we have Our result (1.7) recovers (1.6) when setting p 0 = 1, q 0 = ∞. One shows that (1.8) follows from (1.7) by applying (2.1) and Proposition 2.1(ii). This result recovers the exponent in (1.5) when q 0 = ∞.
The constants found in the estimate (1.7) can be used to establish weighted weak type ( p 0 , p 0 ) boundedness. In the work of Lerner, Ombrosi, and Pérez [32] it was shown that for all Calderón-Zygmund operators T and all weights w ∈ A 1 one has This result is related to the weak Muckenhoupt-Wheeden conjecture, which is now known to be false [36], stating that one has linear dependence on [w] A 1 on the righthand side of (1.9), and the logarithm can be removed. However, the result (1.9) was improved by Hytönen and Pérez [26] to (1.10) It is expected that this dependence on the constants of the weight is optimal. Both the proofs of (1.9) and (1.10) rely on taking a Calderón-Zygmund decompo- Here, the Hörmander condition of the kernel of T is used to deal with the 'bad' part b, using an argument that can already be found in [37] (namely, they use [18, Lemma 3.3, p. 413]). Since we are making no such assumptions on our operators, which may not even be integral operators, we rely on new methods to deal with this term, using only sparse domination. We establish the following result: We note that in particular we recover the bound (1.10). It is of interested to point out that we get this bound even for operators outside of the class of Calderón-Zygmund operators that are in S(1, ∞), see Example 1.8.
We also establish a dual result of the type first studied in [33], generalizing the result [26,Theorem 1.23]. Here we denote by T * the dual operator of T for linear T .
Using the ideas of [35], we then establish optimality of the weighted estimates in terms of the asymptotic behaviour of the unweighted L p operator norm of T at the endpoints p = p 0 and p = q 0 . We refer to Definition 5.1 for the definition of the exponents α T ( p 0 ) and γ T (q 0 ). Theorem 1.6 Let 1 ≤ p 0 < q 0 ≤ ∞, let T ∈ S( p 0 , q 0 ), and let w ∈ A p/ p 0 ∩ RH (q 0 / p) . Then the exponent in the estimate from [6] is optimal under the assumption that α T ( p 0 ) = 1/ p 0 and γ T (q 0 ) = 1/q 0 . Moreover, for w ∈ A 1 ∩ RH (q 0 / p) , the exponent in the estimate In the example of the Riesz transform on two copies of R n glued smoothly along their unit circles [9], it is known that q 0 = n and γ T (q 0 ) = (n − 1)/n, and thus the weighted estimate is optimal. See Example 5.5.

Examples
There is a wealth of examples of sparsely dominated operators. Other than the class of Calderón-Zygmund operators, our main examples can be found in [6,Sect. 3]. See also the earlier work [2]. We point out several examples of particular interest here. Example 1.7 (Riesz transform associated with elliptic second-order divergence form operators). Let A be a complex, bounded, measurable matrix-valued function in R n satisfying the ellipticity condition Re(A(x)ξ · ξ) ≥ λ|ξ | 2 for all ξ ∈ C n and a.e.
x ∈ R n . Then one can define a maximal accretive operator which generates a semigroup (e −t L ) t>0 . If both the semigroup and the family ( √ t∇e −t L ) t>0 satisfy L p 0 -L q 0 off-diagonal estimates, then the Riesz transform R := ∇ L −1/2 is in the class S( p 0 , q 0 ). In particular we point out that if we are using the Lebesgue measure in dimension ν = n = 1, we have p 0 = 1 and q 0 = ∞ so that R ∈ S(1, ∞). We refer the reader to [1] for more values of p 0 and q 0 in other dimensions in the Euclidean setting and to [6] for details on the sparse domination result.
Example 1.8 (Riesz transform associated to Neumann Laplacian) Suppose is the Laplace operator associated with Neumann boundary conditions in a bounded convex doubling domain in R n . As studied in [40], the Riesz transform ∇ −1/2 will not in general have a kernel satisfying pointwise regularity estimates and is thus not in the class of Calderón-Zygmund operators. However, this operator does belong to the class S(1, ∞) and will therefore satisfy the bound (1.6). Note that for this example we need to apply our results to a metric measure space other than R n . We refer the reader to Sect. 4 for an overview of the theory in bounded domains. Example 1.9 (Fourier multipliers) Let m be the function in R n defined by m(ξ ) = 1 − |ξ | 2 for |ξ | ≤ 1 and m(ξ ) = 0 elsewhere. For δ ≥ 0, the Bochner-Riesz operator B δ is defined as the Fourier multiplier B δ f := (m δf ) ∨ . Then, for any δ > 0 there exists a 1 < p 0 < 2 so that for any 0 < ε < 2 − p 0 we have B δ ∈ S( p 0 + ε, 2). For details we refer the reader to [5].

Notation
where it will be made clear from the context which dyadic grid D α we are considering, and where we will write M := M 1 . Similarly we define M B p and M B to be the uncentred maximal operators with respect to balls rather than cubes.
We list some of the basic definitions and facts about weights. A measurable function w : R n → (0, ∞) is called a weight. We identify a weight w with a Borel measure by setting for all measurable sets E ⊆ R n . For 1 ≤ p ≤ ∞ we, respectively, denote by L p (w) and L p,∞ (w) the Lebesgue and weak Lebesgue spaces with measure w.
where the supremum is taken over all balls B ⊆ R n . For an overview of this constant we refer the reader to [27] and references therein. In particular we point out that for a dimensional constant c = c(n, ν) > 0 we have where the first inequality here can be found in [26,Proposition 2.2], while the second one follows from Hölder's inequality.
For s = 1 we will use the interpretation We provide some facts about the classes A 1 and A ∞ that we will use.
(iii) There are constants c, κ > 0 depending only on the doubling dimension ν, so that for every w ∈ A 1 we have Proof For (i) we refer the reader to [20,41]. Property (ii) can be found in [28]. Property (iii) is a consequence of [27, Theorem 1.1]. Indeed, this result states that there are constants c, κ > 0 depending only on ν such that for any ball B we have Thus, (iii) follows from Hölder's inequality and the definition of A 1 .

Weighted Boundedness of Sparsely Dominated Operators
We wish to give some heuristic arguments as to why we can expect certain weighted boundedness of sparsely dominated operators. We start with the following observation: The verification of the strong boundedness is by now standard, see also [13]. While the weak type boundedness should be well known, we could not find a precise reference for the cases where p 0 > 1. For the case p 0 = 1 we refer the reader to [12,Theorem E], see also [4,Proposition 6]. For completeness we give a proof of the general case here, which we defer to the end of this section.
We will show that if an operator T lies in S( p 0 , q 0 ; μ), then T must also lie in S(q − , q + ; w) for appropriate weights w, and for certain q − < q + depending on w. Then Proposition 2.2 implies that T satisfies weighted boundedness.
First we note that if we have a sparse collection S ⊆ D with respect to the reference measure μ, then S is also sparse with respect to all weights w ∈ A ∞ . Indeed, suppose w ∈ A p for some 1 ≤ p < ∞ and suppose S is η-sparse with (E Q ) Q∈S ∩D α as one of the associated pairwise disjoint collections. Then, by Hölder's inequality (use the first equation in Next we observe that for any 1 ≤ p ≤ q < ∞ it follows from Hölder's inequality that Thus, if T ∈ S( p 0 , q 0 ; μ) and w ∈ A p 1 / p 0 ∩ RH (q 0 /q 1 ) for some p 0 < p 1 ≤ q 1 < q 0 , then it follows from the self-improvement properties (i) of Proposition 2.1 that we can find p 0 ≤ q − < p 1 , Picking appropriate functions f , g, and by applying the sparse domination property to the pair f , gw, we find a sparse collection S ⊆ D so that by (2.2) we have In other words, we have T ∈ S(q − , q + ; w) and thus we obtain the boundedness For the case where p 1 = q 1 = p 0 and thus when w ∈ A 1 ∩RH (q 0 / p 0 ) , an analogous reasoning shows that for some p 0 < q + we have T ∈ S( p 0 , q + ; w). Hence, it follows from Proposition 2.2 that T is of weak type ( p 0 , p 0 ) with respect to such weights.
Our main results deal with the cases p 1 = p 0 where we establish quantitative bounds of T in terms of the characteristic constants of the weight in the situations Proof of Proposition 2.2 By splitting into 3 n terms, we may assume without loss of generality that our sparse domination occurs in a single dyadic grid D α throughout our arguments.
Let p 0 < p < q 0 and let f ∈ L p ∩ D, g ∈ L p ∩ D. Then we can find a sparse collection S ⊆ D α so that By using Hölder's inequality and by noting that p > p 0 , p > q 0 , it remains to observe that M p 0 f p f p and M q 0 g p g p . Hence, T extends to a bounded operator in L p .
For the second assertion we will use the equivalence Given such an f with f p 0 = 1 and E ⊆ R n of finite positive measure we define Then we can find a sparse collection S ⊆ D α such that We proceed by taking a Calderón-Zygmund decomposition of | f | p 0 ∈ L 1 . We can find a disjoint collection P ⊆ D α of cubes so that = ∪ P∈P P and functions g, Noting that for all P ∈ P we have P ∩ E = ∅, the properties of the dyadic system imply that for any Q ∈ S with Q ∩ E = ∅ we have P ⊆ Q whenever P ∩ Q = ∅. But then by (2.5) and arguments similar to the ones in the first part of the proof we have Thus, by combining (2.4) and (2.7), we find using (2.6) that Hence, we may conclude from (2.3) that T L p 0 →L p 0 ,∞ < ∞, finishing the proof.

Remark 2.3
The cancellation of the 'bad' part b in our proofs occurs because we are able to perform our Calderón-Zygmund decomposition in the same dyadic grid as where the sparse domination occurs, see Lemma 4.6. The usual Whitney decomposition argument that is used for Calderón-Zygmund decompositions in general doubling metric measure spaces, as can be found for example in [11,39], is not precise enough for this particular argument and we need to adapt the results so that they work with our dyadic grids.

Proofs of the Main Results
Throughout these proofs we fix α ∈ 0, 1 3 , 2 3 n and only consider cubes taken from the grid D α . We also only consider the dyadic maximal operators M p to be taken with respect to this grid to facilitate some of the arguments and for simpler constants in our estimates. Recall that D denotes a space of functions in R n which has the property that it is dense in L p (w) for all 1 ≤ p < ∞ and all weights w ∈ A ∞ .
As an analogue to [32, Lemma 3.2] and [26, Lemma 6.1], our main lemma is the following: , where c p is as in Theorem 1.3.
We point out that a similar type of result is established in [15,Theorem B].

Remark 3.2
In the unweighted case we note that Thus, it appears that adding the weight accounts for the extra term We break up the proof of the main lemma into a sequence of lemmata.
Proof Note that for any Q ∈ D α we have as desired.

Lemma 3.4
For all 1 ≤ q < ∞, w ∈ L q loc , p 0 < p < q 0 , and f, g ∈ D, we have For the proof of this lemma we require two results on dyadic maximal operators. By the classical result of Fefferman and Stein [17] we have and thus M f L p (w) ≤ p f L p (M w) for 1 < p < ∞ by the Marcinkiewicz Interpolation Theorem. This implies that Moreover, as a consequence of Kolmogorov's Lemma we have Proof We will prove the stronger assertion valid for all 1 < r < p 0 , generalizing a version of the result [30, Theorem 1.7] and its proof in which the case p 0 = 1, q 0 = ∞ is treated. The result of the lemma follows by taking r = ( p − 1)/q + 1 ∈ (1, p ]. We set so that 0 < β ≤ 1. By Lemma 3.3 and by Hölder's Inequality we find that where .
We will consider two cases. First assume that and β = 1. Then by the assumption r < p 0 . Then it follows from (3.2) and (3.3) that as desired.
For the second case we assume that Then, using r < p 0 , we note that
Thus, the result follows from (3.4). By combining the two cases, the assertion follows.
For the proof of Lemma 3.1 we will use a result that can be found in [32, p. 8] which states that Proof of Lemma 3.1 Setting v := w (q 0 / p) , it follows from (3.6) that where we used that By maximizing the function t → t 1/t for t ≥ 1, we note that (q ) 1/q ≤ e 1/e . Hence, by combining Lemma 3.4 and (3.7), the result follows.
Proof of Theorem 1.3 Set v := w (q 0 / p) ∈ A 1 and let κ be the constant from Proposition 2.1(iii). Setting Hence, from Lemma 3.1 it follows that proving the result.
Proof of Theorem 1.4 The proof uses arguments similar to the ones presented in the proof of Proposition 2.2. We use the equivalence Fixing a function f ∈ D with f L p 0 (w) = 1 and a measurable set E, we set where M B denotes the uncentred maximal operator with respect to all balls B ⊆ R n and where c = c(n, ν) > 0 is the constant appearing in the inequality , which is a consequence of (3.1). We have and thus, setting E := E\ , we have w(E ) ≥ w(E) − w( ) ≥ w(E)/2. By applying Lemma 4.6 with | f | p 0 ∈ L 1 , we obtain a disjoint collection P ⊆ D α of cubes so that = ∪ P∈P P and functions g, b so that | f Picking a function h satisfying |h| ≤ χ E and hw ∈ D, we apply the sparse domination property to the pair f , hw to find a sparse collection S ⊆ D α so that, by using Lemma 3.1 with the weight wχ E , for all p 0 < p < q 0 and 1 < q < ∞ we have (3.10) Note here that we have used the fact that the terms involving b cancel in the exact same way as they do in the proof of Proposition 2.

2.
Similar to what is done in [26,32,37], we deal with the term involving g as follows: We remark that for a cube P ∈ D α we have M (φχ P c )(x) = ess inf P M (φχ P c ) for all x ∈ P. (3.11) Indeed, let x, y ∈ P and let R ∈ D α so that x ∈ R. Then either R ⊆ P or P ⊆ R.
In the first case we have φχ P c 1,R = 0, while in the second case we have y ∈ R and thus φχ P c 1,R ≤ M (φχ P c )(y). Thus, we may conclude that M (φχ P c )(x) ≤ M (φχ P c )(y), proving (3.11) by symmetry. Using this result, we find, since E ⊆ P c for all P ∈ P, that Since g = | f | p 0 on c , we conclude that We first assume that q 0 < ∞. We set v := w (q 0 / p 0 ) ∈ A 1 and choose Thus, it follows from (3.10), (3.12), and Proposition 2.1(ii) that Next, we note that Moreover, we compute Hence, it follows from (3.8) and (3.13) that The result follows by considering the cases p 0 = 1 and p 0 > 1 separately. Now we assume that q 0 = ∞. Taking Thus, from (3.10) and Proposition 2.1(iii) we obtain w(E) for all p 0 < p < ∞. Choosing p = p 0 + 1/(log(e + [w] A ∞ )), we have Moreover, we compute Hence, by (3.8), (3.14), and (3.15), we conclude that By considering the cases p 0 = 1 and p 0 > 1 separately, the desired result follows.
Proof of Theorem 1. 5 We use the equivalence Let f ∈ D with f q 0 = 1 and let E ⊆ R n with 0 < w(E) < ∞. We denote by M B w the uncentred maximal operator over balls with respect to the measure w dμ. Then we define By applying the Whitney Decomposition Theorem to , see Theorem 4.7, we obtain a disjoint collection P ⊆ D α of cubes so that = ∪ P∈P P with the property that for each P ∈ P there exists a ball B(P) containing P so that B(P) ∩ c = ∅ and |B(P)| |P|, where the implicit constant depends only on n and ν, see also the proof of Lemma 4.6. Moreover, we obtain functions g, b so that | f Next, we pick a function h satisfying |h| ≤ χ E and hw 1/q 0 ∈ D, and fix a p 0 < p < q 0 to be chosen later. We apply the sparse domination property to the pair hw 1/q 0 , f to find a sparse collection S ⊆ D α so that, by applying Lemma 3.1 with the weight w 1/(q 0 / p) , we find that for all 1 < q < ∞ we have (3.17) where the terms involving b cancel in the same way as before. Choosing Furthermore, fixing a P ∈ P and x ∈ B(P) ∩ c , we have Thus, by combining (3.18), (3.19), and (3.20) with (3.17), we conclude that Thus, by (3.16) and (3.21) we have as desired.

Extensions of the Results to Spaces of Homogeneous Type
This section is dedicated to extending our main results to spaces of homogeneous type (X, d, μ). Here X is a set equipped with a quasimetric d, i.e. a mapping satisfying the usual properties of a metric except for the triangle inequality, which is replaced by the estimate d(x, y) ≤ A(d(x, z) + d(z, y)) for a constant A ≥ 1, and μ is a Borel measure on X satisfying the doubling property, i.e. there is a C > 0 such that for all x ∈ X , r > 0. Taking the smallest such C we set ν := log 2 C. Furthermore, we write |E| := μ(E) for all Borel sets E ⊆ X . The doubling property implies that for x ∈ X and R ≥ r > 0 we have In turn, this implies that if y ∈ B(x; R) for x ∈ X , then for 0 < r ≤ 2 AR we have We make the additional assumption that 0 < |B| < ∞ for all balls B ⊆ X . This property ensures that X is separable [7, Proposition 1.6]. Finally, we make the assumption that Lebesgue's Differentiation Theorem holds. This holds, for example, when X is a domain in R n . Indeed, more generally, if A = 1 (that is, (X, d) is a metric space) and μ is an inner regular Borel outer measure, then Lebesgue's Differentiation Theorem holds, see [22,Sect. 14]. This assumption is used for the L ∞ bound on the good part in our Calderón-Zygmund decompositions.
We will consider the situations where X is unbounded and where X is bounded separately, the latter situation being simpler. To facilitate this, we impose that the underlying quasimetric space (X, d) has exactly one of the following properties: for all x ∈ X , r > 0; (II) diam X < ∞.
We note that property (I) and property (II) are mutually exclusive, since (I) implies that X is unbounded. The extra assumption for the unbounded case is not too restrictive in the sense that the unbounded spaces in our applications usually do satisfy property (I). We point out that when (X, d) is a connected metric space, then it satisfies either (I) or (II):

Proposition 4.1 Suppose X is metric, connected, and unbounded. Then (I) holds with
Proof Let r > ε > 0. The assumptions on X imply that X = B(x; r − ε) ∪ B(x; r ) c and thus we can pick y ∈ B(x; r )\B(x; r − ε) so that diam(B(x; r )) ≥ d(x, y) ≥ r − ε, proving the result.
A non-connected example where (I) holds with γ = 1/2 is the subset (−∞, 0)∪(1, 2) of the real line. An example where (I) fails is any metric space that has an isolated point.
We will use the following definition of a dyadic system in X .

Definition 4.2
Let 0 < c 0 ≤ C 0 < ∞ and 0 < δ < 1. If for each k ∈ Z we have a pairwise disjoint collection D k = (Q k j ) j∈J k of measurable subsets of X and a collection of points (z k j ) j∈J k , then we call (D k ) k∈Z a dyadic system in X with parameters c 0 , C 0 , δ, if it satisfies the following properties: (ii) for l ≥ k, if Q ∈ D l and Q ∈ D k , we have that either Q ∩ Q = ∅ or Q ⊆ Q ; (iii) for each k ∈ Z and j ∈ J k we have The elements of a dyadic system are called cubes. We call z k j the centre of Q k j . If Q ∈ D k , then we call the unique cube Q ∈ D k−1 so that Q ⊆ Q , the parent of Q. Furthermore, we say that Q is a child of Q . Note that it is possible that for a cube Q there exists more than one k ∈ Z so that Q ∈ D k . Hence, when speaking of a child or the parent of Q, this should be with respect to a specific k ∈ Z where Q ∈ D k to avoid ambiguity.
For a detailed discussion on the construction of dyadic systems and for the following theorem we refer the reader to [25] and references therein.
Writing D := ∪ K α=1 D α , one defines the respective notions for weight classes accordingly. Likewise, we say that a collection S ⊆ D is called η-sparse for 0 < η ≤ 1 if for each α ∈ 1, . . . , K there is a pairwise disjoint collection (E Q ) Q∈S ∩D α of measurable sets so that E Q ⊆ Q and |Q| ≤ η −1 |E Q |.
For our main results we require that the Calderón-Zygmund decompositions we take are adapted to the dyadic grids obtained from this theorem. The standard Calderón-Zygmund decomposition as found in [11] is not precise enough for these purposes, see also Remark 2.3.
For 1 ≤ p 0 < q 0 ≤ ∞ we may define the class S( p 0 , q 0 ) as the class of those operators T that satisfy the property that there is a constant c > 0 and an 0 < η ≤ 1 so that for each pair of functions f , g in an appropriately large class of functions on X there is an η-sparse collection S ⊆ D so that The remainder of this section will be dedicated to proving the following result: The main difficulty arises when one wants to take Calderón-Zygmund decompositions. We remark that in the cases (I) and (II) one can use the standard maximal cube arguments and localization arguments, respectively, to conclude that our dyadic maximal operators satisfy the usual weak and strong boundedness results. The Lemmata in Sect. 3 all follow in the more general setting in the same way as they have been presented, where we replace the set of test functions D by another appropriate class of functions that is dense in L p (w) for all 1 ≤ p < ∞, w ∈ A ∞ such as the linear span of the indicator functions functions over the balls in X .
From now on we consider a fixed dyadic system D * = ∪ k∈Z D k in X with parameters c 0 , C 0 , δ.
We first assume that we are in the easier case (II). We define the maximal operator M with respect to the cubes Q ∈ D * by M f := sup Q∈D * f 1,Q χ Q . Proof Fix k 0 ∈ Z small enough so that c 0 δ k 0 > diam X . Then for any x ∈ X we have B(x; c 0 δ k 0 ) = X . Hence, it follows from property (iii) of dyadic systems that D k 0 = {X }.
Note that = X implies that f 1,X ≤ λ. Let x ∈ . Then the set is non-empty. Thus, by well-orderedness there is a minimal k x ∈ K x , and thus a cube P x ∈ D k x that contains x so that f 1,P x > λ. By minimality of k x , it follows that f 1, p(P x ) ≤ λ, where p(P x ) ∈ D k x −1 denotes the parent of P x . By (4.2) and property (iii) of dyadic systems this implies that It remains to show that the hereby obtained collection P = (P x ) x∈X is pairwise disjoint. Indeed, assume that P 1 , P 2 ∈ P so that P 1 ∩ P 2 = ∅. We have either P 1 ⊆ P 2 or P 2 ⊆ P 1 by property (ii) of dyadic systems. Without loss of generality we assume the first. Pick x ∈ X so that P 1 = P x . Since x ∈ P 2 and f 1,P 2 > λ, minimality of k x implies that P 2 ∈ D l for some l ≥ k x . Again by property (ii) of dyadic systems, this implies that P 2 ⊆ P 1 , proving that P 1 = P 2 . The assertion follows.
Next, we consider the case (I). We define the maximal operator M B with respect to the balls B ⊆ X by For the proof we use a version of the Whitney Decomposition Theorem. Note that the diameter assumption (4.3) together with property (iii) of dyadic systems implies that for any Q ∈ D k we have Proof We define Moreover we set where p(Q) ∈ D k−1 denotes the parent of Q ∈ D k . We will show that P∈P P = .
Indeed, any P ∈ P is contained in . Conversely, if x ∈ , let (Q k x ) k∈Z be the sequence of cubes in D * with x ∈ Q k x and Q k x ∈ D k for all k ∈ Z. Since is open, there is a ball B = B(x; r ) contained in . Picking k 0 large enough so that 2 AC 0 δ k 0 < r , we find that Hence, for all k ≥ max(k 0 , k 1 ) we have Q k x ∈ E . Thus, the set is non-empty. We also claim that K x is bounded from below. Indeed, if we choose for all k ≤ k 2 by (4.4), and hence Q k x / ∈ E for k ≤ k 2 , proving the claim. We set k x := min K x ∈ Z.
proving that x ∈ ∪ P∈P P, as desired. Next we will show that P is pairwise disjoint. Suppose for a contradiction that we have P 1 , P 2 ∈ P so that P 1 ∩ P 2 = ∅ and P 1 = P 2 . Let l 1 , l 2 ∈ Z so that P 1 ∈ D l 1 , P 2 ∈ D l 2 and p(P 1 ), p(P 2 ) / ∈ E . Without loss of generality we assume that l 1 > l 2 and thus P 1 ⊆ P 2 by property (ii) of the dyadic systems. Then also p(P 1 ) ⊆ P 2 . Since p(P 1 ) / ∈ E , we must have that either p(P 1 ) or d( p(P 1 ), c ) < d( p(P 1 )). The first case implies that P 2 , contradicting the fact that P 2 ∈ E . The second case implies that diam(P 2 ) ≥ diam( p(P 1 )) > d( p(P 1 ), c ) ≥ d(P 2 , c ), again contradicting P 2 ∈ E . We conclude that P is pairwise disjoint, as desired.
It remains to show that d(P, c ) < 4 A 2 C 0 /(γ c 0 δ) diam P for all P ∈ P. Let P ∈ P, P ∈ D k so that p(P) / ∈ E . Then either p(P) or d( p(P), c ) < diam( p(P)). In the first case we have d( p(Q), c ) = 0, so in both cases we have by (4.4). Hence, as desired.
Proof of Lemma 4. 6 We apply the Whitney Decomposition Theorem to write = ∪ P∈P P.
Proof of Theorem 4.4 In both cases (I) and (II), the proof of Theorem 1.3 holds mutatis mutandis. Moreover, in the case (I), the same is true for Theorem 1.4, where one uses Lemma 4.6, and for Theorem 1.5, where one uses Theorem 4.7. For Theorem 1.4 in the case (II), one replaces the set in the proof by the set = {M (| f | p 0 ) > 2[w] A 1 w(E) −1 }. We claim that = X . Indeed, since X is bounded, we have w(X ) < ∞. Thus, by (3.1), we have proving the claim. Thus we may apply Lemma 4.5 to decompose , and the remainder of the proof runs analogously.

Optimality of Weighted Strong Type Estimates
In this section we are going to show that the weighted strong type estimates in (1.3) and (1.8) are optimal, given a certain asymptotic behaviour of the unweighted L p operator norm of T . Such asymptotic behaviour is directly linked to lower bounds on the (generalized) kernel of the operator, see Example 5.5. We improve upon the result in [6], where it was shown that the estimate (1.3) is optimal for sparse forms. Indeed, here we are directly using properties of the operator T itself rather than only its sparse bounds.
Our method is an adaptation of the results of Fefferman and Pipher [16] and Luque et al. [35]. We deduce sharpness of weighted bounds from the asymptotic behaviour of the unweighted L p norm of T as p tends to p 0 and q 0 , respectively. The proof exploits the known sharp behaviour of the Hardy-Littlewood maximal function via the iteration algorithm of Rubio de Francia.
We will work in a doubling metric measure space (X, d, μ) satisfying the assumptions from the Sect. 4. As a matter of fact, the only property we need is a precise control of the L p norm of the maximal operator. More precisely, we let D := ∪ K α=1 D α be the union of the dyadic grids in X obtained from Theorem 4.3. Then we define where we set M := M 1 . Using the shorthand notation M q p = M q L p →L p for p > q, we will use which follows as in (3.2) with w = 1. Let us first define the critical exponents that determine the asymptotic behaviour of the unweighted L p operator norm of T . Then it follows from Proposition 2.1(ii) that for a weight w we have w ∈ A s/ p 0 ∩ RH (q 0 /s) if and only if w (q 0 /s) ∈ A φ(s) . We establish the following connection between the weighted strong type estimates for T and the asymptotic behaviour of the unweighted L p operator norm at the endpoints p = p 0 and p = q 0 .
Theorem 5.2 Let T be a bounded operator on L p for all p 0 < p < q 0 . Suppose that for some p 0 < s < q 0 and for all w ∈ A s/ p 0 ∩ RH (q 0 /s) , T L s (w)→L s (w) ≤ c w (q 0 /s) β/(q 0 /s) We also establish a version involving the A 1 characteristics. Its proof follows the same lines as the one for Theorem 5.2 and will therefore be omitted.

Theorem 5.3
Let T be a bounded operator on L p for all p 0 < p < q 0 . Suppose that for some p 0 < s < q 0 and for all w ∈ A 1 ∩ RH (q 0 /s) , T L s (w)→L s (w) ≤ c w (q 0 /s) β/(q 0 /s) (5.3)