Comptes Rendus Mathématique

. In this note we give several counterexamples. One shows that small energy majorization on bi-tree fails. The second counterexample shows that energy estimate (cid:82) T (cid:86) νε d ν ≤ C ε | ν | always valid on a usual tree by a trivial reason (and with constant C = 1) cannot be valid in general on bi-tree with any C whatsoever. On the other hand, a weaker estimate (cid:82) T 2 (cid:86) νε d ν ≤ C τ ε 1 − τ E [ ν ] τ | ν | 1 − τ is valid on bi-tree with any τ > 0. It is proved in [14] and is called improved surrogate maximum principle for potentials on bi-tree. The estimate (cid:82) T 3 (cid:86) νε d ν ≤ C τ ε 1 − τ E [ ν ] τ | ν | 1 − τ with τ = 2/3 holds on tri-tree. We do not know any such estimate with any τ < 1 on four-tree. The third counterexample disproves the estimate (cid:82) T 2 (cid:86) ν x d ν ≤ F ( x ) for any F whatsoever forsomeprobabilistic ν onbi-tree T 2 .Onasimpletree F ( x ) = x wouldsu ﬃ cetomakethisinequalitytohold. The potential theories without any maximum principle are harder than the classical ones (see we our potential theories on multi-trees maximum


Introduction. Potential theory on multi-trees
Embedding theorems on graphs are interesting in particular because they are related to the structure of spaces of holomorphic functions. For Dirichlet space on disc D := {z : |z| < 1} this fact has been explored in [5][6][7], and for Dirichlet space on bi-disc D 2 in [2-4, 12, 13]. Bi-disc case is much harder as the corresponding graph has cycles. One particular interesting case is studied in [17], where a small piece of bi-tree is considered.
The difference between one parameter theory (graph is a tree) and two parameter theory (graph is a bi-tree) is huge. One explanation is that in a multi-parameter theory all the notions of singular integrals, para-products, BMO, Hardy classes etc become much more subtle than in one parameter settings. There are many examples of this effect. It was demonstrated in results of S. Y. A. Chang, R. Fefferman and L. Carleson, see [8][9][10]18].
A crucial difficulties of multi-parameter theory and one parameter theory can be also seen in the study of paraproducts, the unweighted multi-parameter theory of which has been constructed in [15,16]. The embedding theorems studies mentioned above can be also viewed, in fact, as a certain studies of multi-parameter weighted paraproducts. The thing is that the terms Carleson embedding theorems and weighted paraproduct are very often interchangeable and even synonymous.
The papers dealing with poly-disc and multi-trees mentioned above are all have a common feature: they are based on potential theory on multi-trees. Let us recall the reader the main notations and facts of such a theory. We will do this for bi-tree just for the sake of simplicity.
Let T denote the dyadic rooted tree with root o, we can associate the vertices with dyadic subintervals of I 0 := [0, 1], and o with I 0 itself. Similarly, let T 2 denote the dyadic rooted bi-tree with root o, we can associate the vertices with dyadic sub-rectangles of Q 0 := [0, 1] × [0, 1], and o with Q 0 itself. Both objects have partial order, which is the same as inclusion for intervals, rectangles correspondingly.
Both objects have a natural integration operator, if f is a non-negative function on T or T 2 , and α is a vertex of T or T 2 , then We can call I the Hardy operator on a corresponding graph: it sums up values from α to o along all directed paths from o to α. For T such a path is unique for any α, for T 2 there are many such paths. The formally adjoint operator is I * and Let us make a convention that always our T and/or T 2 are finite graphs, maybe very deep, but finite, and leaves are dyadic intervals of size 2 −N in the case of T or dyadic squares of size 2 −N × 2 −N in the case of T 2 . Then I * is always defined. The set of leaves is a "boundary" of the graph and is denoted ∂T or ∂T 2 correspondingly. Now we want to introduce potential of measure. Again for simplicity (this is not at all important) let us call measure the function µ on T 2 that is identically zero on T 2 \ ∂T 2 and just an arbitrary non-negative function on ∂T 2 . We have the same way to define measure on T . Of course, what we really doing is defining granular measures on Q 0 and I 0 correspondingly. Here granular means that our measure have constant density with respect to dyadic squares of size 2 −N ×2 −N or dyadic intervals of size 2 −N correspondingly. We wish to have all estimates ever met in our theory to not depend on N . Then by making limit when N → ∞ we can consider all measures on Q 0 or I 0 eventually.
Given such a measure µ we define its potential at a vertex α of T or T 2 as Notice that as α is actually a dyadic rectangle R = I × J inside Q 0 (or dyadic interval I inside I 0 ), then I * (µ)(α) is just µ(R) (µ(I ) correspondingly).
But V µ (α) is a more complicated object, it is the sum of µ(R ) over all R containing R, where R is associated with vertex α ∈ T 2 (correspondingly the sum of µ(I ) over all I containing I , where I is associated with vertex α ∈ T ).
Let us be on T for a while and let V µ ≤ 1 on supp µ (these are vertices of ∂T where mu > 0. Then we can easily see that V µ ≤ 1 everywhere. In fact, without loss of generality µ = 0, and let β ∈ ∂T and let µ(β) = 0.
We can find unique smallest predecessor γ > β such that there is α ∈ ∂T , µ(α) > 0, and α has the same predecessor γ. The key statement here is that the smallest such γ > β is unique because we are on a simple tree T . Now V µ (γ) ≤ V µ (α) ≤ 1 as α ∈ supp µ and potential V of any positive measure on T (and on T 2 ) is a decreasing function always.
So we proved that V µ ≤ 1 on supp µ implies V µ ≤ 1 everywhere on ∂T . Then by monotonicity of potentials it is ≤ 1 everywhere on T .
This claim is blatantly false on T 2 . The problems is that there can be a huge family Γ of γ > β such that µ(γ) > 0 and for any pair γ 1 , γ 2 ∈ Γ none is smaller than the other. The reasoning above fails, and moreover there are plenty of simple examples of µ on ∂T 2 such that where C is as large as one wishes (if N is chosen large enough).
This phenomena is called the lack of maximum principle, and it reveals itself prominently in the following effect.
Let T denote either T or T 2 . Let us fix δ > 0 (not necessarily small but can be small) and consider The expression (integration in the second equality is with respect to counting measure on T ) is called the partial energy of µ.
Given a compact K ⊂ T (and all sets are compact on finite graphs) we define its capacity by This infimum is actually realized by f E = I * µ E with a unique measure µ E : a capacitary measure.
uniformly. The reasoning is exactly the same as above for maximum principle. The consequence is the following partial energy estimate: But (1) can be easily false if T = T 2 . We will show below the example that even (2) can be false. All estimates in papers [2][3][4] are based on a weaker version of (2), see [14], which we call the surrogate maximum principle: ε|ν| |ν| .

Majorization with small energy in bi-parameter case
The key estimate on the way to prove the surrogate maximum principle (1) is the following "majorization theorem with small energy" that holds true on the dyadic tree T : [4,14] uses a sort of "redistribution of masses" or "mass transport" approach. For a while we tried to prove the similar statement for T 2 to obtain a "proper" proof of surrogate maximum principle (SMP) for the tri-tree. Namely, we conjectured For some very special cases, e. g. for f = g , this has been proved, and turned out to be a key result in proving a surrogate maximum principle on tri-tree and describing the embedding measures ρ for the Dirichlet spaces in tri-disc into L 2 (D 3 , d ρ). See [2,12,13]. But to extend our key results to four-tree we would wish this conjecture hold true as stated.
It turns out that this is not true in general, and the counterexample is provided in Section 3. Actually we show more: that even a weaker estimate

Maximum principle must be surrogate on T 2
From Theorem 1 it is easy to deduce a more transparent estimate: For T = T 3 we can prove that with τ = 2/3, for T = T 2 we could originally prove it for τ = 1/2 and lately for any τ > 0. For T = T 4 we cannot prove (3) at all, even for a very small 1 − τ.
It turns out that the right hand side of (2) generally fails to hold on T 2 , even if we multiply it by any finite constant C , or, in other words, it is not possible to get rid of τ in (3) altogether. In Section 5 we construct a sequence of pairs ν, ε which prohibits putting τ = 0. In itself this argument is a special case of a counterexample that answers a question posed by Fedor Nazarov.

Counterexample to small energy majorization on bi-tree
Below f , g have special form, namely with certain positive measures on T 2 , where the measure µ is trivial -it is just a unit mass at the root o of T 2 . In particular, The choice of ν is more sophisticated. First we choose a large number M . Consider now another number n = 2 2 s M for some natural s, its value is defined in a few lines. In the unit square Q 0 consider dyadic sub-squares Q 1 , . . . ,Q 2 M , which are South-West to North-East diagonal squares of sidelength 2 −M .
In each Q j choose ω j , the South-West corner dyadic square of sidelength 2 −n−M . Now let ν be a sum of identical masses at ω j and let n and these masses satisfy the following relation ν(ω) := 1 We have immediately there is log n log 2 of them, and they are called q 10 , q 11 , . . . , q 1k , k log n. We do the same for each ω j ,Q j and we get q j 0 , q j 1 , . . . , q j k .
It is proved in [11]. Let So we choose λ = c n with an appropriate c. Then F ⊂ {2λ ≤ Ig ≤ 4λ} . Since I f ≥ 1, then if ϕ as in Conjecture 3 would exist, we would have Iϕ ≥ 1 on F and (by the second claim of Conjecture 3) By the definition of capacity this would mean that cap(F ) ≤ C log n .
In the next Subsection 3.1 we show that cap(F ) 1. Hence, conjecture is false.

Capacity of F is equivalent to 1
Let ρ on F be a capacitary measure of F , and let µ be a measure charging 1 n on each q j k with log n 4 ≤ k ≤ µ(q j k) 1. We claim that Assuming for a moment that this estimate holds, we write for ε > 0 Since ρ is capacitary for F ⊃ supp µ and T 2 is finite (i.e. every singleton has positive capacity), we have V ρ ≥ 1 on supp µ, and T 2 V ρ dµ ≥ |µ|. By (6) there is some absolute ε such that ε T 2 V µ dµ ≤ |µ|, so that the second term in (7) must be negative. But then the first term is positive, which means It remains to prove (6). By symmetry it is enough to estimate the potential at q 1k . For that we split V µ to V 1 , this is the contribution of rectangles containing Q 1 , to V 2 , the contribution of rectangles containing q 1k and contained in Q 1 , and V 3 , the contribution of rectangles containing q 1k that strictly intersect Q 1 and that are "vertical", meaning that there vertical side contains vertical side of Q 1 (there is V 4 totally symmetric to V 3 ). Two of these are easy, V 1 "almost" consists of "diagonal squares containing Q 1 ". Not quite, but other rectangles are also easy to take care of. Denote r = |µ|, M = log n log n .
Then we write diagonal part first and then the rest: To estimate V 2 notice that there are at most cn rectangles containing q 1k and contained in Q 1 that do not contain any other q, there are cn 2 of rectangles contain q 1k and one of its sibling (and lie in Q 1 ), there are cn 4 of rectangles contain q 1k and two of its sibling (and lie in Q 1 ), et cetera. Hence, Now consider V 3 . The horizontal size of q 1k is 2 −M · 2 −n2 −k . Its vertical size is 2 −M · 2 −2 k . So the rectangles of the third type that do not contain the siblings: their number is at most (we are using that k ≥ 1 4 log n) n2 −k (2 k + M ) ≤ n + n 3 4 log n . Those that contain q 1k and one sibling, there number is at most We continue, and get that We deal with V 4 in exactly the same way, only now we use that k ≤ 3 4 log n. Finally after adding all V i we get 1 4 .
Since the inverse estimate is already given by V 1 , we obtain (6).

The shape of the graph of function x → cap(V ν ≥ x)
Let E be a subset of T or T 2 and ν be a capacitary measure for E , First consider the case of T . Let x ∈ [|ν|, 1] and we study the set We want to understand a bit the shape of the graph of We start with x = |ν| = cap(E ). Notice that o, the root of T , is obviously such that V ν (o) = |ν|, so 0 ∈ D |ν| . But cap(0) = cap(T ) = 1. Thus Now consider x = 1. On E we have V ν = 1 and maximum principle (we are on T , so it exists) says that E = {α : V ν ≥ 1}. Therefore, Now let |ν| < x < 1. We know (again this is maximum principle) that Notice that if Ig (α) ≤ x and Ig (son α) > x then Ig (α) ≥ x/2 just because g = I * ν is monotonically increasing on T . But this means that The definition of capacity and relationships (8), (9) show the following: Theorem 5. On a simple tree T the capacity of the level set D x = {α ∈ T : V ν (α) ≥ x} for any capacitary measure ν of a set E satisfies the following inequality This is absolutely not the case for T 2 . The capacity of level set of capacitary potentials on T 2 behave in a much stranger and wild way. We saw it in Section 3.1. In fact, our measure ν in the previous Section is (after multiplying by a constant) a capacitary measure, |ν| = 1 n log n .
We put But we saw above that if the absolute constant c is chosen correctly, then This means that Theorem 5 is false for T 2 because if it were true, that we would have cap((α, β) ∈ T 2 : V ν (α, β) ≥ c n )

The reason for the effect (10)
On T 2 we do not have (2), which is (8) above. Instead we have (3) that makes the estimate of capacity much faster blowing up than in Theorem 5. In fact, (3) claims and we saw that τ is indispensable. Of course the capacity of any subset of T 2 is bounded by 1, so we have This explains a flat piece of graph C (x) 1, when x is between 1 n log n and 1 n .

Lack of T 2 V ν ε dν ≤ C ε|ν| estimate and more
Here is the question asked by Fedor Nazarov. He also hinted us a possible construction of a counterexample.
Question. Consider normalized measures on the unit square, |µ| = 1. Let x 1. Is it always possible to have the estimate The meaning of this question is that we always (see Theorem 1 and (3)) have some trace of total energy in the right hand side of our estimates of partial energy. What if total energy is huge or "infinite"? Maybe one does not need this total energy contribution into the right hand side or even the partial energy is always bounded by a function of its "cut-off" parameter x for all normalized measures?
We will show that no estimate as above exists (but on T it does exist with the simplest F (x) = x).
Observe now that the lack of the "universal" estimate (2) for T 2 follows immediately. Indeed, notice that change of variables δ → t δ, ν → t ν gets both the left hand side and the right hand side of (2) multiplied by the same t 2 . Thus we can normalize measure and always think that |ν| = 1. Inequality above becomes T 2 V ν δ dν ≤ C δ for probability measures ν, which must be false since (11) is false regardless of function F . Notice that on T function F (x) = x makes the above inequality valid.
We repeat the construction from Section 3 with different values of M , n. Namely we now fix any dyadic x 1 and put n2 −M = x, and µ(ω j ) := 2 −M .
We claim that We have already seen that given j , i there are approximately n dyadic rectangles containing q j i and contained in Q j . Each gives contribution 2 −M into V µ (q j i ). So if we would count only them in V µ (q j i ) then we get the total of ≈ n2 −M , and (12) would follow. Let us call this contribution the main contribution and try to justify its title.
Clearly there are much more dyadic rectangles containing q j i and contained in Q 0 . Let us bookkeep their contributions to V µ x (q j i ). We hope that those are not too big in order to prove (12). Notice that if (12) is proved, we have many rectangles R with V(R) ≤ C x; so many that we can hope to prove that So we fix, say, q 0i = [0, 2 −n/2 i 2 −M ] × [0, 2 −2 i 2 −M ], and we can see that apart of ≈ n rectangles between q 0i and Q 0 , there are also The contribution of tall rectangles into V(q 0i ) is bounded by M 2 −M log n << x, the same holds for the contribution of the long rectangles, hence the contribution from rectangles listed in (a) and (b) above can be absorbed into the main contribution.
The contribution of M -large rectangles is 1. There is only one such rectangle, namely our initial unit square Q 0 . The contribution of M −1-large rectangles is 1 2 ·(1+1+1). In fact we would have 3 rectangles in the family of M − 1-large rectangles: Q M −1 0 square itself and its two predecessors, one long, one tall. The contribution of M − 2-large rectangles is 1 4 · (1 + 2 + 2), et cetera (see similar computation in Section 3.1 above). Thus the total contribution of all m-large rectangles containing q 0i is at most M m=1 1 2 m (2m + 1) ≤ C 1 . This is definitely smaller than then main contribution and can be just absorbed into the main contribution ≈ n2 −M = x 1.
We finally proved (12). Now let us estimate R:V µ (R)≤C x µ(R) 2 from below. From [11,14] we know that for each q j i there is a family of dyadic rectangles F j i such that 1) every R ∈ F j i contains q j i and is contained in Q j , j = 1, . . . , 2 M , 2) the cardinality of F j i is at least c n, c > 0, 3) families F j i are disjoint, j = 1, . . . , 2 M , i ≤ C log n. Each rectangle R of ∪ j ∪ i F j i has the property that V µ (R) ≤ C x .
We proved this in (12). So each of those R gives a contribution into the sum R:V µ (R)≤C x µ(R) 2 , and this contribution is 2 −2M . Therefore, R:V µ (R)≤C x µ(R) 2 ≥ 2 −2M · j · i · (F j i ) ≥ c 2 −2M 2 M log n · n = c 2 −M n · log n = c x · (log x + M ) . Now, given x 1, we can freely choose M , e.g. M = x, x 2 , 2 x , F (x) . . . , and then choose n from n2 −M = x and do the construction above. So (13) is proved.