Comptes Rendus Mathématique

. WeprovethatpointwiseandglobalHölderregularitycanbecharacterizedusingthecoe ﬃ cientson the Haar tight frame obtained by using a ﬁnite union of shifted Haar bases, despite the fact that the elements composing the frame are discontinuous


Introduction
If H is a separable Hilbert space, a frame is a sequence (e n ) of elements of H satisfying This condition implies that the vectors (e n ) span E , and any element f ∈ H can be reconstructed from the e n in a stable way: There exists a dual frame (g n ) such that the partial sums n ≤ N 〈 f |g n 〉e n converge to f in H . Wavelet frames were introduced by I. Daubechies, A. Grossmann and Y. Meyer in the seminal article [7], as a compromise between the continuous wavelet transform, which is a very flexible analysis tool (the restrictions on the shape of the wavelet are easy to meet), and orthonormal wavelet bases (the use of which being much less greedy in terms of computational costs).
In signal and image processing, frame decompositions are often a preferred and adapted tool to zoom in and unveil useful features/properties which are critical to not only uncovering algorithmic solutions to many data-driven problems, but to also simplifying the ensuing computational challenges. Properties of frame expansions have been extensively investigated, see e.g. [5,11] and references therein for an account of the developments of frame theory, and its relevance in mathematical analysis, statistics, and in signal and image processing. Convolutional Neural Networks (CNN), see e.g. [17], broadly referred to as Deep Learning (DL) as a tool, is a more refined and better performing Neural Network, consisting of layers of banks of linear filters, whose respective outputs are non-linearly transformed. An alternative interpretation of CNN as a Haar wavelet multi-scale frame optimization of data (referred to as Scattering Networks), has been provided, see [4]. The computational efficiency of this systematic and optimal representation selected from a wavelet frame representation was shown to provide a near-translationinvariance and a viable inference framework. The notion of CNN was extended to data-graph structures [3]. A graph convolution may be defined on such a space, which yields a formalism for Graph-based DL. Similarly, a computationally more efficient Graph-based DL using Haar Convolution was constructed in [20], where a Haar decomposition is performed at a much lower computational cost. However, the main drawback of Haar decompositions is the irregularity of the basis elements, thus their inability to account for the smoothness of the data. The new regularity results for Haar systems obtained in this paper therefore present a new potential for further refining Graph DL and for additionally providing a new perspective on computational impacts, such as translation invariance which is critical to data-inherent transient changes, see [16].

The Haar basis
Let ϕ be the characteristic function of [0, 1], which, more precisely, we define as and let This (slightly unusual) definition for ϕ (and hence ψ) at the end points of its support is motivated by the fact that we will consider pointwise values of partial sums of Haar series, and it is important that every point be a Lebesgue point of these partial sums. The Haar basis on R is the orthonormal basis of L 2 (R) composed of the functions • ∀ f ∈ E , there exists a unique sequence of real numbers (a n ) n ∈ N such that the partial sums n ≤ N a n e n converge to f , i.e. N n = 1 a n e n − f E −→ 0 when N → +∞.
• There exists C > 0 such that, for any sequence of real numbers (a n ) n∈N , for any sequence (ε n ) such that |ε n | ≤ 1, then ε n a n e n E ≤ C a n e n E .
The second requirement ensures the stability of the reconstruction of f . In statistics, this key property is referred to as the multiplier property. Note that it is only one of the two ingredients for an unconditional basis, and it may be verified in spaces that are not separable (and thus cannot have unconditional bases). It is e.g. the case for the Hölder C α spaces if a smooth orthonormal wavelet basis is used, i.e. a basis of the form (4) where ϕ and ψ are sufficiently smooth and well localized, see (10) below.
We now turn to the Haar basis. The problem of determining which function spaces the Haar system is an unconditional basis for, has been settled by G. Bourdaud in [2] for Besov spaces. Let us recall their definition. We use a smooth wavelet basis, i.e. an orthonormal basis of L 2 (R) which has the same algorithmic structure as the Haar system (4), but where ϕ and ψ belong to the Schwartz class, see [19]). Thus where the wavelet coefficients of f are We will use the fact that convergence also holds pointwise: If f ∈ L 1 (R), then the partial sums of f converge almost everywhere, and in particular at Lebesgue points of f , see [24,25].
using the usual convention for ∞ when p or q is infinite.
In particular, the global Hölder spaces C α (R) = B α, ∞ ∞ (R) (α ∈ R), sometimes referred to as Lipschitz spaces, are characterized by the condition The characterization supplied by (9) can roughly be interpreted as follows: The fractional derivatives of f of order s belong to L p , see [21]. The theorem of Bourdaud states (in particular) that the Haar system is an unconditional basis of B s,q p (R) (and (9) holds) if and only if 1 This result is sharp, indeed, for larger values of s, the Haar function ψ no longer belongs to the corresponding space. Note that this trivial obstruction also prevents (9) to yield a wavelet characterization of Besov spaces in that case; indeed the Haar function (3) only has one nonvanishing coefficient on the Haar basis, and therefore its coefficients obviously satisfy (9). Thus, if this characterization held for the Haar basis, it would follow that the Haar function belongs to the corresponding Besov space, which is not the case. Despite the limitation due to its irregularity, the Haar system has been of constant use in signal and image processing, and more recently, became a key tool in data-graph structures. A purpose of this paper is to show that this limitation can be mitigated by using a frame instead of an orthonormal basis, thus taking advantage of redundancy. In order to more precisely address this question, we note that the information within the definition of an unconditional basis may be granularized into two different points: • The analysis problem: Is it possible to characterize the fact that a function belongs to a function space by a condition on the moduli of its coefficients on the analyzing system, as given, e.g. by (9) in the case of Besov spaces? • The synthesis problem: When does the partial sum reconstruction formula (5) (or (7) in the case of wavelet bases) holds in the corresponding function space?
It is clear that the second point cannot be improved upon by using a redundant system; indeed, as soon as the "building blocks" do not belong to the function space, the norms in (5) are not even defined. However, there may be some room for improvement concerning the first point.
More information would be unveiled if (9) holds for the coefficient on a redundant system, and one might expect that it can be converted into some regularity information on the function allowing to go "beyond" the limitation of Bourdaud's theorem. Following this motivation, we now introduce Haar frames.

Haar frame
The Haar frame we will consider is the union of the orthonormal Haar basis (7) together with the two orthogonal bases obtained by shifting the elements of the Haar basis by 1/3 and 2/3. This analysing system is composed of the The I k together with the 2 j /2 H j , k form a union of three orthonormal bases; therefore, they constitue a tight frame, i.e. a frame for which the inequalities in (1) are equalities; more precisely, One well documented drawback of using an orthonormal wavelet basis in signal an image processing is that it does not supply a translation invariant representation, but is dependent on the particular discrete dyadic grid which is chosen. This drawback can be mitigated by oversampling this dyadic grid, thus replacing the initial orthonormal basis by a finite union of orthonormal bases, as proposed in [22,23]. The choice supplied by (11) corresponds to an oversampling by a factor 3 (we will see in Section 3 how the results we will obtain in this setting extend to other choices of oversampling). Applying a wavelet representation in many practical scenarios such as signal detection in a Radar scenario, or reconstruction of a function/signal in noise (i.e. denoising) as well as parameter estimation in communication, is highly dependent on the translation-invariance of the transformation. As an example, a continuous wavelet transform or a wavelet frame representation guarantees that a time delay estimation or a detection of a very short transient will be successfully achieved, whereas an orthogonal wavelet representation may turn out to be unable to detect a short transient or to estimate translation-sensitive parameters. This exploitation of a redundant wavelet representation was later illustrated in [6] as what may be viewed as an "averaging procedure" for improved denoising, and referred to as cycle-spinning. We will show that this oversampling procedure also allows to turn the inherent limitations of the Haar basis for regularity analysis.

Uniform regularity
The first problem we consider is the characterization of the uniform Hölder spaces C α (R); recall that they coincide with the spaces B α,∞ ∞ (R), see [21]; if 0 < α < 1, an equivalent definition is Though C α (R) cannot be characterized by a condition bearing on the Haar basis coefficients, nonetheless, Theorem 4 below shows that such a characterization is possible using the Haar frame (11). We will make the following minimal regularity assumptions on the functions of interest.

Definition 3. Let f be a locally bounded function; f is Lebesgue-regular if every point is a Lebesgue
Note that this definition implicitly makes the assumption that functions are defined "point to point" and not "except for a set of vanishing measure". Continuous functions are of course Lebesgue regular; but this class also allows for discontinuities. For instance assume that, at every point x, f has a right and a left limit at x, and that, at every discontinuity point x 0 , f satisfies then f is clearly Lebesgue regular. A key property that we will use is that the wavelet series of a Lebesgue-regular function f converges everywhere to f . The following result shows that the Haar frame characterization of the global C α regularity is similar to that resulting from a smooth wavelet basis.
Proof. Assume that f ∈ C α (R). Then hence the first statement in (13) holds. Let be the support of H j , k . we denote by I − j , k the left half of I j , k and by I + j , k the right half of I j , k . Then so that hence the second statement in (13) holds.
Conversely, assume that (13) holds. Since f is Lebesgue regular, we can use the reconstruction formula for the Haar orthonormal wavelet basis only (which converges everywhere towards the pointwise value of f ); thus Let us now estimate increments of f . We have three possible reconstruction formulas for f using any of the three orthonormal bases composing the tight frame; the idea of the proof is to use this extra flexibility. Let x = y be given. Define J by Consider now the intervals I J ,k . Since these intervals are of length 2 −J and are deduced from each other by a shift of 1 3 · 2 −J , at least one of them contains both points x and y; we denote it by I J , k J . We now use either the Haar basis or one of its two "sisters" shifted by 1/3, the choice being driven by the fact that the interval I J , k J that we picked, is the support of an element H J , k of the chosen basis. This implies that, for all generations j < J , either the support of an H j ,k of this basis does not contain x and y, or x and y are in the same "half" of the support of H j , k . Let us use the reconstruction formula using this orthonormal basis (and let us denote its elements by ϕ k and ψ j , k ); since it converges everywhere towards the pointwise value of f , we get Because of our choice of the basis, it follows that ∀ k, ϕ k (x) = ϕ k (y) and ∀ j ≤ J , ∀ k, ψ j , k (x) = ψ j , k (y); therefore At each generation j , at most two terms bring a contribution; using (13), we get

Remark 5.
As pointed out to us by Albert Cohen, the proof of Theorem 4 relies on an argument which is similar to the mixing lemma ([9, Lemma 2.3 in Chap. 12], originally in [10]) used to determine the rate of approximation of smooth functions by sequences of splines on the interval [0, 1] having knots at T n = { k n } k = 0, ··· n . The fact that this family is not nested allows for a better rate than in the nested case (e.g. for dyadic subdivisions). Note however that the (piecewise constant) splines used in the proof of Theorem 4 yield much more economical system for a similar conclusion.

Pointwise regularity: The Haar basis
We now consider the problem of characterizing pointwise regularity by estimates on the Haar frame coefficients. Definition 6. Let x 0 ∈ R and α ≥ 0. A locally bounded function f : R → R belongs to C α (x 0 ) if there exist C > 0 and a polynomial P x 0 with d eg (P x 0 ) < α such that, in a neighborhood of x 0 , The pointwise Hölder exponent of f at x 0 is h f (x 0 ) = sup{α : f ∈ C α (x 0 )}.
The polynomial P is unique; it is called the Taylor polynomial of f at x 0 . When using smooth wavelets, criteria based on the wavelet coefficients on an orthonormal wavelet basis allow to recover pointwise regularity; let us recall some notations. A dyadic interval is of the form Let λ be a dyadic interval; 3λ denotes the interval of same center as λ and three times wider. Wavelets coefficients can therefore be indexed by dyadic intervals: We will write c λ := c j , k .
Definition 7. Let f be a locally bounded function, and let the (ϕ k ) and (ψ j , k ) generate a smooth wavelet basis. The wavelet leaders of f are the quantities Wavelet leaders allow to estimate pointwise Hölder exponents, see [12] for the initial 2-microlocal wavelet criterium and [13] for its reformulation in terms of wavelet leaders, which we now recall. We denote by λ j (x 0 ) the dyadic interval of width 2 − j which contains x 0 .
Theorem 8. Let f ∈ C ε (R) for an ε > 0. If the generating wavelets ϕ and ψ belong to C N (R) for an N > α, then The same question is more difficult to answer when using an irregular wavelet basis; to our knowledge, it was only tackled in [14]. Let us first recall the results concerning pointwise regularity results obtained there for the Haar basis. Recall that a dyadic rational is a point of the form k/2 j , for j , k ∈ Z. Definition 9. Let x ∈ R. The rate of approximation of x by dyadic rationals is For any x, r (x) ≥ 1, and almost every x satisfies r (x) = 1. The following result is proved in [14].
Proposition 10. Let f be a locally bounded function, and x 0 ∈ R. If f ∈ C α (x 0 ) for α < 1, then its wavelet leaders on the Haar basis satisfy Conversely, if (17) holds and if the Haar basis coefficients c j , k satisfy the uniform decay assumption This result yields the best possible pointwise regularity that can be inferred from the knowledge of the size of the Haar basis coefficients. Almost every x 0 satisfies r (x 0 ) = 1, so that, if (18) holds, then Proposition 10 yields the exact pointwise regularity at these points. Besides dyadic points (where it is clear that decay estimates on the Haar coefficients cannot allow to estimate pointwise regularity), the estimate of h f (x 0 ) yielded by Proposition 10 deteriorates if r (x 0 ) is large, which means that there exists a sequence of scales such that x 0 is "very close" to dyadic points at these scales.

Pointwise regularity: The Haar frame
We will now show that, in contradistinction with Proposition 10, the Haar frame coefficients allow to recover the pointwise Hölder exponent everywhere. Let us first introduce some notations. We index the elements of the Haar frame H j , k by their support λ = I j , k , see (14) (note that λ is not necessarily a dyadic interval). The corresponding Haar coefficient is Haar leaders are indexed by dyadic intervals and defined by Theorem 11. Let f be a locally bounded function, and let α > 0. If f ∈ C α (x 0 ), and if the Taylor polynomial of f at x 0 is constant, then its wavelet leaders on the Haar frame satisfy Conversely, if (19) holds and if the coefficients c j , k satisfy the uniform decay assumption then Remark 12. The restriction on the Taylor polynomial is automatically satisfied if 0 < α < 1. It is easy to check that it is also satisfied by a class of functions which plays an important role in multifractal analysis: The distribution functions of singular measures (with no restriction on α).
At least one of the three Haar bases is such that, at the generation j , x and x 0 belong to the same interval I j , k j . As in the proof of uniform regularity, in order to estimate increments of f we use this Haar basis for the reconstruction formula: The terms for j < j vanish because x and x 0 belong to the same (right or left) half of the support of ψ j , k (or of ϕ k ). As regards the terms for j ≥ j , we first assume that j ≤ [A j ], where A is a (large) constant, that will be fixed later. Each sum contains at most two nonvanishing terms: the ones such that x or x 0 belong to I j , k ; but, in that case (19) implies that |c j , k | ≤ C 2 −α j . Therefore Assume now that j > j . Because of the localization of the Haar basis, (20) implies that We pick A such that εA = α; (22) implies that j ≤ C | log(|x − x 0 ])|, so that (21) follows from (23) and (24).

Concluding remarks
Our choice of the regular oversampling supplied by the union of three bases shifted by 1/3 was motivated by two reasons: Those already mentioned and developed in [6,22,23], i.e. using a wavelet system closer to translation invariance, but which keeps the numerical efficiency of orthonormal wavelet bases; and, on other hand, the fact that, if one wants to keep a regular sampling, then an oversampling by 3 is the smallest number for which the results stated in Theorems 4 and 11 hold. However, variants are possible. Indeed the key argument in the proofs of Theorems 4 and 11 is that, if x and y are such that |x − y| ∼ 2 − j , then one can find a translated dyadic interval of length close to 2 − j which contains both x and y. As pointed out by one of the referees, this is clearly possible using only the Haar basis and only one translate by a rational r = p/(2k + 1). Indeed, if I is an interval of length l satisfying 2 − j ≤ l < 2 · 2 − j ; let m be defined by p 8(2k + 1) ≤ 2 −m < p 4(2k + 1) .
Then I clearly is included either in a dyadic interval of length 2 − j +l or in a interval of the same length obtained as a shift by r of a dyadic interval. Therefore, the proofs of Theorems 4 and 11 work in the same way for these two systems. And of course, the same conclusion also holds if we include additional translates by r = q/(2k +1) for several values of q (and in particular all of them, if one is concerned by the requirement of a regular oversampling). Similarly, Theorems 4 and 8 easily extend to the several variable version of the Haar system, which is obtained by the usual tensor product construction: For x = (x 1 , · · · , x d ), we define Φ(x) = ϕ(x 1 ) · · · ϕ(x d ), and where the ψ l are either the one-variable functions ϕ or ψ, the choice ϕ(x 1 ) · · · ϕ(x d ) being excluded (so that there are 2 d −1 functions Ψ (i ) ). Then the d -dimensional Haar basis is composed of the Φ(x − k) for k ∈ Z d and the 2 d j /2 Ψ (i ) (2 j x − k) for j ≥ 0 and k ∈ Z d . The corresponding Haar frame is obtained by shifting this basis by the vectors ε i e i /3 where ε i ∈ {0, 1, 2} and the e i are the elements of the canonical basis of R d . This yields 3 d bases, and the d -dimensional Haar frame is composed of the union of these bases. Theorems 4 and 11 also extend without difficulty to the d -dimensional setting. The key point is to notice that, here again, if |x − y| ∼ 2 − j , then one can find a translated dyadic cube of width close to 2 − j which contains both x and y, which is clear; indeed it suffices to use in each direction of the canonical basis, the translation supplied by the one-dimensional case for the corresponding coordinate of the segment [x, y].
Finally, note that Theorems 4 and 8 also extend to piecewise smooth wavelets, such as the "spline wavelets" constructed by G. Battle and P.-G. Lemarié, see [1,18]; in that case, the same proofs allow to characterize global and pointwise regularity up to an order α given by the number of vanishing moments of the wavelet, which is larger than its uniform regularity (and was the natural bound for previous regularity results). This is possible because these wavelets are piecewise polynomials between integers. In contrast, an interesting open problem would be to determine if these results could be extended to e.g. Daubechies wavelets, the singularities of which are not located at integers, but on fractal sets, see [8].