Smooth hyperbolic wavelet deconvolution with anisotropic structure

: This paper considers a deconvolution regression problem in a multivariate setting with anisotropic structure and constructs an estimator of the function of interest using the hyperbolic wavelet basis. The deconvo- lution structure assumed is an anisotropic version of the smooth type (either regular-smooth or super-smooth). The function of interest is assumed to be- long to a Besov space with anisotropic smoothness. Global performances of the presented hyperbolic wavelet estimators is measured by obtaining upper bounds on convergence rates in the L p -risk with 1 ≤ p ≤ 2 and 1 ≤ p < ∞ in the regular-smooth and super-smooth cases respectively. The results are compared and contrasted with existing convergence results in the literature.


Introduction
This paper considers the deconvolution problem of estimation of f that is observed indirectly in an m-dimensional noisy signal, where f : T m → R is defined on the torus, T m = [0, 1] m . The function f has anisotropic structure and observed through the convolution operator, Additive noise is present with B(x) denoting a m-dimensional Brownian sheet. Namely, B(x) is a mean zero Gaussian process with covariance, EB(x)B(y) = m i=1 (|x i | + |y i | − |x i − y i |) /2 and the level of noise in model (1) is governed by the normalisation factor, ε ∈ (0, e −1 ). Asymptotically the normalisation factor ε → 0 as n → ∞ where n represents the sample size.
Deconvolution models such as (1) have examples in econometrics, physics, astronomy and medical image processing. In particular, an important and interesting application is the recovery of features in images that have been distorted. This paper presents new convergence results using the L p -risk in the setting of (1) where the dimension of the problem is arbitrary but fixed and there is anisotropic structure in f and g.
A similar recent body of work includes the functional deconvolution problem with anisotropy in [7,6] and [8] with applications in geophysical data to invert seismic signals and medical imaging. In that body of work the convolution structure was only used in one single dimension of the variables and the anisotropy was apparent in the other dimensions of the functions. The work presented here allows deconvolution to occur on any subset of the possible dimensions with a judicious choice of parameters. Another recent related work is that of [38]. Their paper considers a multivariate regression deconvolution problem similar to our context. The focus of their paper is on uniform convergence of a kernel smoother estimator to construct confidence bands of f . There are also studies that focus on the estimation of f in the direct setting ((1) without the convolution operator) in [24,25] and [4].
This paper examines the convergence rates in the estimation of f in (1) using the L p -risk when both f and g have anisotropic smoothness. In particular, a hyperbolic wavelet deconvolution estimator is constructed and convergence rates are established. As expected, the results generalise known results in the literature and the hyperbolic wavelet basis is able to adapt to the anisotropic structure in both f and g. In particular, it is shown that in the case when the signal of interest f is in a Besov space with anisotropic smoothness, it can be recovered with respect to an upper bound on the L p -risk with accuracy up to extra logarithmic terms, ε 2pγs/(2γs+2γ ν+1/2 ) , and log ε −(1−ζ)/m −mps * γ β −1 ; for the case of regular-smooth and super-smooth convolutions respectively. The degree of anisotropic smoothness in f is represented with parameter s ∈ R m + . In the regular-smooth case, the smoothness is evident in the rate with the harmonic sum γ s = ( m =1 s −1 ) −1 . The degree of ill-posedness is represented with parameter ν ∈ R m + and has a detrimental affect on the rate with the harmonic mean term, γ ν+1/2 = mγ ν+1/2 . In the super-smooth case, a logarithmic rate of convergence is attained and controlled by the degree of ill-posedness parameter β ∈ R m + , the dimension of the problem m, and a modified smoothness index s * = γ s + 1/p − 1/(p ∧ π) where π is a Besov space parameter for f (and p ∧ π = min(p, π)).
Even in the univariate case with m = 1, wavelet methods have been a popular choice in deconvolution problems in recent history due to their ability to spatially adapt to functions and capture transient sharp bumps and discontinuities (see [17,43,19] and [23] and references therein). Various extensions in the univariate case have arisen from this earlier work with multichannel methods, generalised inverse problem operators and functional deconvolution given in [15,1,36,33,34,35] and [9]. Relaxing the noise structure to have strong correlations was studied in [44], and both a multichannel deconvolution with correlated noise in [29].
In the case of direct observations without the convolution structure, the multivariate case has received attention with various multivariate wavelets being proposed. In the case where f has an isotropic structure with similar behaviour in all dimension directions, the wavelet-tensor methods of [30] are favoured since they preserve the multi-resolution analysis behaviour of standard wavelets. However, in the multivariate case of anisotropic functions, with behaviour being different in different dimension directions, the wavelet-tensor basis is no longer optimal. The anisotropic or hyperbolic wavelet basis (see [13]) is better suited in this scenario since it achieves a superior rate of convergence adapting to the varying smoothness in different directions while the wavelet-tensor isotropic basis is restricted by the dimensional direction that has the coarsest smoothness (see [32] and [31]). Recently, [3] and [4] studied the hyperbolic wavelet basis in depth, with the latter comparing and contrasting the isotropic and anisotropic wavelet bases using the maxiset performance criterion. The anisotropic wavelet basis is better suited being able to capture a large class of Besov style spaces with anisotropy while the isotropic basis is unable to capture the same anisotropic Besov spaces. Applications of the hyperbolic wavelets have received recent attention in ultrasound imaging and texture analysis in [41,39,2,12] and [20]. A convolution model that uses a hyperbolic wavelet estimation technique is considered in [40] but not with the same structure as considered in this paper.

Outline of the article
The paper is organised as follows. Section 2 contains more detailed information and discussion about the context of the paper and related situations already present in the literature. It also contains some preliminary information on the types of smooth convolution, a wavelet basis of interest and its extension to the anisotropic hyperbolic basis, and definitions of related Besov spaces on the torus T m . The construction of the wavelet thresholding estimators is given in Section 3. The upper bound results are shown in Section 4 over those Besov spaces and for the L p -risk with 1 ≤ p ≤ 2 when the convolution type is regularsmooth and 1 ≤ p < ∞ when the convolution type is super-smooth. Conclusions and discussion are given in Section 5 with the proofs of the theoretical results and some required auxiliary results given in Section 6.

Notation
For convenience to describe the multi-resolution levels in the wavelet expansion and the decay in Fourier coefficients, the multi-index notation is useful. A multiindex is an ordered m-tuple of non-negative integers. If α = (α 1 , α 2 , . . . , α m ) ∈ Z m + is a multi-index, define |α| = m i=1 α i . If A is a set, |A| denotes the cardinality or number of elements in the set. For generic real valued vectors , the absolute value of x (where 1 A denotes the indicator function). The inner product operator ·, · is defined for both vectors and functions. For dx. Note throughout the paper, inequalities applied to vectors apply component-wise. That is, for x ∈ R m , y ∈ R m and c ∈ R, the statements x ≥ y and x ≥ c imply that x i ≥ y i and x i ≥ c for all i = 1, 2, . . . , m; and similar definitions apply for the other types of inequality statements. The statement A B means there exists constants C 1 ,

Convolution types
The types of convolution in this work are restricted to the smooth type, defined in the Fourier domain. Define the Fourier transform and operator on the space T m for a function f : T m → R, In the univariate case, the smooth type convolution functions are of the form, The super-smooth case requires θ > 0 and the exponential decay dominates g(ω). In the regular-smooth or polynomial smooth case, θ = 0 and the Fourier decay of the convolution function is at a polynomial rate in ν. These are the typical cases considered [19], [23] and references therein. The smoothness assumption in multivariate context of convolution is much less studied. An extension from m = 1 to m = 2 is given by [16] with, The most general assumption of the regular-smooth type appears in [37] which broadly defines the decay in terms of where C α are constants with C α = 0 for at least one multi-index α.
In this work, the focus is on the case of anisotropy where the behaviour is different in the co-ordinate directions only. Thus motivates the following assumption of smooth decay, The definition in (2) generalises the assumptions made in earlier univariate cases such as [29,36] and [18]. It is still not as general as the multi-index case for the regular-smooth decay presented in [37] but does include the super-smooth case when components of θ > 0. Note that the regular smooth decay in (2) takes the form similar to [19] to avoid singularity at zero.

Anisotropic multivariate wavelet bases
For the context of univariate deconvolution, a pragmatic choice of basis functions is the standard periodised Meyer wavelet basis (φ, ψ) defined on T. The Meyer basis is band-limited allowing a simple mathematical formulation in the Fourier domain to simplify the convolution structure (cf. [23]). This approach will be adopted here for the multivariate framework.
Start with the standard periodised Meyer wavelet basis (c.f. [23]). A periodic function f : T → R has wavelet expansions, known as the homogeneous and inhomogeneous expansion respectively. The coefficients, b j,k are given by the inner product, with a j0,k defined similarly in terms of φ j,k . The translated and dilated wavelet functions are defined, An anisotropic wavelet basis allows different scale levels in each dimension and are computed as a tensor product of the univariate inhomogeneous wavelet ba- To generalise, each dimension can start its expansion at a possibly different coarse j 0 scale. To this end, define a vector of coarse scales j 0 = (j 0,1 , j 0,2 , . . . , j 0,m ) ∈ Z m + . Then the inhomogeneous hyperbolic wavelet basis is defined over a set of scales j ∈ J i and locations k ∈ K j with, Then a hyperbolic wavelet expansion of a function f : T m → R takes the form, where now the wavelet coefficients are defined by an equivalent inner product, In the forthcoming presentation, to ease the notation in places, λ = (i, j, k) will denote a triplet index and ψ λ = ψ i j,k .

Besov spaces
The classical anisotropic Besov spaces are defined in [10] and [42]. It is defined via the iterated difference operator, Then the resulting classical anisotropic Besov space can be defined, where · is the ceiling function. As is standard in the literature, the above Besov space can be extended to include the case of π = ∞ and/or q = ∞ with an appropriate modification to use the sup norm. The interested reader is referred to [10], [42] and references therein for more treatment and links between Besov spaces and other spaces such as the anisotropic Hölder and Sobolev spaces.
As is common in wavelet methodology, wavelets characterise functions in Besov spaces by their wavelet coefficients and their decay. For our purposes, define the equivalent anisotropic Besov spaces introduced by [21], The equivalence of the two spaces and the link to the hyperbolic wavelet basis is established in [21].

Unconditional bases and Temlyakov's property
Important properties of univariate wavelets and their multivariate extensions were investigated and analysed in the 90s and the 2000s, see for example the results in [14,17,22,26] and [27]. Two such properties that were instrumental in deriving the convergence rate results were the unconditional basis property and the Temlyakov property. The unconditional basis property and its consequences below will be used later in the proofs.
A basis {e i , i ∈ N} of a space X is said to be an unconditional basis of X if for every x ∈ X there exists a unique set of scalars {a n } n∈N such that x = n∈N a n e n . The periodised Meyer wavelets are known to be unconditional bases for L p (T) (see Theorem 9.1.4 of [11]) and therefore from Lemma 2.1 and Lemma 2.2 of [13] it follows that the hyperbolic Meyer wavelet basis {ψ λ } in Section 2.3 are unconditional bases for L p (T m ) and satisfy the Littlewood-Paley condition below, An unconditional basis also has the shrinkage property that if a set of scalars is dominated by another set (|θ i | ≤ |θ i |) then there exists an absolute constant C such that, The other important property is the Temlyakov property that is not used in our scenario. The Temlyakov property assumes that a basis {e i , i ∈ N} satisfies, The unconditional basis and Temlyakov's property enabled the convergence rate results to be easily extended from the L 2 -loss to the general L p -loss for 1 ≤ p < ∞. In particular, the Temlyakov property was the instrumental tool to ensure the results extend to 2 < p < ∞, see for example [26] and [27]. In those papers, the focus is on univariate wavelet bases and the associated wavelet-tensor basis. The wavelet-tensor basis extends the univariate basis to a multivariate basis via a tensor product albeit with the same single parameter j ∈ Z used in the tensor. One of the major benefits of the wavelet-tensor basis is that it still retains the Temlyakov property. However, the wavelet-tensor basis is unable to optimally reconstruct functions with anisotropic smoothness while the hyperbolic wavelet basis can (see [4]). On the other hand, the hyperbolic wavelet basis does not satisfy the Temlyakov property in general. Consequently, the results presented here are constrained to 1 ≤ p ≤ 2 for the regular-smooth scenario as it uses the inhomogeneous hyperbolic wavelet expansion. The supersmooth scenario is able to cover the full spectrum 1 ≤ p < ∞ in the rate results since it uses a raw hyperbolic projection expansion instead of the inhomogeneous wavelet expansion and has no need for the Temlyakov property.

Estimation
The estimation of f is tempered with a hard-threshold inhomogeneous expansion or raw projection expansion depending on the severity of the decay. In the case of super-smooth convolution, only a linear projection expansion is considered since the decay in the Fourier domain is exponentially fast. In the case of regular smooth convolution, an inhomogeneous non-linear expansion is used.
To simplify problem, the convolution operator is reduced to a simple product of coefficients when viewed in the Fourier domain. Define the Fourier basis functions, e ω (x) := e 2πiω T x , with ω ∈ Z m and x ∈ R m . Then the noisy model can be computed in the Fourier domain, where From the noisy Fourier coefficients y ω , the wavelet coefficients can be estimated with, where the sums are finite over the sets defined below,

Non-linear and projection estimators
In the case of regular-smooth convolution, the non-linear estimator f R is a hardthresholded inhomogeneous wavelet expansion with, where the index set depends on the rate of decay in the regular-smooth convolution function with, and the coefficients are hard-thresholded with b i j, The boundary for truncation, T λ , is defined in the forthcoming section. Note that the index set, J iε , has the calibration of j 0 = 0. Namely, the coarse resolution multi-index starts at the origin 0. This is for mathematical convenience since, asymptotically, the coarse index, j 0 , does not affect the convergence rate.
In the case of super-smooth convolution, there is no hard thresholding and the raw estimates are used in the projection. The resolution level to conduct the projection needs to occur at, for some 0 < ζ < 1. The projection estimator, f S is then given by,

Threshold boundaries for the regular-smooth case
The resolution level thresholds are of the form T i,j = δτ i,j c n decomposed into three terms, • δ: a thresholding or smoothing parameter: δ > 0.
• τ 2 i,j : a variance factor for the wavelet coefficients in direction i and at resolution level j, • c n = ε log ε −1 : an asymptotic scaling parameter.

Convergence results of the hyperbolic wavelet estimators
Consider the two scenarios of regular-smooth and super-smooth separately. In particular, consider the cases when either the entire convolution function is regular-smooth or super-smooth in all dimensions. That is, θ = 0 or θ > 0, so that there isn't a situation where there is regular-smooth behaviour in one dimension but super-smooth behaviour in another direction.

Theorem 1.
Consider a target function f ∈ B s π,r (T d ) where π ≥ 1 and s ≥ 1/π is observed indirectly in model (1). In the case of regular-smooth convolution when θ = 0 set, Estimate f with the hard-threshold estimator, f R in (6) with the resolution truncation choice imposed by (7) and threshold choices specified in Section 3.2. For the L p -risk with 1 ≤ p ≤ 2, there exists a constant C > 0 such that for all where ρ takes the following form, In the super-smooth case when θ > 0, f ∈ B s π,r (T d ) where π ≥ 1, r > 0, and s ≥ 1/π is observed indirectly in model (1). Estimate f with the raw linear projection estimator f S in (9) with coarse resolution multi-index in (8). For the L p -risk with 1 ≤ p < ∞, there exists a constant C > 0 such that for all n ≥ 1, with s * = γ s + 1/p − 1/(p ∧ π). -risk (1 ≤ p ≤ 2). These are typically known as the dense and sparse phase in (11) and (12) respectively. This is a result of the decreasing amount of smoothness present in the underlying function f moving from (11) to (12) and appears in similar structures throughout the literature (see [33,34,28,7,44,29] along with [24] and [25] where the dense and sparse case are treated in separate papers).

Remark 2. The (log 2 ε −1 ) m term in the bound in (10) is conjectured to be not optimal. It is the result of a crude bound used in the proof of Theorem 1 to control the weighted composition of integers for the number of feasible resolution vectors j ∈ J iε that depend implicitly on the degree of ill-posedness vector ν ≥ 0.
To remove this extra logarithmic term, a better approximation for the number of feasible resolution vectors is required. (11) supplement the already known results for multivariate function estimation in the direct setting. For example in [24], the rates in the dense case agree up to a logarithmic term discussed in Remark 1. Similarly, in [4], the maxiset of functions that achieves the convergence rate (ε log ε −1 ) 2γp/(1+2γ) is the union of all Besov balls ∪ s,γs=γ B s p,∞ , with harmonic mean equal to γ. These rate results could be captured as a special case of our results up to the logarithmic factor in (11) with the choice ν = 0. Namely, 2γ ν+1/2 = 1 when ν = 0. In fact, the proof of Theorem 1 can be modified to the case of ν = 0, where the number of feasible vectors j ∈ J iε 0 is much simpler. In that situation with the modified proof, the revised rate is in agreement.

Remark 4.
The rates generalise and extend results already seen in the literature. Indeed, the extension from univariate results to anisotropic results in the direct setting has seen the rates subordinate to the harmonic sum of the smoothness parameter s (see [31] and [4]). This behaviour is exhibited here and the harmonic sum applies in an equivalent type manner for the decay in the convergence rate due to the regular smooth convolution in combination with the well known 'curse of dimensionality'. For example, in univariate regular smooth deconvolution work in [23], the convergence rate in the dense case is given by 2sp/(2s + 2ν + 1) where ν is the univariate degree of ill-posedness parameter. In classical multivariate methods, the curse of dimensionality is exhibited where the rate given by 2γ s p/(2γ s + 1) = 2γ s p/ (2γ s + m). Namely, the dimension appearing in the denominator.

Remark 5.
Consistent with the literature (see [19]), the super-smooth case has a logarithmic convergence rate. The indices in the rate depend on the underlying smoothness s * , an adjusted harmonic sum in s; the dimension m and the harmonic sum in the super-smooth parameter β. As expected, the rate does not depend on the regular-smooth index ν but the rate has an extra log log term. This extra log log term is also a consequence of counting the number of feasible multi-index resolution levels above the coarse resolution j 0 when computing the bias of the estimator.

Conclusions
This paper examined the deconvolution Gaussian white noise model (1) in a multivariate setting. In particular, the focus was on the behaviour of anisotropies in both the target function f after it was distorted by the anisotropic convolution function g. The hyperbolic wavelet basis was chosen since it is already known to be the superior basis for anisotropic function estimation in the direct setting. This paper has focused on the L p -risk and extended findings in [23,4]. In particular, rates are established when the convolution operator is smooth and has anisotropies in the form of a multiplier type behaviour in the Fourier domain for each dimension, (2). Both the regular-smooth and super-smooth results are new and not been seen before to the best of the authors knowledge.
As expected, when the degree of ill-posedness is regular-smooth, the smoothness of the function f appears in the rate of convergence in the form of the harmonic sum in both the numerator and denominator of the rate term. Also as expected, the rate will decay as the degree of ill-posedness parameter increases in each dimension. However, the decay of the rate is not in a linear way but by the harmonic mean of the degree of ill-posedness vector parameter, ν. When the degree of ill-posedness is super-smooth, a logarithmic rate of convergence is attained and this is consistent with the previous minimax results seen in the literature for univariate cases.
It is desirable to obtain results for the case when the multivariate function of interest is dense in some domain variables but sparse in others. This was considered during the manuscript preparation. However, obtaining these results would require a more delicate embedding of the anisotropic Besov spaces across the different variable dimensions that are either dense or sparse. There is some results on anisotropic Besov embeddings across different dimensions in the literature such as Theorem 3 of [5]. However, such embeddings cannot be readily applied in the methodology used in the proofs given in Section 6. A different or more complex proof is required and is left as a possible future research direction.

Proof of Theorem 1
The proof of the result is based on the method used in [26] and modified to incorporate the convolution structure and hyperbolic wavelet basis instead of the wavelet-tensor basis. The noise terms, z ω in (5) are complex Gaussian with Ez ω = 0 and a bound for its variance is sufficient to determine a bound on all its moments and bound on their probabilities. The covariance of the z ω can be computed, where δ x,y is the Kronecker delta function. Therefore z ω are independent and identically distributed complex Gaussian random variables with mean zero and unit variance. Thus, in the case of smooth decay, when j 0 = 0 the variance of the estimated anisotropic wavelet coefficients is of order, where is the Hadamard product. Namely, x y = (x 1 y 1 , x 2 y 2 , . . . , x m y m ) for x, y ∈ R m and 2 x = (2 x1 , 2 x2 , . . . , 2 xm ) ∈ R m . Consider now the regular-smooth case when θ = 0. In this case, from (14) it follows that τ i,j = O 2 j,ν . Denoting Z ∼ N (0, 1) a deviation type bound is constructed, if δ > 4 √ p. Furthermore, consider the number of feasible vectors j ∈ { j, 2ν + 1 = J} where J ∈ Z + . Due to the complexity of the problem with the weights 2ν + 1, consider the feasible j for a less constrained problem. That is, {j : j, 2ν+1 = J} ⊂ {j : j i (2ν i + 1) ≤ J, i = 1, 2, . . . , m}. If the weights adhere to 2ν ≥ 0, then the crude bound can be computed quite easily with, For ease of notation in the expansion of the risk, define the summation sets, λ∈Λn := i∈I j∈J i ε k∈K j and λ∈Λ := i∈I j∈J i k∈K j . Define the weighted measure, The spaces are embedded l q (μ) ⊂ l q,∞ (μ). Consider the truncated measure, where the second last inequality is a result of (16).
Therefore, consider the deterministic bias for the truncated estimator where f ∈ B ξ p,∞ where ξ = (ν + 1/2)(1 − q/p). By the property of Besov spaces, The performance of the regular-smooth hard thresholded estimator takes the decomposition, ≤ Cc 2p n f p p . Exploit again, the unconditional basis property (4) and p inequality for p ≤ 2, where the last line follows from Lemma 2.2 of [26].