Errors of regularisation under range inclusions using variable Hilbert scales

Based on the variable Hilbert scale interpolation inequality bounds for the error of regularisation methods are derived under range inclusions. In this context, new formulae for the modulus of continuity of the inverse of bounded operators with non-closed range are given. Even if one can show the equivalence of this approach to the version used previously in the literature, the new formulae and corresponding conditions are simpler than the former ones. Several examples from image processing and spectral enhancement illustrate how the new error bounds can be applied.


Introduction
Let X and Y be infinite dimensional separable Hilbert spaces with norms · and scalar products (·, ·). We study linear inverse problems in form of ill-posed operator equations (1) Af = g, f ∈ X, g ∈ Y, characterised by an injective bounded linear forward operator A : X → Y for which the range range(A) is a non-closed subset of Y . Then equation (1) is unstable in the sense that the inverse operator A −1 : range(A) ⊆ Y → X is unbounded and hence the use of perturbed data g δ instead of the exact right-hand side g with and noise level δ > 0 may lead to arbitrarily large errors in the solution of (1) even if the noise level is extremely small. As a consequence of this ill-posedness phenomenon regularisation methods are required for the stable approximate solution of the inverse problem. Their basic idea consists in finding approximations to the exact solution f in form of solutions f α = f α (g δ ) to stable auxiliary problems neighbouring (1). Those solutions are obtained by using the noisy data g δ . The degree of neighbourhood of the exploited auxiliary problems is controlled by a regularisation parameter α > 0. In this context, small α express closeness to (1) in combination with a low level of stability, whereas larger α ensure better stability, however combined with a low level of approximation. For the success of any regularisation method an appropriate trade-off between stability and approximation has to be aspired when choosing the regularisation parameter.
As already outlined and summarised in the monograph [9] by Engl, Hanke and Neubauer a successful way for doing regularisation for linear ill-posed problems in Hilbert spaces including convergence and convergence rates of constructed methods requires some knowledge on the impact of smoothness on the regularised solutions. Smoothness should be understood there in a very generalised sense as both solution smoothness and smoothing properties of the forward operator. In [9] such smoothness fitting is focused on Hölder type source conditions yielding Hölder type convergence rates when the regularisation method has a sufficiently high level of qualification. This theory is closely connected with associated (classical) Hilbert scales, where we refer to the seminal paper by Natterer [27]. An extension of that theory to generalised source conditions implying also more general convergence rates was performed rather independently by two different approaches. The first approach initiated and established by Hegland (see [11,12]) introduced variable Hilbert scales with positive index functions the behaviour of which is in particular of interest for large arguments covering the spectrum of an injective unbounded linear operator with bounded inverse. Results in this approach are based on interpolation inequalities. An alternative second approach was developed and published by Mathé and Pereverzev (see [24,25,26]) and complimented by ideas of Hofmann and other co-workers (see, e.g., [15,16,17]). This approach, in principle, also exploits variable Hilbert scales, but the index functions occurring there are more specific and their behaviour is of interest just for small positive arguments covering on the spectrum of A * A. The index functions of the second approach are monotonically increasing and tend to zero as the positive arguments tend to zero. Here we call them rate functions abbreviated by over-lined small Greek letters, because they also express the convergence rate of approximate solutions. One of the main goals of this paper is to compare both approaches, their results, required conditions and their natural interplay. Moreover, some more consequences and new convergence rates results of Hegland's approach shall be formulated and proven in the sequel.
In our study we use variable Hilbert scales and corresponding interpolation inequalities in order to obtain bounds from above for the error f − f α of regularised solutions under conditions imposed on f . As is well-known the convergence of regularised solutions f α → f in X, even if δ → +0 and α = α(δ, g δ ) → +0 is chosen in an appropriate manner, can be arbitrarily slow for solutions f which are in some sense non-smooth with respect to the forward operator A. It is of essential interest in regularisation theory to obtain convergence rates with appropriate rate functionsη. For a non-closed range of A convergence rates require that general source conditions are satisfied which attain in the standard case the form (4) f =ψ(A * A)v with source element v ∈ X and with some rate functionψ defined at least on the interval (0, A 2 ] covering the spectrum of A * A. Here,ψ(A * A) is well-defined as an injective bounded positive self-adjoint linear operator by spectral calculus (see, e.g., [9,Section 2.3]). If the regularisation method has a sufficiently high qualification (see, e.g., [15,Definition 2.6]), then an asymptotically fast decay of ψ(t) → 0 as t → +0 corresponds with high order convergence rates (3) of regularised solutions and vice versa. Note that the tool of general source conditions with rate functionsψ different from monomials was early applied to regularisation theory by Tautenhahn in [33]. Later Hohage (see [19,20]) studied in detail the case of logarithmic functionsψ in (4) and corresponding convergence rates. Using a non-standard source condition with source element w ∈ X, the priori information about the smoothness of the solution f can be expressed by an injective bounded self-adjoint positive definite linear operator G : X → X, where range(G) is a non-closed subset of X. In most cases the character of the operator G is rather independent of the forward operator A and hence G need not be a function of A * A. However, in order to make conclusions to convergence rates of regularised solutions the assumption f ∈ range(G) of (5) and the assumption f ∈ range(ψ(A * A)) of (4) have to be connected anyway. In the framework of Böttcher et al. [6] conditions for such connections and their interplay are discussed in a comprehensive manner. In [15] they are called link conditions. Besides the simplest link type G =φ(A * A), which is typical for commuting G and A * A, one of the most convincing class of link conditions represent range inclusions introduced with Yamamoto in [18] to regularisation theory. The favourite form of such inclusion is with some rate functionψ. Evidently, under (5) a range inclusion (6) immediately implies a source condition (4). The higher the rate expressed byψ is, i.e. the faster the decayψ(t) → 0 as t → +0 goes on, the smaller the set range(ψ(A * A)) becomes. Hence the condition (6) is a strong one for higher ratesψ and vice versa. An alternative link condition is for some constant C > 0, where the rate function̺ acts as a benchmark for the degree of ill-posedness of equation (1) with respect to the a priori information (5). From Proposition 2.1 in [6] we know that a range inclusion range(G 1 ) ⊆ range(G 2 ) and a condition of the form G 1 x ≤ C G 2 x for all x ∈ X and some C > 0 are equivalent. Consequently, with Proposition 2.18 in [9] the condition (7) is equivalent to the range inclusion As exploited in [6, § 4] one can reduce (8) to the form (6) is an operator monotone function (cf. [5]. An important special case of that implication, namely for ̺(t) = t 1 2µ with 0 < µ ≤ 1/2, is well-known as Heinz-Kato inequality (see, e.g., [6,Proposition 8.21] or the corollary of Theorem 2.3.3 in [32, p. 45]). In that special case, (8) yields (6) withψ(t) = t µ for exponents 0 < µ ≤ 1/2.
In the next section we review the definition and some properties of index functions and variable Hilbert scales. The fundamental interpolation inequality is given with a short proof together with an application to a general regularisation method. We then show how the variable Hilbert scales provide natural source conditions. In the third section bounds for the modulus of continuity are given in a variable Hilbert scale setting. An important part of this section compares the new bounds on the modulus of continuity with some obtained earlier and shows how the new results have a substantially simpler structure. The fourth section analyses linear regularisation methods and parameter choices using the variable Hilbert scale approach. In section 5 we consider several examples from image processing and spectral enhancement and the paper finishes with some final remarks.

Interpolation inequalities and consequences
The main tool used here to derive error bounds for regularised solutions is an extension of interpolation inequalities to variable Hilbert scales. For classical Hilbert scales {X r } r∈R -with real numbers as scale index r -interpolation inequalities are well-established. These interpolation inequalities were initially applied to the treatment of linear ill-posed problems (1) by Natterer in [27] (see also the monograph by Engl, Hanke and Neubauer [9]). For variable Hilbert scales, new interpolation inequalities have to be formulated. Here the scale index r is replaced by a wide class of index functions defined as: We call an index functionθ ∈ I a rate function if it is monotonically increasing and if it satisfies the limit condition lim t→+0θ (t) = 0.
Note that any monotonically increasing continuous function θ(t) defined on a finite interval (0, t 0 ] satisfying lim t→+0 θ(t) = 0 can be extended to a rate functionθ such thatθ(t) = θ(t) for t ∈ (0, t 0 ]. Furthermore, the index functions corresponding to the classical Hilbert scales X r can be seen to be power functions θ(λ) = λ r for real r. Rate functions are obtained for this case if r > 0.
The set of index functions I includes the positive constant functions and all power functions but not the zero function. We denote the pointwise operations by φ + ψ, φψ and φ/ψ, respectively. As usual, multiplication by a constant γ is denoted by γφ. The composition is denoted by φ • ψ where (φ • ψ)(λ) = φ(ψ(λ)). The pointwise maximum of two index functions is φ ∨ ψ defined by (φ ∨ ψ) (λ) = max(φ(λ), ψ(λ)) and the pointwise minimum is φ ∧ ψ. One verifies that the index set I from Definition 2.1 is closed under • point-wise addition, multiplication and division, • multiplication with positive constants, • pointwise maximum and minimum and • composition.
If an index function is injective and surjective, the inverse denoted by φ −1 is also an index function. Not every index function is invertible, however. The reciprocal function of φ (with values 1/φ(λ)) is denoted by 1/φ.
The variable Hilbert scales are then families of Hilbert spaces indexed by I.
Definition 2.2. For a given injective self-adjoint positive definite linear operator T densely defined on a Hilbert space X we define the variable Hilbert scale {X θ } θ∈I as a family of Hilbert spaces X θ indexed by functions θ from the set I of index functions in the sense of Definition 2.1. Every Hilbert space X θ with θ ∈ I is then the closure of the domain of the quadratic form In such a way, variable Hilbert scales were introduced by Hegland in [11] for the special case of T being the inverse of a compact operator and in [12] for more general T . The choice of the operator T determines the Hilbert scale. In the simplest case where both T and its inverse T −1 are bounded all the Hilbert spaces X θ are isomorphic to X because of the continuity of the index functions θ. In this paper, in the context of ill-posed problems (1) we often assume that T is unbounded but has a bounded inverse, i.e. the spectrum of T is contained in the interval [ T −1 −1 , ∞) and has +∞ as an accumulation point. As the function 1/λ is an index function and the set of index functions is closed under composition, the inverse T −1 generates the same Hilbert scale as T . It is thus not necessary to consider variable Hilbert scales generated by invertible T and bounded T −1 separately. The more general case where both T and the inverse T −1 are unbounded is only considered for the negative Laplacian T = −Delta and in particular T = −d 2 /dt 2 . For the more general case where also A is unbounded we refer to the recent paper [17]. To get a link with (1), a particular T is suggested either by the forward operator A, by the operator G of condition (6) or based on a combination of both. A common choice is T = (A * A) −1 for injective operators A with a non-closed range. It follows that A * A = θ(T ) if θ(λ) = 1/λ. For classes of problems connected with deconvolution, however, T = −d 2 /dx 2 on L 2 (R) is the canonical choice as T is the generator of symmetric convolutions. An index function θ such that A * A = θ(T ) is then found using Fourier transforms. More generally, for problems where the source conditions relate to smoothness, T = −∆ can be chosen. In such a case ∆ denotes the Laplacian on L 2 (Ω) for some domain Ω ⊆ R d .
It was shown in [12] that there exists a continuous embedding X φ ֒→ X ψ if and only if φ ≤ γ ψ for some constant γ > 0. If two different index functions θ 1 and θ 2 are identical on the spectrum of T they define the same norms and hence the same space X θ1 = X θ2 . If they differ on the spectrum they do define different norms, however, these norms may be equivalent and thus the Hilbert spaces X θ1 and X θ1 as elements of the variable Hilbert scale {X θ } θ∈I are indistinguishable.
The most important connection between the norms of different spaces X θ is the interpolation inequality for variable Hilbert scales.

Lemma 2.3 (Interpolation inequality).
Let T be an unbounded injective self-adjoint positive definite linear operator densely defined on the Hilbert space X with bounded inverse T −1 : X → X. Moreover let φ, ψ, θ and Ψ be index functions such that Ψ is concave and Then for any element 0 = f ∈ X θ ∩ X ψθ one gets f ∈ X φθ and Proof. Let in the following the measure ν be defined by where E(λ) is the spectral family or resolution of the identity defined by T . By definition, because θ is positive and the integral of dν equals 1, ν is a probability measure. Taking into account that f ∈ X ψθ and that Ψ is concave we obtain from the inverse Jensen inequality that Because integration is monotone and the inequality (10) holds one gets ∞ 0 φ(λ) dν(λ) ≤ ∞ 0 Ψ(ψ(λ)) dν(λ) < ∞ and hence f ∈ X φθ . Summarising the results we arrive at the inequality which provides us with the required inequality (11) by replacing dν(λ) by its definition.
The concavity of Ψ is the key property which enables us to use Jensen's inequality. The Lemma 2.5 below shows that this property has only to be established for large arguments. We can focus on large arguments, if the spectrum of T for T under consideration contains only sufficiently large values and has +∞ as an accumulation point. We need some auxiliary result: is concave for some t 0 > 0 then θ is monotonically increasing. If moreover lim t→∞ θ(t) = ∞, then θ is even strictly increasing.
Proof. We show the contraposition. Assume that θ : [t 0 , ∞) → (0, ∞) is not monotonically increasing. Then there exist t 0 < t 1 < t 2 such that θ(t 1 ) > θ(t 2 ). Let . By rearranging this inequality one gets and so θ is not concave. The strict monotonicity for lim t→∞ θ(t) = ∞ follows immediately from the fact that the hypograph of a concave function is a convex set. Now we can replace index functions which are concave for large arguments by such which are globally concave in the following way: Lemma 2.5. Let θ(λ) be an index function which is concave and hence by Lemma 2.4 increasing on the interval 0 < λ 0 ≤ λ < ∞. Then there exists an index function Ψ(λ) which is concave for all 0 < λ < ∞ such that with some λ 1 > λ 0 one has Proof. To obtain the assertion of this lemma we consider the set of real numbers {α | αλ ≥ θ(λ), λ 0 ≤ λ < ∞}. As θ(λ) is concave for λ 0 ≤ λ < ∞ this set is not empty and it is bounded below by zero. Thus it does have a greatest lower bound • there is a λ 1 such that α 0 λ 1 = θ(λ 1 ) if not, α 0 would not be the greatest lower bound.
Hence, knowing from Lemma 2.4 that an index function Ψ which is concave for all 0 < λ < ∞ is always increasing, the function Ψ(λ) can be composed of a linear function growing from zero to θ(λ 1 ) in the interval (0, λ 1 ] and coinciding with θ for greater arguments. The interpolation inequality is the main tool to obtain error bounds for solvers of linear ill-posed problems. However, taking into account Lemma 2.5 by inspection it becomes clear that rate results derived from Lemma 2.3 are only based on the behaviour of Ψ(λ) for large λ ≥ λ 1 . Without loss of generality Ψ can be amended for 0 < λ ≤ λ 1 by the linear function Ψ(λ) = Ψ(λ 1 )λ/λ 1 for 0 < λ < λ 1 .
Three typical choices for Ψ(λ) being concave at least for sufficiently large λ are . For all three choices we have the limit condition and one gets the following versions of interpolation inequalities from Lemma 2.3: • and for Ψ(λ) = log(λ) one has Asymptotically, i.e. for f θ → 0, the interpolation inequality allows us to find error bounds in the application to the error estimation for the solution of equation (1). One aims to get bounds for the norm f in X using values of the image norm Af in Y and values of the norm f ψθ which expresses the specific additional smoothness of f . The terms in the interpolation inequality (11) are then The first condition leads to φ(λ)θ(λ) = 1 for all λ and the second condition gives θ(T ) = A * A and with θ(λ) := 1/λ the relations T = (A * A) −1 and φ(λ) = λ. We are still free to choose the index functions ψ and do it in the form ψ(λ) := χ(λ) λ with an appropriate index function χ.
For later use we add here some observations about convex functions which are stated as a lemma: Lemma 2.6. Let Ψ : (0, ∞) → (0, ∞) be a concave function. Then we have the following properties: (a) The function Ξ : (0, ∞) → (0, ∞) defined by is concave and hence monotonically increasing.

Remark 1.
We note here that the transformation S : Ψ ∈ I → Φ ∈ I according to formula (14), applicable to every index function and preserving concavity, is an involution, that means S −1 = S and hence S is bijective. If the concave index function Ψ satisfies lim λ→∞ Ψ(λ) = ∞, then by Lemma 2.4 the function is even strictly increasing and if, in addition, Ψ is a rate function, i.e., it satisfies the additional limit condition lim λ→+0 Ψ(λ) = 0 (which is also motivated by Lemma 2.5), the inverse function Ψ −1 is a well-defined and convex index function. If, on the other hand, the limit condition (12) holds, then we have and taking into account Lemma 2.6 (a) and (b) one sees that Φ = S(Ψ) is a concave rate function. Vice versa we have that Ψ = S(Φ) satisfies (12) whenever Φ is a rate function. By inspection of the proof of Lemma 2.6 one can also see the following facts: The involution S preserves also the convexity of an index function and if the concavity or convexity is strict, then the strictness carries over to the transformed function. Now we are ready to draw conclusions from Lemma 2.3. A first, abstract version of bounds for errors of regularised solutions is given in the following corollary. We will denote by f α an approximation of a solution f to equation (1) which is computed from an approximate right-hand side using a regularisation method and a regularisation parameter α > 0.
Corollary 2.7. Let A : X → Y be an injective bounded linear operator with nonclosed range mapping between the two Hilbert spaces X and Y . Furthermore let the variable Hilbert scale {X ν } ν∈I be generated by T = (A * A) −1 such that any scale element X ν has a norm denoted by · ν . Moreover let χ and Ψ be index functions and Ψ be concave such that If the solution f to (1) in addition satisfies the condition f ∈ X χ and if f α ∈ X χ is such that Setting h := f α − f this yields the estimate and proves the assertion of the corollary.
Results similar to those of Corollary 2.7 can be found for other choices of T , see for example Corollary 5.3 where T = −d 2 /dt 2 . The Corollary 2.7 can be interpreted as an instance of an abstract Lax theorem [1] where the condition (16) is a stability condition and the bound (17) relates to consistency.
Note that the error estimate (18) of Corollary 2.7 requires the essential conditions f ∈ X χ and f α ∈ X χ , i.e. the approximate solutions f α are constructed such that they obtain the same smoothness level with respect to T as the exact solution f . A next step for drawing conclusions of Lemma 2.3 will be formulated in Corollary 2.8 by assuming that f belongs to ball in X χ with positive radius R = R 1 and that the approximate solutions f α for all α > 0 under consideration belong to another such ball with radius R = R 2 . Moreover, we consider for data g δ satisfying (2) the limit process δ → +0 in correspondence with associated regularized solutions f α , where the regularisation parameter α > 0 is chosen either a priori as α = α(δ) or a posteriori as α = α(δ, g δ ).
Proof. Since Ψ is concave by Lemma 2.4 the error norm ǫ Ψ(ζ 2 /ǫ 2 ) obtained from (18) is increasing in ζ > 0 for fixed ǫ > 0 and as a consequence of Lemma 2.6 (a) this upper bound is increasing in ǫ > 0 for fixed ζ > 0. Moreover, due to (12) in the limit process ǫ → 0 for fixed ζ > 0 implying ζ/ǫ → ∞ this bound and hence the error norm in X even tends to zero. For the mentioned kinds of monotonicity we obtain formula (22) by ǫ ≤Cξ(δ) and (22) is a rate function declining to zero as δ → 0 becauseξ is a rate function.
Remark 2. As a special case for the situation of Corollary 2.8 we can consider an a posteriori choice α dis = α dis (δ, g δ ) for the regularisation parameter realised by a discrepancy principle with some prescribed C dis > 0. Then by using the triangle inequality we obtain with (2) as noise model Then for such α = α dis under (12) the regularisation method converges strongly in X with the convergence rate for some constantK > 0. Note that beside the assumption f ∈ B χ (R 1 ) on the solution smoothness for that result the strong condition f α(δ,g δ ) ∈ B χ (R 2 ) for all δ ∈ (0, δ max ] and all associated g δ satisfying (2) is required.
The convergence rate in (24) depends only on the asymptotic behaviour of Ψ(λ) as λ → ∞. Thus the alteration of Ψ(λ) for small λ in the sense of Lemma 2.5 has no influence on that rate. For the class of functions Ψ(λ) = λ κ with 0 < κ < 1 rate functions proportional to δ 1−κ occur in (24). All those error rates are lower than the rate δ which is typical for well-posed problems. It should be mentioned that Ψ(λ) = λ fails to satisfy the condition (12) and used in Corollary 2.7 the inequality (19) does not yield a convergence rate.
When setting φ(λ) := λ, θ(λ) := 1/λ and ψ(λ) := Ψ −1 (λ) in the interpolation inequality (11) then the corresponding regularity condition f ∈ X ψθ is equivalent to a source condition (4) which expresses the specific smoothness of the solution f with respect to the forward operator A of equation (1). Proposition 2.9. Let Ψ(λ), for 0 < λ < ∞, be a concave and strictly increasing index function satisfying the limit conditions lim which is then a rate function.
Proof. Under the stated assumptions the functionψ is well-defined and a rate function. Namely, we can write when using the substitution u := Ψ −1 (1/t). The variable u > 0 is strictly decreasing with respect to t > 0 such that u → ∞ corresponds with t → +0 and vice versa t → ∞ corresponds with u → +0, because Ψ −1 is also strictly increasing and we have lim Moreover, we have One has equivalence if and only if and the claim follows. This proves the proposition.
After the millennium Mathé and Pereverzev with coauthors seized, reused and extended Hegland's ideas and concepts of variable Hilbert scales and corresponding interpolation inequalities from [11,12] for linear ill-posed problems and their regularisation (cf. [24,25]) and combined it (cf. [15,26]) with the concept of approximate source conditions (cf. [8,14]). The comprehensive theory developed therein considers only rate functions as index functions. Such an approach leads in general to different formulae compared with the results based on the concept of Lemma 2.3, but as we will outline in the sequel clear cross-connections and sometimes even equivalences of the assertions obtained characterise the two different ways.

Modulus of continuity of A −1
The modulus of continuity of A −1 restricted to the set AM with M ⊆ X is The impact of the modulus of continuity on error bounds in regularisation has recently been discussed in the paper [16, §4]. It is well-known that the worst case error This result can be derived from Corollary 3.7 and Theorem 2.1(c) in [16] (see also Theorem 1 in the earlier paper [25]). A similar assertion was already mentioned in a rudimentary form in a paper by Ivanov and Korolyuk in 1969 [21]. The following proposition also yields an upper bound for the modulus of continuity based on a variable Hilbert scale interpolation inequality using Lemma 2.3 or Corollary 2.7. For the proof we use Lemma 2.6 (b).
We note that for centrally symmetric and convex sets M, f ∈ M, and regularised solutions f α dis ∈ M obtained from the discrepancy principle of form (23) mentioned in Remark 2 we easily derive along the lines of [16, Lemma 2.2] that (31) f − f α dis ≤ ω(2M, (C dis + 1)δ) with 2M := {u ∈ X : u = 2v, v ∈ M }. In the case M = B χ (R) with 2M = B χ (2R) the estimate (31) yields with (30) a convergence rate of the form (24) with constant K = 4R 2 /(C dis + 1) 2 . With more generality such rates were verified above directly from Corollary 2.7. Under weak additional assumptions (see [16,Corollary 3.7]) there is also a constant C > 0 such that Then a convergence rate of the form (24) is order optimal independent of the con- for all C > 0. On the other hand, Corollary 2.8 yields an error estimate of best order just for ξ(δ) ∼ δ, hence the discrepancy principle is order optimal in that sense.
Evidently, under the assumptions of Proposition 2.9 with the additional setting (20) of radius R in X χ , an element of the Hilbert scale generated by T = (A * A) −1 expressed through the index function χ. We emphasise that the upper bound in (30) for the modulus of continuity from Proposition 3.1 needing only one function Ψ has a much simpler structure than the nested upper bound in (28) composing the functionsψ and Θ −1 . Also the required concavity of Ψ for obtaining (30) looks much simpler than the needed concavity of the composite functionψ 2 ((Θ 2 ) −1 (t)) ≡ψ 2 (Θ −1 ( √ t)), 0 < t < ∞, for obtaining (28).
In a second step we note that by using the monotonically increasing bijection s = Θ 2 (u) between s ∈ (0, ∞) and u ∈ (0, ∞) and once more by exploiting the right equation in (32) we can write as follows for all s > 0: Hence, by Lemma 2.6 (b) we immediately see that as requiredψ 2 ((Θ 2 ) −1 (s)), s > 0, is concave if Ψ(λ), λ > 0, is concave. Since the involution S (cf. Remark 1) preserves concavity, the reverse assertion formulated in Proposition 3.2 becomes immediately clear, since (25) represents a one-to-one correspondence between index functionsψ and strictly increasing functions Ψ with the limit conditions under consideration.
We now investigate the concavity condition for the functionψ 2 ((Θ 2 ) −1 (s)) in more detail. For this a characterisation of the concavity of index functions is given in terms of the monotonicity of certain divided differences.
and by simple algebraic manipulations and the right choice of t 0 < t 1 < t 2 one gets the second and third statement from the first. Conversely, if (ψ(s 0 + s) − ψ(s 0 ))/s is a decreasing index function for all s 0 > 0 one has for all t 0 < t 1 < t 2 and thus ψ is concave. A similar argument shows that ψ is concave if the third statement holds.
A direct consequence of this lemma is that for concave rate functionsψ one has Proof. By lemma 3.3 we have to show that for all t 0 > 0 the function (ψ( √ t + t 0 ) 2 − ψ(t−0) 2 )/t is a decreasing index function. As the mapping s → (s+s 0 ) 2 is monotone it is sufficient to show that is monotonically decreasing. As ψ is assumed to be concave, Lemma 3.3 implies that Now let s 2 < s 2 . As σ(s) is monotonically decreasing on has The right-hand side is a decreasing function of s 1 if 2s 0 ≤ 2ψ(s 0 )/σ(s 2 ), i.e., σ(s − 2) ≤ ψ(s 0 ). This is a consequence of Lemma 3.3 as stated in the remark after the lemma. Replacing s 1 by s 2 thus gives a lower bound for ω(s 1 ) and thus It follows that ω is monotonically decreasing.
A consequence of this lemma is that for the concavity of the functionψ 2 ((Θ 2 ) −1 (s)) = ψ 2 (Θ −1 ( √ s)) it is thus sufficient to show thatψ • Θ −1 is concave. Finally we conjecture that a similar result to the proposition above also holds more generally, i.e., that a sufficient condition for concavity of g • ψ • g −1 is the concavity of ψ where g belongs to a class of suitably chosen functions.

Linear regularisation approaches
Our goal in this section is to draw conclusions from Corollary 2.7 for linear regularisation methods. Taking into account the setting of Corollary 2.7 we assume throughout this section that the index function Ψ(λ) is concave and strictly increasing for all 0 < λ < ∞ satisfying the limit conditions lim λ→+0 Ψ(λ) = 0, lim λ→∞ Ψ(λ) = ∞ (cf. Lemma 2.5), and (12). Moreover, we set (34) χ(λ) := Ψ −1 (λ) λ , 0 < λ < ∞, andψ(t) := 1 Then χ is an increasing index function with lim λ→∞ χ(λ) = ∞ andψ is an increasing index function with lim t→+0ψ (t) = 0, hence a rate function. As outlined in section 3 under these assumptions we haveψ(A * A)[B(R)] = B χ (R) and the best case for regularised solutions f α approximating the exact solution f ∈ X χ based on data g δ satisfying (2) by using an a priori choice α = α(δ) or a posteriori choice α = α(δ, g δ ) is to achieve the order optimal convergence rate (24). It is a specific consequence of interpolation theory and can be seen easily by inspection of Corollary 2.7 that a successful use requires the focus on regularisation methods which yield regularised solutions of appropriate smoothness. Precisely, there must be a ball B χ (R) to which the elements f α belong for all α > 0 attributed to sufficiently small δ > 0 and g δ satisfying (2).

4.1.
General linear regularisation schemata. In a first approach we are going to consider linear regularisation schemes as described in many textbooks on linear regularisation theory (see, e.g., [9,Chap. 4], [10,Chap. 2] and [2,3,22,23,29]). We consider approximate solutions to f based on a family of piecewise continuous real functions h α (t), 0 < t ≤ A 2 , to which we assign bias functions These functions depend on a regularisation parameter α ∈ (0, α max ], where α max may be a finite real number or ∞. Small α > 0 characterise good approximation of the original problem (1), whereas larger values α are connected with more stability. Hence, an appropriate trade-off between the two conflicting goals approximation and stability can be controlled by the choice of α. We say that such a function h α describes a linear regularisation method if the properties with a constant C 1 > 0 hold. Because of (37) we have another constant C 2 > 0 such that sup and hence for all 0 < α ≤ α max the estimate This implies the limit condition lim α→+0 Af α − g δ = 0 for all data g δ ∈ Y . As a consequence we have that there is always a parameter choice α = α(δ, g δ ), 0 < δ ≤ δ max , such that for some prescribed constant C dis > 0. If the mapping α → Af α − g δ is even continuous, then the discrepancy principle can be realised by a parameter choice α dis = α dis (δ, g δ ) satisfying the equation (23).
Here we call a rate functionφ a qualification of the regularisation method generated by h α if there is a constant C quali > 0 such that Now we are going to study under what conditions the inequality (21) in Corollary 2.8 can be fulfilled here withξ(δ) = δ. First we obtain In order to apply that corollary for obtaining a convergence rate (24) we assume f ∈ B χ (R 1 ) =ψ(A * A)[B(R 1 )] taking into account the cross-connection (34).
is a qualification of the method with constant C quali > 0 this gives with (39) and hence an estimate of type (21) is fulfilled withξ(δ) = δ when an a priori parameter choice α = Θ −1 (δ) is used.
Next we will check whether f α ∈ B χ (R 2 ) for some 0 < R 2 < ∞. We have and after some reformulation with g ≤ δ, since the different functions of A * A are commutable. Now let the interplay of the regularisation method expressed by h α (t) and the parameter choice α = α(δ, g δ ) be such that there is a constant C para > 0 with The upper bound C para in (41) must hold for all data g δ ∈ Y associated with the noise level δ > 0 and satisfying (2), where the case of an a priori parameter choice α = α(δ) should be included as a special case. Under (41) we have with (37) If there is a function Γ(α) satisfying for sufficiently small α > 0 the inequality this represents a sufficient condition for (41). In particular, if moreover the a priori parameter choice α(δ, g δ ) := Θ −1 (δ) satisfies (43) we have an estimate of type (21) withξ(δ) = δ for that a priori parameter choice whenever Θ is a qualification of the regularisation method under consideration. Hence the considerations above gave a sketch of the proof for the following proposition as a consequence of Corollary 2.8: Under the standing assumptions of this section including (34) let f ∈ X χ = range(ψ(A * A)) and consider regularised solutions (35) with a generator function h α that determines the regularisation method and satisfies (36) -(37) as well as (42) with some function Γ such that Θ(t) := √ tψ(t) satisfies (43) with some constant C para > 0 and is a qualification of the method (cf. (38)). Then for the a priori regularisation parameter choice α = α(δ) := Θ −1 (δ) → +0 as δ → +0 we have the convergence rate with some constantK > 0.
Note that in Proposition 4.1 the rate (44) also holds for any other parameter choice α = α(δ, g δ ) that fulfils the inequalities (43) and with some constantĈ > 0.
The best possible rate obtained in that way is f − f α = O √ δ for µ = 1/2. For µ > 1/2 the function Ψ remains strictly concave, but a finite function Γ(α) in (42) fails to exist, since we have sup 0<t≤ A 2 √ t ψ(t)(t+α) = +∞. The limitation of Proposition 4.1 to lower Hölder rates than the saturation of Tikhonov's method admits seems to be a consequence of the fact that our approach based on Corollary 2.8 and the construction (35) do not interact good enough in case of higher smoothness of f . In order to overcome that effect, we will consider another approach in the following subsection.

4.2.
Regularisation with unbounded operators and range inclusions. In a second approach, under a non-standard source condition (5) characterising the available a priori knowledge on the solution smoothness, we exploit a variant of the Tikhonov regularisation with regularised solutions where G : X → X is an injective bounded self-adjoint positive definite linear operator G : X → X with non-closed range, i.e., zero is an accumulation point of the spectrum spec(G) of the operator G. Since the unbounded linear operator with B = G −1 : range(G) ⊆ X → X is frequently a differential operator, this approach is sometimes called regularisation with differential operators. Precisely, by construction the element f α ∈ range(G) is well-defined for all α > 0 as the minimiser of the extremal problem T α (f ) := Af − g δ 2 + α Bf 2 → min, subject tof ∈ range(G), and then the penalty term in T α contains derivatives of the functionf . To apply Corollary 2.8 under our setting (34) we assume f ∈ G[B(R 1 )], with G[B(R)] from (26), and a link condition which is equivalent to with some C > 0. Then from [15,Lemma 6.2] we obtain that Along the lines of the paper [7] by Cheng and Yamamoto we consider an a priori parameter choice α = α(δ) as
With the above considerations we have shown the convergence rate result of the following proposition again as a consequence of Corollary 2.8: Due to [16,Corollary 4.5] for all concave Ψ fulfilling the standing assumptions of this section the rate (44) is even order optimal in the sense of As already discussed in the introduction the requirement (48) gets stronger for higher rates in (44). In many applications (see as an illustration the examples in [18]) one can only verify range inclusions of the form (8) In order to verify in general for what index functions χ a range inclusion (48) with range(G) ⊆ X χ is fulfilled, one can use the spectral theorem for unbounded self-adjoint operators T (see [34,Chapter VII.3] and also [28,Chapter VIII]). In the Hilbert space X, the injective, densely defined, self-adjoint, positive definite, and unbounded linear operator T is unitarily invariant to a multiplication operator M expressed by a real multiplier function m. This means that there are a measure space (Σ, A, µ) with finite measure µ, a unitary operator U : X → L 2 (Σ, A, µ) and a real measurable function m(t), t ∈ Σ, such that Then by using the notationsf := Uf ∈ L 2 (Σ, A, µ) and (Gw) := U G w ∈ L 2 (Σ, A, µ) by definition we immediately find that range(G) ⊆ X χ is equivalent to the condition that holds for all w ∈ X. In Example 2 with background in imaging (cf. [30]) we will consider the special case that U denotes the two-dimensional Fourier transform and that the corresponding measure space is (R 2 , B(R 2 ), µ) with the associated Borel σ-algebra and measure. In that example, T and G are commuting operators, both non-compact with a non-closed range.
On the other hand, in Example 3 we will exploit the one-dimensional Fourier transform to formulate sufficient conditions such that classical source conditions are satisfied for linear compact integral operators.

Examples
In the remaining examples we illustrate the theory. All the occurring operators A are linear integral operators. First the Example 2 refers to convolution operators A which occur, for example, when the deblurring of noisy images is under consideration. Then the Example 3 illustrates the low rate case where an integral equation with a smooth kernel is solved and it is known that the solution is in a Sobolev space. The situation here is similar as in the case of elliptic partial differential equations and has been discussed in [6]. In contrast to the PDE situation here convergence rates are low, typically of the form O(| log(δ)| −k ). The final Example 4 illustrates the high rate case where a derivative of data in the range of an integral operator with smooth kernel is considered. The high convergence rates are here of the form O(δ| log(δ)| k ).
Example 2. In this example with X = Y = L 2 (R 2 ) we are interested in deblurring, that means in finding a true picture which is characterised by a function f = f (t) ∈ L 2 (R 2 ), t = (t 1 , t 2 ) T , that satisfies a linear operator equation (1) of convolution type where g ∈ L 2 (R 2 ) is a blurred image of f which is additionally contaminated with noise such that only the noisy blurred image g δ ∈ L 2 (R 2 ) satisfying (2) available as data. Following [4, Chapter 3] the kernel function k(τ ), τ = (τ 1 , τ 2 ) T ∈ R 2 , is called point spread function of a space invariant imaging system under consideration. We assume that the kernel is such that its Fourier transformk =k(ω) , ω = (ω 1 , ω 2 ) T , called transfer function is bounded. Different variants of such deblurring problems are presented and analysed in [4]. As a reference situation we exploit for illustration a variant of an out-of-focus blur for whicĥ where J 1 is the Bessel function of order one and D is the radius of the circle of confusion (cf. [4, formula (3.25) on p.60]). The linear convolution operator A : L 2 (R 2 ) → L 2 (R 2 ) in this example has a non-closed range but it is non-compact and the kernel is not square integrable. In order to apply our theory to this example one needs to find an index function θ and a symmetric positive definite operator T such that A * A = θ(T ). A natural choice in this context is T = −∆ and in this case θ needs to satisfy |k(ω)| 2 = θ(|ω| 2 ). This, however, is not possible, ask(ω) is zero for some finite ω but an index function has to satisfy θ(λ) > 0 for all λ > 0 and it can only be zero asymptotically at zero or infinity. It is thus not possible to get error bounds for the deblurring problem using the variable Hilbert scale theory and T = −∆.
Then the range inclusion range(G) ⊆ X χ takes the form This range condition can only be satisfied if χ is bounded, i.e., i.e. χ(λ) ≤ C < ∞ for all λ > 0. This is again a consequence of the existence of zeros ofk(ω) for finite ω. A consequence of the finiteness of χ is L 2 (R 2 ) ⊂ H χ and it follows that the "source condition" reduces to f ∈ L 2 (R) which does not lead to an error bound. The failure of the above attempts to get error bounds clearly illustrates the need to extend the variable Hilbert scale theory to be able to cope with the deblurring problem. One can, however, deal with a partial deblurring problem. Observe that one has the asymptotics 2 J 1 (D|ω|) D|ω| ≍ |ω| −3/2 for large |ω| (cf. [4, formula (3.29) on p.60]). It follows thatk(ω) = |ω| −3/2 κ(ω) for some bounded κ(ω). The first factor |ω| 3/2 relates to a "smoothing component" of the out-of-focus blur situation. We now consider inversion of this smoothing component only. For this we introduce an integral operator A with kernel k which satisfies (54)k(ω) = |ω| 3/2 .
Example 3. In this example we consider compact forward operators A in equation (1) with X = Y = L 2 (R) in form of linear operators A : L 2 (R) → L 2 (R), for which the range of the operator K :=ψ(A * A) is a subset of H φ with some index function φ and some rate functionψ. That means, we have range(K) ⊆ H φ and a classical source conditions (4) is valid for f ∈ H φ implying the corresponding convergence rates in regularisation. In this context, let K be a linear Fredholm integral operator of Hilbert-Schmidt type. For such operators one can provide conditions on the kernel which guarantee this range condition.
Proof. The adjoint operator K * of K is an integral operator with kernel k * (s, t) = k(t, s) as a consequence of the theorem of Fubini. By Plancherel's theorem one has An application of Parseval's identity several times gives for u, v ∈ L 2 (R): It follows that for y = Kx with x ∈ L 2 (R) one has and consequently y ∈ H φ .
Example 4. As a concrete application example we consider a problem from derivative spectroscopy [31]. Here numerical derivatives are used to enhance the resolution of measured spectra in order to separate close peaks. An instance is the Eddington correction formula. The approach determines f = Lg := g − g (2) 2 from observed g δ where g (2) is the second derivative of g. We now apply the theory developed so far to determine how well f = Lg can be determined from spectral data g δ . For f ∈ H 2 (R) and f = Lg the Fourier transformsf andĝ satisfŷ f (ω) = (1 + ω 2 /2)ĝ(ω), a.e.
Using Plancherel's theorem, one obtains from this the bounds which means in particular that Lf is an equivalent norm for H 2 (R). Using standard arguments, one can then show that L : H 2 (R) → L 2 (R) is a Hilbert space isomorphism. Using the convolution theorem one sees that A = E 2 L −1 : L 2 (R) → L 2 (R) is an integral operator with where E 2 denotes the embedding H 2 (R) → L 2 (R). As L −1 maps L 2 (R) onto H 2 (R) the range of A can be identified with H 2 (R).
In addition to the Sobolev spaces, which form a classical Hilbert scale, we will use a variable Hilbert scale H φ with norms · φ defined by In contrast to the usual case, where the source condition is stated as a condition on f , here the source condition is stated as a condition on (the original spectrum) g. This source condition results from physical models for the spectrum, and, in particular for the so-called spectral broadening. A variety of models are used, the most common ones are the Gaussian, Lorenz and Voigt spectra where a Voigt spectrum is a combination of a Lorenz and a Gaussian spectrum. Here we consider Gaussian spectra defined by for some v ∈ L 2 (R). For a different discussion and more background on the problem, the reader may consult the paper by Hegland [13]. It follows that g ∈ H ψ with ψ(λ) = exp(λ). The concave function Ψ can then be chosen as Ψ(λ) = λ, for λ ≤ 1 (1 + log(λ)/2) 2 , for λ ≥ 1.
It follows that Ψ is concave and that φ(λ) ≤ Ψ(ψ(λ)). As a consequence one gets the error bounds f − f α ≤ δ(1 + log(η/δ)) for δ < η and f − f α ≤ η if δ ≥ ǫ. The stabilisation guarantees that even if the errors are very large, the error of the approximation does not grow to infinity. In fact, the solution f α = 0 would probably be a good choice for the large data error case.