Solution paths of variational regularization methods for inverse problems

We consider a family of variational regularization functionals for a generic inverse problem, where the data fidelity and regularization term are given by powers of a Hilbert norm and an absolutely one-homogeneous functional, respectively, and the regularization parameter is interpreted as artificial time. We investigate the small and large time behavior of the associated solution paths and, in particular, prove the finite extinction time for a large class of functionals. Depending on the powers, we also show that the solution paths are of bounded variation or even Lipschitz continuous. In addition, it will turn out that the models are almost mutually equivalent in terms of the minimizers they admit. Finally, we apply our results to define and compare two different nonlinear spectral representations of data and show that only one of them is able to decompose a linear combination of nonlinear eigenvectors into the individual eigenvectors. Finally, we also briefly address piecewise affine solution paths.


Introduction
A standard approach for approximating solutions of an ill-posed inverse problem where the data fidelity term D enforces Au to be close to f and the regularization functional R incorporates prior knowledge about the solution (sparsity, smoothness, etc) into the model. The real number t > 0 is typically referred to as the regularization parameter and balances data fidelity and regularization. One of the most famous examples for (P) within the field of mathematical imaging is the Rudin-Osher-Fatemi (ROF) denoising model [1] min u∈BV(Ω) Here, t should be chosen dependent on the noise level of f to obtain a satisfyingly denoised image. In contrast, the parameter t can also be interpreted as an artificial time that steers the solution of (P) from being under-regularized to over-regularized as time increases, or speaking in the ROF context, that successively and edge-preservingly smooths f until a constant state is reached. In this manuscript we will refer to the maps t → {u t : u t solves (P)} and t → {Au t : u t solves (P)} as the solution path and forward solution path, respectively. Recently, this and similar evolutions, which can be viewed as a scale space representation of the input f , have been used to define nonlinear spectral multiscale decompositions, e.g. [2][3][4][5][6][7]. Hence, in this context the solution of ROF becomes interesting even if the data f is not noisy at all. Typically, these decompositions involve computing derivatives with respect to the parameter t of the (forward) solution path wherefore it is interesting to study its regularity. Furthermore, not only in the ROF model but also in general, a very popular choice for the data fidelity in (P) is the squared norm of some Hilbert space whereas the regularization functional is often assumed to be absolutely one-homogeneous. However, there is often no substantial justification for preferring such models over others. In particular, one could consider arbitrary powers of a Hilbert space norm · and of an absolutely one-homogeneous functional J instead, which leads to the weighted problem with weights α, β 1. Note that the multiplicative scalings 1/α and 1/β do not restrict generality since they can be absorbed into t. Indeed there are only a few contributions in literature that consider general powers of norms (see [8,9] for a Hilbert norm with α = 1 and [10] for error analysis for a Banach norm with fixed α 1) or a different scaling of an absolutely onehomogeneous regularization functional [11]. While such modifications seem only minor at first glance and the resulting models will be equivalent for parameters t in a certain interval, we will see that outside this interval the qualitative behavior of the models differs significantly. In a nutshell, the models disintegrate into four classes, depending on whether α or β are larger than or equal to 1. If both parameters equal 1, due to the homogeneity of J, the corresponding problem (wP) becomes contrast invariant, meaning that if u solves (wP) with some f then cu solves the problem where f is replaced by cf and c > 0. Our precise setting in this paper is as follows: let (X , · X ) be the dual space of a separable predual Banach space Y and let (H, ·, · ) be a Hilbert space with norm · H := ·, · . We consider a bounded linear forward operator A : X → H mapping between these spaces and denote by N (A) and ran(A) its null-space and range. Furthermore, let J : X → R + ∪ {+∞} be an absolutely one-homogeneous, weak * lower semi-continuous, and proper convex functional, whose null-space and effective domain we denote by N (J) := {u ∈ X : J(u) = 0} and dom(J) := {u ∈ X : J(u) < ∞}, respectively. For parameters α, β 1, t 0, and given data f ∈ H we define functionals which we aim to minimize. If f ∈ ran(A), meaning that there exists u † ∈ X with Au † = f , we assume that u † / ∈ N (J). This is the only interesting scenario since otherwise u † is a minimizer of E α,β t (·; f ) for any t 0. The remainder of this work is organized as follows. We will perform a thorough analysis of the variational problem at hand in an infinite dimensional setting in section 2. A special emphasis will lie on the small and large time behavior of the so-called solution path and uniqueness of the forward solution path. Furthermore, we briefly demonstrate the equivalence of some classes of the models under consideration. Using these results, section 3 will deal with regularity of the forward solution path depending on the weights α and β. In section 4 we will indicate how our results can be used to define nonlinear spectral representations. We undertake numerical experiments that illustrate our theoretical findings in section 5 and conclude with some open questions. Basic notation and relevant notions from convex analysis as well as fundamental properties of generalized orthogonal complements and projections with respect to the forward operator A are collected in the appendix.

Analysis of the variational problem
In this section we will provide a basic analysis of the variational problem of minimizing (1.1). We start with fixed t and then proceed towards the behavior of the solution path for small and large t, which can allow for exact penalization and finite time extinction, respectively.

Basic properties of the variational problem
In the following, we make three assumptions related to the forward operator A and its interplay with the regularization functional J, which we make use of throughout this manuscript: Assumption 1. u A := Au H is a norm on N (J) which is equivalent to the restriction of · X to N (J).
Note that for assumption 1 to hold it is sufficient to have N (J) ∩ N (A) = {0} and dim N (J) < ∞ together with an appropriate definition of X which is satisfied in most cases. The second assumption is a generalized Poincaré inequality which assures a weaker form of coercivity of J. To this end we define the map This assumption is guaranteed if A = B * with some bounded linear operator B : H → Y. However, in some cases it is not obvious how to ensure this condition. In the following remark we demonstrate how an appropriate choice of the space X can accomplish this. Remark 2.1. In most cases the space X is solely determined by the regularization functional, but in some very mildly ill-posed cases the data fidelity needs to be taken into account as well in order to satisfy the assumptions. The canonical case is indeed TV in multiple dimensions. We define X := BV ∩ L 2 with norm · X := · BV + · L 2, choose H = L 2 , and let A be the continuous embedding operator. A predual of X is given by Y := Z + L 2 where Z * = BV. Since weak * convergence in X implies a particular weak L 2 -convergence, the embedding X → H is weak * -to-weak continuous. More generally, it can be checked that the dual of a sum of Banach spaces equals the intersection of the duals. Now we provide some basic results concerning the minimization problem for the energy functional E α,β t (·; f ). We start with an existence result which follows by standard arguments using assumptions 1-3. Now we turn to optimality conditions for minimizers. In some of the following statements we will utilize the range condition

Theorem 2.2 (Existence of minimizers
which applies if the inverse problem possesses a (possibly not unique) solution. For convenience we also define B H 1 := {q ∈ H : q H 1}. Theorem 2.3 (Optimality conditions). Let t > 0 and α, β 1, u t be a minimizer of E α,β t (·; f ). We distinguish between two cases: if u t = u † for some u † which satisfies (RC), then α = 1 holds necessarily and there is q ∈ B H 1 such that If u t is such that Au t = f , it holds where we use the convention 0 0 = 1 if β = 1 and J(u t ) = 0.
Proof. Standard results of subgradient calculus [10] allow us to calculate the subdifferential of the energy functional (1.1). Note in particular that u → 1 α Au − f α H is continuous, thus the subgradients of E α,β t (·; f ) are given by the sum of subgradients of 1 α A · −f α H and t β J(·) β . By the chain rule for subdifferentials, see [12] for instance, the subdifferential of E α,β (2.4) and for any q ∈ H it holds Hence, the optimality condition for u † and α > 1 reads which contradicts t > 0 since J(u † ) = 0, by assumption. Therefore, u † cannot be a minimizer for α > 1. Similarly, any minimizer u t for β > 1 satisfies u t / ∈ N (J) since otherwise f = Au t holds true due to (2.4). This would contradict our non-triviality assumption on the data.  .3) are also sufficient for optimality.
As we have seen in theorem 2.2, minimizers are unique under stronger assumptions on the forward operator A. However, in the general case one can still prove that the norm of the residual and the value of the regularizer of minimizers are uniquely determined for α > 1 or β > 1. The statement follows from standard arguments and is implicitly used in several proofs in the literature; however, it is usually not stated clearly, despite being a result of interest. Remark 2.6. With a little abuse of notation we introduce the following maps where u t is a minimizer of E α,β t (·; f ). Note that we suppress the dependency of R on α and β for concise notation. By theorem 2.5 the maps R and J are well-defined for α > 1 or β > 1. If α, β = 1, we will use the same expressions for minimizers of E 1,1 t (·; f ) although their values will depend on the individual minimizer, in general.
A fairly well-known property is that the residual map t → R(t) is monotonously increasing whereas the regularizer map t → J(t) decreases monotonously. The proof works precisely as in [13] which deals with the case J = TV. Lemma 2.7. Let 0 < s < t and u s , u t denote minimizers of E α,β s (·; f ) and E α,β t (·; f ), respectively. Then it holds R(s) R(t) and J(s) J(t), where the inequalities are strict if minimizers are unique.

Behavior for small time
Obviously, for t = 0 any u † fulfilling (RC) is a minimizer of E α,β 0 (·; f ). In this section we consider the special case α = 1 where such u † can be a solution for small t > 0 as well. This phenom enon is called exact penalization and has been introduced in [14]. Due to the regularizing effect of the minimization of (1.1), certainly this exotic behavior can only occur if the datum f is noise-free. Although this situation might be of limited relevance in practical situations, it is important to understand and characterize exact penalization from a theoretical perspective, e.g. in order to obtain convergence rates (see [15,16]). We shall assume that (RC) holds and assume that there is some u † which also fulfills the following source condition: Needless to say, since J(u † ) = 0, any such q fulfilling (SC) is also different from zero. Furthermore, according to [14] such u † fulfills range and source condition if and only if it is a J-minimizing solution of the forward problem (IP), i.e. J(u † ) J(u) for all u ∈ X with Au = f . In particular, the (positive) value J(u † ) does not depend on the choice of u † and will be denoted by J min , in the sequel. It is obvious from the optimality condition (2.2) that (SC) is necessary for u † being a minimizer for t > 0. Indeed, the source condition is also sufficient.
To show this, we start with the following lemmas.

Lemma 2.8. Let conditions (RC) and (SC) hold true. Then s * given by
Proof. Let us now assume that there is a sequence (u † k ) ⊂ X fulfilling conditions (RC) and (SC) and a corresponding sequence of source elements (q k ) ⊂ H with A * q k ∈ ∂J(u † k ) for all k such that lim k→∞ q k H = 0. In this case we calculate which is a contradiction. Finally, assumptions (RC) and (SC) imply that the admissible sets in (2.7) are non-empty and hence s * < ∞. □ Lemma 2.9. Under the conditions of lemma 2.8 the infimum is attained, i.e. there is û ∈ dom(J) fulfilling Aû = f and q ∈ H with A * q ∈ ∂J(û) such that q H = s * .
Proof. Let (u † k ) ⊂ X fulfilling (RC) and (q k ) ⊂ H such that A * q k ∈ ∂J(u † k ), for every k ∈ N, be a minimizing sequence for (2.7), meaning that lim k→∞ q k H = s * . By assumption 2 we infer Hence, u † k − P A (Au † k ) is bounded in X and admits a subsequence (denoted with the same index) which weakly * converges to some h ∈ X . As P(Au † k ) = P A ( f ) holds for all k ∈ N, we obtain that (u † k ) converges to û := h + P A ( f ). Using again that Au † k = f , this implies that f = Aû. Furthermore, by the lower semi-continuity of J, we infer that û ∈ dom(J). Hence, we have shown that the limit of (u † k ) fulfills (RC). Similarly, being a minimizing sequence, (q k ) is bounded in H and a subsequence weakly converges to some q ∈ H. It holds (after another round of subsequence refinement) using the lower semi-continuity of J. On the other hand, one clearly has J(u † k ) = J min J(û), for all k ∈ N, since û satisfies (RC). This shows A * q ,û = J(û). Furthermore, from and the weak convergence of (q k ) to q we infer that (A * q k ) weakly * converges to A * q in X * . Since the sequence (A * q k ) lies in ∂J(0) which is weakly * closed (see [17]), also A * q ∈ ∂J(0) holds. Using (A.4), we have shown that A * q ∈ ∂J(û), as desired. It remains to show q H = s * . The definition of s * and the lower semi-continuity of the Hilbert norm implies s * q H lim k→∞ q k H = s * by the assumption that (q k ) is a minimizing sequence. This concludes the proof. Proof. Let t t * and choose û ∈ H and q ∈ H as in the proof of lemma 2.9. Defining p := A * q and q := −tJ β−1 minq we find that A * q + tJ β−1 min p = 0 and q H = tJ β−1 min q H t * J β−1 min s * = 1. Consequently, by the optimality conditions (see theorem 2.3 and remark 2.4) it follows that û is a minimizer of E 1,β t (·; f ). On the other hand, let u † , fulfilling (RC) and (SC), be a minimizer of E 1,β t (·; f ). By (2.2) from theorem 2.3 there are q ∈ B H 1 and 0 = p ∈ ∂J(u † ) such that A * q + tJ β−1 min p = 0 or equivalently p = −A * q/(tJ β−1 min ). Hence, it holds by definition of s * that q/(tJ β−1 min ) s * which Note that in the second part of the proof the source condition follows directly from the optimality condition and does not have to be imposed. Next we show that for t < t * the forward solution path (and hence the residual) is uniquely determined: Theorem 2.11. Let (RC) and (SC) hold. Every minimizer u t of E 1,β t (·; f ) for 0 < t < t * fulfills Au t = f .

Proof.
Suppose u t is a minimizer for 0 < t < t * and Au t = f . Then where u † fulfills the range and source condition. Hence, multiplication with t * /t > 1 yields In order to maintain a concise notation, for the rest of this manuscript we will define t * := 0 if α > 1 or if conditions (RC) and (SC) do not hold.

Behavior for large time
It is well-known that for increasing parameters t in the ROF model, the solution approaches the mean value of the data. Similarly, if the regularization functional is given by a norm, the solution will approach zero. However, if a non-trivial forward operator mapping between two distinct spaces X and H and a general regularization functional are involved, the situation becomes unclear. Hence, we investigate the behavior of minimizers u t of our general functional E α,β t (·; f ) for t sufficiently large, and we expect that u t behaves like a solution of which is the A-orthogonal projection of f onto N (J), introduced in (2.1). We refer to appendix B for further details. Note that the projection is not always as trivial as in the introductory examples of this section. In particular, if the null-space of the functional becomes bigger, as is the case for higher order regularizations like total generalized variation [18], it does not even admit a closed form. Furthermore, it is not obvious whether or not minimizers u t converge to the solution of (2.8) for finite parameters t. Note that the study of extinction times is also closely related to nonlinear spectral theory (see [5] and section 4) since it relates to the eigenvalues contained in the data f . In a nutshell, the parts of the data which correspond to small eigenvalues become extinct quickly, whereas the low eigenvalue components persist for a longer time t.
Remark 2.12. Note that for X = H and A = id it holds that P A = P , i.e. the minimizer of (2.8) coincides with the orthogonal projection on N (J) which fulfills f − P( f ), P( f ) = 0.
But even in our more general setting one can obtain properties for P A which resemble the classical ones for orthogonal projections in Hilbert spaces. These are subsumed in proposition B.4 and will be needed to obtain the finite extinction time of minimizers of E α,β t (·; f ) with β = 1, meaning that there is T > 0 such that all minimizers for t > T coincide with P A ( f ). However, first we will prove a weaker statement, namely that minimizers of E α,β t (·; f ) converge to P A ( f ) as t tends to infinity. Theorem 2.13. Let (t k ) ⊂ (0, ∞) be a sequence tending to infinity and u t k be a minimizer and, in particular, This implies the existence of a weakly * convergent subsequence (denoted with the same indices) with limit u. Again, by assumption 3, this implies that Au t k weakly converges to Au in H. Due to (2.9), u is an element of N (J). Consequently, we can calculate, using weak lower semi-continuity of the norm in H and (2.9): Since u ∞ is the unique minimizer of (2.8), this implies that u = u ∞ . The same argument holds true for all cluster points of (u t k ) which shows convergence of the whole sequence. □ In order to obtain a finite extinction time, one has to demand the Poincaré-type inequality of assumption 2 and β = 1. We define E α t (·; f ) := E α,1 t (·; f ). Theorem 2.14. Let β = 1. Under assumption 2 it holds that and for t t * * , given by

Now let
Then for any u ∈ N (J) we have p t , u = 0 = J(u) which holds in particular for u = P A ( f ). For arbitrary u ∈ X with J(u) = 0 we have using t t * * and (A.8c) as well as self-adjointness of P A (see proposition B.4). Thus, p t ∈ ∂J(P A ( f )) and the optimality condition Assume that there exists another minimizer u for t > t ** . Then Let us now assume that P A ( f ) is a minimizer. In this case, the optimality condition implies that Hence, using p t , u J(u) for all u ∈ X , we can estimate which yields the assertion. □ Example 2.15. If X = H = R n equipped with the Euclidean inner product, A = id, and J is an arbitrary norm on H, one obtains P A = P = 0 and, thus, assumption 2 always holds true due to the equivalence of norms on finite dimensional vector spaces.
and Ω ⊂ R n , assumption 2 is just the Poincaré inequality for BV functions.
|Ω| Ω u dx is the mean value of u over Ω. Summing up the results of the last two sections, the critical time t * > 0 can exist only if α = 1 whereas t * * < ∞ requires β = 1. In more generality, one can easily extend these results to models of the type Φ( Au − f H ) + tΨ(J(u)) with convex and differentiable functions Φ and Ψ. In this case, the critical times can appear only if Φ (0) or Ψ (0), respectively, are positive.

Uniqueness of the forward solution path for α > 1 or β > 1
Let us now prove that for each time t > 0 the forward solution path t → Au t is uniquely determined if α > 1 or β > 1. This is a necessary property for studying finer regularity. Not surprisingly, this follows from the uniqueness of the residuals.

Theorem 2.16 (Uniqueness of the forward solution path I). Let
Proof. Let us first consider the case t * > 0. Then necessarily α = 1 and β > 1 holds and by theorem 2.5 we infer that every minimizer of E t (·; f ) for 0 < t t * has the same residual. Since there is a minimizer with zero residual this has to hold for all minimizers as well, and this implies that the forward solution path for 0 < t t * coincides with the set {f }.
Let us now turn to the case t > t * . We use the optimality condition (2.3) for two minimizers u 0 , u 1 with Au 0 , Au 1 = f to obtain where p i ∈ ∂J(u i ) and i = 0, 1. Subtracting these equalities yields By theorem (2.5) we know that both the residuals and the values of the regularizer are unique and, hence, we can use the maps R and J from (2.5) and (2.6) to write Multiplying with R(t) 2−α , taking a duality product with u 1 − u 0 and using the non-negativity of the symmetric Bregman distance, we infer It remains to study what happens for α = β = 1. Since in this case both the data fidelity and the regularizing term of the energy functional (1.1) are not strictly convex, one cannot expect uniqueness of the forward solution path for parameters t ∈ [t * , t * * ]. However, for values of t where non-uniqueness occurs, we are able to confine the set of possible forward solutions to a one-parameter family.

Theorem 2.17 (Uniqueness of the forward solution path II). Let t t * . Then it holds
where û is an arbitrary minimizer of Proof. The only non-trivial case is Au = f since otherwise c = 0 can be chosen in (2.11).
As before, we obtain by subtracting the optimality conditions (2.3) of u and û that where p and p denote the corresponding subgradients. We shortcut w := Au − f and ŵ := Aû − f , multiply with u −û, and use the non-negativity of the symmetric Bregman distance to obtain where the second inequality follows from Cauchy-Schwarz. This immediately implies which is equivalent to Au = f + c(Aû − f ). This closes the proof. □ Remark 2.18. Note that in case c = 1, which corresponds to non-uniqueness of the forward solution, (2.14) can be rewritten as which means that-in the case of non-uniqueness-one can construct an element from the two minimizers which fulfills the range condition (RC). This is a counter-intuitive behavior since one would not expect the two regularized solutions to carry sufficient information to allow for the exact reconstruction of the datum f . Indeed, if f / ∈ ranA-which can be interpreted as noisy data-equation (2.15) is a contradiction and, hence, the forward solution path is unique in this case.
Despite the considerations of the previous remark, one cannot expect uniqueness of the forward solution path, in general. This will be illustrated in the following example.
Then the forward solution path is not unique in t * = 1/ √ 13 and in t = 1. This can be seen as follows: it is well-known that the subdifferential of the 1-norm is given by the multi-valued signum function; i.e. for u ∈ R n it holds component-wise ∂ u 1 = sgn(u), where sgn(·) denotes the multi-valued sign function. In addition, since A is invertible, the vec- It can be easily checked using the optimality condition (2.3) that all members of the family are minimizers for t = t * and, similarly, that all members of are minimizers for t = 1. Hence, due to the invertibility of A, the corresponding forward solution paths are also not unique. The strategy to find such non-unique solutions is using the Hence, we can have non-uniqueness even if the forward operator is trivial, i.e. equals the identity.
An important consequence of the uniqueness of the forward solution is the continuity of the residual map t → R(t).

Corollary 2.20 (Continuity of the residuals). Let
Proof. The continuity follows from a straightforward generalization of the proof of claim 3 in [19], using that A * is weak * -to-weak continuous and the Hilbert norm is weak lower semicontinuous. □ From the uniqueness of the forward solution path and the residuals we immediately obtain

Relation of the problems
In this section, we will deal with the mutual relation of minimizers of E α,β t (·; f ) for different values of α and β. The structure of the subgradient (2.3) suggests that as long as Au t − f H , J(u t ) = 0, one can switch back and forth between minimizers corre sponding to different choices of the exponents α, β by adapting the regularization parameter t.
Foreshadowing, one has one-to-one correspondences of all minimizers within the critical parameter range (t * , t * * ) where t * and t ** can attain the values 0 or ∞, respectively. For instance, minimizers of E 1,2 t (·; f ) for t ∈ (t * , ∞) correspond exactly to those of E 2,1 τ (·; f ) for τ ∈ (0, τ * * ). As an example, we will prove this equivalence for minimizers of E α,1 t (·; f ) with α 1 and E 2,1 τ (·; f ), the latter being the 'standard' variational problem with squared norm and one-homogeneous regularization. Since both models possess finite extinction time due to β = 1, we will obtain full equivalence for t ∈ (t * , ∞) and τ ∈ (0, ∞). Note that in the following, the expression u t will correspond to minimizers of E α,1 t (·; f ) whereas v τ will only be used for minimizers of E 2,1 τ (·; f ). In particular, R(t) = Au t − f H and R(τ ) = Av τ − f H denote the respective residuals and are not to be confused. Furthermore, we remind the reader of the fact that the residual Au t − f H is not uniquely determined if α = 1. By the optimality condition (2.3) we obtain the following two lemmas.

Lemma 2.22. Let t > t * and u t be a minimizer of E α,1
t (·; f ). Then u t is a minimzer of non-decreasing, and surjective. If α > 1, it is even a bijection with continuous inverse Proof. Since by theorem 2.5 the residuals of minimizers with strictly convex data term are unique, map T is well-defined. By corollary 2.20 it follows that T is continuous. Let us first consider the case α > 1. Then similarly S is well-defined and continuous. Furthermore, it follows from the uniqueness of the residuals that S and T are mutual inverses. Finally, T is non-decreasing which can be seen as follows. For α 2 this is obvious as T is the product of non-decreasing functions (see lemma 2.7). For α ∈ (1, 2) the same holds true for S. Since they are inverses, it shows that both T and S are increasing for arbitrary α > 1.
Let us now address the case α = 1. As we have seen, the residuals R(t) are not unique in general and therefore, the map S(t) := tR(t) is not well-defined. However, by lemmas 2.22 and 2.23 we infer that T is still surjective. Furthermore, being the pointwise limit of the increasing functions τ R(τ ) α−2 for α 1 shows that T(τ ) = τ /R(τ ) is non-decreasing. □

Remark 2.25 (Bayesian interpretation).
The relation of the problems for different values of α and β can also be interpreted in terms of Bayesian models for inverse problems (see [20]). Under appropriate conditions, E α,β t (·; f ) can be interpreted as the Onsager-Machlup functional of a posterior distribution and its minimizer is the maximum a posteriori probability (MAP) estimate (see [21,22]). In the finite-dimensional case the posterior density is often simply modeled as p(u|f ) ∼ exp(−cE α,β t (u; f )). In practice, α is determined from the noise modeling, while one usually chooses β = 1 based on the standard formulation of the variational problem. Essentially, the posterior distribution is extrapolated from the collection of MAP estimates, in practice. However, the equivalence of the minimization problems for different β shows that there is a variety of posterior distributions leading to the same MAP estimates for any f . The behavior of the posterior, however, can differ strongly, in particular in degenerate cases such as BV (see [23][24][25]).

Uniqueness of the forward solution path for
The results of the previous section allow us to characterize (non-)uniqueness of the forward solution path also in the degenerate case α, β = 1.
Proof. Let us first assume that f − τ q is the forward solution path of E 2,1 τ (·; f ) for τ ∈ [τ 0 , τ 1 ] and 0 τ 0 < τ 1 . Then q = 0 holds since f cannot be a minimizer for any positive value of τ . Hence the time reparametrization T reduces to T(τ ) = τ /R(τ ) = 1/ q H , which means by lemma 2.23 that f − τ q is also the forward solution of E 1,1 t (·; f ) for t := 1/ q H . Since τ runs in a proper interval this implies the non-uniqueness of the forward solution in t.
Conversely, let us assume that the forward solution of E 1,1 t (·; f ) is not unique. Then there exist u 0 , u 1 ∈ argmin E 1,1 t (·; f ) such that Au 0 = Au 1 . Due to convexity the convex combina- (2.16) We distinguish two cases. If Au 0 = f and Au 1 = f (this corresponds to t = t * ) we have (2.18) In any case, we can define numbers τ 0 := t Au 0 − f 0 and τ 1 := t Au 1 − f . In the first case we have 0 = τ 0 < τ 1 and in the second case-after possibly exchanging the roles of u 0 and u 1 -we can assume c > 1 such that 0 < τ 0 < τ 1 holds. Next, we observe that, due to lemma 2.22, u λ is a minimizer of E 2,1 τ (·; f ) with τ := t Au λ − f H . By using (2.17) or (2.18), respectively, we infer that in both cases it holds 1]. Plugging this expression for λ into (2.17) or (2.18), respectively, the forward solution of Since the set {t ∈ R : |{τ : T(τ ) = t}| > 0} has Lebesgue measure zero, we can conclude. □

Regularity of the forward solution path
In this section, we investigate regularity of the forward solution path t → {Au t : u t ∈ argmin E α,β t (·; f )} which we have shown to be a single-valued map for α > 1 or β > 1 in the previous section. As already mentioned, when using the minimization of (1.1) for obtaining nonlinear spectral decompositions of the data f , one typically computes derivatives of the (forward) solution path with respect to t. While these solution paths can be shown to be sufficiently regular under some finite dimensional assumptions (see the discussion in section 4.3), a general study of their regularity in a Banach or Hilbert space setting is still pending. Our results are a first contribution in this direction and the topic will remain subject to future research.

Residual bounds under range or source condition
In this section we prove growth rates of the residual R(t) which will be used to improve the subsequent Lipschitz estimates of the forward solution path close to zero under the range or source condition. This can also be interpreted as Hölder continuity of the forward solution path close to zero in the case α > 1. First, we state a preparatory lemma.

If (RC) and (SC) hold, then q t , defined as
.

L Bungert and M Burger Inverse Problems 35 (2019) 105012
Proof. By the optimality conditions (2.3) we infer that p t := A * q t ∈ ∂J(u t ). Furthermore, letting q ∈ H and û be such that p 0 := A * q ∈ ∂J(û) and q H = s * , we calculate which is equivalent to With Cauchy-Schwarz this implies q t H q H = s * . The other upper bound is trivial. □ Now we are ready to prove the growth bounds of the residuals. Note that the growth in zero can be estimated more sharply when demanding the source condition (SC).

Proof. From the optimality condition (2.3) for u t we obtain
where p t ∈ ∂J(u t ). Reordering yields A * A(u t − u † ) = −tR(t) 2−α J(t) β−1 p t and by taking the duality product with u t − u † we obtain where we used that J is decreasing (see lemma 2.7) and J(u t ) 0. Given (SC), we define q t as in (3.1) and use lemma 3.1 to write which can be used to obtain the second inequality if α > 1. □

Lipschitz continuity of the forward solution path for α > 1 or β > 1
In this section we address the Lipschitz continuity of the forward solution path in the case that it is uniquely determined. It will turn out to be Lipschitz continuous in the range of positive parameters t. For t 0 the general estimates will break down which is obvious since the solution instantaneously changes from the noisy data to being regularized as t gets positive. However, if the range or source conditions hold, the rate of change can be slightly tamed by employing the results of section 3.1.
The following lemma is the basis for our regularity estimates.
Proof. Defining t := tJ(t) β−1 R(t) 2−α and s analogously, we obtain from the optimality conditions for p t and p s given by (2.3): Taking a duality product with u t − u s and using non-negativity of the symmetric Bregman distance yields Application of the Cauchy-Schwarz inequality to the right-hand side and simple reordering leads to

Plugging in the definitions of t and s concludes the proof. □
This result also included the case α = β = 1. Since, however, in this case the forward solution path is not even uniquely defined, one cannot expect continuity properties. Thus, the following statements will always require that one of the weights α and β is larger than 1. In particular, the maps R(t) and J(t) are well-defined in that case.

Corollary 3.4 (Continuity of the forward solution path). If α > 1 or β > 1, estimates (3.4) together with the continuity of the residuals and the regularizers (see corollary 2.20) show that the forward solution path t → Au t is continuous for all t > t * . Continuity in t = t * follows from the continuity of the residuals.
For α 2 one can directly obtain Lipschitz estimates of the forward solution path. As already mentioned, the estimates close to zero can be improved by assuming the source condition.

holds. This estimate can be improved to
If we now use that α 2 and that for s < t it holds J(t) J(s) and R(s) R(t), we obtain Here we employed the a priori estimate

Under (RC) or (SC) one uses lemma 3.2 to further estimate R(t). □
We obtain the following regularity result for the forward solution path.

Theorem 3.6 (Lipschitz continuity of the forward solution path I). Let α 2, β 1. The forward solution path t → Au t is Lipschitz continuous on
Hence, (Au t ) exists almost everywhere in (δ, ∞) and it holds This estimate can be improved to Proof. Lipschitz continuity of t → Au t is a direct consequence of estimate (3.5). Since H, being a Hilbert space, has the Radon-Nikodym property (see [26], for instance), we can deduce from a generalization of Rademacher's theorem [27,28] that (Au t ) exists almost everywhere on (0, ∞). Estimates Proof. The first assertion is an immediate consequence of the reverse triangle inequality: Since by estimate (3.5) the forward solution path is Lipschitz, the same holds for R. For the second claim, let 0 < s < t and let u s and u t denote corresponding minimizers. Thus, it holds Since R(·) is Lipschitz on (δ, ∞) for all δ > 0, the same holds for R(·) α and for t → J(t) β . Applying the βth root preserves local Lipschitz continuity away from zero and hence we can conclude. □ In order to proceed to the case 1 α < 2, where α = 1 requires β > 1, we use the relation between the different formulations established in section 2.5. For simplicity, we will only consider the case 1 < α < 2 and β = 1. Defining τ → T(τ ) = τ R(τ ) α−2 as in theorem 2.24, one observes that, due to corollary 3.7, function T is Lipschitz continuous on (δ, ∞) for all δ > 0. Hence, its derivative T exists almost everywhere in (δ, ∞) and it holds (3.11) Here, we used that also R (τ ) exists almost everywhere according to corollary 3.7, and can be computed with the chain rule: this inequality is true due to Cauchy-Schwarz and estimate (3.8) which can be used to bound (Av τ ) . Hence, in that case also S, the inverse of T, is a Lipschitz function. Consequently, we obtain Lipschitz continuity for minimizers u t of E α,1 t (·; f ) with 1 < α < 2 since Au t = Av S(t) is a composition of Lipschitz functions. By setting T(τ ) = τ R(τ ) α−2 J(τ ) 1−β this argument can easily be repeated for β > 1, which makes the calculations more cumbersome but leads to the same results. In this case, also α = 1 and β > 1 yields the desired Lipschitz continuity. Hence, the assumption α 2 in corollary 3.7 and theorem 3.6 can be relaxed to α > 1 or β > 1 without losing Lipschitz continuity or differentiability of the forward solution path. However, estimates (3.8) and (3.10) need to be adapted. To keep the presentation short, we only formulate the estimates for β = 1.
Proof. For simplicity, we only consider the case β = 1. It remains to prove the bound (3.12).
To this end, we let u t denote a minimizer of E α,1 t (·; f ). Then it holds according to the previous results that u t = v τ with τ = S(t) and with the chain rule together with (3.8) we obtain , the definition of S, and τ = S(t) we infer Reordering yields the first inequality in (3.12), from where on we proceed as before. □

Bounded variation of the forward solution path for
Using the equivalence of the problems together with the Lipschitz regularity of minimizers of the quadratic problem one can at least show that the forward solution path t → Au t for α = β = 1, which is well-defined almost everywhere according to corollary 2.27, has bounded variation. Proof. First, we note that Au t is well-defined for almost every t > 0 according to remark 2.27.
Let us first assume that t * > 0, i.e. conditions (RC) and (SC) hold. We already know that Au t has zero variation on (0,t * ) and (t * * , ∞). Hence, it is enough to assert finite variation on the interval (t * , t * * ). To this end, let t * = t 1 < t 2 < · · · < t n−1 < t n = t * * be a finite partition of the interval (t * , t * * ). By theorem 2.24, we can choose numbers 0 τ 1 < · · · < τ n such that Au t k = Av τ k for all k = 1, . . . , n. Here, τ n = τ * * is given by the finite extinction time of minimizers of E 2,1 τ (·; f ). Furthermore, using (3.7) with α = 2 we compute Forming the supremum over all partitions of (t * , t * * ) shows that Au t has bounded variation. If t * = 0 one uses (3.5) to deduce the weaker result. Consequently, the finite Radon measure (Au t ) can be decomposed into an absolutely continuous part, a jump part, and a Cantor part (see [29] for precise definitions), where the jump part is supported in [t * , t * * ] since Au t is constant outside this interval. □ Once more, we obtain statements concerning the subgradient and the solution path.
where p t is given by the optimality conditions (2.2) and (2.3), has the same regularity as the forward solution path. (ii) If A is bounded from below, meaning that there is c > 0 such that c u X Au H , ∀u ∈ X , then the solution path t → u t has the same regularity as the forward solution path.

Nonlinear spectral representations
In order to define a nonlinear spectral representation φ t of some data f with respect to the functional E α,β t (·; f ), we draw our motivation from classical linear Fourier analysis and follow an axiomatic approach. Formally, the Fourier transform of a sine or cosine-being eigenfunctions of the negative Laplacian-is given by a delta distribution which is concentrated on the corresponding eigenvalue (or the frequency after a change of variables). Hence, also in the nonlinear setting eigenfunctions should give rise to atoms in the spectral representation. In addition, in analogy to the inverse Fourier transform, there should be an inverse transform, mapping a nonlinear spectral representation back to the data and allowing for spectral filtering.

Solution path of generalized singular vectors
To find a nonlinear spectral representation with the above-noted properties we follow the approach of Gilboa, first brought up in [4], and examine the solution path that corresponds to singular vectors (see [30,31]) of J, i.e. f = Au † where λA * Au † ∈ ∂J(u † ) for some λ > 0. For such data, one would like to have a delta peak in the spectral representation to indicate that only one singular vector is contained in the data, that is, φ t = f δ 1/λ (t), where 1/λ can be interpreted as a generalized frequency.
Proof. In the case α = 1 one can easily check that t * = t * * = 1/(λ f H ) if f is a singular vector. The other minimizers can be obtained by inserting the ansatz u t = c(t)u † into the optimality condition (2.3). □ Figure 1 shows the corresponding solution paths for a singular vector u † with singular value λ such that f = Au † has unit norm and β = 1. In this case, all paths are extinct in 1/λ. Hence, in order to obtain φ t = f δ 1/λ (t), suitable spectral representations If A is bounded from below such that the solution path t → u t has the same regularity as the forward solution path, one can even choose φ t = −u t or φ t = tu t , respectively. For other α's an integer derivative does not typically produce a delta peak and one could consider fractional derivatives as done in [32]. Note that by these definitions and due to the finite extinction time the reconstruction formula holds, which can be used for spectral filtering by defining where F is a sufficiently well-behaved filter function (see [5], for instance).

Remark 4.2.
Note that φ t = −(Au t ) is a well-defined finite Radon measure according to proposition 3.9, whereas this is a priori unclear for φ t = t(Au t ) . However, due to the finite extinction time, this spectral representation can be defined in a distributional sense, via where ψ : R → H is a Fréchet-differentiable test function with ψ(t) = 0 for all t in a neighborhood of 0. Owing to theorem 3.6, the second condition is not even necessary if one of the conditions (RC) or (SC) holds since in that case (Au t ) H is integrable in zero.
Proposition 4.1 also shows that, although all problems for α > 1 are equivalent, they significantly differ in terms of the spectral representations which can be obtained from their solution paths. Furthermore, since the minimizer for β = 2 smoothly depends on t, no singular spectral representation can be achieved by computing time derivatives, which is why we will restrict ourselves to the case β = 1 for the rest of the manuscript.
Another interesting consequence of proposition 4.1 is that some of the models E α,1 t (·; f ) are scale invariant on eigenfunctions. To see this, we choose J = TV as the total variation of functions on R n , X = BV(R n ) ∩ L 2 (R n ), H = L 2 (R n ), and A the continuous embedding operator. It is well-known that eigenfunctions of TV are given by indicator functions of socalled calibrable sets Ω ⊂ R n with eigenvalue P(Ω)/|Ω| where P denotes the perimeter and | · | is the n-dimensional Lebesgue measure (see [33,34]). If f = 1 Ω for calibrable Ω, we find that the extinction time of minimizers E α,1 t (·; f ) is given by t ext (Ω) = |Ω| α 2 /P(Ω) for α 1. If one rescales Ω r = rΩ with some r > 0, then Ω r is still calibrable and the extinction time changes to t ext (Ω r ) = r n(α−2)+2 2 t ext (Ω). Hence, we observe that for any dimension n 2 there is α := 2 − 2/n ∈ [1,2) such that t ext (Ω r ) = t ext (Ω) which makes the model scale invariant. Note that in dimension n = 2, which is most relevant for imaging applications, the model E 1,1 (·; f ) becomes both contrast and scale invariant.

Spectral representations for α = 1 and α = 2
From now on our setting will be a Gelfand-triple X → H → X * such that operator A becomes a continuous embedding operator and will thus be omitted in our notation. In the absence of a forward operator one usually refers to singular vectors as eigenvectors. Due to the observations in the previous section, we will only study the functionals F τ (·; f ) := E 2,1 τ (·; f ) and E t (·; f ) := E 1,1 t (·; f ) and fix our notation in such a way that the corresponding minimizers are denoted by v τ and u t , respectively. We consider the spectral representations given by ϕ τ := τ v τ , which is to be understood in the distributional sense, and φ t := −u t , the latter being a finite Radon measure according to proposition 3.9.
Next we formulate a theorem which is a generalization of proposition 4.1 and deals with an important question concerning nonlinear spectral decompositions, namely with the decomposition of a linear combination of generalized eigenvectors. Two conditions that suffice for a perfect decomposition into eigenvectors are the (SUB0) condition and orthogonality of the eigenvectors, introduced in [35]. Here the authors showed that the inverse scale space flow is able to decompose the data perfectly into the eigenvectors. A similar statement holds true for the variational problem F τ (·; f ), in particular, the solution path v τ will shrink each eigenvector linearly until their disappearance and will, thus, be piecewise affine in τ . For a more compact notation we will from now on abbreviate K := ∂J(0), which can be viewed as a characteristic set of J since it contains all subdifferentials and defines J via duality (see (A.4) and (A.5)).

Theorem 4.3 (Linear combination of eigenvectors I). Let f be the linear combination of orthogonal eigenvectors
and u i , u j = 0 for all i, j = 1, . . . , n, i = j. Furthermore, we define p k := n i=k sgn(γ i )λ i u i and assume that Additionally, we assume an ordering such that where τ 0 := 0, τ k := γ k /λ k , and k = 1, . . . , n.
Proof. The proof works along the same lines as the proof of [35, theorem 3.14]. □ Remark 4.4. Note that it is straightforward to extend this result to data which are composed of generalized singular vectors, i.e. f = n i=1 γ i Au i . To this end, one has to demand A-orthogonality Au i , Au j = 0 for i = j and define p k := n i=k sgn(γ i )λ i A * Au i , instead.

Remark 4.5.
It is no significant restriction in theorem 4.3 to assume that all |γ i |/λ i are different for i = 1, . . . , n. If this were not the case, the corresponding eigenvectors would simply shrink away simultaneously. However, in order to avoid unnecessarily complicated formulae, we refrained from considering this case.

Corollary 4.8 (Linear combination of eigenvectors II). Under the conditions of theorem 4.3 the minimizer u t of E t (·; f ) is u t = v S(t) where S is given by
Here, t k := T(τ k ) = τ k /R(τ k ) for k = 0, . . . , n and the S(t) := 0 if k = 1.
Proof. From the definition of v τ in (4.4) we easily see, using the orthogonality of the u i 's, that Inverting this on the intervals (τ k−1 , τ k ) for k 2 yields the expression for S. Furthermore, it holds that t k = T(τ k ) < 1/ p k H for k 2 which makes S well-defined and continuous. Noting that S is the inverse of T(τ ) for τ > τ 1 and applying lemmas 2.22 and 2.23 shows that u t = v S(t) . □ Now we investigate the spectral representations φ t and ϕ τ under the conditions of theorem 4.3. By means of corollary 4.8, we find From (4.5) it is obvious that S is continuously differentiable on the intervals (t k−1 , t k ) and discontinuous only in t 1 . Hence, the measure φ t is singular only in t * := t 1 = T(τ 1 ) = 1/ p 1 and, since S is continuously differentiable on (t k−1 , t k ), represented by a bounded function elsewhere. The jump of u t in t * is given by f − vτ, where τ := τ 1 , and hence the singular part of φ t reduces to This can be considered bad news since, on one hand, the spectral representation φ of the contrast-invariant problem E t (·; f ) is not able to isolate an individual eigenvector although it has a delta peak at t * . On the other hand, the time point t * where the peak occurs is independent of the specific eigenvector that vanishes. Thus, it cannot be brought into correspondence with the eigenvalue λ 1 or the factor γ 1 . In contrast, the spectral representation ϕ is given by which is a sum of singular Dirac measures and hence a perfect decomposition of the data f into its components.

Affine solution paths of the quadratic problem
Theorem 4.3 in particular states that if the data f are a linear combination of eigenvectors satisfying additional fairly strong conditions, the corresponding solution path v τ is piecewise affine in the time variable. In [3] this has been proven in finite dimensions under the condition that J is a polyhedral semi-norm. In infinite dimensions and for general data f this behavior cannot be expected. However, we would like to find a condition which assures that the solution path v τ is affine in τ at least on a small interval [0,τ ]. Due to theorem 2.26 this is in one-to-one correspondence with an exact penalization effect of the corresponding contrast-invariant problem E 1,1 t (·, f ) and, hence, to the validity of conditions (RC) and (SC). We start with equivalent reformulations of this behavior and give several illustrative examples in finite and infinite dimensions.
By Moreau's identity (see [36] for a finite dimensional version), we find that the minimizer v τ of F τ (·; f ) is given by (4.8) Here we used that J = χ * K (see (A.5), (A.6)) and let proj K (·) denote the projection on the closed and convex set K with respect to the Hilbert norm · H which is well-defined as K ∩ H 0. Remark 4.9. While Moreau's identity is often formulated in Hilbert spaces or finite dimensions, the identity p ∈ ∂J(u) ⇐⇒ u ∈ ∂J * ( p), which holds for lower semi-continuous and convex J defined on a Banach space X (see [37, ch 5]), makes it easy to show that it is applicable also in our slightly more general setting.
The beauty of the representation (4.8) lies in the fact that it allows us to study the solution path v τ by investigating the geometric properties of the set K and the projection onto it.
Using (4.8), the residual is given by R(τ ) = τ proj K (f /τ ) H and therefore . Taking theorem 2.26 into account, the following statements are equivalent: Note that (4.9) is always fulfilled if K ⊂ R n is polyhedral 1 since in this case the solution v τ is piecewise affine with v τ = f − τ p for τ ∈ [0,τ ] and p ∈ ∂J( f ), as it was shown in [3] or less generally for LASSO / 1 problems in [38][39][40][41]. However, the condition of a polyhedral K is neither necessary nor can it be completely waived, as the following examples show.
Example 4.10. Let a 1 , a 2 > 0 with a 1 = a 2 , let M = diag(a 1 , a 2 ), and J(u) = u, Mu . Then, K is an ellipse with semi-axes √ a 1 and √ a 2 and, therefore, not polyhedral. Here, ). If f is not an eigenvector, i.e. f is not parallel to a semi-axis, the projection of f /τ onto K does not equal ∂J( f ) for any τ > 0, as can easily be seen from the corresponding Karush-Kuhn-Tucker conditions. Hence, conditions (4.9) are violated and there is no affine behavior.
Example 4.11. Now let J : R 2 → R be given by Then K coincides with the unit square in the first and the third quadrant, and with the unit circle in the remaining quadrants of R 2 (figure 2). It is easy to see that all vectors in the second and fourth quadrant are eigenvectors and hence (4.9) trivially holds. The solution path of vectors in the first and third quadrant is also piecewise affine since the problem coincides with standard 1 shrinkage (see the references before) there. Note that K is not polyhedral either. 1 Polyhedral in this context means being the convex hull of a finite set of vectors.  If f is piecewise constant, then according to [42] the solution v τ is piecewise affine with v τ = f − τ p for τ ∈ [0,τ ] and p ∈ ∂J( f ). In [43] the authors prove similar results in two dimensions, using anisotropic total variation as regularization and assuming the data to be piecewise constant on rectangles.
The following theorem characterizes an affine solution path for small times.
are piecewise linear in τ , as expected. Also the spectra, which are defined as the 1-norm of the spectral representations φ t and ϕ τ and are depicted in the bottom row of figure 4, match our analytic results (4.6) and (4.7) since they possess a numerical delta peak at t * or at the four critical times, respectively. In particular, we see that φ t does not have any atoms for t = t * .
Note that the height of the spectral peaks is not informative since the measure at these points is given by a multiple of a Dirac measure which has 'infinite height'.

Total variation scale space
Next, we turn to the ROF model and the variant with non-squared L 2 -norm, respectively. The data f are given by the 'Barbara' image shown in the top right corner of figure 6. The top row of figure 5 shows the residuals and regularizers of u t and v τ , respectively. We can observe that there is a positive t * and that there are no kinks, meaning there is no visible piecewise behavior of the solution paths. The magnitudes of the spectral representations are given in the bottom row of figure 5. Note that both spectra, again defined as the 1-norm of the spectral representations, behave very regularly and do not show any numerical delta peaks. However, the spectrum of ϕ τ contains much more information, being encoded in two elevations that are marked in red (dotted line) and blue (dashed line). The top row in figure 6 shows the corresponding spectral components ϕ τ integrated with respect to τ over the red and blue area, respectively (see (4.2)). This procedure can be viewed as band-pass filtering with respect to the nonlinear frequency decomposition ϕ τ and allows us to extract and manipulate patterns and textures from the original image. In our example, these images correspond to differently oriented stripe patterns on the table cloth and Barbara's clothing. The spectrum of φ t , however, cannot be used for this task since the only two significant parts of the spectrum-marked in the same fashion-correspond to very fine and fine structures (see second row of figure 6) but do not separate different textures. We have the suspicion that this behavior is explained by the closing remarks of section 4.1 according to which the TV-model with α = 1 is scale invariant on eigenfunctions in two dimensions. Indeed, further numerical experiments indicate that the one-dimensional ROF model with non-squared data term is capable of capturing different scales.
Another popular filtering procedure is high-pass and low-pass filtering, which corresponds to keeping only the frequency components beyond or until a threshold frequency. The last two rows of figure 6 show the corresponding filtered images using the spectra of ϕ τ and φ t , respectively. Here, both methods succeed equally well in separating texture and objects. Regarding high-and low-pass filtering, it can be considered a slight advantage of the spectral representation generated by the scale and contrast-invariant model that the magnitude of the spectrum decreases more rapidly and that textures seem to be concentrated more compactly in the spectrum. This can make automatic filtering easier and more robust.

Conclusion
We have analyzed a family of variational regularization functionals with different powers of data fidelity and regularization terms, among which the model with quadratic fidelity and absolutely one-homogeneous regularization stands out as the 'standard choice'. Apart from trivial solutions-which are achieved for very small, respectively, large values of the regularization parameter-all models generate the same set of minimizers. Therefore, simply aiming at finding a regular approximate solution to the inverse problem, no specific weighting can be preferred over others. However, if one is interested in the whole solution path and derivatives thereof with respect to the regularization parameter, the choice of the specific weighting becomes relevant. In particular, we have argued why it is necessary to choose the standard weighting in order to obtain nonlinear spectral decompositions. Furthermore, the failure of the contrast-invariant methods to decompose a linear combination of eigenvectors shows that enforcing consistency on a single eigenvector is not enough to define a meaningful spectral representation of arbitrary data.

Some open questions
We conclude this work by pointing out some interesting open questions that will be the subject of future research.
(i) It is an interesting question whether and how our results connect with generalized Cheeger sets (see [46]). It is well-known that a convex set is calibrable if and only if it is a Cheeger set in itself. Furthermore, we have seen that the extinction time of a calibrable set Ω under TV with data term 1 α u − f α L 2 is given by |Ω| α 2 /P(Ω) which is precisely the inverse Cheeger constant if Ω is a generalized Cheeger set, i.e. a minimizer of P(E)/|E| m among all sets E ⊂ Ω with m := α/2, where usually 1 − 1/n < m < 1 is assumed which corresponds to 2 − 2/n < α < 2.
(ii) Furthermore, a relevant open point is to find sufficient conditions for τ > 0, meaning that v τ is affine on an interval (0,τ ). We suspect that the necessary condition from proposition 4.15 could also be sufficient but a proof is still pending. (iii) Related to the former point is the well-definedness of ϕ τ as a Radon measure for general data. Certainly, a piecewise affine behavior of the solution path guarantees this but this does not occur, in general. However, we have the hope that formula (4.8) can be used to deduce the regularity of v τ from the regularity of the boundary of the convex set K.