An Alternative Variational Framework for Image Denoising

We propose an alternative framework for total variation based image denoising models. The model is based on the minimization of the total variation with a functional coefficient, where, in this case, the functional coefficient is a function of the magnitude of image gradient. We determine the considerations to bear on the choice of the functional coefficient. With the use of an example functional, we demonstrate the effectiveness of a model chosen based on the proposed consideration. In addition, for the illustrative model, we prove the existence and uniqueness of the minimizer of the variational problem. The existence and uniqueness of the solution associated evolution equation are also established. Experimental results are included to demonstrate the effectiveness of the selected model in image restoration over the traditional methods of Perona-Malik (PM), total variation (TV), and the D- 𝛼 -PM method.


Introduction
The objective of any image restoration process should not focus only on the removal of noise, but it should also observe, as Perona and Malik [1], Koenderink [2], and Witkin [3] determined, that no new spurious details are created in the restored image; at each scale-space representation, the boundaries/edges are sharp or preserved, and, at all scales, intraregion smoothing is preferred to interregion smoothing.
In the light of the above considerations, researchers have observed that it is then logical to obtain or develop edge indicators that would be adapted to the local image structure [4].Consequently, a number of edge indicators have been proposed and logically grafted into the partial differential equation (PDE) based evolution equations [1,5].In addition, energy minimization problems are continually being formulated which focus on producing adaptive partial differential equations [6,7].
Total variation (TV) method, widely considered a powerful technique for smoothing, edge preservation, and general image restoration, was first proposed by Rudin et al. [8].The method is based on the strength of the argument that TV norms principally are  1 -norms of derivatives and that  1norms provide the proper basis for image restoration [8,9].
TV functionals are defined in the space of functions of bounded variation (BV) and therefore do not necessarily require image functions to be continuous and smooth.This fact makes them allow for "jumps" or discontinuities and thus be able to protect edges.
The  2 -norm of the gradient of image allows removing noise; however, it has the adverse effect of penalizing too much the gradients corresponding to edges [10].A functional based on the  2 -norm does not permit discontinuities in the solution, and thus edges cannot be recovered properly [11].
The total variation (TV) norm in [8] is a regularization functional of the form Although TV regularization above allows for edge recovery, it has some demerits.Firstly, the formulation favours solutions which are piecewise constant.This has the effect of causing staircase effects and may even generate false edges on the image [12].Secondly, the method has the effect of reducing contrast even in regions of the same pixel intensity or in noise-free observed images [13,14].Various modifications have, therefore, been proposed, in an attempt to address the drawbacks of the TV model, to 2 Abstract and Applied Analysis make it as adaptive as possible to the local image structure.For instance, Strong and Chan in [15] proposed the spatially adaptive regularization functional of the form where the control factor () is designed to slow diffusion in the neighbourhood of edges.Then, we have the efforts, according to Blomgren et al. in [13], which produced a denoising functional of the form where  is a nondecreasing function of the magnitude of gradient, () → 2 as  → 0 and () → 1 as  → ∞.This model is designed to automatically tap into the benefits of both isotropic diffusion and TV regularization.For other modifications, we refer the reader to, among others, the works in [12,16,17].
In this paper, however, we propose an alternative framework of variational model for image denoising, where the regularization potential is a product of a gradient based functional coefficient and the norm of gradient of image (potential function for TV); namely, () = ∫ Ω Ψ(|∇|) ⋅ |∇|.That is, the coefficient function is a function of the magnitude of the gradient of the image.We propose the criteria for the choice of the coefficient.Finally, we have selected a model based on the proposed criteria, as an example for further analysis and demonstration of experimental results.
The structure of this paper is as follows.In Section 2, we present the proposed model (4) and discuss the general criteria for choosing the functional coefficient to the traditional TV potential.Section 2 is concluded by considering, based on the proposed criteria, an example functional for further analysis.In Section 3, we give certain preliminary definitions and lemmas we rely on, variously, in this paper.In addition, we prove the existence and uniqueness of the solution to the minimization problem (11) and the associated evolution equation ( 27)- (29).In Section 4, we define the weak solution to the evolution problem ( 27)- (11).Furthermore, we present the formulation of the approximate evolution equation ( 32)- (36), prove the existence of solutions to the approximate evolution problem, and conclude Section 4 with the existence and uniqueness of the solution to the evolution problem ( 27)- (29).In Section 5, we give the numerical schemes and experimental results to demonstrate the strength and effectiveness of our method.Additionally, we have presented a brief discussional comparison of our results with those of other methods like PM, original TV, and D--PM method by Guo et al. [5].A brief summary concludes the paper in Section 6.

Proposed Model
In this section, inspired by the works of Chan and Shen [18], Vese [19], and Chen et al. [12], among others, we propose an alternative framework for total variation based denoising model.The model is based on minimization of a functional, where the regularization potential is a product of the total variation potential and a gradient based functional coefficient.The coefficient, which acts to penalize the norm of gradient and detect any edges, is also taken as function of the norm of the gradient of image.So, we present the general form of the model and determine certain properties of the functional coefficient, especially in terms of linearity, sublinearity, and superlinearity growth at infinity.Ultimately, we have selected a specific example of model, based on the proposed guidelines, for further analysis.

The New
Framework for Energy Functional.The proposed energy functional is given in the following general form: where Ψ() is a function that detects edges and penalizes the norm of gradient and  is the noise image.Observe that if we set Ψ() = , then the kernel of () becomes a variational problem of the  2 -norm that is known not to allow for discontinuities, since it leads to the traditional edge obliterating isotropic diffusion [20], while if Ψ() = 1 the problem becomes the usual TV functional, which allows a diffusion mechanism that is strictly normal to the image gradient [12].The functional () in ( 4) should verify the Euler-Lagrange equation Next, we establish certain properties of Ψ() from ( 5) that should influence its choice.For the purposes of denoising, to be able to smooth the image in homogeneous regions, that is, regions where On the other hand, to preserve discontinuities (edges) in the image, that is, in areas where  = |∇| → +∞, Ψ() takes a form such that lim  → +∞ (Ψ  (|∇|) + (Ψ(|∇|)/|∇|)) = 0. Now, for a little more precision on the expected behaviour of Ψ(), we decompose divergence part in terms of the tangent and normal directions to the isophote lines.Equation (5) then becomes Then, to be able to achieve isotropic diffusion within homogeneous regions, from (6), we may impose the condition that lim where  is a constant.The above condition yields from (6) a diffusion model that diffuses isotropically.
In the regions neighboring the edges, the model should dissipate diffusion effects across the edges and favor diffusion along the edges.This implies that as  → ∞, [Ψ  () + (Ψ()/)] →  1 , where  1 is an arbitrary positive constant, while [2Ψ  () + Ψ  () ⋅ ] → 0. These conditions might be difficult to meet simultaneously for Ψ().Therefore, motivated by the works in [10,19], a weaker compromise is imposed by demanding that both terms approach zero, but at different rates, with diffusion along the direction normal to the isophote lines approaching zero faster than the diffusion along the tangent to isophote lines.This leads to the condition that lim Hence, we observe that, for Ψ() to become an effective coefficient in the functional in (4), it should be such that conditions (7) and ( 8) are satisfied.
For simplicity of notation in subsequent stages, we will denote, from (5), Furthermore, we are interested in the properties of Ψ() with regard to its growth at infinity and its overall impact on the functional (): then  () is superlinear, 0, then  () is either superlinear or linear, +∞, then  () is superlinear.(10) Note that  > 0 is a constant and growth.

Summary on the Characterization of Growth of Ψ( ⋅ ).
The aim of this section is to discuss the property of the function coefficient Ψ() that will lead to a functional () of linear growth at infinity, but, of course, subject to conditions (7)- (8).The characterization is based upon the results of process (10).
Observe that setting the coefficient Ψ() in the regularization kernel, as a function of the norm of gradient of the image to be linear growth, leads to a functional () in (4), which is of superlinear growth.Such a functional yields associated diffusion equation which either diffuses images uniformly, without being sensitive to discontinuities or edges, or generally does not yield good denoising results [10,21].
Choosing Ψ() that is of superlinear growth leads to () which is invariably of superlinear growth.For instance, if we set Ψ() =   , ( > 1) which is of superlinear growth at infinity, then we obtain the function Ψ() ⋅  =  +1 .This functional is of superlinear growth and does not give good results in image denoising.This is because, with such a choice of Ψ(), the derivative of Ψ() ⋅  would yield a nondegenerate elliptic differential operator of the second order, which would have an oversmoothing effect under the optimality condition [22].
Bildhauer and Fuchs [21] reckon that a regularization function need not necessarily be of power growth.Therefore, another example for Ψ() of superlinear growth is Ψ() ⋅  =  2 log(1+), where Ψ() =  log(+1) is viewed as some kind of compromise between the cases  > 1 and  = 1 of coefficient of power growth [21].This formulation too does not promise any better results.
Consequently, we observe that to achieve maximal results, which entails reconstructing a noise image in such a way that the edges are protected while diffusion within homogeneous regions proceeds rather uniformly, it is required that Ψ() be of sublinear growth at infinity.Also, the chosen Ψ() should be such that the functional () generated is of linear growth.Moreover, Ψ() should be such that the conditions in ( 7)-(8) are satisfied.Functionals which arise from such potential functions tend to give better image denoising results.Some examples of the functions Ψ() of sublinear growth are Ψ() := /(1 + ), Ψ() := log(1 + ), and Ψ() := 1/(1 + ).The function Ψ() := /(1 + ) is a good candidate for further consideration.
In this paper, therefore, we choose the control function Ψ() := /(1 + ), which is function of sublinear growth at infinity.This leads to a nondecreasing potential function for  ∈ R + , which is strictly convex and of linear growth at infinity given in the form Ψ() ⋅  :=  2 /(1 + ).
With this choice, we observe the following further properties of Ψ(); (iii) Ψ() is nondecreasing for  ≥ 0. This, as observed from (6), ensures only forward diffusion along the tangent to the isophote lines.

The Minimization Problem
Our model based on the presentation above is then given by min where Hence, in this section, we study the existence and uniqueness of the minimization problem (11).The following preliminary information prefaces our reasoning here and in the subsequent sections of this paper.

Preliminaries Definition 1.
Let Ω be an open subset of R  .A function  ∈  1 (Ω) has a bounded variation in Ω, denoted by BV(Ω), if where BV(Ω) denotes the space of such functions.Then, BVnorm is given by Definition 2 (see [23]).If  ∈ BV(Ω), then And writing  and its total variation on Ω, ||(Ω), one has (see [7]) is a Radon measure, where ∇ is the density of the absolutely continuous part of  with respect to the Lebesgue measure, L  , and    is the singular part.

Existence and Uniqueness of Solution to the Minimization
Problem.Since the objective here is to preserve edges, which are viewed as discontinuities, the natural function space to seek the solution is the BV(Ω) space.In image analysis, the BV(Ω) space allows for discontinuities across the edges of images.Furthermore, due to nonreflexivity of the  1 (Ω) space, which would have been a more natural space within which to seek the solution, it is noted that the solution to the minimization problem might not exist there.Therefore, the BV- * (Ω), which denotes BV- * topology, provides the most reasonable alternative space for the existence of the solution.In what follows, we will, for convenience of notation, refer to BV- * (Ω) simply as BV(Ω).This space allows obtaining compactness because of the separability of  1 (Ω) space even though it is not reflexive [10].Hence, we state and prove the following existence and uniqueness theorem for our problem.
Proof.Since (∇) as specifically defined above for ( 11) is of linear growth and given the fact thatlim  →∞ () = +∞, it implies that it is also coercive.Consequently, there exists a sequence {  } ∈ BV(Ω) ∩  2 (Ω) such that (  ) ≤ .Hence, we have ∫ Ω (∇  ) < , and, from the inequality of the second component of This indicates that {  } is bounded in  2 (Ω) and  1 (Ω).And from (  ) ≤  it is also clear that   is abounded in BV(Ω).Thus, there exists a subsequence And on the strength of the lemma on convex functions of measures (see Lemma 4) and the weak lower semicontinuity of the  2 -norm for the second component of (), we deduce that () is lower semicontinuous.In fact, the recession function Φ ∞ () in the definition of convex function of measures is finite for our functional.That is, Φ ∞ () = 1.Hence, we obtain that or We, thus, have Thus,  ∈ BV(Ω) ∩  2 (Ω) is a minimizer of the problem (11).Uniqueness of  is drawn from the strict convexity of (∇) and convexity of the second component of (), which implies the overall strict convexity of the functional ().
Additionally, a strictly convex functional admits at most one minimum.This, then, implies that  is a unique minimizer of (11).

The Associated Evolution Equation.
From the energy minimization problem as assigned above, namely, min Abstract and Applied Analysis 5 we have the associated Euler-Lagrange equation given by (25) with the Neumann boundary condition We compute for  in the Euler-Lagrange equation ( 25) by putting it into a dynamical system, where the time  is used as an evolution parameter.Hence, the evolution equation associated with the minimization problem ( 4) is given by where () is the noise image,   := [0, ] × Ω, and In order to see whether the potential () := Ψ() ⋅ , as defined above, respects the general principle of image reconstruction, where it is required that reconstructed image be formed by homogeneous regions separated by sharp edges, we decompose the divergence term of (25) using the local image structures like tangent and normal directions to the isophote lines.Writing it in its nonconservative form, analogous to (6), yields Notice from (30) that as |∇| → 0, signaling homogeneous regions, the potential () behaves like a linear isotropic diffusion encouraging uniform smoothing in both the   and   directions.However, as |∇| → ∞, corresponding to the neighbourhood of the edges, diffusion rate along the normal (or   ) direction is diminished, while the diffusion along the tangent (or   ) direction is preferred, thereby preserving the edges.The model therefore is well-behaved, since it reasonably satisfies the principle cited above.
We then consider the difference approximating equation of (32) as follows: Denoting V  :=  ,  , then the above equation becomes div The idea here is to prove that if the value of  ,−1  is known and  ,0  =   , then (41) admits a weak solution V  :=  ,  .From (41), we may back-project to the general functional  defined in  1 (Ω), given by It can be shown that (⋅) above is convex and lower semicontinuous in  1, (Ω).Hence, there exists a minimizing sequence {   } for (V  ) such that (   ) = inf V  ∈ 1, (V  ).For simplicity and without loss of generality, we will let  = 1.Then, for any test function () ∈  ∞ 0 (Ω) with (40) and integration by parts, we have To obtain the approximate solution to the whole domain   , we denote where  for  ∈  ∞ 0 (  ).In the steps that follow, we obtain some estimates for    (, ) and V   (, ).For this purpose, let us choose the test function () in (43) such that () =  ,  −  ,−1

𝑝
. We then obtain Applying convexity leads to Summing (48) from  to  for 1 ≤  ≤  ≤  yields which implies that sup In addition, let us consider that Now, summing (48) from  to  leads to Equation ( 57) implies that Moreover, by definition of V   (, ), we see that for some   , V  , and .

Numerical Experiments
In this section, we show the performance of the proposed formulation in denoising images involving a Gaussian white noise.The results are then compared with those obtained by the classical methods of Perona and Malik (PM) [1], Rudin et al. (TV) [8], and Guo et al. [5].
The following numerical scheme has been proposed for the implementation of the model.

Additive Operator Splitting (AOS) Scheme.
Here, we have implemented the evolution problem ( 27)-( 29) using AOS scheme proposed by Weickert et al. in [33].Thus, the equations are discretized as follows: where  0 denotes the noise-free image,  is the denoised image,  ×  is the dimension of image, and | max  0 − min  0 | yields the gray scale range of the original image.SSIM, designed by Wang et al. [35], is a quality metric used to measure the similarity between any two images.It is widely considered to work in a manner analogous to the human visual system.As opposed to the PSNR, SNR, MAE, and MSE, which are error based measures, SSIM models image distortion as a combination of loss of correlation, luminance degradation, and contrast distortion.Given any two images  and  0 , SSIM measure is given by the formula SSIM (,  0 ) =  (,  0 ) ⋅  (,  0 ) ⋅  (,  0 ) .
(,  0 ) = (  0 +  3 )/(    0 +  3 ), where   0 is covariance between  and  0 , is a structure comparison measure which determines the correlation between the images  and  0 .It attains maximal value of 1 if structurally the two images coincide, but its value is equal to zero when there is absolutely no structural coincidence.The quantities  1 ,  2 , and  3 are small positive constants that avert the possibility of having zero denominators.
Taking the edge map EM() from (27) with  the tuning parameter, we have Hence, the corresponding edge similarity measure is given by All these measures have been deployed not only to judge the extent of signal recovery, but also to see the extent of structural coincidence between the original image and the reconstruction image.It should be noted, however, that there is no hard and fast rule for selecting between the error based measures and the structural similarity index measure [36].Therefore, any or both can be used at the same time to measure the quality of image recovery.
We have compared results of the construction by our method to those of Perona-Malik method [1], total variation method of Rudin et al. [8], and the D--PM algorithm proposed by [5].Adjustable parameters in the experiments were time step , thresholding\tuning parameter , and others depending on the respective model.However, fidelity parameter  was dynamically obtained in the course and process of iteration.
In Tables 1 and 2, we have, respectively, presented the results of the reconstruction for the synthetic image, Figure 1, and those of the textured image, Lena, Figure 2. The comparisons are based upon PSNR, MAE, SSIM, and PSNR E values and visual effects.
For the nontextured synthetic image, Figure 1, it is observed that, in spite of higher CPU time, our method enjoys superior signal and structural recovery, at PSNR = 38.52,MAE = 3.16, and SSIM = 0.9905 (see Table 1).In addition, as evidenced by Figure 1(e), reconstructions due the proposed model do not show the relative blurriness speckles evident in Figure 1(c) (by PM method); it does not manifest the relative blockiness visible on Figure 1(d) (by TV method) neither does it show heavily jagged edges apparent in Figure 1(f) due to the D--PM algorithm.Comparing Figure 1(e) produced by our method and Figure 1(a) of the original image, it can be observed that the differences are extremely marginal.The edges also seem better preserved and sharper.
Furthermore, although in terms of numerical metrics our method compares fairly well with those obtained by the D--PM, it should be noted that the increased number of parameters in the D- that have to be manually tweaked or manipulated to obtain optimal results aggregately hinders the effectiveness of the method.
For the textured image of Lena in Figure 2, the metrics evidence that our method still gives better signal recovery and structural coincidence compared to its comparisons (see Table 2).By visual observation, Figure 2(c) produced by PM method shows speckles and blur, while Figure 2(d) shows heavy presence of staircase effect, and Figure 2(f) shows heavier ramp-like features.However, Figure 2(e) produced by our method not only shows no speckles and blur but also manifests significant reduction of staircase effect.
In respect of the edge similarity measure PSNR E , for the nontexture image (see Figure 1), we observe a phenomenon where the PM method only marginally beats the proposed method.However, the method far outperforms the TV method.For the texture image (see Figure 2), experiments show that the proposed method generally outperforms the PM, TV, and D--PM methods in reconstruction of image from noise (see Tables 1 and 2).
In addition, experiments demonstrate that the tuning parameter  introduced in the implementation of the proposed model is texture sensitive.In the nontextured images,  tends to be very small, while as texture increases the value of  also tends to increase.Generally observed is that the AOS implementation algorithm shows gray value invariance, stability based on extremum principle, Lyapunov functionals, and convergence even for larger values of the scale parameter  [33].This does not apply in the case of TV and PM, since they become unstable for  > 0.25.The D--PM survives this eventuality only when implemented by the AOS algorithm.
Generally, therefore, the proposed formulation performs better than TV method and PM method and indeed fairly better than the D--PM algorithm not only in the quantitative metric measures but also in terms of visual effect.The D--PM method competes fairly well with the proposed method.However, the increased number of parameters in the D--PM may disadvantage its practical ability to produce optimal results.Moreover, our formulation seems to handle texture images better than its comparison.Finally, given the nonmonotone nature of the flux function of the D--PM model proposed by Guo et al. [5], it is an extremely ill-posed problem, generating both backward and forward diffusion at various times in the evolution process.Therefore, analytically speaking, its ability to arrive at steady state and produce data with minimum energy is highly in mathematical doubt.Notwithstanding the foregoing, the discrete or numerical implementation of the D--PM method still does give an approximated solution.

Conclusion
In this paper, we have proposed an alternative framework for total variation based image denoising models.The model is based on minimization of total variation with a functional coefficient.We set out to determine factors to consider when choosing the functional coefficient for the minimization potential, and we have established that, for a well-behaved minimization functional, it is desirable that the functional coefficient be of sublinear growth at infinity.The chosen functional coefficient, apart from its sublinear growth at infinity, must be one that results in a potential which is nondecreasing, of linear growth at infinity, and convex.Convexity can then be used to obtain lower semicontinuity, as it is also instrumental in determination of existence of a minimizer to the functional.We picked on an example to illustrate the effectiveness of the consideration.We have shown for the example that existence and uniqueness of minimizer can be established and that the evolution equation has a unique entropy solution.In addition, numerical implementation of the evolution equation on images demonstrates better denoising results than those offered by traditional competition due to PM method, TV method, and even the D--PM in [5].
The successful application or otherwise of any regularization formulation in practical denoising always depends on the type of image under consideration, the type of noise involved, the application intended for the restored image, and indeed the platform upon which the formulation is implemented.Consequently, some of the constraints that might affect the practical performance of a regularization formulation modeled along the lines proposed here, including the sample formulation used, include the nature of the noise and even the hardware in which the implementation has been carried out.For more efficient performance, we suggest that platform with higher processing capabilities be deployed.The extent of degradation and the level of preservation required are also other factors that determine the performance of the formulation.For instance, in case of heavy degradation, it is widely recommended that a preconvolution of the noise image with Gaussian be done before the implementation of the suggested formulation to recover edges and other semantically important features.In addition, there may also be cases where noise removal may be counterproductive.This may occur when the oscillation due to noise is of the same scale as that of the features or patterns for which preservation is intended [37].To surmount such a challenge, one may require a combination of formulations.However, generally, we have observed that models derived along the lines of the approach proposed here give formulations which not only generate better results but also submit rather progressively to mathematical analysis.