A Total Fractional-Order Variation Model for Image Restoration with Non-homogeneous Boundary Conditions and its Numerical Solution

To overcome the weakness of a total variation based model for image restoration, various high order (typically second order) regularization models have been proposed and studied recently. In this paper we analyze and test a fractional-order derivative based total $\alpha$-order variation model, which can outperform the currently popular high order regularization models. There exist several previous works using total $\alpha$-order variations for image restoration; however first no analysis is done yet and second all tested formulations, differing from each other, utilize the zero Dirichlet boundary conditions which are not realistic (while non-zero boundary conditions violate definitions of fractional-order derivatives). This paper first reviews some results of fractional-order derivatives and then analyzes the theoretical properties of the proposed total $\alpha$-order variational model rigorously. It then develops four algorithms for solving the variational problem, one based on the variational Split-Bregman idea and three based on direct solution of the discretise-optimization problem. Numerical experiments show that, in terms of restoration quality and solution efficiency, the proposed model can produce highly competitive results, for smooth images, to two established high order models: the mean curvature and the total generalized variation.


Introduction
This paper presents a fractional-order derivative based regularizer for variational image restoration.It may be used for other imaging models such as image registration.Denote an observed image by z = z(x), x ∈ Ω ⊂ R d where Ω is the bounded domain of the image with d space dimension and has a Lipschitz boundary.Here we consider d = 2 and mainly the image denoising problem with an additive noise i.e. assume z = u + η 0 with η 0 representing some unknown Gaussian noise of mean zero and deviation σ, but most results are applicable to d > 2 and other noise models.

Image inverse problem
Restoring the unknown u (without any restrictions) from z is an inverse problem.According to the maximum likelihood principle [39], most image processing problems involve solving the least-square problem measuring the fidelity to z.For example, P (u) = u for image denoising, P (u) takes the template image T (x + u(x)) (and z = R(x) for a reference image) for image registration, and P (u) = P Ω1 (u(x)) for image inpainting with Ω 1 ⊂ Ω the subdomain with missing data.
The problem (1) is in general ill-posed due to non-uniqueness, therefore how to effectively solve it becomes a fundamental task in image sciences.The most popular idea is to regularize it so that the resulting wellposed problem admits an unique solution.The classical regularization technique by Tikhonov et.al [76] is to add a smoothing regularization term into the energy functional to derive the following minimization problem where λ is a positive constant.This model cannot preserve image edges, though it is simple to use.The total variation (TV) model by Rudin-Osher-Fatemi [67] or the ROF model is widely used, where σ is an estimate of the error η 0 between the noisy image z and the true data u.The ROF model preserves the image edges by seeking solutions of piecewise constant functions in the space of bounded variation functions (BV).A variety of methods based on the TV regularization have been developed to deal with the imaging problems such as image restoration [1,2,10,82], image registration [48,37,62], image decomposition [61,38,32], image inpainting [46,40,41,24] and image segmentation [16,77].Restoring smooth images in some applications where edges are not the main features presents difficulties for the ROF model as it can yield the so-called blocky (staircase) effects.Another disadvantage of the model is to the loss of image contrasts [52].It should be remarked that the recently popular method by the iterative regularization technique [60] can reduce the staircasing effect and improve on the image contrast to some extent; besides it provides a fast implementation.

High-order regularization
To remedy the above mentioned two drawbacks (stairicasing and contrast), two types of alternative regularizer to the TV have been proposed in the literature.The first type introduces higher order regularization into image variational models [22,72,7,52,74,31,15,84].The mean curvature-based variation denoising model was studied in [52,53,17,84] where the regularized solution u is obtained by solving the fourth-order Euler-Lagrangian equation.Bredies et al. [15] proposed the total generalized variation regularizer involving a linear combination of higher-order derivatives and the TV of u to model the image denoising while Chang et al. [25] considered a nonlinear combination of regularizer based on first and second order derivatives.For image inpainting, a high order regularization based on Euler's elastica of u is used in [72].Similarly the Euler's elastica energy [56] and mean curvature [36,29] are also proposed to transform the template image T (x + u) to map the reference image R(x) in image registration; see also [35,51].The above mentioned high order regularization methods are effective but due to high nonlinearity efficient numerical solution is a major issue.
The second type introduces fractional-order derivatives, which are widely studied in other research subjects beyond image processing [3,5,6,8,85], into regularization of images.For example, Bai and Feng [11] introduced first fractional-order derivative into anisotropic diffusion equations for noise removal where c(•) denotes the divergence parameter and D α * x denotes the adjoint operator of D α x , which may be viewed as a generalization of the Perona-Malik model.Although the above equation can be related to the Euler-Lagrange equations of an energy functional with the fractional derivative of the image intensity, generalizing commonly used PDE models, the energy minimization models are not studied as such.The discrete Fourier transform is used to implement the numerical algorithm assuming a periodic input image at its borders [11].See also [45,44,49,66] for more motivations and studies based on the above diffusion equation.Chen et al. [28,27,26] considered the fractional-order TV-L 2 image denoising model and numerically obtained improved denoising results over the Perona-Malik and ROF models; however no analysis was given.There, they converted this primal formulation into a dual problem for the new dual variable p = (p 1 , p 2 ) by u = f − div α p/λ and used a dual algorithm using the gradient descent idea similar to the Chambolle method [18] for the ROF.In [81], the authors proposed a discrete optimization framework for image denoising problem where the fractional order derivative is used to model the regularization term, which is solved by an alternating projection algorithm.See also [21].These works have reflected good performance of the fractional order derivative in achieving a satisfactory compromise such as no stair-casing and in preserving important fine-scale features such as edges and textures.These encouraging results motivated us to investigate this new model more closely.
There have been several other works involving discrete forms of an α-order derivative proposed to tackle image registration problem [78,54] and image inpainting problem [83].Comparing with the first type of high order models [15,36,29], a fractional order model (type two) is less nonlinear and hence is more amenable to developing fast iterative solvers.Clearly there is strong evidence to suggest that fractional order derivatives may be effective regularizer for imaging applications.There is an urgent need to establish a rigorous theory for the total α-order variation based variational model so that further applications to image inverse problems can be considered in a systematic way.

Our contributions
This work is substantially different from previous studies.We mainly focus on the continuous total α variation-based model, instead of discrete formulation, and its analysis and associated numerical algorithms.
Our contributions are four-fold: • We analyze properties of the total α-order variation laying foundations for applications to image inverse problems as a regulariser; • We give a new method for treating non-zero Dirichlet boundary conditions which represents a generalization of similar results that existed only in 1D to 2D; • We establish the convexity, the solvability and a solution theory for the total α-order variation model to make it more advantageous to work with than high order and non-convex counterparts (such as a mean curvature based model) which are not gradient based and do not have much known theory on their solutions; • We propose and test four solution algorithms (respectively Split-Bregman based, forward-backward algorithm, Nesterov accelerated method and fast iterative shrinkage-thresholding algorithm -FISTA) to solve the underlying total α-order variation model.We also compare with related models.
Our work is hoped to motivate further studies and facilitate future applications of α-order variation based regularizer to other imaging problems in the community.The rest of the paper is organized as follows.Section 2 reviews the definitions and basic properties of the fractional order derivative.Section 3 first defines the total α-order variation and the space of functions of fractional-order bounded variations.In this space, it then analyzes the the existence and the uniqueness of the solution of the total α-order variation based model for denoising.In Section 4, a boundary condition regularization method for treating nonzero Dirichlet boundary conditions is proposed to effectively employ and compute the fractional order derivatives of an image.Section 5 first discusses the discretization of the fractional order derivatives by a finite difference method and presents a Split-Bregman scheme for effective solution.Section 6 takes the alternative discretise-optimize solution approach and develops three optimization-based algorithms (Forward-backward algorithm, Nesterov accelerated method and FISTA) to solve the image denoising model.Experimental results are shown in Section 7, and the paper is concluded with a summary in Section 8.

Review of fractional-order derivatives
This section reviews definitions and simple properties of a fractional order derivative which has a long history and may be considered as a generalization of the integer order derivatives.Three popular definitions to be reviewed are the Riemann-Liouville (R-L), the Grünwald-Letnikov (G-L) and the Caputo definitions [55,59,63].
In this paper, a fraction α ∈ R + is assumed to lie in between two integers n−1, n i.e. 0 ≤ = n−1 < α < n and a fractional α-order differentiation at point x ∈ R is denoted by the differential operator D α [a,x] , where a and x are the bounds of the integral over a 1D computational domain.Undoubtedly, the gamma function is very important for the study of fractional derivative, which is defined by the integral [63] One of the basic properties is that Γ(z + 1) = zΓ(z) and hence Γ(n) = n!.Before introducing formal definitions, we review the following informative but classical example: has the solution given by the well-known formula This example helps to understand the formal definitions of fractional derivatives.In fact for 0 < α < 1, equation (7) taking on the form , and equation ( 8) taking on the form D α [0,x] f (x) = ψ(x) is defined as the fractional α-order left R-L derivative of f (x).As operators, under suitable conditions [68], we have D −α [0,x] D α [0,x] = I where I denotes the identity operator.
The first definition of a general order α derivative is the left sided R-L derivative Subsequently the right-sided R-L and the Riesz-R-L (central) fractional derivative are respectively given by The second definition is the G-L left-sided derivative denoted by which resembles the definition for an integer order derivative, where [ϑ] is the integer such that ϑ−1 < [ϑ] ≤ ϑ.
The third definition is the Caputo order α derivative defined by where f (n) denotes the n th -order derivative of function f (x).The right sided derivative and the Riesz-Caputo fractional derivative are similarly defined by When α = n − 1 is an integer, the above left-sided R-L definition reduces to the usual definition for a derivative.One notes that when a function is n − 1 times continuously differentiable and its nth derivative is integrable, the fractional derivatives by the above definitions are equivalent subject to homogeneous boundary conditions [63].However we do not require such equivalence for our image function u; refer to Remark 2 later.
Fractional derivatives have many interesting properties -below we review a few that are useful to this work.
Linearity.For a fractional derivative D α [a,x] by any of the above three definitions, then one has for any fractional differentiable functions f (x), g(x) and p, q ∈ R.This property will be shortly used to prove convexity and to derive the first order optimal conditions.Zero fractional derivatives.An integer derivative of an image u at pixels of flat regions may be close to zero but the left R-L derivative of a constant intensity function is not zero.One advantage of minimizing a R-L derivative instead of the total variation (image gradients) could be a non-constant solution.It would be interesting to know the kind of functions that have zero α-order derivatives.
Lemma 1 (Singularity) Assume that D α [a,x] is one of the above three fractional-order derivative operators.For any non-integer α > 0 and x > a, there exists a non-constant value function Proof.We give explicit constructions.Here we only consider the R-L and Caputo derivatives; for G-L derivative, we can derive a similar conclusion through their equivalency.
Remark 1 For our later applications §7, we take Boundary conditions.For the left R-L derivative D α [a,x] f (x) of f (x), one assumes that f (a) = 0 or f (b) = 0 for the right R-L derivative; otherwise there is a singularity at the end point.So the Riesz R-L derivative would require f (a) = f (b) = 0.One solution for nonzero Dirichlet boundary conditions for f would be to extract off a linear approximation g(x) (that coincides with f at x = a, b) and to consider D α [a,x] (f (x) − g(x)); however there was no such a method for the 2D case.In Jumarie's work [50], a simple alternative is to modify the R-L derivative to the following also ensuring that the new fractional derivative of a constant is equal to zero and removing the singularity at x = a [8].In Section 4, we present one method for treating nonzero Dirichlet boundary conditions in 2D.
3 The total α-order variation and its related model This section first studies the properties of the total α-order variation, second analyzes a total α-order variation based denoising model and finally presents a numerical algorithm.For the classical total variation based model, its solution lies in a suitable space called the function space BV(Ω) of bounded variation [23,69,15].From tests, the total fractional-order variation model can preserve both edges and smoothness of an image; we anticipate from the former that its solution should lie in a space similar to the BV space and from the latter that the smoothness is due to the non-local nature of the new regulariser.It turns out that for total α-order variation using α-order derivatives, a suitable space is the space BV α (Ω) of functions of α-bounded variation on Ω which will be defined and studied next.The work of this section is motivated by analysis of the total variation (TV) [1,9,18] and of the total generalized variation (TGV) [15].
In variational regularization methods, integration by parts involves the space of test functions in addition to the main solution space.Before discussing the total α-order variation, we give the following definition: ∂n i | ∂Ω = 0 for all i = 0, 1, . . ., , v is compactly supported continuous-integrable function in Ω. Therefore the -compactly supported continuous-integrable function space is denoted by C 0 (Ω, R d ).
Definition 2 (Total α-order variation) Let K denote the space of special test functions Then the total α-order variation of u is defined by We note that TV α (u) is the same for any definition of ∂ α φi ∂x α i because φ satisfies the equivalence conditions.
However for our applications in the paper, ∂ α u ∂x α i is generally not the same for different fractional derivatives (not even in the distributional sense).
Based on the α-BV semi-norm, the α-BV norm is defined by and further the space of functions of α-bounded variation on Ω can be defined by Taking sup φ(x) in the above inequality, we have lower semi-continuity from TV α (u) ≤ lim inf k→+∞ TV α (u k ) (see [1,33] for TV case).
Lemma 3 The space BV α (Ω) is a Banach space.
Proof.First we can see that BV α (Ω) is a normed space following immediately from the definitions of u L 1 (Ω) and total α-order variation TV α (u), so it only remains to prove completeness.Suppose {u k } is a Cauchy sequence in BV α (Ω); then, by the definition of the norm, it must also be a Cauchy sequence in L 1 (Ω).
According to the completeness of L 1 (Ω), there exists a function We shall show that u k → u in BV α (Ω).We know that for any > 0 there exists a positive integer N such that Remark 2 In the literature [63], the equivalence of different fractional derivatives requires stringent continuity conditions e.g. one has However for imaging applications (the objective function u), we do not require such equivalence.
To distinguish the two definitions, we shall continue using the superscript C for C derivatives based quantities such as C div α and C ∇ α while no superscript means that a quantity is based on the R-L derivative.
For any positive integer < +∞ be a function space embedding with the norm , where gives the α-order integration by parts formula (see [4]).Furthermore, applying ( 14) twice, we have shown the relationship where For any α > 0, using the dual relationship (15), one can obtain that By multiplying φ 0 by a suitable characteristic -compactly supported continuous function η in Ω (e.g., η ∈ C 0 (Ω, R d )) and then mollifying (see [1] for TV and [9,42]), the new in fact, it is easy to show that the lower semi-continuity Ω |∇ α u|dx ≤ lim inf k→∞ Ω |∇ α u k |dx holds in the space W α 1 (Ω) similar to the TV case [33]).

Lemma 4
The space W α p (Ω) is a Banach space.Proof.The p = 1 case is clear.Now for p = 1, let q satisfy 1/p + 1/q = 1.To obtain the lower semicontinuity, taking u ∈ W α p (Ω) and ψ(x) ∈ φ ∈ C 0 (Ω, R d ) φ(x) L q (Ω) ≤ 1 for all x ∈ Ω , the following inequality Then we can deduce the result, following the similar lines to proving Lemma 3.

Lemma 5
The following embedding results hold: Proof.The proof follows the linearity of fractional order derivatives, and the positively homogeneous and sub-additive properties of TV α (u).
Theory for a total α-order variation model.We are now ready to analyze model (5) or the total α-order variation model in a more precise form min To focus on the total α-order variation model in Ω = (0, 1) × (0, 1) ⊂ R 2 , we assume 1 < α < 2; the following theorem establishes convexity of the minimization problem (16).
Proof.Since F (u) is a strictly convex functional, the proof follows from Lemma 6.
If a Banach space X is reflexive (separable), then every bounded sequence in X (in X * ) has a weakly (weak * ) convergent subsequence [80,Prop. 38.2].Although BV α (Ω) is not reflexive, however, it is the dual of a separable space.Therefore we can give the following definition: From the above definition 3, we may derive the weak compactness of BV α (Ω) on the weak * topology.This, combined with the weak lower semi-continuity of E(u) and boundedness of Banach space BV α (Ω) (i.e, u is bounded in Banach space BV α (Ω)), yields the following result: Proof.Follow the similar lines of [80, Prop.38.12(d)]).
Proof.The convexity result of Theorem 1 leads to uniqueness of solutions.Refer to [80,Theorem 47C].
We remark that similar existence and uniqueness theories of the total variation problem can be found in [1,19,9].

Nonzero Dirichlet boundary conditions and regularization
The standard definitions for fractional derivatives require a function to have zero Dirichlet boundary conditions due to end singularity, but for imaging applications such conditions are unrealistic and too restrictive.To obtain the system for finding the unknown intensities u at inner nodes of a discretization grids, we have to use boundary conditions, but the difficulties caused by them in fractional derivative computations would be hard to overemphasize; inaccurate boundary conditions can easily lead to the oscillations near boundaries, so proper treatment of the boundary conditions for problems involving fractional derivatives is crucial.
In this section, we shall reduce nonzero Dirichlet boundary conditions to zero ones so that standard definitions and our introduced algorithms become applicable.The basic idea of boundary regularization is to introduce an auxiliary unknown function which satisfies the zero boundary conditions.In this way, the non-zero boundary conditions move to the right-hand side of the equation as a new known quantity.
We recall that, in the 1D case, if the boundary conditions are nonzero Below we generalize the above 1D idea to the 2D case, assuming that the four corners of the solution are given or accurately estimated: We can achieve zero conditions at the edges using the auxiliary function e 2 (x, y) Remark 4 It remains to address the question of how to obtain estimates of u(x, y) at corners and edges: 1.The true intensities a := u(0, 0), b := u(1, 0); c := u(1, 0), d := u(1, 1) in four corner points are unknown a priori, to build the auxiliary function e 1 (x, y), the solutions approximating to them should be solved from the observed image z(x, y) by the local smoothing or other simple techniques.
2. Similarly, the true edge intensities a 1 (y), a 2 (y), b 1 (x) and b 2 (x) are also not given, a reconstruction step on boundary ∂Ω must be proceeded in order to capture a robust solution.To do this, we can apply a 1D model.
According to Remark 4, we can propose a complete procedure for regularizing boundary conditions for 2D variational image inverse problems in edges and corners.
• Firstly, we restore image intensities in 4 corner points from an observed image z.A natural technique would be local smoothing operator for the region of corner points, the oscillations could also be reduced by many variational methods to local regions.
• Secondly, in order to reconstructed 4 edges from the restored intensities z(0, y), z(1, y), z(x, 0) and z(x, 1), the total α-order variation regularization would be used to solve four 1D inverse problems i.e. solve an equation like (16): Thus through e 1 (x, y), e 2 (x, y), we see that equation (17) reduces to finding the new image ū(x, y) with zero Dirichlet conditions and hence the standard definitions of fractional derivatives for ū(x, y) apply.

Discretization and Split-Bregman algorithm
Since solution uniqueness of our variational model ( 16) is resolved, we now consider how to seek a numerical solution of the total α-order variation model.We first reformulate it in preparation for employment of an efficient solver and then discuss some discretization details (by finite-differences) before presenting our Algorithm 1.

A Split-Bregman formulation
Inspired by Goldstein and Osher's Split-Bregman work [43], we introduce a special and new variable d(x) = (d 1 (x), d 2 (x)) T to the total α-order variation based model ( 16) to derive the following constrained optimization problem: To enforce the constraint condition, we transfer it into the Bregman formulation The above iterative scheme can be simplified to the two-step algorithm [43,70,71]: with the multiplier updated by iteration Here the two main subproblems of (21) are Subproblem d : min Further note that the subproblem d has a closed-form solution [43], while the subproblem u is determined by the associated Euler-Lagrange equation as shown below.
Theorem 4 Let u(x) be a minimizer of functional J(u) from (22).Then u(x) satisfies the following first order optimal condition with one of these sets of boundary conditions i) fixed : u(x) ∂Ω = b 1 (x), and ∂u(x) ∂n ∂Ω = b 2 (x); ii) homogeneous : where n denotes the unit outward normal and C div α denotes the divergence operator based on the C derivative.
Proof.Refer to Appendix.

Discretization of the fractional derivative
Before introducing the finite difference discretization of the fractional derivative, we define a spatial partition (x k , y l ) ( for all k = 0, 1, . . ., N + 1; l = 0, 1, . . ., M + 1) of image domain Ω.Assume u has a zero Dirichlet boundary condition (practically we apply the regularization method §4 first before discretization).Here we mainly consider the discretization of the α-order fractional derivative at the inner point (x k , y l ) (for all k =, 1, . . ., N ; l = 0, 1, . . ., M ) on Ω along x-direction by using the approach which is applicable to both the R-L and C derivatives [65,79], where = (−1) j α j , j = 0, 1, . . ., N + 1 and Alterative discretization for fractional derivatives in the Fourier space can be found in [11,44].
Observe from (24) that the first order estimate of the α-order fractional D α [a,b] f (x k , y l ) along x-direction at the point (x k , y l ) with a fixed y l is a linear combination of N + 2 values {f l 0 , f l 1 , . . ., f l N , f l N +1 }.After incorporating zero boundary condition in the matrix approximation of fractional derivative, all N equations of fractional derivatives along x direction in (24) can be written simultaneously in the matrix form (denote w = ω α 0 + ω α 2 ): From the definition of fractional order derivative (24), for any 1 < α < 2, the coefficients ω (α) k has the following properties [63,79]: Hence by the Gerschgorin circle theorem, one can derive that matrix B α N in ( 25) is a symmetric and negative definite Toeplitz matrix (i.e.−B α N is a positive definite Toeplitz matrix).We recall that the Kronecker product A ⊗ B of the p × q matrix A = [a ij ] and the n × m matrix B = [b rt ] is the np × mq matrix having the block structure A ⊗ B := [a ij B].Further vector (A ⊗ B)x can be computed by matrix scheme BXA T (i.e., [(A ⊗ B)x] s = [BXA T ] j,i with s = (i − 1)m + j), where the m × q matrix X is the reshape of the vector x along its column.
Let U ∈ R N ×M denote the solution matrix at all nodes (kh x ; lh y ), k = 1, . . ., N ; l = 1, . . ., M , corresponding to x-direction and y-direction spatial discretization nodes.Denote by u ∈ R NM ×1 the ordered solution vector of U .The direct and discrete analogue of differentiation of arbitrary α order derivative is where u (α) x = u

The Split-Bregman algorithm
In discrete form, we are ready to state the discretized equations in structured matrix form.The discrete scheme of ( 23) is given by T of vectors d and p (i = 1, 2).A matrix approximation equation is given as where D i and P i are N × M -size reshape matrices of vectors d i and p i for i = 1, 2, λ = (−1) n λ/µ.The following justifies the use of a conjugate gradient method for W U = F .Proof.For any matrix U = 0, it is easy to show that which completes the proof.
An implementation of this method may be summarized below: by solving the Moreau-Yosida problem with the l 1 regularization; step 4. Solve subproblem u: Find the solution U k+1 of ( 26) with an effective parameter λ µ by CG method; step 5. Update  with γ ∈ (0, 1]; step 6.Check the stopping condition; stop and return U * := U k+1 ; • else k := k + 1, go back to Step 3; • end step 7. Accept the correct solution U from boundary regularization.

Optimization based numerical methods
As many variational models are increasingly solved by the discretise-optimise approach, we now present three related algorithms for model ( 4) after applying a finite difference discretization.In this section, we assume that we have the zero Dirichlet boundary conditions for u mainly to simplify the notation.As in §5, the α-th order derivative u (α) x of u(x; y) along all x-direction nodes in Ω can be given by matrix B α N U , and similarly U (B α M ) T for y-direction (as U is the solution matrix).
Then using the discrete setting introduced above, the discretised problem of model ( 16) is min where We also have the adjoint relationship U, D * Φ = DU, Φ with DU = (B α N U, U (B α M ) T ) and Φ = (Φ 1 , Φ 2 ).In line with the literature, this model can be denoted by the convex optimization problem in a generic notation by min where one views We also need the notation where f 1 can be any other convex function and λ > 0.
To solve (28) by the methods to be presented, computation of the proximal point prox λ f1 (x) is a major and nontrivial step.We consider how to compute it when D = ∇ α , borrowing ideas from solving a similar problem of TV regularization.In a dual setting, Chambolle [18,20] firstly proposed a discrete dual method by optimizing a cost function consisting of two variants [18,28].Recently, one variant of this scheme is employed in [28] to effectively solve a fractional image model by a dual transform.The other variant is used in [26].
Define two projections as Noting ∂G(x,D * Φ) ∂x = D * Φ and that the optimal solution is where x = x k − γD * Φ and Φ is unknown.Based on methods of Chambolle [18] and Beck-Teboulle [13], we see that (29) can be used to reduce the min-max problem min to the dual problem max Φ∈V2 Proj V1 (x), D * Φ + 1 2γ Proj V1 (x) − x k 2 and further to i.e. max Φ∈V2 Proj V1 (x), Below we consider the operator The minimization problem min Φ∈V2 h(Φ) can be solved to obtain the Φ-update as follows using the gradient projection scheme of h(Φ) [13].Here L(h) ≤ 16γ 2 is the Lipschitz constant.Finally the proximal point prox γ f1 (x k ) is given by ( 29) once Φ is obtained; see also [13].

Forward-backward algorithm
Various applications in sparse optimizations stimulated the search for simple and efficient first-order methods.
The forward backward scheme for ( 28) is based (as the name suggests) on recursive application of an explicit forward step with respect to f 2 , i.e, and followed by an implicit backward step with respect to f 1 , i.e., min The scheme decouples the contributions of the functions f 1 and f 2 in a gradient descent step [12].The scheme is also known under the name of proximal gradient methods [73,30,70,12], since the implicit step relies on the computation of the so-called proximity operator.
The forward backward algorithm is summarised as follows.

Nesterov's method
As a gradient based method, though simple, the above method can exhibit a slow speed of convergence.For this reason, Nesterov [57] proposed an improved gradient method aiming to accelerate and modify the classical forward-backward splitting algorithm, while achieving an almost optimal convergence rate.As a consequence of this breakthrough, a few recent works have followed up the idea and improved techniques for some specific problems in signal or image processing [13,10].
Recently Nesterov [58] presented an accelerated multistep version, which converges as O( 1 r 2 ) (r is the iteration number).For a problem of type (28), this new method introduced a composite gradient mapping.We now show the algorithm as follows.

FISTA method
Beck and Teboulle [13,14] proposed a fast iterative shrinkage thresholding algorithm (FISTA) to solve the image denoising and deblurring model, The method applies the idea of Nesterov to the forward-backward splitting framework, resulting in the same optimal convergence rate as Nesterovs method but wider applicability.It can be applied to a variety of practical problems arising from sparse signal recovery, image processing and machine learning and hence has become a standard algorithm.
Applying it to (28), we obtain Algorithm 4 below.

Numerical results
Finally, we present some numerical results from using the four presented algorithms denoted by

Opti-FB:
Optimization based Forward-backward (Algorithm 2); Opti-Nesterov: Optimization based Nesterov Accelerated method (Algorithm 3); Opti-FISTA: Optimization based FISTA (Algorithm 4), and their comparisons with related methods.In all tests, an initial solution is the noisy image z(x, y), the algorithms solving the diffusion equation or optimization problem are stopped after achieving a relative residual of 10 −4 or a relative error of 10 −8 within 1000 outer and 15 inner iterations.Here we mainly compare the solution's visual quality, the snr (the signal-to-noise ratio) and psnr (the peak signal-to-noise ratio) values which are given where mean(u * ) is an average value of the true image u * , n x and n y denote the size of the test image z.It should be noted however that these valuations do not always correlate with human perception.In real life situations, the two measures are also not possible because the true image is not known.In general, an optimization problem may be solved many times to select a suitable regularization parameter λ or to optimize the solution for the underlying inverse problem; a solution is accepted when some stopping criterion is satisfied.It remains to carry out a systematic study on our new model as in [82] for the TV model.However we shall use the best (numerical) λ for all models in the following tests.
For denoising, F (u) = (u − z) 2 is the L 2 measure between the solution u and the observed image z.To intuitively describe the denoising ability, four sets of data will be used in this part (also see Fig.  Though our framework is readily applicable to image deblurring and image registration, here we only present denoising results.

Performance comparisons of boundary regularization
We first test the idea from §4.One the hand, the variational framework seeks the boundary conditions of a nonzero Dirichlet or a Neumann type on ∂Ω and also real images do have nonzero boundary conditions.On the other hand, fractional order derivatives require homogeneous boundary conditions (as used in works of many authors) due to end singularity.In order to aid accurate computation of the discretized fractional order derivative, in our work, a boundary processing technique §4 has been proposed to transform nonzero boundary conditions of observed data z into zero boundary conditions; hence a consequent matrix approximation to the fractional derivative operator D α [0, 1] can use a zero Dirichlet boundary condition.Here we test the performance and effectiveness of our boundary regularization against no regularization.The experiment is carried out on P1 -Parabolic surfaces as shown in Fig. 2, i.e., a synthetic image of size 256 × 256 and range [0, 1], in Fig. 7.1(a), which is added zero mean value Gaussian random noise with a mean variance δ = 15  256 to get the noisy image displayed in Fig. 7.1(d).For the boundary regularization case, the approximation u| ∂Ω from the observed data z| ∂Ω is from applying one dimensional fractional order variation model as described in §4.The treated case is named as 'Treated' whose results are depicted on

Comparisons of Algorithms 1-4
In Table 1, we compare the restoration quality (via psnr and snr ) of 4 Algorithms.There, all four test datasets are used.In the cases of synthetic images P1 and P2 with noise variation δ = 10 255 , λ is taken as 12000 and 3800 respectively and α = 1.6.In the cases of natural images P3 and P4 with noise variation δ = 5 255 , λ is taken as 18000 and 20000 respectively and α = 1.4.One can see that, from Table 1, Opti-Nesterov and PDE-SB perform similarly in terms of the best restoration quality (via psnr and snr ).However in efficiency (computation times cpu(s)), Opti-FISTA and PDE-SB are the best while Opti-Nesterov takes more computational times than other three algorithms.Evidently, overall, PDE-SB (Algorithm 1) shows the most consistence in good performance in tested cases considered.Since our model ( 16) contains two main parameters: α for the order of differentiation and λ as the coupling parameter for a regularized inverse problem, it is of interest to test their sensitivity on the restoration quality.
Here we shall test all Algorithms's sensitivity using the image P2 -Saddle surface of size 256 × 256, after adding zero mean value Gaussian random noise image of range [0, 1] and δ = 10 256 .Varying λ in a large range from 400 to 60000, all four algorithms are tested on this synthetic image with the results shown in Figs.3(a) and 3(c) for different stopping criterions (GSC: the general stopping criterions with the relative residual 10 −4 , relative error 10 −8 , inner iterations 10,SSC: the strong stopping criterions with the relative residual 10 −7 , relative error 10 −10 , inner iterations 25 ).Different from the TV denoising case where the regularization parameter λ is crucial for restoration quality [82], however, Figs.3(a) and 3(c) show that our total α-order variation regularization model still obtains a satisfactory solution for a large range of λ; this is a pleasing observation.Of course there exists an issue of an optimal choice.
Next varying α ∈ (1, 2) from 1.1 to 1.9, Figs.3(b) and 3(d) show four algorithms's restored results responding to two stopping conditions GSC and SSC.As represented, the smaller α leads to the blocky (staircase) effects in u and the larger α will make solution u too smooth along x 1 -and x 2 -directions respectively.For denoising, our test suggests that α = 1.6 is suitable for smooth problems because the diffusion coefficients are almost isotropic in all regions, leading to smooth deformation fields, and α = 1.4 is appropriate for nonsmooth problems because the diffusion coefficients are close to zero in regions representing large gradients of the fields, allowing discontinuities at those regions.
We should emphasize that the stopping criterions have impacted on the actual numerical implementation.In other words, if we drop the limit on the maximal number of inner iterations and relative residuals (and relative errors), some methods take too long but obtain the more satisfactory results.

Comparisons with other non-fractional variational models
In this test, we compare our total α-variation model(PDE-SB) with three popular methods for variational image denoising.The first compared approach is naturally the TV model proposed by Rudin et al. [67] because the total α-order variation model in this work is inspired by it.The second compared work is the mean curvature model [75] which also addresses the problem of restoring a good result for a smooth image; their approach is different from ours since it is focused on higher order regularization and a multigrid method.
See also [53,17,84].The third compared approach is the TGV model [15] involving a combination of first order and higher-order derivatives to reduce the staircasing effect of the bounded variation functional.
In Table 2, we first compare the restoration quality (via psnr, snr ) and efficiency (computation times cpu(s)) of four approaches by testing the artificial images (P1 -Parabolic surface, P2 -saddle surface) and the natural images (P3 -Pepper, P4 -Penguin); in each approach relevant parameters are shown in Table 2.We see that, with the emperically optimal parameters λ * , the differences of four models are very small, though our new and convex model is slightly better.In other tests where such optimal parameters are not used, our new model performs more robustly and better.
In order to present more visual differences, some stronger regularization parameters (λ * /2) and higher noise variations (with the noise level δ = 30 255 ) are tested, the solution's visual representations restoring the natural image P3 -Pepper in Fig. 4(b) are shown in Fig. 4.While ROF denoising leads to blocky results, the mean curvature model performs better in the smooth regions but exhibits more smooth near discontinuities, the total generalized variation model leads to further improvements over the aforementioned models.The total fractional-order variation model leads to significantly better results.The reason is that the new model tries to approximate the image based on affine functions or non-local high order smooth functions, which is clearly better in this case, in other words, our approach is more effective in eliminating the noise for smooth images and is competitive to high order methods; in efficiency the new approach (PDE-SB) is much faster than the TGV and the mean curvature.We also plot four error results between the restored and true images along a diagonal (magenta) line in Fig. 4(a) for comparison in Fig. 5; we see that PDE-SB produces the best restored surface, which show a major advantage (or better performance) of using our total α-order variation model ( 16) when the test image is smooth, and even when the contrast between meaningful objects and the background is low.
Mean curvature [75] TV [67] TGV [15] Total α-order model ( 16  The total α-order variation regularization with fractional order derivative is potentially useful in modeling all imaging problems.In this paper we analyzed rigorously a simple variational model using total α-order varia- tion for image denoising.One Split-Bregman based algorithm and three optimization-based algorithms were developed to solve the resulting image inverse problem.Instead of using the usual fixed and zero boundary conditions, we proposed a boundary regularization method to treat the fractional order derivatives.Numerical results show that the PDE-based Split-Bregman algorithm (PDE-SB) performs similarly to (though more stably than) optimization-based approaches while our boundary regularization method is essential for getting good results for imaging denoising.Moreover, PDE-SB outperforms currently competitive variational models in terms of restoration quality.There are still outstanding issues with our proposed model and algorithms; among others optimal selection of λ is to be addressed.Future work will also consider generalization of this work to other image inverse problems.

Theorem 5
The weighted matrices inner product W U, U = ij ( k W ik U kj )U ij is positive for any matrix U = 0, where W is a known positive definite operator.

Figure 2 :
Figure 2: Test for P1-Comparisons between treated and Non-treated cases for non-zero boundary conditions (δ = 15 256 ) using PDE-SB.The treated case has psnr =47.53 and snr =35.43, while the non-treated case has psnr =23.69 and snr =10.38.Clearly our boundary regularization §4 is effective while direct application of a fractional model leads to incorrect boundary restoration.Here the error r = u − u * F / u * F .

Figure 3 :
Figure 3: Sensitivity test of Algorithms 1-4 to parameters λ (with the fixed α = 1.6) and α (with the fixed λ = 3800) in the cases of the GSC and SSC stopping conditions.

Figure 4 :
Figure 4: Comparison I -Comparisons of our PDE-SB with TV, mean curvature and TGV models.