A reweighted ` 2 method for image restoration with Poisson and mixed Poisson-Gaussian noise ∗

We study weighted ` 2 ﬁdelity in variational models for Poisson noise related image restoration problems. Gaussian approximation to Poisson noise statistic is adopted to de-duce weighted ` 2 ﬁdelity. Diﬀerent from usual weighted ` 2 approximation, we propose a reweighted ` 2 ﬁdelity with sparse regularization by framelets. Based on Split Bregman algorithm introduced in [20], the proposed numerical scheme is composed of three easy subproblems that involve quadratic minimization, soft shrinkage and matrix vector multiplications. Unlike usual least square approximation of Poisson noise, we dynamically update the underlying noise variance from previous estimate. The solution of the proposed algorithm is shown to be the same as the one obtained by minimizing Kullback-Leibler divergence ﬁdelity with the same regularization. This reweighted ` 2 formulation can be easily extended to mixed Poisson-Gaussian noise case. Finally, the eﬃciency and quality of the proposed algorithm compared to other Poisson noise removal methods are demonstrated through denoising and deblurring examples. Moreover, mixed Poisson-Gaussian noise tests are performed on both simulated and real digital images for further illustration of the performance of the proposed method.


Introduction
We consider an imaging system whose output data is a vector f ∈ R M and true underlying image is u ∈ R N . The observation model is often described by where A denotes a linear operator from R N to R M . In this paper, we focus on variational image restoration from Poisson and mixed Poisson-Gaussian noisy observation f . Typically, variational models for image restoration are composed of two terms, one is data fidelity term and the other is regularization term for modeling a priori knowledge on unknown images. In general, data fidelity term keeps true image u close enough to input data f , otherwise useful information may be discarded in the solution. According to different noise statistics, fidelity term takes different forms. It is well known that the least square fidelity is used for additive white Gaussian noise (AWGN), and it is mostly considered in literature for its good characterization of system noise. Whereas, non-Gaussian types of noise are also encountered in real images. An important variant is Poisson noise, which is generally observed in images produced by low number of photons, such as fluorescence microscopy, emission tomography, etc. A vast amount of literature in image restoration is devoted to problems encountering Poisson noisy images (see [2] and the references therein). The poisson noise data, i.e. the probability of receiving y particles is given by P(y) = e −τ τ y y! , y = 0, 1, 2, · · · (2) where τ is the expected value and the variance of random counts. Moreover, in the case of image acquired with CCD camera, the mixture of Poisson and read-out noise (modeled as AWGN) is a more appropriate model, although it is seldom considered in literature. In [23], a SURE estimator based on Poisson-Gaussian statistics is constructed in wavelet transform for mixed noise denoising, however the complicated construction of estimators can not be easily extended for more general image restoration problem such as deblurring. Recently, Gong et al. proposed in [21] a universal 1 + 2 fidelity term for mixed or unknown noise and achieve encouraging numerical results, however it is hard to derive a meaningful statistical understanding.
Besides fidelity term, we need a regularization term to control noise and artifacts in reconstructed images. A simple but effective idea is to urge a sparse representation in some transform domain for the recovered image. Wavelet type of transforms are often chosen to sparsify natural images. For a large class of applications, penalizing the 1 norm of its transform coefficients leads to sparse representation, as largely illustrated in compressive sensing communality in recent few years. On the other hand, choosing an appropriate transform basis for natural images are also important. One popular choice is total variation proposed by Rudin-Osher-Fatemi [27]. Framelets becomes very useful for different restoration tasks, see [12] and the reference therein. Recently, the relation of sparse framelets representation and total variation has been revealed in [9], and theoretically total variation can be interpreted as a special form of 1 framelets regularization. For this reason, we will consider framelets regularization in this paper.
The generalized Kullback-Leibler (KL)-divergence [10] fidelity is usually used in Poisson image restoration, and it is directly derived based on exact statistics of Poisson noise (see section 2.3.1 below). However, difficulties on computation stability and efficiency arise due to the logarithm function appeared in the KL-divergence function. We are interested in considering a least square based fidelity by approximating Poisson noise with appropriate Gaussian noise. In fact, the approximation of KL-divergence by a fixed weighted least-square method has been used for a long time in medical imaging reconstruction and image deblurring [2]. Recently, such a technique is restudied by Stagliano et al. [32], where the fidelity term is defined as a weighted least square involving the unknown image. Combining with regularization, a scaled gradient method with line search technique has been applied to solve the resulted problem. Here, we consider utilizing the efficiency of 2 fidelity based regularization algorithm and dynamically estimate Poisson noise variance. More precisely, we propose an iterative algorithm based on split Bregman iteration [20], originally proposed for 2 fidelity based 1 regularization, to solve the model with reweighted 2 fidelity. With this formulation, soft-shrinkage can be naturally incorporated as an efficient 1 minimization algorithm. Furthermore, the model gives a reasonable extension to the mixed Poisson-Gaussian noise case immediately with a small modification on the weight. This paper is organized in three consecutive parts. In the first part, we review the background of variational restoration models, split Bregman method and existing Poisson fidelities. Then we present the weighted 2 fidelity for Poisson noise through Gaussian approximation and maximizing likelihood (ML) and then build the corresponding image restoration models and algorithms. We also provide an analysis of the proposed models and algorithms which reveals the connection to the classical KL-divergence model. The models are further extended to the mixed noise cases. The last part is dedicated to numerical simulation where other fidelity terms and models are compared numerically to ours for Poisson image denoising/deblurring and the mixed Poisson-Gaussian noise case.

Variational image restoration model
The observation model in consideration takes the following generic form where A denotes a linear operator from R N → R M , c is fixed background and possibly 0, and is noise perturbation. Generally, for additive white Gaussian noise setting, it is assumed that c = 0 and follows independent normal distribution of mean 0 and variance σ 2 for each pixel. The operator A can either come from the imaging system such as the point spread function (PSF) or the disturbance related to the object like its own motion. For image denoising, A is an identity operator; for image deblurring, A is a convolution operator.
Using classical maximum a posterior probability (MAP) P(u|f ) estimation, image restoration by variational model is usually composed of the minimization of the sum of two terms, namely fidelity and regularization: where λ is a positive number for balancing the fidelity term F (u) and the regularization term G(u). The fidelity term is related to the noise characteristic and derived by the likehood function, while the regularization G(u) is designed based on a priori assumption on u. As introduced previously, it is nowadays standard technique to penalize the 1 norm of representation coefficients in transform domain. Therefore, the following variational model is largely studied for image restoration from AWGN: where · 1 denotes the usual 1 vector norm and D is a linear transform, such as discrete gradient used in total variation [27], Fourier transform, local cosine transforms, wavelet or framelets [8].

Split Bregman algorithm
Split Bregman method introduced in [20] is designed to solve the variational model taking the form as (5). More generally, we consider the minimization problem where F (u) is a convex functional representing fidelity, such as F (u) = 1 2 Au − f 2 2 as in the case of AWGN. By introducing an auxiliary variable d = Du, the alternating split bregman for solving (6) is given by the following scheme: for µ > 0, The efficiency of this algorithm relies on the close form solution of the second subproblem. In fact, d k+1 is given by the so-called soft-shrinkage operator where each operation is performed componentwisely. The nonsmooth optimization problem (6) is thus decomposed into three simple subproblems when F (u) is in quadratic form. This method has been demonstrated to be very efficient for 1 type of minimization in a large variant of applications. In a more general setting, it is shown to be equivalent to classical Douglas-Racheford and alternating direction multiplier method (ADMM) [14,28], thus the convergence was given in this framework. In [8], Cai et al. studied the application of such algorithm for framelets based image restoration and provide a convergence proof. Further approximation of the quadratic subproblem was considered in [34] in order to maximally decouple the subproblems. In this paper, we intend to take advantage of the efficiency of split Bregman and Poisson noise formulation to develop an efficient algorithm for Poisson noise related image restoration problem. For this reason, we will consider a reweighted 2 fidelity term which is closely related to the problem (6).

Poisson noise related fidelities
We assume that the observation vector f is corrupted by Poisson noise (see (2)), i.e.
f ∼ P(Au + c) As in [30,3,7], the following assumptions are made for the linear operator A and c:
• The linear operator A satisfies the following conditions: where A ij is the (i, j) element of the imaging matrix A.
• The fixed background image c > 0.
Since both f and u denote photo counts number, thus their elements are nonnegative. For most of image restoration model, A is the convolution or line integral operator, and these conditions can be easily fulfilled without loss of generality. The first assumption is used to avoid model deficiency as in KL-divergence and weighted 2 . The second assumption implies that for u ≥ 0, Au = 0 if and only if u = 0, see Lemma 3.1. In practice, if the background can be ignored, we can put c = 1 without degrading image quality, where 1 is the vector with each entry being 1. Given Au and c, we have the likelihood of observing f where (Au + c) i denotes the i th element of Au + c. By Poisson noise's property, we have the expectation (mean) and the variance f are Before presenting the proposed reweighted 2 fidelity, we review two existing fidelities for Poisson statistics for comparison.

KL-divergence
The most popular fidelity for Poisson noise is the generalized Kullback-Leibler (KL)-divergence fidelity [10], which can be derived directly by MAP method. By taking a negative log likehood (10), we obtain where 1 is the vector with each entry being 1.
If we neglect the constant term log (f !) which is unrelated to the unknown u, we obtain the following fidelity term Combining with sparse regularization and nonnegativity constraint on photon counts u, we get the restoration model as Generally, (14) is a difficult optimization problem because of the nonsmooth regularization term and KL-divergence term. Optimization of KL-divergence fidelity with nonnegative constraint are typically solved by a popular iterative algorithm called Expectation-Maximization (EM) algorithm [22]. It is known that the convergence of the EM algorithm is slow and it may introduce so-called "checkboard effect" [3,32,7]. In [7], the authors proposed a two-step iteration method called EM-TV for solving (14) when D is a discrete differential operator. We rename it as EM+ 1 algorithm for later adoption of framelets regularization. The algorithms is described as This algorithm and its variants have been proved to be efficient for Poisson noise removal in PET and nasoscopy image deconvolution [7]. The convergence with a damped parameter under adequate condition is provided in [6]. However, the conditions are rather hard to verify since it involves the sequential iterates u k and u k+1 . The other kind of method considered in [29,16] is to apply directly the split Bregman method (7) by introducing a variable d = Du on the model (14). Comparing with this kind of method, our proposed algorithm takes a much simpler form and is more convenient for implementation.

Anscombe transform
Another well-known technique of Poisson denoise is the Anscombe transform [1] proposed for image denoising, when the linear operator A in (9) is the identity transform. Anscombe transform is defined as followed When x is a random variable that obeys Poisson distribution with mean and variance τ , the transformed random variable Ax follows an approximated standard normal distribution N (τ, 1). We now apply the Anscombe transform on the observed data f in the form Letf := Af,ũ := A(u + c). Since f follows Poisson distribution with mean and variance u + c, thenf follows approximately a normal distribution N (ũ, 1). Hence, the usual least square fidelity term can be applied forf and we obtain the following denoising model (17) and the final image is given by u * = A −1ũ * − c. The regularization used in the above model (17) urges sparsity onũ in the transformed domain. This is roughly equivalent to require sparsity on u in Framelet domain since the Anscombe transform (16) is monotone increasing and keeps the order of magnitude of elements in u. We note that this model is generally applied for denoising model, and the adaption to image deblurring is not easy since it involves inverse of nonlinear transform.
3 Weighted least square for Poisson noise image restoration

Framelets regularization
Here, we choose tight frame basis because of its multi-resolution property and redundancy that is helpful both in algorithm implementation and sparse representation of images [11], [12]. The multi-resolution analysis (MRA) based wavelet can be generated by the unitary extension principle (UEP) of [26]. By collecting all the refinement masks of wavelet tight frame system, we can generate the fast tight frame transform or decomposition operator W. The matrix W is consisted of J + 1 sub-filtering operators W 0 , W 1 , ...W J . Among them, W 0 is the low-pass filtering operator and the rest are high-pass filtering operators. Correspondingly, by unitary extension principle [26], the operator W T is the fast tight frame reconstruction operator and we have W T W = I, i.e., W T Wu = u for any image u. More details on discrete algorithms of framelet transforms can be found in [12].

Model and algorithm
We consider Poisson noise for the observation model (9). Let Then the random variable can be interpreted as an additive perturbation noise. Given Au and c we have the mean of If we approximate the random difference by additive Gaussian noise, then follows normal distribution N (0, Au + c), i.e.
where Σ is the covariance matrix. By the independence of noise at each pixel, we have where diag(Au + c) is the diagonal matrix defined by M -dimentional vector Au + c. Using maximum likelihood, we take the negative log of the normal distribution (19) to get the following fidelity term as We denote weighted 2 norm of a vector x ∈ R N as x 2 Q = x T Qx, for Q being a symmetric positive definite matrix. Then (21) is reformulated as Notice that the division and square root operator in the 2 norm on the right hand side of (22) are both element wise. Also by the assumption that c > 0 and Au ≥ 0 for u ≥ 0, we have the positive definitiveness of Σ −1 . Note that (22) was used as a discrepancy principle for choosing the regularization parameter λ in [3]. However, this fidelity term (22) has a weight involving unknown u, and in practice, we need to either approximate or directly solve the nonlinear square problem. For example, the unknown weight Au + c can be approximated by the observed data f as Under this simplification, efficient least square based method can be applied together with regularization.
Combining with framelets sparse regularization and nonnegativity constraint, we have the following restoration model This formulation is considered in [32] and a scale gradient projection method is applied to directly solve the nonlinear least square term with sparse regularization, for which a smoothed version of the 1 regularization term was adopted for calculating the gradient. We are interested in taking advantage of the weighted least square structure and utilizing recently proposed efficient sparse regularization scheme, such as split Bregman presented in Section 2.2. In fact, suppose that there exists an iteration solving (24) and u k is the sequence of solutions derived from it. If u k converges to u * , then we are more or less solving the following minimization problem with a fixed u * min u≥0 1 2 Au for k is big enough. As u k becomes stable, we may approximate the weight 1 Au * +c by 1 Au k +c . With such an idea in mind, we combine with the popular Split Bregman iteration and derives a new algorithm called the reweighted 2 algorithm with split Bregman for Poisson noise: With an idea of being close to previous iteration for stableness of solutions, we can add a proximal term on the update of u and d. For given γ 1 , γ ≥ 0, the numerical scheme is described as followed: This proximal iteration was largely studied by Rockefeller [25] in the context of augmented lagrangian. It was used for least square minimization with 0 regularization in [13] to stabilize the iterations. With γ 1 = γ 2 = 0, it reduces to the previous algorithm. We describe the general method (27) in more detail in Algorithm 1. Note that the first step can be solved by a gradient method with a projection onto the nonnegative orthant. In practice, often a few iterations are sufficient to get a reasonable result.

Analysis
In this section, we justify the proposed algorithm by presenting some mathematical analysis. Our main theorem shows that the sequence of solutions u k generated by (27), if converge, it converges to a minimizer of KL-divergence fidelity model (14). We first present some lemmas to reveal the relation between the solutions of different models. In the following, we will refer to the KL-divergence fidelity model (14) replacing D by W. Proof. By the assumption, j A ij > 0, ∀i, we have (Au) i > 0 for any u ≥ 0 but u = 0 . Hence Au = 0 has trivial non-negative solution.  Proof. We can see that the minimizers of the problem (25) exists due to the usual coercivity of the objective function. Furthermore, the minimizer is unique due to the assumption Au = 0 if and only if u = 0 in nonnegative orthant. In fact, since A has trivial null space in nonnegative orthant, and Au + c > 0 for any u ≥ 0, thus the objective functiona is strict convex. Therefore we obtain the uniqueness of the solution.
In the following, we will denote the unique minimizer of (14) asû and show that the sequence generated by (27), if converge, converges toû. Theorem 3.4 Let (u k , d k , b k ) be the sequence generated by the algorithm (27). If (u k , d k , b k ) converges to (ũ,d,b), then u =û.

Acceleration
As recently proposed in [19], we show that Algorithm 1 can be accelerated using optimal first order methods, namely Nesterov method [24]. We refer to Algorithm 2 as the accelerated version of Algorithm 1. Note that the convergence of the accelerated algorithm for alternating direction multiplier method (ADMM) is demonstrated under some assumptions. In our reweighed case, it is not hard to see that Algorithm 2, if converges, converges to the same solution as Algorithm 1, based on the similar arguments in Theorem 3.4.

Extension to Poisson-Gaussian mixed noise
Previously, we mainly discuss models with Poisson noise only. In real imaging systems, besides Poisson noise which characterizes the fluctuation in counting number of photons, there is other system-born noise that can be approximated by AWGN, as considered in [31]. The literature on Poisson-Gaussian mixed noise are limited despite of its importance. We now explore the extended application of the reweighted 2 fidelity to mixed Gaussian-Poisson noise case. With a small modification of the Poisson noise version, weighted 2 fidelity can be adapted to the mixed noise case.
Let σ 2 be the variance of AWGN and the observed image f has distribution Approximating the PDF of f again by normal distribution (19) with covariance matrix Σ = diag(Au + c) + σ 2 I, we obtain the new fidelity term as followed for mixed noise: Combining with tight frame regularization, we have the following restoration model The algorithm solving this model is the same as Algorithm 1 except adding σ 2 in estimation and updating of covariance matrix Σ.

Numerical results
In this section, numerical experiments on Poisson-Gaussian mixed noise models (31) will be shown for synthesized and real digital photos for demonstrating the performance of the proposed Algorithm (1) and (2). In all implementations, we choose the piecewise linear B-spline wavelets with 2 to 3 levels of decomposition. The corresponding filter masks in discrete version are ]. For the initial guess of the variance matrix Σ 0 , although the observed data f is an acceptable estimation, we use a preprocessed data as a closer approximation to Au + c. In particular, we apply a bilateral filter [33] to preprocess the noised data f . This processing can be used for the first few steps of iteration till the sequence of solutions is stabilized. The bilateral filter B of an 2D digital image x is defined as where [i, j], [p, q] are indices of pixels, and G σs , G σr are Gaussian functions with variance σ s and σ r respectively. Given a M × N reference image u, the peak signal-to-noise ratios (PSNR) value of imagẽ u to u is defined by where u max and u min are the maximum and minimum respectively of the reference image u, and M , N are the size of 2D image. It is a quantitative measure of image quality asũ being the denoised result and u the ground truth.

Poisson denoise
For image denoising, the linear operator A reduces to the identity operator I. We compare first the three fidelities through their corresponding denoising method: the reweighted 2 method (27) with Algorithm 1, the KL-divergence model (14) with the algorithm (15) and the Anscombe transform model (17). The noisy images in our test are simulated as follows. The clean images are first rescaled to an intensity range from 0 to 120; then the Poisson noise is added in Matlab using the function 'poissrnd'. The parameters in each algorithm are tuned to get the best visual outcome for one simulated noisy image; they are then fixed when applied on the rest images.
For the denoised results, we also add a post procedure of bilateral filter to remove some obvious artifacts.
The comparisons are performed in Figure 1 on several test images. The denoising results of the reweighted 2 model and that of the KL-divergence model are more or less the same both qualitatively and in terms of PSNR. The Anscombe transform model gives a slightly poorer result than the other two models. There are some artifacts in the high intensity part of the denoised image which may be caused by the post inverse Anscombe transform.

Deblurring from Poisson data
The deblurring case is in general more difficult than denoising. We compare our reweighted 2 method and the KL-divergence model (14) since Ancombre transform can not be easily applied in this case. For this application, we compare both Algorithm 1 and Algorithm 2 as the direct version and the accelerated version respectively. The EM+ 1 algorithm is applied for the KLdivergence model (14) as the denoising case. Blurred and noisy images are simulated in the following way. The clean images are first rescaled to an intensity range from 0 to 1200; then they are corrupted by a disk blurring kernel of radius 3 with symmetric boundary condition; finally, the Poisson noise is simulated in the same way as in the denoising case.
The main computation cost of deblurring Algorithm 1 is that u k+1 is solved by conjugate gradient method instead of a direct inversion in the first step of the iteration. As shown in Figure  2, the result of the KL-divergence fidelity algorithm is more blurry in contrast to the result of weighted 2 fidelity, although they have similar PSNR, see Table 1  In Figure 3, we compare the energy evolution of different algorithm. As we shown previously, Algorithm 1 and 2, if converge, converge to the solution of the model (14). We thus compare the KL-divergence model E(u) = 1 T (Au + c) − f T log(Au + c) + λ W u 1 of the three algorithms (Algorithm 1, the accelerated Algorithm 2, and EM-1 ). We can see that the energy for both Algorithm 1 and 2 decrease with the iterations, and Algorithm 2 has faster speed of convergence numerically. The evolution of weighted 2 -norm energy E(u) = 1 2 Au+c−f √ Au * +c 2 2 +λ Wu 1 is also shown in the right figure of Fig. 3, where u * is set as the ground truth image. Both reweighted based algorithms have decreasing energy, while EM-1 has a divergence energy for this fidelity. To reach the same PSNR level of the result given by the direct algorithm after 100 iterations, the accelerated Algorithm 2 needs less iterations (around 30 steps), See Fig. 4.
In terms of computation time, the three denoising algorithms run generally for 20 to 30s. In the deblurring case, the reweighted algorithms (1) needs around 100s while the EM-1 around 240s for the same relative error stopping criteria.   Deblurring results of simulated noisy-blurred images with peak intensity 1200. from left to right: noisy-blurred image, direct reweighted 2 algorithm 1 and KL-divergence model (14). From top to bottom: full images and zoomed out regions.  Finally, the weighted 2 and the KL-divergence denoising model have superior results than the Anscombe transform model; while in deblurring case, the reweighted 2 outperforms the KL-divergence one. From these observations, our reweighted 2 fidelity term is a competitive potential choice when working with complex noisy models.

Mixed Poisson-Gaussian noise restoration
In this section, we test the extension of reweighted 2 model to mixed Poisson-Gaussian noise for some synthesized data and real photos. The synthesized mixed noised image is simulated by adding first Poisson noise to a rescaled image with peak intensity 120 and the Gaussian noise of σ = 12. Fig. 5 shows the denoising result of model (31) with A being identity operator. In addition, we perform the mixed denoising model on a digital photo taken in a low light environment with high ISO setting. Our denoising result is compared with the embedded noise reduction algorithm in Sony DSLR-A700 camera in Figure 6, and we can see that our result has less noise and better quality.
For deblurring case, we generate blurred image with peak intensity 1200 and then corrupt it with Poisson noise, Gaussian noise of σ = 12 consecutively. See Fig. 7 for the deblurring result of model (31).

Conclusion
In this paper, we studied weighted 2 fidelity term derived from Gaussian approximation of Poisson statistics. In solving denoising and deblurring models, we proposed a reweighted algorithm based on Split Bregman iteration and its Nestrov accelerated one. The algorithms have the same limit solution as the minimizer of KL-divergence fidelity model (14) if they converge. The proposed algorithms give competitive result with respect to classical fidelity terms in denoising and deblurring simulations. In addition, the reweighted 2 model can be easily extended to the mixed Poisson-Gaussian noise case. As shown by numerical results, the reweighted 2 with split Bregman framework is promising in a wide perspective.