A Multiscale Denoising Framework Using Detection Theory with Application to Images from CMOS/CCD Sensors

Output from imaging sensors based on CMOS and CCD devices is prone to noise due to inherent electronic fluctuations and low photon count. The resulting noise in the acquired image could be effectively modelled as signal-dependent Poisson noise or as a mixture of Poisson and Gaussian noise. To that end, we propose a generalized framework based on detection theory and hypothesis testing coupled with the variance stability transformation (VST) for Poisson or Poisson–Gaussian denoising. VST transforms signal-dependent Poisson noise to a signal independent Gaussian noise with stable variance. Subsequently, multiscale transforms are employed on the noisy image to segregate signal and noise into separate coefficients. That facilitates the application of local binary hypothesis testing on multiple scales using empirical distribution function (EDF) for the purpose of detection and removal of noise. We demonstrate the effectiveness of the proposed framework with different multiscale transforms and on a wide variety of input datasets.


Introduction
Digital images acquired using complementary metal oxide semiconductors (CMOS) or charged coupled devices (CCD) image sensors are subject to noise from two notable sources, i.e., electronic instruments and the photo-sensing devices [1,2]. This noise is typically modelled using a mixture of Poisson and Gaussian distributions, namely, Poisson-Gaussian distribution [3,4]. In cases where the Poisson component of the noise is dominant, the Gaussian component is ignored and noise is modelled using the Poisson distribution [5,6].
For the purpose of denoising, estimation of noise parameters of the signal-dependent noise from CMOS/CCD sensors is a problem of interest [7,8]. A mixed Poisson-Gaussian distribution was used to model the practical sensor noise which was subsequently used for denoising [9]. In [10], Poisson statistics in combination with maximum likelihood estimation are used to restore images from optic acquisition systems. A Bayesian framework is developed for denoising and deconvolution of Poisson-Gaussian noise [11]. In addition, a post processing technique for Poisson denoising using best linear prediction on the local image patches is introduced in [12].
Similar to the Stein's unbiased risk estimator (SURE) which is an estimate of the mean squared error (MSE) for Gaussian noise [13], a Poisson unbiased risk estimator (PURE) is estimated and used with the linear expansion techniques (LET) to formulate a state-of-the-art Poisson denoising method known as PureLet [14]. This technique was also extended for Poisson-Gaussian denoising, whereby on the photo-detector. This is followed by a color filter array which generates only one of the red, green or blue signal at each pixel. Next, an array CMOS imaging sensors capture the analog signals and converts them into electrical signals which are subsequently digitized using the analog to digital converters to generate pixel values. Finally, post processing operation is performed to adjust white balancing and perform color correction [38].

Conversion to Voltage
Electrical Signals

Image Post Processing
Array of CMOS Image sensors Figure 1. Pipeline of CMOS imaging acquisition system (adapted from [38]).
While CMOS image sensing technology is revolutionizing the digital imaging by shrinking the pixel pitch [39], one of the major challenges includes reducing image noise at the time of acquisition [40]. In the case of CMOS imaging sensors, main sources of noise include sensor electronics and photon starvation. For precise modeling of noise in the CMOS sensors, it is important to look into the pixel sensing architecture shown in Figure 2. The pixel circuit is composed of a photodiode and a switching transistor. The imaging principle works as follows: During exposure, photons fall on the reverse biased photodiode leading to the decrease of the reverse voltage across the diode. Subsequently, voltage across photodiode is measured or read at the end of the exposure and photodiode is reset for another exposure [41]. This is known as passive pixel model while the active CMOS pixel model includes the amplification of the read signal which not only increases the sensitivity of the CMOS sensor but also helps reduce the noise [41]. Figure 3 gives an account of various types of noise corrupting the CMOS sensed image. The quantum nature of light dictates that the amount of photons incident on the photodiode is never a certainty. This fluctuation in photon count results in shot noise in the acquired image [41]. Similarly, dark current non-uniformity is an exposure dependent fixed pattern noise which also has a temporal noise associated with itself, known as dark current shot noise. Noise due to photon fluctuations are signal-dependent and are modelled using Poisson distribution [42], where the mean and the variance of the Poisson process equals the signal strength.
Moreover, charge to voltage conversion and subsequent amplification of the electrical signal are also noisy owing to electronic fluctuations. These processes notably introduce flicker or 1/ f noise and thermal noise, which are modelled using the additive white Gaussian noise (AWGN) [42]. In addition, residual error due to quantization is also modelled using the independent additive white Gaussian noise [15].
Based on the above discussion, noise due to CMOS image sensors is typically modelled using the mixture of Poisson and Gaussian distributions also termed as Poisson-Gaussian distribution [15,16,42]. However, in situations of poor illumination or low light conditions noise may be dominantly Poisson distributed [14,20] since the effect of AWGN can be neglected due to low photon count in such cases.

Statement of Problem
Let z i denotes the pixels of the acquired noisy image Z using a CMOS sensor, which may be mathematically modelled as where s i denotes expected pixels of true image S, η g i ∼ N (0, σ 2 ) denote the additive white Gaussian noise (AWGN) with zero mean and arbitrary variance σ 2 and η p i (s i ) ∼ P(z i |s i ) denotes the signal-dependent Poisson noise. Here, Poisson distribution P(z i |s i ) is given as follows where ! denotes the factorial operation. Note that the vector i denotes pixel location, i.e., row and column indexes. Under various physical limitations, i.e., low light or short exposure time, the effect of Gaussian noise may be neglected due to relative strength of the signal-dependent noise. In this case, the acquired pixel model (1) reduces to In this case, z i will also be distributed by P(z i |s i ) with a non zero mean E[z i |s i ] = s i . Hence, it can be concluded that the mean and variance of η p i (s i ) are given as follows This means that the variance of each noise coefficient η p i (s i ) is dependent on the corresponding true signal value s i and is parametrized by the peak value of the signal.

Preliminaries: Introduction to Hypothesis Testing in Detection Theory
Classical detection theory [43] based on hypothesis testing assumes prior distribution models for signal and noise where the aim is to detect signal (with or without an additive noise part) while avoiding noise, e.g., communication channel, radar signal processing etc. A detection problem using hypothesis testing comprises of (i) null hypothesis H 0 of noise only case and (ii) alternate hypothesis H 1 concerning with signal plus noise detection. A classical example may be the detection of signal in a transmission medium where the noise is assumed to be distributed by zero mean Gaussian distribution N (0, σ 2 ) and signal plus noise is modeled by a non-zero mean Gaussian distribution N (µ, σ 2 ) as signal values when added to zero mean noise, contribute a mean µ to the distribution. Mathematically, this detection problem may be modeled as where H 0 and H 1 denote the null and the alternate hypothesis, respectively, while x denotes an arbitrary value from the noisy signal x. Figure 4a plots probability distribution functions of each hypothesis, i.e., p(x|H 0 ) = N (0, σ 2 ) and p(x|H 1 ) = N (µ, σ 2 ). The problem here is to differentiate between the following hypothesis where s denotes an arbitrary value from the true signal s and η denotes an arbitrary value from η distributed by N (0, σ 2 ).
Here, the decision in Equation (6) can be taken by comparing the observations x against a threshold λ. This is elaborated graphically in Figure 4a, where the threshold is plotted as a dotted line. Note that the values greater than the threshold λ k are more likely to be distributed according to p(x|H 1 ) while the values less then the threshold λ k are more likely to be distributed in accordance with p(x|H 0 ). Therefore, the above hypothesis testing problem can be given as A trivial choice of threshold may be the point of intersection of p(x|H 0 ) and p(x|H 1 ) as shown in Figure 4a while a different choice of threshold is depicted in Figure 4b.
However, this kind of detector makes two types of errors: (type I) detecting H 0 when H 1 is given; and (type II) detecting H 1 when H 0 is given p(H 1 |H 0 ). Figure 4 shows probability regions p(H 0 |H 1 ) and p(H 1 |H 0 ) corresponding to type I and II errors respectively, in the context of a simple detection problem. The type II error is also known as false alarm and the probability p(H 1 |H 0 ) is termed as the probability of false alarm (P f a ). Minimizing both errors simultaneously is not possible as decreasing one increases the other, however, these errors can be traded off for each other by adjusting the value of threshold λ, (as depicted by different choices of threshold in Figure 4). Typically, it is required to keep P f a very low in order to avoid the severe consequences of noise detected as signal. Hence, P f a is fixed to a very small value α to estimate a suitable threshold using the following relation where the range {x | x > λ} denote the values x detected as signal (i.e., H 1 ) and Prob(·) denotes the probability of the given event and α is of the order of 10 −3 -10 −6 .  On the other hand, the probability of the first type of error p(H 0 |H 1 ) is minimized which in turn maximizes the probability of signal detection when signal is present, i.e., p(H 1 |H 1 ) = 1 − p(H 0 |H 1 ), see Figure 4c. The probability of true signal detections p(H 1 |H 1 ), also termed as probability of detection (P d ), is mathematically given as In detection theory, P d is required to be maximized with the minimization of P f a . The binary hypothesis defined in Equation (7) directly compares the data x against a threshold. More generally, some metric S(x) on x is used for this purpose, e.g., Neyman Pearson optimal detector [44] uses a statistic S(x) = p(x|H 1 ) p(x|H 0 ) to compare against the threshold λ for hypothesis testing as follows H 0 : S(x) ≤ λ, i.e., x ∈ η, where the distributions of null hypothesis p(x|H 0 ) and alternate hypothesis p(x|H 1 ) must be known a priori. Consequently, definition of P f a in the presence of test statistic S(x) changes to the following where {x | S(x) > λ} is composed of the values x for which S(x) > λ and as a consequence, threshold λ may be estimated by fixing the P f a = α. Similarly, P d changes to the following A popular approach in detection theory is based on goodness-of-fit (GoF) test in which the test statistic S(x) is based on the information regarding the empirical distribution model of the data at hand. This approach also avoids the need to assume a prior distribution model for alternate hypothesis, since prior knowledge of the null distribution is adequate for binary hypothesis testing in Equation (7). Under such conditions, the test statistic S(x) estimates the distance between the empirical distribution function (EDF) F (t) = ∑ t 1.(x > t) of noisy observations x, and the null cumulative distribution function (CDF) F 0 (t) = t p(x| H 0 )dx, where t is the support vector. There are a number of test statistics/measures used as detectors within the framework of the GoF test, but Anderson Darling (AD) statistic [45], Cramer Von Mises (CVM) statistic [46] are frequently used in detection problems [47,48], and are given, respectively, as follows

Proposed Denoising Framework Using Detection Theory
In this section, we propose a denoising framework to remove Poisson and Poisson-Gaussian distributed arising due to the CMOS/CCD image sensors. For this purpose, we first employ variance stability transformation (VST) to 'Gaussianize' the noise present in the CMOS/CCD images. Following that, image denoising is formulated as a detection problem whereby local hypothesis testing based on empirical distribution function (EDF) is employed.
Since, detection theory is employed on time series data [47,48], formulation of detection problem based on local EDF statistics for spatio-temporal data (images) in our case requires following notable adjustments.

1.
To ensure the preservation of spatio-temporal characteristics of multiscale coefficients of noisy image, two dimensional (2D) windows of size l × l are considered around the coefficient for local hypothesis testing.

2.
Two dimensional EDFs are not unique and are computationally expensive [49], therefore, their use for GoF testing on 2D data is not suitable. Consequently, in our work, we list the coefficients in the windows as 1D vectors followed by the computation of their unique (1D) EDF. Note that listing of 2D segments as 1D vectors is a common practice in image denoising methods whereby multivariate statistical distributions are used to model multiscale dependencies [50].
The block diagram of the proposed method is shown in Figure 5. The method involves VS transformation followed by multiscale hypothesis testing of data at local level using EDF statistics. The following subsections illustrate the main steps of the proposed framework.

Variance Stability Transform (VST)
We propose to use VST as a preprocessing step in the case of denoising images corrupted by Poisson and Poisson-Gaussian noise. Following the preprocessing step, the noise is effectively transformed into an independent Gaussian noise with constant variance which can be handled through a Gaussian denoising framework. For Poisson image pixels z i ∈ Z, the AT [20] could be used for variance stabilization of signal-dependent noise as follows, For variance stabilization of mixed Poisson-Gaussian noisy image Z, generalized Anscombe transformation (GAT) [25] is used where the problem (1) is now transformed to Gaussian denoising problem since x i ∈ X in Equations (15) and (16) are pixels of the variance stabilized image corrupted with approximate Gaussian noise. Note that, GAT is a generalization of AT, as for scaling factor α = 1 and σ = 0 (i.e., absence of Gaussian noise), Equation (16) reduces to Equation (15).

Multiscale Local Hypothesis Testing Based on EDF
Typically, in detection problems, we are mainly interested in the detection of signal at a particular time (with or without additive noise part). In signal denoising, on the other hand, we are interested in separating signal from noise so that the effect of noise could be cancelled from the output data. That requires a modification in the classical binary hypothesis testing framework to be applied for denoising applications. In the denoising problem, the alternate hypothesis must correspond to the signal only case whereas the null hypothesis corresponds to the noise only case as before. To achieve that, we propose to apply the modified hypothesis testing within the framework of goodness-of-fit (GoF) test at multiple scales obtained via a multiscale transform.
Let T (·) denote a multiscale transform which decomposes a noisy signal x into multiscale coefficients u (i) k at scale k and location i, as given below For multiscale coefficients u (i) k to either correspond to the true signal (only) or the noise (only), the T (·) must fulfill the following conditions:

2.
Across each scale, signal and noise must be distributed among separate coefficients/values.
Set of transform domain methods fulfilling the above conditions may include DWT and its redundant variants like DDDWT [51], DT-CWT [52] and UWT [53] etc.
Given that T (·) fulfills aforementioned conditions, we propose to formulate the denoising problem as a transformed hypothesis testing problem as follows whereH 0 andH 1 , respectively, denote the transformed null and alternate hypothesis while T (s) denotes multiscale true signal coefficients or the multiscale version of signal only case and T (η) are multiscale noise (only) coefficients.
Based on proposed hypothesis testing problem for multiscale denoising in Equation (18), foundations of the multiscale detection theory can be built. To this end, a scale adaptive threshold λ k may be obtained by fixing the probability of a false alarm at the kth scale, i.e., P (k) f a = α (k) . Now a test statistic S(u (i) k ) may be employed to compute the statistical distance between the multiscale coefficients u (i) k from the distribution of noise at multiple scales, i.e., distribution of T (η). Henceforth, the transformed hypothesis testing problem in Equation (18) can be re-written as follows Remark 1. Null and alternate hypotheses in the proposed approach correspond to the noise only and signal only detections at multiple scales (i.e.,H 0 : T (η) &H 1 : T (s) respectively), whereas the null and alternate hypothesis in traditional detection problem correspond to noise only and signal plus noise detections at original signal (image) scale (i.e., detection of H 0 : η & H 1 : S + η respectively).

Estimation of Threshold λ k
As a consequence of the modified hypothesis testing problem in Equation (19), the definition of f a is also modified accordingly, which directly follows from Equation (11) as where {u k ) > λ k } is the set of multiscale noise coefficients which are falsely detected as signal, i.e., the set of coefficients yielding false alarms.
In the proposed framework, the threshold λ k for each scale k is estimated using Equation (20) for a given probability of false alarm P (k) f a at scale k. For that purpose, probability distribution function of noise coefficients at multiple scales p(u The challenge is that the T (·) might change input noise distribution, e.g., UWT does not retain the Gaussianity at multiple scales. As a result, the probability density function p(u where d dt denotes the first order difference in the discrete case. The empirical estimation of the null EDF F (k) 0 (t) for a non-linear transform is discussed in the next section. Similarly, the definition of P d changes to the following where p(u (i) k |H 0 ) denotes the probability distribution function of multiscale noise T (η) and p(u (i) k |H 1 ) denotes the probability distribution function of multiscale true signal T (s).
In order to estimate P (k) d from Equation (22), distribution model p(u (i) k |H 1 ) must be known a priori. One limitation of the extension of detection theory to multiscale denoising lies in the non-availability of a prior distribution model for multiscale signal coefficients. One exception to this could be the multiscale coefficients obtained from the DWT, which have been shown to follow heavy tailed exponential distributions [26]. However, a prior assumption for p(u (i) k |H 1 ) may not be possible for other transforms, which means that P d could not be easily computed for a general multiscale transform T (·). Consequently, the tradeoff between P f a and P d is fixed experimentally.

Remark 2.
For a given noise distribution, the threshold estimation is performed only once.

Multiscale GoF Statistics Estimation
In order to perform local hypothesis testing based on EDF, GoF test statistic S(u (i) k ) must be estimated for each window of local coefficients u For generality we denote both AD and CVM statistic by S(u (i) k ). As discussed above, the non-linear transforms change the distribution of noise at multiple scale, hence, to compute S(u (i) k ) using Equations (23) and (24), F (k) 0 (t) must be known a priori for each scale k for a given non-linear transform T (·). To this end, we estimate F (k) 0 (t) empirically by assuming a large sized AWGN η which is subsequently decomposed using multiscale transform as follows u (i) k = T (η). Next, multiscale noise coefficients u (i) k at each scale k are divided into M windows of local coefficients centered at the spatial location i. Subsequently, EDF F i η (t) of all windows is computed followed by the ensemble average of these EDFs resulting in the reference or null distribution F (k) To give an insight into the GoF based hypothesis testing, Figure 6 plots the null distribution F

Multiscale Thresholding Based on Hypothesis Testing
For a threshold λ k obtained for given P f a = α in Equation (20), the following hard thresholding function is employed based on the proposed hypothesis testing in Equation (19) for all the windows of multiscale coefficients corresponding to noisy signal where the central coefficient u (i) k is replaced by zero if null hypothesis is fulfilled, i.e., S(u In order to obtain the denoised image, the thresholded multiscale coefficients from Equation (25) are reconstructed by employing the transform T −1 (·) as followŝ whereŝ an estimate of the true signal (or image) s, or simply stated the denoised signal or image. Implementation of various forward and inverse wavelet transforms is reported in [54][55][56][57].

Inverse VST
In the case of Poisson denoing, exact unbiased inverse of Anscombe transform (Inv-AT) [20] of theŜ is performed to obtain denoised imageŜ. Similarly for Poisson-Gaussian denoising, inverse generalized Anscombe transformation (Inv-GAT) performed on the Gaussian denoisedŜ to obtain the Poisson-Gaussian denoised imageŜ .

Poisson Denoising
In this section, we discuss the performance of the proposed Poisson denoising method against the state-of-the-art. For Poisson denoising using the proposed framework, UWT and DTCWT were employed as transform domain methods while AD statistics was used as test statistic within the GoF framework. We compare the proposed Poisson denoising methods against MSVST [28], NLPCA [18], PureLet [14] and Poiss-NLM [17]. The set of input test images is composed of standard 'Lena', 'Plane', 'Peppers' and 'Boat' images along with two images capturing the aerial view of 'Padma River' and 'Ogden Valley'. These images were corrupted by signal-dependent Poisson noise arising from CMOS/CCD sensors where Poisson noise with varying intensities was added to the input images to simulate sensor noise. Since noise here is signal-dependent, increasing the peak amplitude of signal, increases the peak signal to noise ratio (PSNR) of the noisy image.
Each input image was corrupted by Poisson noise at varying signal peaks, i.e., 1-100 and the resulting input PSNR values are listed in Table 1, along with output PSNR values obtained by denoising these images using the state-of-the-art and the proposed Poisson denoising methods. Results in Table 1 show that the proposed AT-AD-DTCWT method yielded highest output PSNR values on most instances when compared to the other methods. The proposed AT-AD-UWT method also demonstrated comparable performance by consistently yielding second or third highest output PSNR values while at times it also managed to outperform all of the comparative methods. PureLet, which is considered as the gold standard method in Poisson denoising, remained competitive against the proposed methods and managed to beat them at a few input noise levels. Poiss-NLM also showed comparable denoising results but it mostly remained behind the PureLet and AT-AD-DTCWT methods in terms of the output PSNR values. MSVST failed to match the performance of the best methods but showed good denoising performance. NLPCA shows competitive performance on higher noise levels but as the noise level was reduced in the signal, NLPCA failed to improve its performance for all images. We also display denoised images of 'Lena' and 'Padma River', respectively, in Figures 8 and 9. In Figure 8, noisy 'Lena' image at signal peak = 20 is displayed in Figure 8a Figure 8c shows the denoised signal by the Poiss-NLM which seems to be devoid of the artifacts but at the cost of the loss of image details, due to over-smoothing of the denoised signal. The denoised signals by the proposed methods AT-AD-UWT and AT-AD-DTCWT, shown in Figure 8e,f respectively, showed lesser artifacts as compared to the PureLet and the MSVST while also extracting more signal details. When compared to Poiss-NLM, denoised images by the proposed AT-AD-DTCWT method extracted more details but at the expense of slight artifacts. The AT-AD-UWT also extracted higher signal details compared to the Poiss-NLM but with some visible artifacts. PureLet and MSVST also changed the brightness of the denoised images whereas the proposed methods did not alter signal brightness even at such higher noise level. Figure 9 compares the denoising performance of the proposed AT-AD-DTCWT method against the PureLet and the NLPCA at very high noise level, i.e., signal peak = 5, on 'Padma River' image. Original and noisy versions of 'Padma River' image are shown in Figure 9a,b while the denoised images from NLPCA, PureLet and the proposed AT-AD-DTCWT are shown in Figure 9c-e respectively. NLPCA not only over-smoothed the recovered image details but also blurred it, see Figure 9c. PureLet and the proposed AT-AD-DTCWT managed to recover significant image details even at such a high level of signal-dependent additive noise, see Figure 9d,e. Note that Purelet showed significant blurring artifacts in Figure 9d while denoised image by the proposed AT-AD-DTCWT recovered image details effectively with very little artifacts even at such a high input noise level, see Figure 9e.

Poisson-Gaussian denoising
We now provide comparative results of the proposed framework for Poisson-Gaussian denoising against the existing denoising methods. We use DTCWT as a transform domain method in the proposed methodology while AD is used as test statistic for the GoF based hypothesis testing. We name the proposed method GAT-AD-DTCWT and compare it against the PGureLet [15], GAT-BLSGSM [24] and MSVST-MPG [27]. We report the denoising results on all input images used in previous two sections, namely, 'Lena', 'Plane', 'Peppers', 'Boat', 'Padma River' and 'Ogden Valley'. These images were corrupted with input Poisson-Gaussian noise of varying noise levels to simulate CMOS/CCD sensor noise where the strength of the Poisson noise was defined by signal peak = 1, 2 ,3, 4, 5, and 10 and standard deviation σ of Gaussian noise was selected as σ = peak/10.
Input PSNR values corresponding to these parameters of Poisson-Gaussian noise are reported in Table 2, along with the output PSNR values of the denoised images by the comparative methods. Note that denoised images from the proposed method had highest output PSNR values on most input noise levels while the GAT-BLSGSM yielded competitive results. It was observed that PGureLet showed competitive results on higher noise levels while the GAT-BLSGSM yielded competitive results on lower noise levels. However, the proposed GAT-AD-DTCWT showed consistently improved performance at all noise levels. MSVST-MPG yielded lowest output PSNR values among all the methods. Figure 10 shows original, noisy and denoised 'Ogden Valley' images obtained from the comparative methods. Noisy images displayed in Figure 10b,f were, respectively, corrupted by Poisson-Gaussian noise at signal peak = 5 and 10 and AWGN standard deviation σ = 0.5 & 1. These noisy images were denoised using PGureLet, GAT-BLSGSM and proposed GAT-AD-DTCWT method which are, respectively, displayed in second, third and fourth columns of Figure 10. Observe that the PGureLet blurred the recovered images and distorted the information, see Figure 10c,g. GAT-BLSGSM recovered the image details well (see Figure 10d,h) as compared to the denoised images from PGureLet. However, denoised images by the GAT-BLSGSM yielded spike-like artifacts which are more evident in denoising at higher input noise level, as shown in Figure 10h. Denoised images by the proposed GAT-AD-DTCWT method are displayed in Figure 10e,i where not only image details have been preserved well but also the contrast and bright information is intact.

A Denoising Example of an Image Obtained from CMOS Sensor
In this section, we employ proposed GAT-AD-DTCWT to suppress real CMOS sensor noise from an image obtained through a CMOS camera installed in Xiomi Mi3 mobile. This image is made freely available as part of the RENOIR dataset [58] containing CMOS images corrupted by sensor noise. The study in [58] not only offers a noisy dataset but also compares the performance of the state-of-the-art Poisson and Poisson-Gaussian denoising methods for removing the sensor noise. Figure 11 displays the noisy CMOS image along with the denoised image using the GAT-AD-DTCWT method. Noisy image (top row) and the zoomed in view of the highlighted part (lower row) are, respectively, shown in Figure 11a, while the denoised image using the GAT-AD-DTCWT (top row) and the zoomed in view of the highlighted region (lower row) in Figure 11b. As can be observed from the Figure 11 (top row) that the proposed method successfully suppresses majority of the real CMOS sensor noise. The zoomed-in view of the highly detailed region in the noisy and the denoised image are shown in Figure 11 (lower row), where granular noise pattern or shot noise spikes are visible in the noisy image patch. However, in the denoised image patch, these granular patterns have been successfully suppressed by the proposed method. . Performance analysis of the proposed GAT-AD-DTCWT on a noisy image obtained from RENOIR dataset [58] which contains noisy images from CMOS sensors corrupted by real sensor noise.

Discussion and Conclusions
This article proposes a generalized denoising framework based on detection theory and applies it to remove Poisson and Poisson-Gaussian noise from CMOS/CCD image sensors. To this end, variance stability transformation (VST) has been combined with the proposed binary hypothesis testing framework to enable the detection and removal of Poisson and Poisson-Gaussian noise at multiple scales. For local hypothesis testing, statistical measures of goodness-of-fit test based on empirical distribution function (EDF), i.e., Anderson Darling (AD), Cramer Von Mises (CVM) have been employed in our work. Furthermore, different 2D transform domain methods have been tested within the proposed framework.
The proposed methodology has been shown to outperform the comparative state-of-the-art methods in Poisson and Poisson-Gaussian denoising. This could be attributed to the effectiveness of the proposed framework to handle the non-standard noise distributions due to its data driven nature. To further stress this point, an example of denoising a CMOS image corrupted with real sensor noise using the proposed GAT-AD-DTCWT is also presented which demonstrates the efficacy of the proposed framework for suppressing the noise due to CMOS/CCD sensors.
Computational complexity of the proposed framework can be minimized by offline estimation of thresholds versus P f a table which is only required to be computed only once. Similarly, the estimation of the reference CDF (in case of nonlinear transformation) may be also be performed offline. Other computationally intensive step involved in the proposed method is the local estimation of the EDF which requires computations of the order of O(L logL) while rest of the steps in the proposed methodology require computations equivalent to standard multiscale denoising methods.
The scope of this work is limited to the algorithm design for denoising CMOS/CCD images. However, in the following, we discuss few important aspects related to the hardware implementation of the proposed method. The proposed algorithm has a potential to be implemented in six stage-pipelined architecture, which enables parallel computations and increases the throughput of the system in real-time. The first-pipelined stage applies VST as a pre-processing step on the input noisy signal. The computationally expensive part of this stage is to evaluate square-root, CORDIC algorithms are used to compute square-root in hardware platforms (e.g., microcontrollers, processors and FPGAs). The second-pipelined stage computes the transform (UWT/DTCWT) operation on the pre-processed signal of the first stage. Real-time implementations of different transforms are reported in the literature [54][55][56][57]. Addition and average mathematical operations are needed to be executed in the third-pipelined stage of the proposed algorithm. In addition, these operations are executed in windowing fashion, therefore different blocks can be executed in parallel. In the fourth-pipelined stage, hard thresholding is done using GOF thresholding technique, which can be done by comparator implementation. The fifth-pipelined stage computed inverse transform operation, which has same computational complexity as the forward transform. Sixth-pipelined stage computed the exact inverse VST which requires pre-computed tables to empirically remove the bias at low photon count as stated in [20,25]. These tables can be stored and used in look-up table to decrease the computational complexity of this stage. Funding: This work is supported by the UK EPSRC through grants EP/P017487/1 and EP/R02572X/1.