1 Introduction

Image denoising has become vital as noise may get introduced at any stage between input and output. There exist several techniques to denoise that are heuristic in nature. The computational complexity of such techniques should be considered before deploying. The development of a noise model is crucial for any denoising algorithm. But in many cases, the exact model of noise is not available. The prime sources of noise in images are due to the sensors’ electron sensitivity to thermal excitation [1], related hardware components of the sensor [2], and improper assignment of intensity values due to an operator effect.

Sensor noise is characterized as Gaussian and the remaining as impulse noise, which degrades the quality of the image. In the literature, many schemes exist focusing either Gaussian [3,4,5,6,7] or impulse noise [8,9,10,11,12]. The latter corrupts the image in such a way that only a few pixels are modified. The change in pixel value is to be either 0 or 255, resulting in salt-and-pepper noise, or a random value. Nonlinear filtering methods, viz. median filtering [13], are used predominantly to handle impulse noise. The pixels are not affected by the noise change if you use the median filter, hence making the process in vain. To eliminate the drawback of the median filter, adaptive median filtering, multistate median filtering and weighted median filtering are proposed, which proved to have better noise removal but fail for a mixture of noise [11, 14, 15].

Gaussian noise, in contrast to impulsive noise, corrupts each pixel value in the image, which randomly chosen from the Gaussian distribution. Hence, there is a stringent requirement to process the entire image. Linear filtering performs better in Gaussian noise, resulting in smooth edges. Bilateral filtering is one of the fundamental solutions to Gaussian noise [3]. Such a filter finds an optimum value for a pixel by calculating the weighted average of neighboring pixels. But similar pixels may be distributed in the image, not only at neighboring locations but also at any location in the image. Many modifications to the bilateral filter exist in the literature, and a notable scheme of those is non-local means filtering [11]. The weighted average of many pixels provides an estimate of the original pixel value, which is lost during the error mechanism. In non-local mean filtering, the selection of weights to different pixels depends on the similarity of the region, where the pixel values need to be estimated and to that of different parts of the image.

Collection of similar parts or more clearly patches, over the image along with the respective weights, leads to a benchmark called block matching and 3D filtering (BM3D) [16]. 2D blocks are 2D image fragments. The collection of these blocks forms a 3D group. The use of these groups restores images in the transform domain [17, 18] and is introduced in regularization terms of sparse coding to exploit structural properties with fewer reference frames. Sparse representation models can be used to code a patch of an image. But such a representation tends to be less accurate. Hence, the sparse coding noise is exploited [17]. The training process involves denoising based on approximation of the noisy patches using a sparse linear combination of elements taken from a dictionary [18]. But, the complexity of such dictionaries is very high with mixed noise. Yuanjie Shao et al. [19] proposed a joint deblurring and matching scheme with sparse representation prior. This scheme is computationally efficient by the use of pseudo-Zernike moment, which is having a much lower-dimension-based representation than the original image feature. Rabha W. Ibrahim [20] proposed a scheme for denoising of multiplicative noise using fractional calculus. A convolution operational product of the input image pixels with a conformable fractional calculus mask window resulted in better denoising. Yehu Lv [21] presented a total variation method to tackle Poisson noise, which preserves edges while smoothing and has better staircase effect elimination. In a switching bilateral filter was proposed to remove mixed noise [22]. The filter parameters are computed based on the identification of noisy pixels. Noise estimation was done using the domain weight pattern by which filter parameters are adaptively chosen to remove mixed noise. A fuzzy-based hybrid filter was proposed to remove the mixed noise [23]. A fuzzy metric-based filter is used to remove impulsive noise, while a fuzzy peer group method is used in the second stage to remove Gaussian noise.

In practice, the source of noise cannot be attributed to a single phenomenon. There exist multiple sources of noise, and correspondingly, in this paper, a mixed noise with Gaussian and impulse noise is considered. The mix of impulse and Gaussian makes the problem of denoising more difficult, as they possess entirely different properties. Few methods exist in the literature to remove this kind of mixed noise. But such schemes involve two sequential steps, i.e., detection of pixels that are corrupted due to impulsive noise and then remove the noise. Such schemes seem to be less effective when the noise is predominant. Hence, in this paper, a simple scheme is proposed to handle such mixed noise. Experimental results proved it to be effective. A weight matrix is defined, which identifies the locations of the image that are undergoing the effect of specific kind of noise, and then the noise is removed using sparse coding.

2 Background

As pointed out in the previous section, some of the pixels and patches of the noisy regions distribute over the entire image. The collection of similar pixels or the regions together is referred to as grouping. In general, the collection of say k-dimensional fragments forms k + 1-dimensional groups. Hence, a group is a k + 1-dimensional object molded by grouping similar components of the image. The application of grouping enables the role of higher-dimensional filters that detect similarity among the segments of the groups. Further, this provides an estimate of the accurate signal out of this collection. This process is collaborative filtering. K-means clustering is one of the fundamental clustering techniques [24] that forms groups, by identifying ‘k’ centroids and aligning each data component of the signal to one of the centroids based on its similarity to these centroids. A distance parameter is defined to measure the similarity. Neural network concepts like self-organizing feature maps (SOFM) [25] are also in use for many years and mainly used to reduce the dimensionality of the signal. Fine-tuning of such networks to the number of levels results in clusters [26, 27]. These schemes perform clustering in such a way that a test segment belongs to one and only one cluster.

In contrast to this, matching-based grouping maps a test segment to multiple clusters based on its similarity to these clusters. Now, a series of clusters may be defined based on the level of similarity. Now, a higher threshold for similarity provides less number of segments in the cluster. As the threshold decreases, the size of the cluster increases. This kind of scheme is flexible in defining clusters for a problem at hand.

3 Mixed noise

Let p be the image and pxy denote the pixel value of image p at the location specified by (x, y). Let the noisy version of the image p be q. In the presence of additive white Gaussian noise, qxy, the pixel value of noisy observation at (x, y) can be modeled as

$$q_{xy} = p_{xy} + r_{xy}$$

where ‘r’ is the noise component. The Gaussian noise can be of a different kind. Indeed, when more noise sources are present, the active noise can be modeled using Gaussian noise. It is true even when the noise sources are not identically distributed. Gaussian noise sources besides or in any other relation produce a noise which can be characterized by Gaussian distribution itself. Figure 1 shows the effect of Gaussian noise. The first row shows the actual images. The remaining rows show noisy effects from Gaussian noise with varying standard deviation of 25, 50, 75 and 100.

Fig. 1
figure 1

Effect of additive white Gaussian noise. First row: true images

Impulsive noise, as mentioned in earlier sections, imposes a random value on the image at some locations specified by the parameters of impulsive noise. The impulsive noise is characterized by [cmin, cmax]. The difference between these extremes defines the dynamic range of the impulsive noise. A special case of impulsive noise is salt-and-pepper noise. In this case, either complete black or white pixels get added with the density defined by its parameter. Figure 2 shows noisy images with different densities of impulsive noise. The density is directly related to the amount of noise imposed and on the number of pixels affected by noise.

Fig. 2
figure 2

Effect of impulsive noise (salt-and-pepper noise)

In practice, irrespective of noise sources, the noise can be attributed to the channel. Channel noise is not pure Gaussian or impulsive or any other kind of noise. In this paper, a mixture of Gaussian and impulsive noise is simulated and applied to an image. Figure 3 shows the noisy images.

Fig. 3
figure 3

Effect of mixed noise—Gaussian and impulsive noise

As can be verified from Fig. 1, 2, and 3, the effect of mixed noise is more effective than single noise. Table 1 shows the PSNR values of the noisy image with different values of noise parameters of Gaussian and impulse noise.

Table 1 PSNR values of mixed noise

4 Proposed method

Let x be the image. The image x can be represented using sparse representation in terms of another image. For this, an overcomplete dictionary is to be built. The dictionary can be either a set of predefined functions or the one designed adaptively using a set of high-quality images. The former is a practical scheme only if the signal at hand is represented using the features of a predefined set of functions. In the latter case, the use of a set of high-quality images gives a better solution by collecting all the needed fragments and reducing the number by a selection criterion.

Let xi be the fragment extracted out of x from the location linked with i. Using the sparse representation [17], an overcomplete dictionary \(\varPhi = \left[ {\phi_{1} ,\phi_{2} , \ldots ,\phi_{n} } \right]\) is built, by which xi can be coded. The segment xi can be generated using the dictionary

$${\mathbf{x}}_{i} = \varPhi \alpha_{i}$$
(1)

where \(\alpha_{i}\) is the coding vector. The least-square solution of x is given by

$${\mathbf{x}} = \left( {\sum\limits_{i} {R_{i}^{T} R} } \right)^{ - 1} \left( {\sum\limits_{i} {R_{i}^{T} \varPhi \alpha_{i} } } \right)$$
(2)

Equation (2) can be rewritten as

$${\mathbf{x}} = \varPhi \alpha$$
(3)

where \(\alpha\) is the collection of all vectors \(\alpha_{i}\). Under the presence of noise, y is the degraded version of image x. In the case of additive white Gaussian noise, the denoising model is characterized by

$$\hat{\alpha } = \arg \,\min_{\alpha } \left\| {{\mathbf{y}} - \varPhi \alpha } \right\|_{2}^{2} + \lambda {\mathbf{R}}(\alpha )$$
(4)

Here, λ is a regularization factor that scales the regularization term R(α). Though this model is designed for AWGN, this applies to any kind of additive noise. In the mixed noise case, the fidelity term in Eq. (4) does not yield the maximum a posteriori solution.

In the case of mixed noise, particularly the mix of Gaussian and impulse, there exist many pixels that are affected only by Gaussian noise. The remaining are affected by both the Gaussian and impulsive noise. The number of pixels affected by only Gaussian noise is directly related to the density of impulse noise.

Now, the pixels undergoing the effect of only Gaussian noise need to be distinguished from that are being affected by both Gaussian and impulse noise. A parameter needs to be defined to discriminate the same. This parameter is defined and included in the standard sparse coding to remove the noise resulted from both Gaussian and impulse noise. This is accomplished by using a weight matrix. Equation (5) shows the modified model.

$$\hat{\alpha } = \arg \,\min_{\alpha } \left\| {W^{1/2} ({\mathbf{y}} - \varPhi \alpha )} \right\|_{2}^{2} + \lambda {\mathbf{R}}(\alpha )$$
(5)

Here, W is the weight matrix.

The quality of the reconstructed image improved to a great extent by considering non-local similarity. Many repetitive patches are present over the image. For each patch xi in the image, a similar patch is identified using the following relation.

$$e_{i} = \left\| {x_{i} - x_{i}^{'} } \right\|_{2}^{2} \le t$$
(6)

where \(t\) is a threshold. Now, \(x_{i}\) is estimated using the weighted average of \(x_{i}^{'}\). By fixing the appropriate weights, weighted average is calculated. Let b be the weights. The estimation is not always accurate, but gives a certain amount of error given by,

$$\left\| {x_{i} - \sum\limits_{i = 1}^{L} {b_{i} x_{i}^{\prime} } } \right\|_{2}^{2}$$
(7)

Using Eq. (7), the denoising model in the presence of mixed noise can be modeled as

$$\hat{\alpha } = \arg \,\min_{\alpha } \left\| {W^{1/2} ({\mathbf{y}} - \varPhi \alpha )} \right\|_{2}^{2} + \lambda {\mathbf{R}}(\alpha ) + \left\| {x_{i} - \sum\limits_{i = 1}^{L} {b_{i} x_{i}^{\prime} } } \right\|_{2}^{2}$$
(8)

The proposed model exploited the non-local similarity present in the image. The inclusion of the weight matrix in sparse coding is the crucial contribution that enables the application of standard sparse based schemes on mixed noise. The standard regression models, along with sparse coding in image restoration, can be applied to this scheme also.

5 Simulation results

This section presents the experimental results of the proposed scheme. The earlier sections discuss the effect of Gaussian noise, impulse noise and mixed noise with respective noise parameters and PSNR values. The effect of noise in the presence of mixed noise is more severe than that of individual noise. Figure 4 shows noisy images with different amounts of Gaussian and impulse noises along with individual noise. Figure 5 shows denoised images with the respective PSNR values.

Fig. 4
figure 4

Noisy images with Gaussian and impulse noise

Fig. 5
figure 5

Denoised images

In addition to the above combinations of Gaussian and impulse noise, several other combinations are simulated and found that the proposed method provides quality images even when the noise is intense. Table 2 shows as many as 66 noise combinations. Table 3 shows the PSNR values of denoised images.

Table 2 PSNR values of noisy images with different combinations of Gaussian and impulse noises
Table 3 PSNR values of denoised images with different combinations given in Table 2

From Table 2, it is evident that the effect of impulse noise is vital in the overall noise quantity. The PSNR values without the presence of Gaussian noise are 18.3 dB, 15.4 dB, 13.7 dB, 12.4 dB, 11.5 dB and 10.8 dB where the density of salt-and-pepper noise is 0.05, 0.1, 0.15, 0.2, 0.25 and 0.3, respectively. The contamination of Gaussian noise with existing noisy images resulted in the degradation of image quality when the density of impulse noise is low. But, when the density of impulse noise increases, the effect of Gaussian noise is nominal. The complete set of PSNR values in Table 3 suggests that the operations involved in the proposed scheme results in restored images with a quality that is directly proportional to the noise content in the degraded image.

Table 4 shows the performance analysis of the proposed technique in contrast to the state-of-the-art techniques and the resulting PSNR values for different quantities of standard deviation and impulse noise Density (ID). As the percentage of salt-and-pepper quantity increases, the number of pixels affected by noise also increases. From Table 4, it is evident that the proposed scheme outperforms the state-of-the-art techniques even in high noise conditions. The improvement in performance is predominant when the noise is strongest. From Fig. 6, it is evident that there exists at least 2 dB improvement when the noise parameters sigma and density are 25 and 0.3.

Table 4 Performance analysis of proposed technique (PSNR values)
Fig. 6
figure 6

Performance analysis of the proposed scheme

6 Conclusions

Denoising an image contaminated by both Gaussian and impulsive noise is performed using a sparse representation-based model with dictionary learning. The standard sparse-based scheme reduces the effect of Gaussian noise, while the weight matrix models the denoising with mixed noise. The proposed method is analogous to hybrid filtering schemes where the filter parameters are chosen adaptively for specific noise types. In the hybrid filtering schemes, estimation models identify the noise type and effect of noise. In this paper, the weight matrix identifies the type of noise. Results show that the proposed method performs well, even in higher noise cases. The proposed method produces a 2% improvement when the noise quantity is less, and up to 8% when the noise parameters are high that seems to outperform in contrast to the state-of-the-art schemes.