Abstract

A good contrast is significant for analysis of medical images, and if the images have poor contrast, then some methods of contrast enhancement can be of much benefit. In this paper, a convolution neural network-based transfer learning approach is utilized for contrast enhancement of mammographic images. The experiments are conducted on ISP and MIAS datasets, where ISP dataset is used for training and MIAS dataset is used for testing (contrast enhancement). Experimental comparison of the proposed technique is done with the most popular direct and indirect contrast enhancement techniques such as CLAHE, BBHE, RMSHE, and contrast stretching. A qualitative comparison is done using mean square error (MSE), signal to noise ratio (SNR), and peak signal to noise ratio (PSNR). It is observed that the proposed technique outperforms the other techniques HE, RMSHE, CLAHE, BBHE, and contrast stretching.

1. Introduction

In women, breast cancer is the most common disease after lung cancer [1, 2]. Its detection and treatment in the early phases enhance the chance of successful recovery from the disease. This is done by mammogram technique which checks the abnormalities present in the breasts [3]. The contrast indicates the regions with increased blood flow as cancerous tissues have more blood vessels. In the mammogram, the difference in the contrast of malignant tissue and normal tissue is very low, and human eyes may not be able to observe it [4]. Hence, enhancing the contrast between the normal and cancerous cells makes it much easier to detect the cancer in mammograms.

The contrast enhancement techniques can be broadly classified as indirect and direct (refer Figure 1). Indirect techniques increase the contrast by modifying the histogram, whereas direct techniques modify the image contrast directly. Histogram equalization [5] is the earliest indirect technique for contrast enhancement. Contrast limited adaptive histogram equalization (CLAHE) [6] is an improved version of histogram equalization. Brightness preserving bihistogram equalization (BBHE) [7] bifurcates the image using mean and then does histogram equalization. Recursive mean-separate histogram equalization (RMSHE) [8] is an enhanced version of BBHE. Contrast stretching [9] is a direct technique which enhances the range of intensity.

In this paper, a convolution neural network (CNN)-based transfer learning approach is used for enhancing the contrast of the mammographic images. Transfer learning refers to the machine learning approach in which training and test dataset are different. So, a CNN trained on dataset of contrast-enhanced images is applied on mammographic images to regenerate the same image with higher contrast. The contributions of this paper are as follows:(i)A novel convolutional neural network-based solution is proposed for contrast enhancement of images(ii)For evaluation of the results, a new SSIM metric is utilized which gives better idea of the quality of the image(iii)Transfer learning approach is utilized to enhance the contrast of medical images by training the data on nonmedical dataset

The remainder of the paper is organized as follows: in Section 2, a literature survey of contrast enhancement techniques is provided where a discussion of all contrast enhancement techniques is done. In Section 3, the CNN architecture utilized for training the dataset is discussed. Experimental results and their analysis are shown in Section 4. Section 5 concludes the research paper and presents future directions.

2. Literature Survey

In this section, all the popular contrast enhancement techniques present in the literature such as histogram equalization, CLAHE, and BBHE are discussed.

Histogram equalization (HE) maps all input levels to one grey level [10]. The probability of all grey levels is uniformly distributed in the output image, i.e., at each grey level, we have an equal number of pixels [11, 12]. This technique has the disadvantage that it considers the global intensity of the image instead of local intensity for contrast enhancement. It does not consider or think about input visual details of the image at the time of enhancement. This results in the image having excessive contrast enhancement. The resulting image looks unnatural and causes visual artifacts in the image.

Contrast limited adaptive histogram equalization (CLAHE) is an upgraded form of histogram equalization [8, 10, 13] which is generally used for low contrast images. CLAHE first diverges the input image into multiple disjoint images that do not overlap each other [14]. After that, it performs histogram equalization on all the disjoint images. In this technique, the slope of the function is used for transformation, depending on the height of the histogram. Then, all histograms of these disjoint images are clipped to a limit [15]. Clipping limit is used to limit the upper range of enhancement of every pixel [6]. Histogram equalization of all subimages is done separately. In the resultant image, all the details are very clear concerning the background [16]. The CLAHE technique enhances both the foreground and background which is the biggest advantage of this technique. This results in a high contrast output image.

Brightness preserving bihistogram equalization (BBHE) technique bifurcates the image by using the mean brightness of the image [4, 7, 17]. The first part of image contains pixels having intensity value from zero intensity to mean intensity and the other part of images contains pixels having intensity value from mean intensity to max intensity of the image. The BBHE technique independently performs histogram equalization on both parts of image obtained using bifurcation. After histogram equalization of both the images, this technique performs a union of both sublevel images and gives brightness preserved contrast-enhanced image [18, 19]. The disadvantage of this method is that it does not give good results for distorted contrast images.

Recursive mean-separate histogram equalization (RMSHE) technique first separates the mean and then performs histogram equalization [20]. This technique has better brightness preservation, i.e., original values of brightness are not destroyed [10, 18]. RMSHE technique first bifurcates the original image by using the mean intensity. Then, after separation, this technique performs histogram equalization on both the images. This technique does mean separation recursively. Every time it does mean separation, it generates a better image. However, one disadvantage is that more mean separation makes it complex and time-consuming.

Contrast stretching in [9] is a type of normalization which performs stretching on the range of intensities. To perform stretching, this technique specifies limits of the upper pixel value on which normalization is performed; it also specifies limits of the lower pixel value for normalization of the image. The advantage of contrast stretching is that it enhances contrast in the image without distorting grey levels.

Retinex theory is a popular method for enhancement of low-light images. In this method, image is decomposed into two partitions: reflectance and illumination. SSR (single-scale retinex) [21], MSR (multiscale retinex) [22], and MSRCR (multiscale retinex with color restoration) [23] are some of the retinex-based algorithms for enhancement of low-light images. Lime (low-light image enhancement) [24] considers the image enhancement as an optimization problem where the light of the image is to be optimized.

DWT-SVD (discrete wavelet transform-singular value decomposition) [25, 26] method of machine learning have been actively used enhancement of low-light images. Recently, researchers have worked on image enhancement using deep learning. LightenNet [27] is a CNN-based solution for enhancement of weakly illuminated images. Low-light net (LLNet) [28] is an auto-encoder-based solution for denoising and enhancing images. Low-light CNN [29] improves on LLNet by adopting structural similarity index (SSIM) as the loss function for enhanced texture preservation. A dual transformation network has been recently proposed by [30] for image contrast enhancement. In [31], the authors proposed a two-stage neural network to enhance the contrast of CT scan images. Recently, a neural network-based progressive-recursive image enhancement network [32] has been proposed to enhance low-light images.

3. Proposed Technique

The deep learning methods discussed in previous sections are for image enhancement and not dedicated to contrast enhancement. In this section, convolution neural network architecture is proposed dedicated to contrast enhancement in images.

3.1. Convolution Neural Networks

Convolution neural networks (CNN) are advanced neural networks which are able to extract features from the images by themselves. A convolution neural network consists of multiple convolution and pooling layers with a fully connected layer at the end [33]. The convolution layer is responsible for extracting the features from the images, whereas the pooling layer reduces the size of the input image by preserving the important information. VGG16, LeNet, AlexNet, and ResNet are some of the popular CNN architectures [34].

In mathematics, “convolution” refers to an operation on two functions which generates a third function [35]. In neural networks, the convolution refers to multiplication operation between input vectors and weight vector. In CNN, the input is a 2D array and weight vector is also a 2D array but of smaller dimensions. This 2D array of weights is called as “kernel.” The kernel is slid over the input array in overlapping or nonoverlapping fashion to generate a 2D output array. The weights of the filter are adjusted and optimized during the training process using back-propagation algorithm. The weights of this filter help in extracting the features from the images.

3.2. CNN Architecture for Contrast Enhancement

The architecture takes a 3 × 256 × 256 (i.e., color/RGB) image and passes it through a convolution layer with 5 × 5 kernel. The output of this layer is low-level features such as edges and curves. This output is fed into second convolution layer with 3 × 3 kernel to get higher-level features such as quadrilateral, semicircles, and other combinations of edges and curves. The output of both the layers will be feature maps of size 64 × 256 × 256.

After passing the image through two convolution layers and obtaining features, upsampling operations are done to enhance the resolution of images. A bilinear (bicubic) interpolation method is used to magnify the image to nearly twice the size. This is done so that the image can be passed to another convolution layer to get even higher-level features. The output of upsampling layer will be a feature map of size 64 × 512 × 512.

The output of upsampling layer is passed to the third convolution layer with 3 × 3 kernel to get the higher-level features (compared to those generated by second layer) of size 64 × 512 × 512. These features are then passed to a max-pooling layer of kernel size 2 to reduce feature size to 64 × 256 × 256.

The feature map generated by pooling layer is passed to fourth convolution layer with kernel size 3 × 3 to generate even higher-level features of size 64 × 256 × 256. These features are again upsampled to get a feature map of size 64 × 512 × 512. These upsampled features are passed to three consecutive convolution layers with kernel size 3 × 3. The output layer will generate an image of size 3 × 256 × 256. Table 1 shows the layers with output size.

The loss function to minimize is the combination of L1 loss and SSIM (Structural SIMilarity index). L1 loss helps in preserving the pixel-wise relations between the images but results in lack of textural details. To preserve the textural details, SSIM is utilized.where “x” represents the original image, “y” represents the enhanced image, “µ” represents the pixel value average of image, “σ” represents the variance/covariance, and c1, c2 are the constants which prevent the denominator from being zero.

We need to maximize SSIM, so we minimize 1-SSIM. S, the loss function iswhere “” is set to 0.1.

4. Results and Discussion

The dataset used for training the network is DeepISP dataset which consists of 110 pairs (normal and low-light) of images, as shown in Figure 2. After training the CNN network on these images, the mammography images from the MIAS dataset obtained from Kaggle are enhanced (i.e., transfer learning approach).

4.1. Evaluation Metrics

The contrast enhancement gives a processed image that has better contrast than the unprocessed image. We can identify this type of enhancement by visual inspection of the image. However, by visual inspection, we cannot get complete and specific characterization. However, no parameter or method can give both subjective and objective specialization. So, quality parameters MSE, PSNR, and SNR are used for the performance evaluation.

4.1.1. Mean Square Error (MSE)

MSE finds out the average of the squares of the difference of pixel values in both the images. MSE is a risk function, also known as the mean square deviation. The smaller value of MSE denotes a better-quality image and vice versa. We define the error between two images with the mathematical formula:

Here, n denotes the total pixels of the given image. Ai and Bi denote the ith pixel of images A and B.

4.1.2. Peak Signal to Noise Ratio (PSNR)

PSNR quantifies peak error and compares image compression quality. However, perceptual quality is not reflected by this. A small PSNR value indicates poor image quality. The high value of PSNR indicates good image quality. The mathematical formula for calculation of PSNR iswhere M is the mean square error in the image. Here, R denotes the highest fluctuation present in the image or we can say that it is the highest possible pixel value. For images that represent pixels with 8 bits per sample R is 255, R can be calculated using the formula:

Here, B denotes bits value per sample by which pixel of the images is represented.

4.1.3. Signal to Noise Ratio (SNR)

SNR is defined as the ratio of signal to noise in the image. A higher SNR value indicates high image quality and vice versa. SNR is calculated using the following equation:where Msignal is a signal of power and Mnoise is the noise of power.

4.1.4. Structural Similarity Index

To preserve the textural details, SSIM is utilized.where “x” represents the original image, “y” represents the enhanced image, “µ” represents the pixel value average of image, “σ” represents the variance/covariance, and c1, c2 are the constants which prevent the denominator from being zero.

4.2. Comparative Analysis

The results are presented in Table 2 and Figures 36.

On analysing Table 2, it is noted (Figure 4 and Figure 5) that the proposed CNN-CE technique gives the least MSE value among all the contrast enhancement techniques. Similarly, on analysing Table 2, it is observed that the CNN-CE technique gives the highest PSNR value for all images among all the contrast enhancement techniques. Further, it is observed that the CNN-CE technique gives the highest SNR value for all images among all the contrast enhancement techniques followed by contrast stretching and CLAHE technique.

Thus, based on performance analysis on MSE, PSNR, and SNR, it can be concluded that the proposed technique gives the best-enhanced image in comparison to histogram equalization, CLAHE, BBHE, RMSHE, and contrast stretching techniques.

Figures 3(b) to 3(d) show the results of histogram equalization, CLAHE, and BBHE techniques on the original image (mdb021) which is shown in Figure 3(a), and Figures 4(a) to 4(c) show the results of RMSHE, contrast stretching, and CNN-CE techniques, respectively.

It is noted from Figures 3 and 4 that histogram equalization enhances all the pixels of the image to a uniform level, and thus it just shows a brighter image. CLAHE technique gives a better result for mammogram images. It performs better compared to other techniques, except contrast stretching and the proposed technique. It shows details in the image relative to the background. BBHE technique gives a better result for the image. RMSHE technique gives better results compared to the BBHE technique but worse than the CLAHE technique. The contrast stretching technique gives the best result for mammogram images after the MBHE technique. The CNN-CE technique outperforms all the techniques.

Figures 5(b) to 5(d) show the results of histogram equalization, CLAHE, and BBHE techniques on the original image (mdb004) which is shown in Figure 5(a). Figures 6(a) to 6(c) show the results of RMSHE, contrast stretching, and CNN-CE technique, respectively.

From Figures 5 and 6, it is observed that histogram equalization technique extremely enhances the brightness and gives an unnatural look to the image. CLAHE technique improves both foreground and background of the image and gives a much better result. BBHE technique gives the average result. RMSHE technique gives a better result than BBHE and HE techniques. Contrast stretching enhances the contrast of the image up to a limit. CNN-CE technique gives the best result.

5. Conclusions

In this paper, a new convolution neural network-based technique is proposed for better enhancement of mammogram images named CNN-CE. The proposed technique is compared with state-of-the-art techniques such as histogram equalization, CLAHE, contrast stretching, BBHE, and RMSHE by applying on several different mammogram images taken from the standard MIAS dataset. Based on performance analysis using evaluation metrics MSE, PSNR, and SNR, it is evident that CNN-CE achieves the best contrast enhancement for low contrast medical images such as mammogram images. The proposed technique also gives better brightness preservation for the mammographic image with SSIM score of 0.99 as compared to other algorithms with scores in the range 0.95 to 0.98. In the future scope, more advanced deep neural networks can be utilized or more recently proposed CNN architectures such as XRCE and NEC-UIUC can be used for transfer learning.

Data Availability

The data will be available from the author upon request ([email protected]).

Conflicts of Interest

The authors declare that they have no conflicts of interest.