No-reference image quality assessment using modified extreme learning machine classifier

doi:10.1016/j.asoc.2008.07.005

Applied Soft Computing

Volume 9, Issue 2, March 2009, Pages 541-552

https://doi.org/10.1016/j.asoc.2008.07.005 Get rights and content

Abstract

In this paper, we present a machine learning approach to measure the visual quality of JPEG-coded images. The features for predicting the perceived image quality are extracted by considering key human visual sensitivity (HVS) factors such as edge amplitude, edge length, background activity and background luminance. Image quality assessment involves estimating the functional relationship between HVS features and subjective test scores. The quality of the compressed images are obtained without referring to their original images (‘No Reference’ metric). Here, the problem of quality estimation is transformed to a classification problem and solved using extreme learning machine (ELM) algorithm. In ELM, the input weights and the bias values are randomly chosen and the output weights are analytically calculated. The generalization performance of the ELM algorithm for classification problems with imbalance in the number of samples per quality class depends critically on the input weights and the bias values. Hence, we propose two schemes, namely the k-fold selection scheme (KS-ELM) and the real-coded genetic algorithm (RCGA-ELM) to select the input weights and the bias values such that the generalization performance of the classifier is a maximum. Results indicate that the proposed schemes significantly improve the performance of ELM classifier under imbalance condition for image quality assessment. The experimental results prove that the estimated visual quality of the proposed RCGA-ELM emulates the mean opinion score very well. The experimental results are compared with the existing JPEG no-reference image quality metric and full-reference structural similarity image quality metric.

Introduction

The main objective of image/video quality assessment metrics is to evaluate the visual quality of a compressed image/video with/without referring to their original form. It is imperative that these measures exhibit good correlation with perception by the human visual system (HVS). The most widely used objective image quality metrics, namely, the mean square error (MSE) and the peak signal-to-noise ratio (PSNR), as widely observed, do not correlate well with human perception [1] besides requiring the original reference image to compute distortion. Most images on the Internet and in multimedia databases are only available in compressed form, and hence inaccessibility of the original reference image makes it difficult to measure the image quality. Therefore, there is an unquestionable need to develop metrics that closely correlate with human perception without needing the reference image.

Considerable volume of research has gone into developing objective image/video quality metrics that incorporate the perceived quality measurement with due consideration for HVS characteristics. However, most of the proposed metrics based on HVS characteristics require the original image as a reference [2], [3], [4], [5]. Though it is easy to assess the image quality without any reference by manual observations, developing a no-reference (NR) quality metric is a difficult task. To develop NR metrics, it is essential to have a priori knowledge about the nature of artifacts. Currently, NR quality metrics are the subject of considerable attention by the research community, visibly so, with the emergence of video quality experts group (VQEG) [6] which is in the process of standardizing NR and reduced-reference (RR) video quality assessment methods.

The most popular and widely used image format in the Internet as well as in digital cameras happens to be JPEG [7]. Since JPEG uses block-based DCT transform for coding to achieve compression, the major artifact that JPEG-compressed images suffer is blockiness. The compression rate (bit-rate) and image quality are mainly determined by the degree of quantization of these DCT coefficients. The undesirable consequences of quantization manifest as blockiness, ringing and blurring artifacts in the JPEG-coded image. It turns out that the subjective data for all these artifacts are highly correlated [8]. Hence, measuring the blockiness with reference to HVS criteria in turn indicates the image quality. Since, the image quality is a subjective phenomenon, the manual inspection plays an important role. The subjective test is concerned with how an image is perceived by a viewer and designates his/her opinion on a particular image (opinion score). The mean opinion score (MOS) provides average opinion score over all subjects. Here, the objective is to find the functional relationship between the extracted HVS features and MOS for quantifying the quality of the image.

Existing algorithms to measure the blockiness have used a variety of methods to do so. Wang and Bovik proposed an algorithm based on computing the FFT along the rows and columns to estimate the strength of the block edges of the image [9]. Further, they proposed a nonlinear-model for NR quality assessment of JPEG images, where the parameters of the model were determined with subjective test data [10]. Vlachos used cross-correlation of subsampled images to compute a blockiness metric [11]. Wu and Yuen proposed a metric based on computing gradients along block boundaries while tempering the result with a weighting function based on the HVS [3]. Here, the block edge strength for each frame was computed. Similar ideas about the HVS were utilized by Suthaharan [4] and Gao et al. [5]. The general idea behind these metrics was to temper the block edge gradient with the masking activity measured around it. These approaches utilize the fact that the gradient at a block edge can be masked by more spatially active areas around it, or in regions of extremities in illumination (very dark or bright regions) [12]. Jung et al. [13] proposed an NR metric for emulating the full-reference metric proposed by Karunasekera and Kingsbury [2] using neural network. Here they have used general image features for training the neural network and the results are not compared against the subjective test scores. On the other hand, recently, Gastaldo et al. [14], [15] proposed a circular back propagation (CBP)-based image quality evaluation method using the general pixel-based image features such as higher order moments without considering the HVS. In all these above mentioned approaches, extracting large number of general image features are computationally quite complex for real-time implementation. Also, the functional relation between the HVS features and the MOS are nonlinear and is difficult to mathematically model. Under these circumstances, neural networks are best suited for solving such problems.

In the last few decades, extensive research has been carried out in developing the theory and the application of artificial neural networks (ANNs). ANNs possess an inherent structure suitable for mapping complex characteristics, learning and optimization have emerged to be a powerful mathematical tool for solving various practical problems like pattern classification and recognition, medical imaging, speech recognition and control [16], [17], [18], [19]. Furthermore, from a practical perspective, the massive parallelism and fast adaptability of neural network implementations provide more incentives for further investigation in problems involving complex mapping with uncertainties. Of the many neural network architectures proposed, single layer feedforward network (SLFN) with sigmoidal or radial basis function are found to be effective for solving a number of real world problems. The free parameters of the network are learned from given training samples using gradient descent algorithms. The gradient descent algorithms are relatively slow and have many issues in error convergence.

Recently, it is shown that the SLFN network with randomly chosen input weights and hidden bias values can approximate any continuous function to any desirable accuracy [20]. Here, the output weights are analytically calculated using the Moore-Penrose generalized pseudo-inverse [21]. Since, the learning algorithm is faster and has a good generalization ability, it is called ‘extreme learning machine’ (ELM). The ELM algorithm overcomes many issues in traditional gradient algorithms such as stopping criterion, learning rate, number of epochs and local minima. In fact, the performance of the ELM algorithm on many real-world problems have been compared with the other neural network approaches [22] and its performance has been found to be better.

In this paper, we present image quality estimation using ELM algorithm. In general, the quality estimation problem is the process of finding the functional relationship between the MOS values and the feature inputs. But, the MOS values depend on the number of opinion scores per image. If the number of opinions available for a given image is low and statistically different, then it will affect performance of the image quality estimator. Hence, in this study, the problem is circumvented by converting the MOS values to the quality class. The functional relationship between the HVS features and the quality class label is approximated using the ELM classifier. The image quality metric is calculated using the predicted class label and the posterior probability.

Here, the quality classification problem has fewer training samples per class and high imbalance in number of samples per class. In such cases, the generalization performance of the ELM algorithm depends on the proper selection of the input weights and hidden bias values (fixed parameters). Also, the number of hidden neurons affects the generalization performances. Hence, in this paper, we present k-fold cross-validation (KS-ELM) and real-coded genetic algorithm (RCGA-ELM) approaches to select appropriate values for the free parameters in extreme learning machine classifier. In the RCGA-ELM approach, the minimal number of hidden neurons, its corresponding input weights and the bias values are selected automatically, whereas the KS-ELM approach requires an exhaustive search to determine the number of hidden neurons. The proposed RCGA-ELM is different from the existing ‘evolutionary ELM’ (E-ELM) algorithm [23]. In E-ELM, the genetic algorithm searches only for the best input weights and the bias values for a given number of hidden neurons such that the network has better generalization performance. In the E-ELM, the optimal number of hidden neurons are obtained using exhaustive search. Whereas in the RCGA-ELM, new genetic operators are defined to find the minimal number of hidden neurons and their corresponding input weights and the bias values. First, we evaluate the performances of KS-ELM, RCGA-ELM and ELM algorithms using classification problems from UCI machine learning repository [24] to validate the proposed schemes. The results clearly indicate that the proposed KS-ELM and RCGA-ELM provide better generalization performance over conventional ELM algorithm for the classification problems.

For our image quality estimation, experiments are carried out using two disjoint set of original images with its compressed version from the JPEG LIVE image quality database [25]. Out of 29 original images, 20 original images and its compressed version are used for image quality model development. The remaining nine original images and its compressed version are used for evaluating the performance. Finally, the performance of proposed image quality estimators are compared with the available NR image quality metric [10] and full-reference (FR) structural similarity image quality metric (SSIM) [26] techniques.

The organization of this paper is as follows: Section 2 describes the HVS based feature extraction technique. In Section 3, we briefly present the recently developed ELM algorithm and issues related to the ELM algorithm for classification problems with high imbalance in the samples. Section 4 present k-fold ELM and the RCGA-ELM classifier to handle high imbalance in the number of samples per class. Performance evaluation of the proposed classifiers for three benchmark multi-category problems and image quality estimation are presented in Section 5. Section 6 summarizes the main conclusions from this study.

Section snippets

HVS-based Feature Extraction

It is easily deducible that most of the distortion in image/video is due to the block DCT-based compression. The most popular and widely used image format, on Internet and digital cameras happens to be JPEG [7]. Since JPEG uses the block-based DCT transform for coding, to achieve compression, the major artifact that JPEG-compressed images suffer, is blockiness. In the JPEG coding, non-overlapping $8 \times 8$ pixel blocks are coded independently using DCT transform. The compression ratio and the image

Extreme learning machine

In this section, we present a brief overview of the extreme learning machine (ELM) algorithm [22]. ELM is a single hidden layer feedforward network, where the input weights are chosen randomly and the output weights are calculated analytically. For hidden neurons, many activation functions such as sigmoidal, sine, Gaussian and hard-limiting function can be used, and the output neurons have linear activation function. ELM uses the non-differentiable or even discontinuous functions as an

Real-coded genetic algorithm approach

The real-coded genetic algorithm (RCGA) is perhaps the most well-known of all evolution based search techniques [28]. Genetic algorithms were developed in an attempt to explain the adaptive processes of natural systems and to design artificial systems based upon these natural systems. Genetic algorithms are widely used to solve complex optimization problems where the number of parameters and constraints are large and the analytical solutions are difficult to obtain. In recent years, many

Experiments and discussions

In this section, we present the performance comparison of proposed KS-ELM, RCGA-ELM and ELM classifiers on benchmark classification data sets first. Next, we present the results for the image quality estimation problem.

Conclusions

In this paper, we have presented a system for predicting image quality using extreme learning machine algorithm, considering various human visual characteristics. The functional relationship between the extracted HVS features and the MOS is modeled by the ELM algorithm. The random selection of input weights and the bias values considerably affects the generalization performance of the ELM algorithm for classification problems with high imbalance in training data set. For this purpose, we

Acknowledgments

This work was in part supported by the ITRC, Korea University, Korea, under the auspices of the Ministry of Information and Communication. The authors would also like to thank Prof. Bovik and his lab members for providing the JPEG image quality assessment database to test our metric.

References (33)

L. Meesters et al.
A single-ended blockiness measure for JPEG-coded images
Signal Processing
(2002)
P. Gastaldo et al.
Objective quality assessment of displayed images by using neural networks
Signal Processing: Image Communication
(2005)
G.-B. Huang et al.
Extreme learning machine: Theory and applications
Neurocomputing
(2006)
Q.Y. Zhu et al.
Evolutionary extreme learning machine
Pattern Recognition
(2005)
Z. Wang et al.
No-reference perceptual quality assessment of JPEG compressed images
S.A. Karunasekera et al.
A distortion measure for blocking artifacts in images based on human visual sensitivity
IEEE Transactions on Image Processing
(1995)
H.R. Wu et al.
A generalized block-edge impairment metric for video coding
IEEE Signal Processing Letters
(1998)
S. Suthaharan
A perceptually significant block-edge impairment metric for digital video coding
W. Gao et al.
A de-blocking algorithm and a blockiness metric for highly compressed images
IEEE Transactions on Circuits and Systems for Video Technology
(2002)
Video Quality Experts Group (VQEG), website:...

JPEG official site,...

Z. Wang et al.

Blind measurement of blocking artifacts in images

Z. Wang et al.

Why is image quality assessment so difficult?

T. Vlachos

Detection of blocking artifacts in compressed video

(2000)

M. Yuen et al.

A survey of hybrid MC/DPCM/DCT video coding distortions

Signal Processing

(1997)

M. Jung et al.

Univariant assessment of the quality of images

Journal of Electronic Imaging

(2002)

Cited by (226)

An evolutionary supply chain management service model based on deep learning features for automated glaucoma detection using fundus images
2024, Engineering Applications of Artificial Intelligence
Glaucoma, a multifaceted eye condition, poses a high risk of vision impairment. Initially, most automated approaches segment the primary system and assess the clinical measurements to classify and screen for glaucoma. The proposed customized convolutional neural network (CNN) model for automated glaucoma detection, built using deep learning techniques, can assist many stakeholders in the supply chain management network. These stakeholders may include eye hospitals, healthcare service providers, doctors, ophthalmologists, patients, insurance companies, etc. The deployed model comprises four learnable layers, i.e., three convolution layers and a flattened layer. The customized CNN model learned the deep features with the least number of tunable parameters. Subsequently, a combined feature reduction strategy called principal component analysis (PCA) and linear discriminant analysis (LDA) to reduce the dimensions of feature sets. Finally, a classification is carried out by utilizing an extreme learning machine (ELM). The hidden node parameters of ELM are optimized with the help of the modified particle swarm optimization (MOD-PSO) technique. The generalized performance of the proposed model has been enhanced by employing 5-fold stratified cross-validation. The proposed model deployed on two standard datasets, G1020 and ORIGA. The experimental results show that the proposed computer-aided diagnosis (CAD) model achieves an accuracy of 97.80% and 98.46% on the G1020 and ORIGA datasets, respectively. The customized CNN model outperforms as compared to other state-of-the-art models with a significantly less number of features and could help the decision-makers of supply chain management networks.
A survey of deep learning approaches to image restoration
2022, Neurocomputing
In this paper, we present an extensive review on deep learning methods for image restoration tasks. Deep learning techniques, led by convolutional neural networks, have received a great deal of attention in almost all areas of image processing, especially in image classification. However, image restoration is a fundamental and challenging topic and plays significant roles in image processing, understanding and representation. It typically addresses image deblurring, denoising, dehazing and super-resolution. There are substantial differences in the approaches and mechanisms in deep learning methods for image restoration. Discriminative learning based methods are able to deal with issues of learning a restoration mapping function effectively, while optimisation models based methods can further enhance the performance with certain learning constraints. In this paper, we offer a comparative study of deep learning techniques in image denoising, deblurring, dehazing, and super-resolution, and summarise the principles involved in these tasks from various supervised deep network architectures, residual or skip connection and receptive field to unsupervised autoencoder mechanisms. Image quality criteria are also reviewed and their roles in image restoration are assessed. Based on our analysis, we further present an efficient network for deblurring and a couple of multi-objective training functions for super-resolution restoration tasks. The proposed methods are compared extensively with the state-of-the-art methods with both quantitative and qualitative analyses. Finally, we point out potential challenges and directions for future research.
Kernel extreme learning machine based hierarchical machine learning for multi-type and concurrent fault diagnosis
2021, Measurement: Journal of the International Measurement Confederation
The detection and identification of faults in rotary machines are of great significance to the mechanical equipment reliability especially the gearbox. Traditional machine learning algorithms suffer from low diagnosis accuracy of faults that have multiple types and exist concurrently. A novel machine learning method called hierarchical machine learning (HML) was proposed in this study to improve the faults diagnosis accuracy. The proposed algorithm consists of two layers. The first layer comprises a traditional machine learning model to identify the faults with distinguishable features and filter out these faults with indistinguishable features. The second layer model recognizes the faults filtered out by the first layer. In order to verify the effectiveness of the proposed method, the gearbox simulation experiment is carried out in the study. The simulation results validate that the proposed method outperforms other algorithms under an identical measure.
Fast discrete curvelet transform and modified PSO based improved evolutionary extreme learning machine for breast cancer detection
2021, Biomedical Signal Processing and Control
A significant research area in medical imaging analysis is digital mammography breast cancer detection in the early stage. For breast mass classification into the benign or malignant category, an enhanced automated computer-aided diagnosis (CAD) model is suggested in this work, enabling radiologists to identify breast diseases correctly in less time. First, a fast discrete curvelet transform with wrapping (FDCT-WRP) is deployed to extract the curve-like features and create a feature set. Then, a combined feature reduction strategy called principal component analysis (PCA) and linear discriminant analysis (LDA) is used to produce more relevant and reduced feature sets. Finally, a new enhanced learning algorithm called MODPSO-ELM incorporates modified particle swarm optimization (MODPSO) and an extreme learning machine (ELM) proposed for the classification task. In the MODPSO-ELM algorithm, MODPSO is utilized to optimize the hidden node parameters (input weights and hidden biases) of single-hidden-layer feedforward neural networks (SLFN) and analytically determined the output weight. The proposed CAD model has been evaluated on three standard datasets with a 10 × k-fold stratified cross-validation (SCV) test. It is found from the experiment that the suggested CAD model yields the best outcome for the MIAS dataset and obtains an accuracy of 98.94% and 98.76% for DDSM and INbreast datasets, respectively. The experimental results indicate that the proposed model is superior to other state-of-the-art models with a substantially reduced number of features with better classification accuracy.
Optimizing Weighted Extreme Learning Machines for imbalanced classification and application to credit card fraud detection
2020, Neurocomputing
The classification problems with imbalanced datasets widely exist in real word. An Extreme Learning Machine is found unsuitable for imbalanced classification problems. This work applies a Weighted Extreme Learning Machine (WELM) to handle them. Its two parameters are found to affect its performance greatly. The aim of this work is to apply various intelligent optimization methods to optimize a WELM and compare their performance in imbalanced classification. Experimental results show that WELM with a dandelion algorithm with probability-based mutation can perform better than WELM with improved particle swarm optimization, bat algorithm, genetic algorithm, dandelion algorithm and self-learning dandelion algorithm. In addition, the proposed algorithm is applied to credit card fraud detection. The results show that it can achieve high detection performance.
Robust extreme learning machine in the presence of outliers by iterative reweighted algorithm
2020, Applied Mathematics and Computation
Extreme learning machine (ELM) is widely used to derive the single-hidden layer feedforward neural networks. However, ELM faces a great challenge in the presence of outliers, which can result in the sensitivity and poor robustness. To overcome this dilemma, a non-convex 2-norm loss function is developed to reduce these negative influences by setting a fixed penalty on any potential outliers. A novel robust ELM is proposed in this paper, and the resultant optimization can be implemented by an iterative reweighted algorithm, called IRRELM. In each iteration, IRRELM solves a weighted ELM. Several artificial datasets, real-world datasets and financial time series datasets are employed in numerical experiments, which demonstrate that IRRELM has superior generalization performance and robustness for modeling datasets in the presence of outliers, especially at the higher outlier levels.

View all citing articles on Scopus

View full text

No-reference image quality assessment using modified extreme learning machine classifier

Abstract

Introduction

Section snippets

HVS-based Feature Extraction

Extreme learning machine

Real-coded genetic algorithm approach

Experiments and discussions

Conclusions

Acknowledgments

Signal Processing

Signal Processing: Image Communication

Neurocomputing

Pattern Recognition

No-reference perceptual quality assessment of JPEG compressed images

A distortion measure for blocking artifacts in images based on human visual sensitivity

IEEE Transactions on Image Processing

A generalized block-edge impairment metric for video coding

IEEE Signal Processing Letters

A perceptually significant block-edge impairment metric for digital video coding

A de-blocking algorithm and a blockiness metric for highly compressed images

IEEE Transactions on Circuits and Systems for Video Technology

Blind measurement of blocking artifacts in images

Why is image quality assessment so difficult?

Detection of blocking artifacts in compressed video

A survey of hybrid MC/DPCM/DCT video coding distortions

Signal Processing

Univariant assessment of the quality of images

Journal of Electronic Imaging