Elsevier

Neurocomputing

Volume 173, Part 3, 15 January 2016, Pages 1630-1639
Neurocomputing

Sample-screening MKL method via boosting strategy for hyperspectral image classification

https://doi.org/10.1016/j.neucom.2015.09.035Get rights and content

Abstract

The problem of limited training samples is always a major concern in hyperspectral remote sensing image classification. In this paper, a sample-screening multiple kernel learning (S2MKL) method is proposed for hyperspectral image classification with limited training samples. The core idea of the proposed method is to employ boosting strategy for screening the limited training samples under MKL framework. Different from existing methods, the proposed MKL method exploits the boosting trick to try different combinations of the limited training samples and adaptively determine the optimal weights of base kernels in the linear combination. Morphological profiles are firstly extracted as the both spatial and spectral features for classification instead of the original spectra. With the morphological profiles, AdaBoost strategy is then introduced to guide the construction of multiple kernel learning machine. By means of boosting strategy, the limited samples are effectively screened and used for classification. Meanwhile, the weights of base kernels in the linear combination are automatically determined in the process of screening samples. Three real hyperspectral data sets are used to evaluate the proposed method. The experimental results show that the proposed boosting-based multiple kernel learning method is superior to state-of-the-art methods in terms of classification performance while limited samples are used.

Introduction

Satellite images captured by hyperspectral sensors often have more than one hundred spectral bands for each pixel. Therefore, hyperspectral images can provide abundant spectral information regarding the physical nature of the different materials which can be utilized to distinguish objects in the image scene. The analysis of hyperspectral images is an active research area in remote sensing community, especially hyperspectral image classification [1]. In addition, it is also important to make use of the available spatial information of hyperspectral image [2], [3], especially for urban data classification. The morphology method is based on making use of both the spectral and spatial information for classification, [4], [5], [6]. The relative morphology methods have been successfully applied in hyperspectral images, such as extended morphological profiles (EMPs) [7], morphological attribute profiles (APs) [8], etc. Jointing spectral-spatial texture features can also be done by using feature transform (such as wavelet and Gabor transform), [9], [10], [11]. Another family of spectral-spatial classification methods is based on image segmentation, [12], [13], [14]. Segmenting the hyperspectral image into different regions based the homogeneity so that all the pixels with the same region can be considered as a spatial neighborhood.

The small size of samples has been a challenging problem because of the Hughes phenomenon in hyperspectral image classification. To deal with the problem, support vector machine (SVM) [15], [16] is introduced with only small training data. SVM maps training data to a much higher dimension, and in the new dimension searches for a linear optimal hyper-plane, which decides the category of the samples. Apart from the high discrimination power in general, SVMs are particularly well-suited for hyperspectral image classification, with only small training data sets and no underlying assumptions of statistical distribution.

Hyperspectral features are strongly correlated, hence, valuable and independent information can often be summarized in a well-chosen subset. As for MPs feature, which can be generated at different scales, it is crucial to select good scales and types of SE. The existing methods generally stack multiscale MPs as a vector, and SVM is then used to implement with the stacked feature vector.

Recently, multiple kernel learning (MKL) has attracted more and more attention [17], [18], [19], [20], [21]. MKL combines multiple base kernels building by features and generates a new kernel machine, which outperforms than any individual base kernel. In consideration of different feature kernels, MKL achieves better capability and higher flexibility. Some effective MKL methods have been proposed for hyperspectral image classification, such as SimpleMKL [22], representative multiple kernel learning (RMKL) [23], and rule-based multiple kernel learning (RBMKL) [24], etc. SimpleMKL algorithm utilizes a gradient descent to iteratively optimize the SVM objective function and finds a convergence solution to the optimal combination of base kernels. RMKL aims to capture the most variation of the original base kernels by finding a low-dimensional representation. RBMKL (mean) trains an SVM with means of the combined kernels. RBMKL (product) trains an SVM with product of the combined kernels. The conventional MKL learns the models by seeking the optimal combination of multiple predefined kernels for classification tasks. The conventional MKL methods (e.g., SimpleMKL) are often formulated as a complicated optimization task, typically convex optimization task, which is then resolved by applying some existing optimization techniques.

On the other hand, boosting is a general method for improving the accuracy of any given learning algorithm [25]. Boosting was proposed by Schapire (1990) [26] and improved by Freund and Schapire (1997) [27]. Boosting sequentially trains and combines a collection of classifiers in such a way that training samples that are wrongly predicted by former classifier will play more important role in training of later ones. In [28], a multiple kernel boosting algorithm was proposed and presented good performance in terms of classification. The multiple kernel boosting algorithm learns an ensemble of multiple base kernel classifiers, each of them is learned from a single kernel.

Motivated by the boosting techniques, a sample-screening MKL (S2MKL for short) is proposed for hyperspectral image classification with limited samples. The proposed algorithm applies the AdaBoost as the framework of MKL classifier. In terms of integrating spatial and spectral information, MPs are chosen as the feature extraction method for classification. In the process of the proposed algorithm, the samples are selected in the way that the easily misclassified samples are more probable to be chosen to learn the classifier. In the end, the proposed method adjusts the parameters of MKL by the process of screening samples. In addition, by means of the boosting idea, the proposed method effectively chooses the kernels and computes multiple kernel combination coefficients without solving complicated optimization task.

Here, it should be noted that the way to screen limited samples in the proposed method is similar to active learning strategy. Recently, active learning has been introduced to solve problem of hyperspectral image classification [29], [30]. The original goal of active learning is to train a classifier on a small set of well-chosen samples. The classification performance with the classifier trained on the limited samples can be close to or better than the classifier trained on a larger amount of samples randomly chosen. Being different from the active learning strategies developed for hyperspectral image classification in [29,30] and, the proposed method does not intend to select an optimal subset of training samples. Under the boosting framework, in substance, the proposed method adopts sample-screening strategy to automatically determine weights of base kernels in the linear combination. In this way, the limited training samples can be well exploited in MKL framework to serve for final classification.

The contributions of this paper can be found as following: (1) a method is proposed to deal with limited training samples for hyperspectral image classification by screening important training samples via boosting under MKL framework; (2) the linear combination coefficients of MKL are adaptively determined by the performance of base classifiers during boosting trails, which gives an intuitive understanding of the important base kernels.

The remainder of this paper is organized as follows. In Section 2, we introduce the kernel methods, including standard kernel method-SVM and MKL. In Section 3, we discuss the proposed method for hyperspectral image classification. In Section 4, we describe the experimental hyperspectral dataset employed in this work. In Section 5, we use hyperspectral images for classification and verity the effectiveness of the proposed algorithm through a series of comparative experiments. Section 6 provides a summary of experimental results and draws the relative conclusions.

Section snippets

Kernels and support vector machine

During the last decades, kernel methods, such as support vector machines (SVM), have become classical tools for classification. For classification, the performance of the learning algorithm strongly depends on the data representation. In kernel methods, the data representation is implicitly chosen through the so-called kernel K(x,x׳). This kernel actually plays the other role: it defines the similarity between two samples x and x׳, while determines an appropriate regularization term for the

The morphological profile

Mathematical morphology is a theory aiming to analyze the spatial relationship between pixels. Two fundamental operators in mathematical morphology are erosion and dilation. These operators are applied to an image with a set of known shapes, called the structuring elements (SE). Opening and closing are combinations of erosion and dilation. Dilation of an image followed by erosion of the dilated results is called opening. And erosion of an image followed by dilation of the eroded results is

Case study

Three groups of hyperspectral data were conducted to test the effectiveness of the proposed S2MKL. The detailed information was shown as Table. 1.

1) Hyperspectral Data on the University of Pavia: the first hyperspectral data set was acquired from Pavia University (abbreviated as Pavia) using the ROSIS-3 sensor during a flight campaign over Pavia, northern Italy. The data set has 115 spectral bands and covers the 0.43–0.86 μm range of the electromagnetic spectrum. After removing some bands due to

Experimental settings

For the tree hyperspectral data sets were firstly processed by principal component analysis (PCA) then by mathematical morphological feature extraction. The eigenvalues were arranged in the descending order, and reserved first k PCs which divided by all original PCs equaled 99.9%. A diamond SE was used for MPs. Ten types of sizes for SE were set as [1], [10] with a step size of one. The MPs were generated by applying respectively either 10 morphological openings or 10 closings after PCA-based

Conclusion

This paper presents an effective S2MKL method to achieve the classification of the hyperspectral data. S2MKL applies boosting techniques to screen limited samples and solve the optimization tasks. The effective training samples are adaptively screened by the probability distribution. The S2MKL method learns MKL models by the combination of base classifiers, and the combination coefficients are obtained by the base classifier classification accuracy.

The S2MKL algorithm has been compared with the

Acknowledgment

This work was supported by the National Science Fund for Excellent Young Scholars under the Grant 61522107 and the National Natural Science Foundation of China under the Grant 61371180.

Yanfeng Gu received the Ph.D. degree in information and communication engineering from Harbin Institute of Technology, Harbin, China, in 2005. He joined as a Lecture with the School of Electronics and Information Engineering, Harbin Institute of Technology (HIT). He was appointed as Associate Professor at the same institute in 2006; meanwhile, he was enrolled in first Outstanding Young Teacher Training Program of HIT. From 2011 to 2012, he was a Visiting Scholar with the Department of

References (31)

  • Y. Freund et al.

    A decision-theoretic generalization of on-line learning and an application to boosting

    J. Comput. Syst. Sci.

    (1997)
  • Y. Zhong et al.

    An adaptive artificial immune network for supervised classification of multi-/hyperspectral remote sensing imagery

    IEEE Trans. Geosci. Remote Sens.

    (2012)
  • R. Ji et al.

    Spectral-spatial constraint hyperspectral image classification

    IEEE Trans. Geosci. Remote Sens.

    (2014)
  • Y. Gao et al.

    Hyperspectral image classification through bilayer graph-based learning

    IEEE Trans. Image Process.

    (2014)
  • M. Presaresi et al.

    Approach for the morphological segmentation of high-resolution satellite imagery

    IEEE Trans. Geosci. Remote Sens.

    (2001)
  • J.A. Benediktsson et al.

    Classification and feature extraction of remote sensing images from urban areas based on morphological approaches

    IEEE Trans. Geosci. Remote Sens.

    (2003)
  • Zhi Yong Lv et al.

    Morphological profiles based on differently shaped structuring elements for classification of images with very high spatial resolution

    IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.

    (2014)
  • J.A. Benediktsson et al.

    Classification of hyperspectral data from urban areas based on extended morphological profiles

    IEEE Trans. Geosci. Remote Sens.

    (2005)
  • M.D. Mura et al.

    Morphological attribute profiles for the analysis of very high resolution images

    IEEE Trans. Geosci. Remote Sens.

    (2010)
  • M. Unser

    Texture classification and segmentation using wavelets frames

    IEEE Trans. Image Process.

    (1995)
  • T.C. Bau et al.

    Hyperspectral region classification using a three-dimensional Gabor filterbank

    IEEE Trans. Geosci. Remote Sens.

    (2010)
  • S. Jia et al.

    Gabor feature-based collaborative representation for hyperspectral imagery classification

    IEEE Trans. Geosci. Remote Sens.

    (2015)
  • M. Fauvel et al.

    Advances in spectral-spatial classification of hyperspectral images

    Proc. IEEE

    (2013)
  • J. Zhao et al.

    Detail-preserving smoothing classifier based on conditional random fields for high spatial resolution remote sensing imagery

    IEEE Trans. Geosci. Remote Sens.

    (2015)
  • Y. Zhong et al.

    A hybrid object-oriented conditional random field classification framework for high spatial resolution remote sensing imagery

    IEEE Trans. Geosci. Remote. Sens.

    (2014)
  • Cited by (0)

    Yanfeng Gu received the Ph.D. degree in information and communication engineering from Harbin Institute of Technology, Harbin, China, in 2005. He joined as a Lecture with the School of Electronics and Information Engineering, Harbin Institute of Technology (HIT). He was appointed as Associate Professor at the same institute in 2006; meanwhile, he was enrolled in first Outstanding Young Teacher Training Program of HIT. From 2011 to 2012, he was a Visiting Scholar with the Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA, USA. He is currently a Professor with the Department of Information Engineering, HIT, Harbin, China. He has published more than 60peer-reviewed papers, four book chapters, and he is the inventor or co-inventor of 7 patents. His research interests include image processing in remote sensing, machine learning and pattern analysis, and multiscale geometric analysis. Dr. Gu is an associate editor for Neurocomputing and a peer reviewer for several international journals such as IEEE TRANSACTION ON GEOSCIENCE AND REMOTE SENSING, IEEE Journal of Selected Topics in Area of Remote Sensing, IEEE TRANSACTIONS ON IMAGE PROCESSING, IEEE GEOSCIENCE AND REMOTE SENSING LETTERS. His current research focuses on image processing in remote sensing, machine learning and pattern analysis, multiscale geometric analysis. He has published more than 60 peer-reviewed papers, four book chapters, and he is the inventor or co-inventor of 7 patents.

    Huan Liu received the B.Eng. degree from Harbin Institute of Technology in 2015. She is currently pursuing the M.Eng.degree at Institute of Image and Information Technology, Harbin Institute of Technology. Her research interest is multi-source remote sensing image classification, machine learning

    View full text