Fast Endmember Extraction for Massive Hyperspectral Sensor Data on GPUs

Hyperspectral imaging sensor becomes increasingly important in multisensor collaborative observation. The spectral mixture problem seriously influences the efficiency of hyperspectral data exploitation, and endmember extraction is one of the key issues. Due to the high computational cost of algorithm and massive quantity of the hyperspectral sensor data, high-performance computing is extremely demanded for those scenarios requiring real-time response. A method of parallel optimization for the well-known N-FINDR algorithm on graphics processing units (NFINDR-GPU) is proposed to realize fast endmember extraction for massive hyperspectral sensor data in this paper. The implements of the proposed method are described and evaluated using compute unified device architecture (CUDA) based on NVIDA Quadra 600 and Telsa C2050. Experimental results show the effectiveness of NFINDR-GPU. The parallel algorithm is stable for different image sizes, and the average speedup is over thirty times on Telsa C2050, which satisfies the real-time processing requirements.


Introduction
Multisensor image data fusion in remote sensing is a kind of collaborative image processing technology for sensor networks, which utilize the consistency and complementarity of different sensors' image data to assess accurately [1]. Hyperspectral imaging sensor becomes increasingly important in multisensor collaborative observation. There are tens or hundreds of contiguous bands of high spectral resolution in hyperspectral image, covering the visible, near-infrared, and shortwave infrared spectral bands [2]. It can get the spectral signatures and enable identification of the materials that make up a scanned target, which greatly improve the ability of target recognition and detection in sensor networks.
Spatial resolution of hyperspectral sensor data is often relative low to meters or tens of meters, and several different materials jointly occupy a single pixel. Therefore, most of the pixels of hyperspectral data, which called mixed pixels, contain more than one material (called endmember) [3]. The mixture problem will seriously influence the efficiency of hyperspectral data exploitation. Many researchers focus on the study of hyperspectral unmixing. Endmember extraction is one of the key issues. Many algorithms have been developed to solve this issue, and the N-FINDR [4] is one of the most widely used methods for automatically determining endmembers in hyperspectral data without using a priori information, which has been successfully applied for over ten years.
There are two factors which cause the limitation of N-FINDR's applications in multisensor collaborative observation. First, it is quite expensive in computational terms due to the high algorithm complexity. Second, the quantity of hyperspectral sensor data is massive due to the extremely high dimensionality of hyperspectral data cube. For example, the airborne visible-infrared imaging spectrometer (AVIRIS) is able to record the visible and near-infrared spectrum (wavelength region 400-2500 nm) of reflected light in an area, 2-12 km wide and several kilometers long, using 224 spectral bands. The resulting multidimensional data cube typically comprises several GBs per flight [5]. High-performance computing is extremely demanded for those scenarios requiring real-time response, such as military detection, monitoring of chemical contamination, and wildfire tracking.
International Journal of Distributed Sensor Networks Several parallel computing technologies, like supercomputers, clusters, distributed computing, multicore CPUs, field-programmable gate arrays (FPGAs), and graphics processing units (GPUs), are used to accelerate hyperspectral data processing algorithms [6,7]. GPUs are quickly evolving as a standardized architecture in hyperspectral processing due to their compactness, low cost, portability, low weight, and high computational power [8]. Wu et al. [9] presented an improved GPU implementation of the PPI algorithm which provides real-time performance. Plaza et al. [10] developed three new GPU-based implementations of endmember extraction algorithms: the pixel purity index (PPI), a kernel version of the PPI (KPPI), and the automatic morphological endmember extraction (AMEE) algorithm, and they provided a GPU-based implementation of the fully constrained linear spectral unmixing algorithm. Barberis et al. [6] proposed a new parallel implementation of the vertex component analysis (VCA) algorithm for spectral unmixing of remotely sensed hyperspectral data on commodity GPUs. Although there are several parallel implements of the N-FINDR algorithm existing in literature [5,[11][12][13], the speedup of them is less than 30 times, which cannot meet the requirements of real-time applications well.
In this paper, we propose a method of parallel optimization for N-FINDR on graphics processing units (NFINDR-GPU) to realize real-time endmember extraction for massive hyperspectral sensor data. The implements of the proposed methods using compute unified device architecture (CUDA) are described and evaluated. The computation time of the parallel implementation on GPUs is compared with the serial implementation on central processing units (CPUs).

Endmember Extraction for Hyperspectral Sensor Data
There are two models to unmix the hyperspectral sensor data, named linear and nonlinear. The linear mixture model identifies a collection of spectrally pure constituent spectra (endmembers) and expresses the measured spectrum of each mixed pixel as a linear combination of endmembers weighted by fractional abundances that indicate the proportion of each endmember present in the pixel [3]. It assumes minimal secondary reflections and multiple scattering effects in the data collection procedure, and hence the measured spectra can be expressed as a linear combination of the spectral signatures of materials present in the mixed pixel. The linear mixture model is formulated as follows: where denotes a -by-1 spectrum vector of one pixel of the observed hyperspectral data; denotes the number of bands, = [ 1 , 2 , . . . , ] denotes a -by-mixing matrix with endmembers as columns and is usually of full column rank, denotes the number of endmembers; = [ 1 , 2 , . . . , ] denotes a -by-1 vector containing the respective fractional abundances of the endmembers, is the abundance fraction of the th endmember, with = 1, 2, . . . , , and the notation (⋅) stands for vector transposed; denotes an additive -by-1 noise vector collecting the errors affecting the measurements of the pixel at each spectral band. Endmember extraction of hyperspectral data aims at obtaining a good estimation of the mixing matrix .
Several methods have been used to perform endmember extraction, including geometrical, statistical, and sparse regression-based approaches [3]. A successful and widely used algorithm in the first category has been the N-FINDR.
N-FINDR algorithm is an iterative optimization procedure that maximizes the volume of a simplex containing all hyperspectral image pixels in feature space and automatically extracts the endmembers in the hyperspectral scene [4]. It relies on the assumption that, when the noise vector is negligible, all the spectrum vectors of hyperspectral pixels are contained in a convex set (named simplex) of highdimensional space, and the endmembers are vertices of the simplex. Thus, the problem of endmember extraction is transformed to solving the vertices of the simplex.
The N-FINDR looks for the set of pixels with the largest possible volume by inflating a simplex inside the data. The mathematical definition of the volume of a simplex is formulated as where is the matrix of endmembers augmented with a row of ones, = [ 1 ; is a column vector containing the spectra of endmember ; abs( ) is the absolute value of ; | ⋅ | denotes the determinant of matrix; ( − 1) is the number of dimensions occupied by the data.
The determinant is only defined in the case where the number of features is − 1; is the number of desired endmembers. Since ≫ typically in hyperspectral data, we adopt principal component analysis (PCA) to reduce the dimensionality of the input hyperspectral data [5] and use the virtual dimensionality (VD) method [14] to estimate the number of endmembers, for the preprocessing of N-FINDR. After the preprocessing, the standard N-FINDR algorithm can be summarized as in Algorithm 1.

Fast Endmember Extraction Based on GPUs
It could be noted that there are three factors which lead to high computational overhead of the N-FINDR algorithm. Firstly, the algorithm is an iterative procedure, and every pixel in the data set must be evaluated to refine the estimate of endmembers, looking for the set of pixels that maximizes the volume of the simplex defined by the selected endmembers [5]. Secondly, in step (4), the computation is done for every single element in the input data set, and the replacement step is repeated for all the pixel vectors in the dataset. Thirdly, in step (3), the calculation of the determinants is particularly time consuming.
Aiming at solving these problems, the endmember extraction based on N-FINDR is optimized using compute unified device architecture (CUDA) on GPUs as shown in Algorithm 2.
The NFINDR-GPU is graphically illustrated by a flowchart in Figure 1. (2) repeat (3) Calculate the volume of the current endmember set ( ) For every pixel spectrum vector (1 ≤ ≤ , is the number of pixels) in the dimensionality reduced data set, test this pixel in all the endmember positions in the ( ) set and recalculate the volume ( ( ) ); (5) If the new volume is larger than the previous one, then update the ( ) set to ( +1) , and = + 1, then go to step (3). If the new volume is not larger than the previous one, then no replacement is required; (6) until the endmember set converges and no new replacement takes place. (2) Calculate the volume of the endmember set (0)

Experiment
A well-known real hyperspectral scene labeled as f970619t01p02 r02, collected by the airborne visible infrared imaging spectrometer (AVIRIS) over the Cuprite mining district in Nevada [15], is used to evaluate the performance of NFINDR-GPU. In order to test the performance of the algorithm on different magnitude, the portions used in experiments correspond to 5 subsets of the scene, including a 250 × 191-pixel subset, a 300 × 300-pixel subset, a 350 × 350pixel subset (see Figure 2), a 512 × 614-pixel subset, and a 2206 × 614-pixel subset. The scene comprises 224 spectral bands between 0.4 and 2.5 m, with nominal spectral resolution of 10 nm. Prior to the analysis, bands 1-3, 105-115, 150-170, and 221-224 were removed due to water absorption and low SNR in those bands. The site is well understood mineralogically and has several exposed minerals of interest including alunite, buddingtonite, calcite, kaolinite, and muscovite [5]. Reference ground signatures of the above minerals are available in the form of a U.S. geological survey library (USGS) [16]. The number of endmember was estimated to be 16 after calculating the VD of the AVIRIS Cuprite data.
The GPU platforms used to test our parallel algorithm are the NVidia Quadra 600 and Tesla C2050. The former features 96 processor cores, total dedicated memory of 1 GB, and memory bandwidth of 25.6 GB/s. The latter features In order to achieve a fair benchmark in terms of execution time with respect to the GPU version, we firstly realize a serial implementation of the N-FINDR algorithm as a basis for the subsequent parallel implementation. The serial algorithm is executed in one of the available CPU cores, and the parallel time is measured in the considered GPU platform. For each test, 60 runs are performed and the mean values are reported. The experimental results indicate that the endmember extraction results of both parallel and serial versions of N-FINDR correspond to the published ground truth very well. While using the same initialization endmember set (0) , the serial and parallel algorithms reach the same maximum volume with the same iteration times. The main difference between them is the time they need to complete their calculations. The average run time per iteration and the speedup of the serial and parallel implementations are summarized in Table 1 and    We can conclude that the parallel implementations are stable for different image sizes, even for the big size of 2206 × 614 pixels, and NFINDR-GPU on Tesla C2050 achieved a significant speedup of greater than 30 times with regard to the CPU-based serial version of the N-FINDR algorithm. The proposed method shows better performance than the methods proposed in the existing literature. As [9] reports, while the AVIRIS scanning rate is 12 Hz, more recent satellite hyperspectral sensors such as Hyperion feature 220 Hz crossline scanning rates, which means that a hyperspectral sensor data like the AVIRIS Cuprite scene (a typical AVIRIS data cube with 614 × 512 pixels and 224 spectral bands) could be collected in about 5 s. The achieved processing time on the considered GPU architecture satisfies the real-time processing requirements.

Conclusions
Hyperspectral sensor can get the spectral signatures and enable identification of the materials that make up a scanned target, which will greatly improve the ability of target detection and recognition in sensor networks. Endmember extraction is one of the key issues for hyperspectral application in multisensor collaborative observation.
The N-FINDR is one of the most widely used and successfully applied methods for endmember extraction. But it suffers from long execution time due to its high algorithm complexity and the massive quantity of hyperspectral sensor data. High-performance computing is extremely demanded for those scenarios requiring real-time response, such as military detection, monitoring of chemical contamination, and wildfire tracking. Improved parallel optimization of N-FINDR using the compute device unified architecture (CUDA) on GPUs (NFINDR-GPU) is proposed for fast endmember extraction of hyperspectral sensor data. The algorithm is implemented on both NVIDIA Quadra 600 and Telsa C2050, achieving significant improvements when compared with the previous GPU-based implementations of N-FINDR. Experimental results, based on the hyperspectral data collected by AVIRIS hyperspectral imaging sensor, show the effectiveness of NFINDR-GPU. The parallel implementation is stable for different image sizes, and the NFINDR-GPU on NVIDA Tesla C2050 achieved a significant speedup of greater than 30 times with regard to the CPU-based serial version of N-FINDR, which satisfies the real-time processing requirements. Future work will focus on the comparison with other parallel methods.