Sparse Constrained Low Tensor Rank Representation Framework for Hyperspectral Unmixing

: Recently, non-negative tensor factorization (NTF) as a very powerful tool has attracted the attention of researchers. It is used in the unmixing of hyperspectral images (HSI) due to its excellent expression ability without any information loss when describing data. However, most of the existing unmixing methods based on NTF fail to fully explore the unique properties of data, for example, low rank, that exists in both the spectral and spatial domains. To explore this low-rank structure, in this paper we learn the different low-rank representations of HSI in the spectral, spatial and non-local similarity modes. Firstly, HSI is divided into many patches, and these patches are clustered multiple groups according to the similarity. Each similarity group can constitute a 4-D tensor, including two spatial modes, a spectral mode and a non-local similarity mode, which has strong low-rank properties. Secondly, a low-rank regularization with logarithmic function is designed and embedded in the NTF framework, which simulates the spatial, spectral and non-local similarity modes of these 4-D tensors. In addition, the sparsity of the abundance tensor is also integrated into the unmixing framework to improve the unmixing performance through the L 2,1 norm. Experiments on three real data sets illustrate the stability and effectiveness of our algorithm compared with ﬁve state-of-the-art methods.


Introduction
Hyperspectral remote sensing images are widely used in classification [1,2], target recognition [3], detection [4] and segmentation [5] tasks since they contain hundreds of spectral information bands. Unfortunately, hyperspectral images (HSI) generally have a low spatial resolution, and a large number of mixed pixels exist in the image, which hinders the application of remote sensing technology. To solve this problem, unmixing is used to extract all the pure spectra (i.e., endmember) in each pixel and the proportion (i.e., abundance) of each endmember [6,7]. In the current research, there are two main kinds of unmixing models. One is the linear mixed model (LMM) assuming that the pixels are linearly expressed by all endmembers in the image, and the other is the nonlinear mixed model, which assumes that there are multiple complex interactions among endmembers [8,9]. Since LMM is simple and easy to model, this paper only discusses the unmixing algorithm under the linear mixing assumption.
In recent years, the non-negative matrix factorization (NMF) framework has received extensive attention since its clear physical interpretability and strong ability of data representation among unmixing algorithms under the assumption of the LMM model. NMF aims at finding a set of bases that can express the data as accurately as possible and the expression coefficients of the data under the bases [10]. However, the NMF framework that regards unmixing as a blind source separation problem is non-convex problem with respect to two matrices obtained, which is greatly affected by the initial value and the solution is not stable enough [11]. Therefore, to improve the performance of unmixing, many regularizations are added to the NMF framework based on the prior knowledge about hyperspectral data, such as sparse regularization [12][13][14], smoothing regularization [15,16], non-local similarity regularization [17], manifold regularization [18], and low-rank regularization [19,20]. Hong et al. [21] embed the spectral variability dictionary learning into the linear NMF framework to make the endmembers more accurate. Unfortunately, the unmixing algorithm based on the NMF framework has a flaw, which destroys the integrity of the input data since the 3-D HSI is reshaped into a 2-D matrix. Recently, inspired by the success of non-negative tensor factorization (NTF), which regards HSI as a 3-D tensor without any information loss, a framework based on NTF framework has developed and shown superior performance in hyperspectral unmixing [22][23][24][25][26]. Regrettably, the existing unmixing methods based on NTF mostly focus on the similarity of abundance on the spatial domain, which are not conducive to fully exploring the internal similarity structure of data, such as non-local spatial similarity and information redundancy in the spectral direction.
To overcome the above issues, a new sparse-constrained low-rank tensor factorization (SCLT) framework is proposed for hyperspectral unmixing in this paper, which integrates a multi-mode low-rank learning regularization and a sparsity constraint by L 2,1 norm. Firstly, the hyperspectral data are divided into many patches with the same size. Then, all similar patches are clustered into the same group. Each similarity group can be regarded as a 4-D tensor, which has four modes, including two spatial modes in vertical and horizontal directions, a spectral mode and a non-local similarity mode. Since 4-D tensors are composed of multiple similar patches, all four modes have strong similar relationships. Therefore, a low-rank regularization is designed here to learn these strong similarities, which can express the multi-directional low-rank structure of HSI. In addition, based on prior knowledge, the abundance is sparse. To explore the sparse structure of the abundance tensor, the L 2,1 norm is introduced into the unmixing model to constrain the abundance tensor. Specifically, during each iteration, the sparse regularization is used to force the abundance tensor to remain sparse, and the low-rank regularization plays an important role in exploring the similarity relationship in the various modes of HSI. Since the proposed SCLT is a multi-variate problem, the alternative direction multiple method (ADMM) is utilized to optimize the objective function. For convenience, the contributions of this paper can be summarized as follows: • A new sparsity-constrained low-rank tensor factorization model (SCLT) is proposed for unmixing, which aims at maintaining the low-rank spatial structure of the raw data and the sparsity of the abundance tensor in the iterative process. • A regularization is designed to explore the low-rank structure of data, which can learn the low-rank representation of 4-D tensor containing similarity patches in spectral mode, spatial mode and non-local similarity mode. • The sparsity constraint is embedded in the proposed model to explore the sparsity structure of the abundance tensor through the L 2,1 norm, which can improve the performance of unmixing.
The rest of this paper is organized as follows. Section 2 introduces some works related to NTF. The problem under the NTF unmixing framework is formulated in Section 3. Section 4 describes the method proposed in this paper in detail. Sections 5 and 6 are the experimental analysis and conclusions, respectively.

Related Work
For hyperspectral mixed pixels, NMF has proved as a very effective method for unmixing. Under the LMM, the goal of NMF is to decompose a 3-D HSI reshaped into a 2-D matrix into two matrices, which are the endmember matrix and the abundance matrix. Specifically, each pixel in the hyperspectral data can be synthesized by the endmember matrix and its corresponding coefficients [27]. This model can accurately simulate a large number of mixed pixels, and the calculation is simple, convenient and easy to understand. However, for both the endmember matrix and the abundance matrix, the objection function of NMF is a non-convex problem, and its solution is not unique [28]. Therefore, researchers search for some constraints to assist the NMF framework to improve the performance of unmixing. Among them, the sparse constraint is one of the most important constraints. Imposing L 0 constraint on the abundance matrix is the most ideal sparse regularization [16]. Regrettably, since L 0 is an NP-hard problem, many improved L p regularizers have been proposed. The L 1 regularization is used in [12] to promote the sparsity of the abundance matrix. The L 1/2 is illustrated in [14,29], which can make the matrix have better sparsity. At the same time, the L 2 regularization is applied in [30], which can promote the smooth performance of the abundance matrix compared with the sparse performance. Therefore, to explore the sparse structure in the matrix rather than just the number of non-zeros, some methods also use L 2,1 regularization to constrain the abundance matrix in an attempt to improve the performance of unmixing [31,32].
Unfortunately, the NMF-based methods need to reshape the 3-D hyperspectral data into a 2-D matrix, which seriously damages the original spatial structure among pixels. In recent years, many works have applied the NTF framework to replace NMF for unmixing, which treat hyperspectral data as a 3-D tensor and directly decomposes it instead of reshaping it into a 2-D matrix [33]. Qian et al. [22] pioneer a matrix-vector unmixing method, which combines CPD and Tucker decomposition to innovatively develop ideas for unmixing methods. Subsequently, there are many improvements to the MV-NTF method. For instance, the S-MV-NTF algorithm is proposed in [25] which uses superpixel segmentation and low-rank tensor representation to consider the global and local information of data. Xiong et al. [26] apply the TV regularization to the abundance tensor. Sparse constraint and minimum volume constraint are added to MV-NTF to improve the performance of the algorithm in [34]. Imbiriba et al. [35] propose a low-rank tensor regularization to deal with spectral variability. Non-local low-rank tensor regularizations with weight constraints also show their promoting effects on unmixing in [36,37]. Unfortunately, regularization based on non-local low rank is simulated by the kernel norm and weighted kernel norm in these existing methods. They are defined by the singular values of the matrix, which would bring higher computational complexity and unstable solutions. Different from the above methods, the SCLT algorithm proposed in this paper uses NTF as the basic framework, and simultaneously explores the relationships among pixels in the image and the sparsity of abundance. It not only keeps the algorithm stable, but also improves the unmixing performance. The SCLT by combining sparse constraint and low rank constraint is quite different from the above-mentioned existing unmixing methods. Specifically, (1) SCLT learns non-local similarity in HSI, and explores the low-rank structure of the 4-D tensor of the similarity group in four modes, including two spatial, one spectral and one non-local similarity modes. (2) Embedding the L 2,1 sparse regularization into the NTF framework can not only constrain the number of zeros in abundance tensor, but also explore the sparse structure of the abundance tensor, that is, keep the zero regions in the abundance.

Problem Formulation
The definition of some concepts used in this paper and the NTF unmixing framework are shown in this section.

Concepts
In this paper, we use decorated letters to represent tensors and boldface uppercase letters to represent matrices. Boldface lowercase letters and italic letters are used to denote vectors and constants, respectively. For instance, Y ∈ R I 1 ,I 2 ,I 3 represents a HSI cube. M ∈ R I×J is the endmember matrix; y ∈ R I×1 is the abundance vector; I denotes the spectral dimension of HSI.
Mode: The dimensions of a tensor are called modes. For example, the tensor Y ∈ R I 1 ,I 2 ,I 3 has three modes.
Frobenius Norm: The Frobenius Norm of Y ∈ R I 1 ,I 2 ,I 3 is: (1) Mode-n Unfolding: The tensor is expanded along mode n to form a matrix. Known as a tensor Y ∈ R I,J,N , its mode-1, mode-2 and mode-3 unfolding are:

NTF Model
According to the LMM, each pixel is formed by the linear combination of all endmembers. Therefore, known a hyperspectral data Y ∈ R I 1 ×I 2 ×I 3 , each pixel y n in Y can be expressed as follows: where M ∈ R I 1 ×P is the endmember matrix; a n ∈ R P×1 is the corresponding coefficient vector of the endmember M in the pixel y n ; e denotes the noise; I 1 , I 2 , I 3 , P are the length, width, number of bands and number of endmembers of hyperspectral data, respectively. For the entire HSI, it is regarded as a 3-D tensor Y, and the endmember spectra can form a 2-D matrix M. Therefore, the abundance A and noise E should also be in the form of tensors. It can be expressed as: in which A ×3 is the mode-3 unfolding of tensor A. Similar to the NMF framework, the NTF also has two constraints: (1) abundance non-negative constraint (ANC), which is used to ensure that the unmixing model has real physical meaning. (2) abundance sum-to-one constraint (ASC), which is used to keep the sum of the proportion of all pixels within each pixel as 1. Therefore, the objective function of the unmixing model under the NTF framework can be written as: where A× 1 is the mode-1 unfolding of tensor A, 1 P and 1 I 1 ×I 2 represent a P-dimensional vector and a matrix with I 1 × I 2 size with each element of 1.

Proposed Method
To fully explore the low-rank structure within the hyperspectral data and the sparse regions in the abundance tensor instead of non-zero numbers, a sparse constrained lowrank tensor factorization algorithm (named SCLT) is used. This section explains the proposed algorithm and its optimization under the ADMM framework in details.

SCLT Model
Unlike [22,[33][34][35], in SCLT a novel kind of low-rank tensor regularization is proposed to learn the low-rank structure in HSI. According to prior knowledge, there are non-local similarities in hyperspectral data. First, the HSI is divided into many patches with the full spectral bands of I 3 by the sliding window of r × r. Then, the K-means++ algorithm [38] is used to gather all the similar patches together to form several similarity groups. Each similarity group all can be expressed as a 4-D tensor with four modes, i.e., two spatial modes, one spectral mode and one non-local similarity mode. Suppose that K clusters are obtained after clustering, then the k-th 4-D tensor can be written as: , k = 1, 2, . . . , K, where N k is the number of similar cubes. Since the pixels in the 4-D tensors are highly similar, they have strong internal correlation in spatial, spectral and non-local similarity modes. Therefore, each 4-D tensor is unfolded from four modes to explore its low-rank attributes. Here a new regularization Y LR that can be used to learn the multi-mode low-rank structure of the tensor is proposed. Specifically, for the k-th tensor, it can be defined as: in which Y (k) <t> denotes the unfolding matrix of the tensor Y (k) on the t-th mode.
The definition of a t is the similar as in [39]. In detail, where D j is the length of the j-th dimension of the tensor Y (k) .
In function (6), L(·) is used to impose low rank constraint on the matrix. Generally, the kernel norm · * is used to solve the low-rank approximation problem of the matrix Y, which is defined as: where σ i (Y) represents the i-th single value of the matrix Y. It can be seen from (8) that the kernel norm treats all singular values of the matrix as equally important. However, the large singular values provide the main information of the image and are more important, while small singular values mainly represent noise. Therefore, large singular values should be suppressed less, and small singular values should be suppressed more in low-rank approximation. Motivated by this fact, ref. [39] thinks that taking the singular values and the logarithm function can solve the low-rank problem of the matrix more accurately. The logarithmic low-rank norm is defined as: where ε is a very small constant used to ensure that the log function is meaningful. Hence, (9) is used as the definition of matrix in low-rank approximation in this paper. According to the basic objective function (5) of unmixing, it can be concluded that all pixels in the HSI share an endmember matrix, and all the relations among pixels can be transferred to the abundance tensor. Therefore, the low-rank regularization is embedded in model (5). According to [40], properly relaxing ASC can make the unmixing performance higher. Therefore, only ANC on abundance A is made here. Then, the objective function of unmixing designed in this paper can be written as: in which λ LR denotes the weight parameter of the multi-mode low-rank regularization.
In addition, the abundance tensor has zero connected regions on each band according to the prior, so it contains a sparse structure. To take advantage of this joint sparse property, we use the L 2,1 norm of the abundance tensor here, which can keep the sparse area of the abundance tensor in unmixing instead of just keeping the number of zeros. Specifically, the L 2,1 norm of the tensor can be defined as follows: Finally, the final objective function of the model SCLT can be written as: where λ S is the weight parameter of the sparse regularization.

Optimization
For (12), first the endmember matrix M is updated. At this time, the objective function (12) becomes: which is a convex problem. Therefore, the update rule of M can be directly obtained: where Y ×3 is the mode-3 unfolding of tensor Y; . * and ./ represent the multiplication and division of corresponding elements, respectively.
Since the direct solution of abundance tensor A is a difficult problem, this paper uses the ADMM framework [39] to solve it. In detail, the intermediate variable of abundance A is introduced to (12), then the augmented Lagrangian function can be described as: Obviously, the above function (16) is a convex function for tensor A. We can directly obtain the partial derivative of it, and set the partial derivative to zero, then the updating rule is: where I is an identity matrix.
(b) Update Q 1 , Q 2 , Q 3 , U , V, W. When updating Q 1 , the objective function (15) can be expressed as: Similar to (16), the function (18) is also a convex problem for variable Q 1 , which can be easily solved. The specific update rule is as follows: When updating Q 2 , the objective function (15) can be expressed as: The objective function in [32] is similar to (20), hence it can be known that the update rule of Q 2 is: in which vect − so f t(a, b) is a threshold function and is defined as vect − so f t(a, b) = a(max{||a|| 2 − b, 0}/(max{||a|| 2 − b, 0} + b)), and γ = Q 2 − H 2 . When updating Q 3 , the objective function (20) can be expressed as: The purpose of setting Q 3 is to ensure that the abundance tensor A always has a physical meaning in the unmixing process, which is used to constrain the non-negativity of A. Easily, the update rule of Q 3 is: When updating U , the objective function (15) can be expressed as: in which U k <1> , A k <1> and H 4 k <1> are the mode-1 unfolding form of tensors U k , A k and H 4 k respectively. U k , A k and H 4 k are the k-th similarity group of tensors U , A and H 4 respectively, which are all 4-D tensors. According to [39], the solution to problem (24) can be written as: Here, U k 1 , Σ k 1 and V k 1 can be solved by SVD. In detail, is a diagonal matrix and it can be obtained by Σ k 1 . Specifically, where c 1 and c 2 can be described as: c 1 = |x| − ε, c 2 = c 2 1 − 4(α − ε|x|). After obtaining U k <1> through (25), the mode-1 unfolding matrix U k <1> can be reverted to the tensor U k . Since U k is the k-th clustering group of U , the tensor U can be obtained through the corresponding relationship.
It can be seen from (15) that the update rules of U , V and W are similar. Therefore, it can be known that the update rules of V and W are as follows: The SCLT algorithm proposed for hyperspectral unmixing is concluded in Algorithm 1. It is worth noting that, similar to [41], a threshold is set here. Once errors are more than 10 times within the threshold, the iteration is stopped.

Input:
Y ∈ R I 1 ×I 2 ×I 3 -the observed hyperspectral image; P-the number of endmembers; r-the size of window; K-the number of similarity group; Parameter λ S , and λ LR . A-the abundance tensor; M-the endmember matrix.

Experiments
In this section, a series of experiments are detailed to test the performance of the proposed algorithm SCLT on three real hyperspectral data sets. To prove the effectiveness and superiority of SCLT, five existing methods were used as comparison algorithms, namely, SULRSR-TV [42], SGSNMF [43], MV-NTF [22], NL-TSUn [33] and ULTRA-V [35]. Among them, SULRSR-TV and SGSNMF are unmixing algorithms based on the NMF framework while MV-NTF, NL-TSUn, and ULTRA-V are unmixing methods based on NTF framework. It is worth noting that to ensure the fairness of all comparative experiments, we used the VCA-FCLS algorithm [44] for all data to determine the initial values of endmembers and abundances. On this basis, each algorithm runs 20 times, and the average values are taken as the final results.
Next, four subsections are presented in this section in detail. Section 5.1 introduces the evaluation indicators used in all algorithms in this paper. The results of three real hyperspectral data sets are described in Section 5.2. Section 5.3 analyzes the impact of each parameter change in the SCLT algorithm on the unmixing performance and the effectiveness of each module in SCLT.

Evaluation Indexes
This paper uses two commonly used unmixing evaluation indicators [8], including spectral angle distance (SAD) and root-mean-square error (RMSE). Specifically, the known ground truth endmember matrixM and the abundance tensorÂ, SAD and RMSE are calculated as: where M and A are the endmember matrix and abundance tensor extracted by the proposed SCLT algorithm.

Experiments on the Real Data
This subsection introduces the unmixing performance of the proposed algorithm SCLT and the five comparison methods on three real hyperspectral data sets.
Jasper Ridge Dataset. Figure 1a is a pseudo-color image of Jasper Ridge data. These data were captured by Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) sensor. The original data have 224 bands, but the 1-3, 108-112, 154-166 and 220-224 bands are severely interfered with by water vapor and noise, which are removed here. The final image data size is 100 × 100 × 198. According to [45], the Jasper Ridge data contains four kinds of endmembers, namely Soil, Water, Road and Tree.  Table 1 shows the SAD and its deviation results of the five comparison methods and the proposed SCLT algorithm on Jasper Ridge data. It can be concluded that the unmixing performance of the five algorithms on Jasper Ridge data is generally not high. The proposed SCLT algorithm just has significant advantages in extracting the endmember spectra of Soil. Although the superior performance of the SAD on the other three endmembers is not perfect, it is stable when extracting each spectrum, and the deviation is almost close to zero. Figures 2 and 3 are visualization comparison of the unmixing results of the six algorithms. In Figure 2, the SCLT proposed in all abundance maps is the closest to the real reference abundance map, which shows the superiority of the proposed SCLT in abundance estimation. In Figure 3, the fourth spectral curve algorithm generally does not perform well, because it stands for the road spectrum, which requires a higher unmixing algorithm. It can be seen from this experiment that the proposed SCLT algorithm has a weak advantage over other algorithms on Jasper Ridge data. Samon Dataset. According to the existing research results [46], Samon data mainly contain three types of ground objects, including Soil, Tree and Water. These data have 156 bands, and the spectral band coverage is 0.401-0.889 µm. Since the original data is very large, a sub-image with a size of 95 × 95 is used in real experiments for saving space. Specifically, the pseudo-color image of Samon data is shown in Figure 1b. As shown in Table 2, it is the quantitative comparison result of five comparison methods and the proposed SCLT on Samon data. It can be seen that SULRSR-TV, SGSNMF and SCLT all show better unmixing results on a single spectrum. However, these five comparison algorithms have poor performance on Soil. The SCLT not only has the best results on the average spectrum, but also has extremely high unmixing stability on each endmember with the smallest deviation. Specifically, the endmember spectral curves and estimated abundance maps extracted by the six unmixing methods in this experiment are presented in Figures 4 and 5, respectively. They can also be confirmed from the visualizations, which show that the proposed SCLT method has the best unmixing performance.  Indiana Pines Dataset. The Indiana Pines dataset is a database widely used for unmixing. It contains 224 bands, some of which are also affected by water vapor and have low signal-to-noise-ratio (SNR). Therefore, the selected Indiana Pines dataset has 169 bands and the spatial size is 145 × 145. The pseudo-color image is shown in Figure 1c. Yuan et al. [41] points out that the Indiana Pines dataset mainly includes six kinds of materials, i.e., Wheat, Man-made land, Soybean, Corn, Vegetation and Haystack. The SAD results of six algorithms are shown in Table 3. It can be concluded that the SCLT algorithm has extremely superior performance in terms of stability, which is consistent with the unmixing results of Jasper data and Samon data. Although the proposed SCLT algorithm does not have the lowest error on each endmember spectrum, its average value is the smallest. Figure 6 introduces the endmember spectrum comparison curves extracted by six algorithms. Since the Indiana Dataset does not have the ground truth of abundances, here we only show the comparison of the abundance maps in Figure 7, and the results are similar to [8], which is sufficient to prove that the proposed SCLT algorithm has full superiority.

Parameter Analysis
The parameters involved in the SCLT algorithm are discussed in this subsection, which mainly include the weight parameters of the low-rank regularization λ LR and the sparse constraint λ S , the size r of the patch and the number K of the similarity group. Due to space limitations, here the Jasper Ridge dataset is taken as an example to illustrate how this paper evaluates the parameters. According to [8], the control variable method is used to determine the parameters. Firstly, λ LR , λ S and r are set 0.01, 0.01 and 5, respectively. K is set to {1, 5, 10, 15, 20, 25, 30, 35, 40, 45}. It can be seen from Figure 8a that, when K is 20, the unmixing achieves the best performance. Subsequently, K is fixed to 20. r is set to {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}. Figure 8b shows the SAD results under these parameter settings. It can be concluded that when r = 3, the SCLT algorithm achieves the best performance on Jasper Ridge dataset.

Conclusions
In this paper, we propose a novel non-negative tensor factorization framework by combining a low tensor rank constraint and a L 2,1 sparsity constraint for hyperspectral unmixing. A regularization that can describe the low-rank properties of HSI is designed to learn the low-rank structure of the similarity groups along the spatial, spectral and non-local similarity modes in the image. In addition, the L 2,1 norm is applied to explore the sparseness of abundances. The proposed unmixing framework SCLT can not only learn the low-rank structure of data in the different modes, but also keep the sparseness of the abundance. The SCLT is optimized by the ADMM method. The experimental results conducted on three real data introduce the superiority and effectiveness of the proposed algorithm.