Hyperspectral Image Classification With CapsNet and Markov Random Fields

Hyperspectral image (HSI) classification is one of the most challenging problems in understanding HSI. Convolutional neural network(CNN), with the strong ability to extract features using the hidden layers in the network, has been introduced to solve this problem. However, several fully connected layers are always appended at the end of CNN, which dramatically reduced the efficiency of space utilization and make the classification algorithm hard to converge. Recently, a new network architecture called capsule network (CapsNet) was presented to improve the CNN. It uses groups of neurons as capsules to replace the neurons in traditional neural networks. Since the capsule can provide superior spectral features and spatial information extracted, its performance is better than the most advanced CNN in some fields. Motivated by this idea, a new remote sensing hyperspectral image classification algorithm called Conv-Caps is proposed to make full use of the advantages of both. We integrate spectral and spatial information into the proposed framework and combine Conv-Caps with Markov Random Field (MRF), which uses the graph cut expansion method to solve the classification task. The Caps-MRF method is further proposed. First, select an initial feature extractor,which a CNN without fully connected layers. Then, the initial recognition feature map is put into the newly designed CapsNet to obtain the probability map. Finally, the MRF model is used to calculate the subdivision labels. The presented method is trained with three real HSI datasets and is compared with the latest methods. We find the framework can produce competitive classification performance.


I. INTRODUCTION
With technology advancing, various types of high-resolution Earth surface images become readily available [1]. Hyperspectral images contain abundant spectral and spatial information [2]. They often consist of hundreds of spectral bands, and are used in mineral exploration, soil testing and environmental monitoring. Hyperspectral image classification aims at automatically assigning a specific semantic label to each pixel according to its spatial-spectral information [3]. In recent years, several classification methods have been developed for HSI classification tasks. New classification methods are mostly based on low-order features or standard features, such as Support Vector Machine (SVM) [4] and Sparse Representation (SR) [5]. However, challenges these The associate editor coordinating the review of this manuscript and approving it for publication was Hongjun Su. methods face are that the marked training samples available are minimal, leading to the Hughes phenomenon and the ignorance of essential features.
Recently, Deep Learning (DL) [6], [7] has become a very active topic in computer vision including image classification [8], data dimensionality reduction [9] and semantic segmentation [10]. Deep learning models can acquire more powerful, abstract, and differentiated functions through deep structure neural networks. Especially, CNN [11] has a strong ability to extract features using the hidden layers in the network, making it more suitable for hyperspectral image classification, and have achieved state-of-the-art results [12]. Unfortunately, the fully connected layers in the CNN model compress the two-dimensional feature map into a one-dimensional feature map, ignoring the information related to the spatial domain, which make the classification algorithm difficult to converge and seriously reduces the classification accuracy [13].
To solve the problems above, we proposed Conv-Caps, which combine the CNN and the CapsNet, with spectralspatial features to process hyperspectral image classification tasks. With CNN's powerful capabilities of learning features and the property of equivariance of CapsNet, Conv-Caps has an advanced performance. Otherwise, CapsNet has been successfully applied to many fields, such as tumors classification [14], sound event detection [15], and remote sensing image classification [16]. The main idea of the CapsNet is that vector capsules are utilized to represent internal attributes, and replace the neuron in the traditional neural network with a set of neurons as a capsule to solve the problem of spatial hierarchies between features effectively. It is useful to identify small sample data to overcome the disadvantage of CNN via the dynamic routing process and vectorize the result. Sabour et al. [17] state that CapsNet exhibits superior performance when dealing with spatial level related issues. However, Luo et al. [18] propose that it is not ideal to apply CapsNet to the HSI data directly. Deng et al. [19] compared the performance of the CapsNet and CNN model under the condition of a similar network structure and found that the capsule network is better than the CNN model in many aspects.
The problem encountered by these methods is that they rarely consider eithers spectral or spatial features. None of the above models formulate HSI classification into the Bayesian framework that simultaneously considers deep learning models and MRF. To further improve the classification performance, we introduce Markov Random Field into the model as a post-processing method of the Conv-Caps classification map. In the image segmentation task, MRF can encourage adjacent pixels to have the same class label, and it can adequately characterize spatial information via modeling adjacent labels in hyperspectral images. Researches show that MRF can significantly improve the classification accuracy in HSI classification tasks. Based on this, we propose the Caps-MRF model. Specifically, first, the input HSI is deblocked, and the patch is used as the feature input for each pixel of the hyperspectral image. Then, CNN without fully connected layers is used as the initial feature extractor, and a deep network classifier is constructed using CapsNet. Finally, with the classifier's results, the MRF is introduced to perform smoothing post-processing in the airspace to obtain the final classification output. Experiment results show that our method is more suitable for HSI classification. In summary, the merits of our paper are as follows: 1) To further improve the classification precision, we proposed a novel architecture called Conv-Caps. The proposed method improves CapsNet and enhances the internal connection of the capsule, overcoming the problem of overfitting in classification, and can effectively deal with the classification of hyperspectral scenes. 2) Bbased on Conv-Caps, we introduced the new classification method called Caps-MRF. The proposed method can take full advantage of spectral-spatial properties, combined the CapsNet and MRF, and explained the problem of hyperspectral classification from the perspective of the Bayesian framework. The result is superior to the current state-of-the-art methods on three challenging data sets. 3) We study and analyze the impact of different factors in the proposed architecture to the classification results, such as the patch size, capsule dimension, and network depth. Also, further analysis is carried out to improve the robustness and generalization of the proposed method by enhancing noise and changing the training scale. These can provide valuable guidance for the application of capsule networks in hyperspectral image classification in the future.
The rest of this article is organized as follows. In section II, we describe the background formulations of the capsule and MRF. In Section III, the design procedure is discussed, and the proposed structure is described. In section IV, the experiments on three data sets are illustrated. Finally, the paper ends with section V where we summarized our work.

II. BACKGROUND FORMULATION A. CapsNets
CapsNet is a recently proposed neural network that may cause a profound impact on deep learning, especially in computer vision. In traditional CNN, a neuron's input and output are scalars. Different from it, CapsNet's neurons process the vector. Therefore, Capsule is also called Vector Neuron (VN), and all critical information about the state of features in capsule detection will be encapsulated in the form of a vector.
In this article, CapsNet splits the hyperspectral image into vector capsules I i through a convolutional step. Each I i is multiplied by a corresponding weight matrix W i . Finally, a digit capsule v j is created by a dynamic routing algorithm. Vector capsule v j represents a category of the hyperspectral image, and the modulus of the vector represents the probability of the category. Compared with CNN, capsule networks have some disadvantages, such as being unsuitable for large databases and slower execution. To overcome these shortcomings, Paoletti et al. [13] propose a new CNN architecture based on spectral-spatial capsule networks while significantly reduce the network design complexity. Wang et al. [20] expand classification frameworks for HSI based on CapsNet, which introduces the affine transformation matrix. Yin et al. [21] tune a new CapsNet architecture with three convolutional layers and achieve superior performance in HSI classification to the CNN-based methods. Wang et al. [22] resolve the problem that high resolution may increase intraclass difference and interclass similarity with Caps-TripleGAN.

B. MARKOV RANDOM FIELDS
Due to the spatiality of the hyperspectral image, MRF is widely used in the field of hyperspectral applications to model the spatiality [23]. The point of the MRF clustering algorithm is randomly walking in a dense cluster. Before leaving the cluster, random traversal in dense clusters accesses many nodes. Then in the probability matrix, each value is expanded by a power, which makes it possible to strengthen the fixed points and weaken the loose points.
Let the HSI dataset be X = {x 1 , x 2 , . . . , x n } ∈ R h×w×d , where h is the height, w is the weight, and d is the number of spectral bands. n = hw represents the total number of extracted patches. The set of class lables is defined as L = {1, 2, . . . , K }, K is the number of classes for the given HSI dataset. Consider Y = {y 1 , y 2 , . . . , y n } as image of class labels, Y = {y 1 , y 2 , . . . , y n } as the initial probabilistic label prediction for each data point. In the Bayesian framework, the estimation of labels Y for observations Y can be inferred by maximizing a distribution P(Y |Y ): where P(Y |Y ) and P(Y ) represents the likelihood function and the prior probability of category labeling, respectively. The MAP segmentationŶ is computed via the α-Expansion algorithm. The details of the calculation will be given in the next section [24].

III. PROPOSED METHOD
In this section, we will introduce the design principles of network architecture and CapsNet. Then we describe how to compute the segmentation results with an efficient min-cut optimization technique.

A. IMPROVED CAPSULE-BASED CLASSIFIER
Inspired by CNN, we added the CNN architecture to the capsule network, replacing the original fully connected layer with the proposed Conv-Caps. For the problem that CNN cannot consider the spatial hierarchy between elements, Cap-sNet 's solution is to encode spatial information and calculate the existence probability of the object. In particular, a capsule is a vector containing multiple neurons, each representing various properties of a specific entity that appears in the image. These attributes include many different types of instantiation parameters and help to extract the features of the image, thereby making up for a large amount of information that CNN lost when outputting, effectively preserves the spatial-spectral details of the elements in the HSI data cube. The proposed architecture is shown in Fig 1, which consists of two dual CNNs, two pooling layers, and a capsule network. Also, the dynamic routing process of the capsule network has been improved.
We use the HSI patch as the input of CapsNet. The first convolution layer uses 100 kernels of size 3 × 3 to filter the sample block, the stride is equal to 1, and 100 feature maps are generated. Then, the combined set of features of size 9 × 9 will be inputted into the max-pooling layer with kernels of 2 × 2. The size of the feature map after pooling is 3 × 3. The maps pass through a convolutional layer with 300 kernels of 3 × 3, which filters the feature maps into a new feature map of size 3 × 3 and enters the next max-pooling layer with the kernel of 2 × 2. Now, a data block of size 9 is generated. The next layer is the main capsule layer, the kernel size is 3 × 3, the stride is equal to 1, the number of capsules in the output layer is the same as the number of types, and the dimension is 16. And the current capsule is output to which higher-level capsule by dynamic routing.
The output capsule will be calculated by a nonlinear squashing function as follows: where v j represents the output of capsule i after the squashing activation function, s i denotes the total input of capsule i. In this article, we replaced Squash(s i ) with (s i )/ s i during the iteration, and only squashed it at the end of the output. Improve formula (2) to formula (3).
To be specific, the total input s i of capsule i is obtained by the weighted sum of all the capsules' output in the previous layer, which is formulated as: where I i is the vector of the i-th input, to ensure I i is aligned with the upper-layer capsule s i , we use the transformation matrix W ij (the weight of the i-th vector). c ij is the routing weight; it represents the contributions of the previous capsules to the upper-layer capsules and is determined by routing softmax. The formula is shown as follows: where b ij is the log prior probability between the i-th capsule and the j-th capsule. b ij is used to reflect the agreement between U ij and v j , and the closer the values of U ij and v j are, the higher the correlation between the two capsules. In order for capsule to be coupled i to capsule j with equal probability, b ij is initialized to zero. With iterations, repetitively updates b ij as b ij ← b ij + U ij v j to make b ij more accurate. Finally, the output layer will replace each of the capsule with its module and perform argument-max (argmax) function to gain the pixel's label Y . In the HSI classification task, the probability of local spectral features existence is represented by the norm of the vector. The separate margin loss L C for each capsule C can be given as: The total loss function of CapsNet is a weighted sum of margin loss. For an output capsule of T C = 1, the margin loss is 0 when the output vector is higher than m + , or else it is not 0. Correspondingly, for the output capsule of T C = 0, when the output vector is smaller than m − , the margin loss is 0. Otherwise, it is not 0.

B. PRIORITY OF SMOOTHNESS
After obtaining the probabilistic label prediction Y in Conv-Caps, we should solve the optimization problem. On this basis, we proposed Caps-MRF. Use smoothness prior to promoting piecewise smooth segmentations. Assume that the conditional class is independent of pixels, and the optimization problem can be simplified as follows: The prior method belonging to the MRF class will smooth the segmentation result by assigning neighboring pixels into the same category. According to the Hammersly-Clifford theorem, the density associated with MRF is the Gibbs' distribution. Therefore, the previous segmentation model has the following structure: where Z is the normalization constant of density, tunable parameter µ controls the smoothness, 1{·} is an indicator function. N (i) is the adjacent pixel of pixel i, and δ(y) is a unit pulse function, defined as: δ(0) = 1 and δ(y) = −1 for y = 0. Paired terms δ(y i − y j ) have a higher probability than equal adjacent labels.

C. CAPS-MRF FRAMEWORK FOR HSI CLASSIFICATION
To get a good approximation, we use a series of minimumcuts computation, map the problem to the suitable graphs.
The following is the final classification model: This model can be regarded as a MRF model.The final step of our proposed method involves spatial context information. The label Y represents a Markov Random Field, and the objective function is its energy function, with the first term representing the cost given to different categories of pixels. The probability that a pixel is assigned with a corresponding label and this label depends on the possibility that the pixel belongs to a specific category. The second term encourages the labels of neighboring pixels to be close. Thus, it can be approximated by an algorithm based on the α-expansion [24] graph cut. As a result, a good approximation ofŶ is produced, which is reasonable and effective from a computational point of view and considering the accuracy of the model used. The detail of the proposed method is to summarize the procedure in Algorithm 1.

IV. EXPERIMENTAL RESULT A. DATA DESCRIPTION AND EXPERIMENTAL SETTINGS
Three public real hyperspectral datasets (see Figure 2, 3, and 4) with different spatial resolutions are used in our VOLUME 8, 2020

Algorithm 1 Caps-MRF Hyperspectral Image Segmentation
Input: HSI patches X , smoothness parameter µ, learning rate α. Output: output lableŶ .  Table 2). After the 20 water absorption bands were removed, the available data set was composed of 204 bands. Finally, it is the Pavia Centre (PC). This scene is provided by Prof. Paolo Gamba from the Telecommunications and Remote Sensing Laboratory, Pavia University (Italy). Pavia Centre is a 1096 × 715 pixels image. The number of spectral bands is 102 and the geometric resolution is 1.3 meters. There are nine classes included in the ground truth map, and the number of samples for each class is given in Table 3. Different dataset choices can reflect the robustness of the method and are important factors in determining the final outputs and accuracy. The sampling strategy is one of the important factors that affect the classification results of models, especially when using a limited number of known samples. In this article, the available labeled samples are divided into three subsets, namely training samples, verification samples, and test samples (see Figure 5, 6 and 7). Each different category had the same color as the basic fact category. Randomly select the 10% labeled samples of each data set as the training set to train, and use the performance evaluation of the 10% verification samples to design our network architecture. These samples are also randomly selected. The labeled samples remaining after training is used as a test set to evaluate the network's capabilities and obtain the final classification results. All methods are numerically compared using the following three classification performance metrics: overall accuracy (OA), average accuracy (AA), and kappa coefficient (κ). Specifically, OA represents the ratio of category pixels of the correct classification to the total number of categories; AA represents the average accuracy of each category; and κ relates to the ratio of errors that are reduced by classification and complete random classification. The larger the value of these three conditions, the better the classification effect.
The hardware is an NVIDIA RTX 2080Ti graphical processing unit (GPU) with 11GiB memory and dual Intel Xeon E5-2678 v3 with 64GiB memory. Regarding our software environment, it is composed of Ubuntu 18.04 x64 as an operating system, NumPy 1.16.4 and pandas 0.24.2, Tensor-Flow 1.13.1 framework, and Python 3.5 as the programing language, while others are run in Matlab R2014b.  In this set of experiments, we considered six representative HSI classification methods to verify the classification performance of the proposed method. Specifically, Support Vector Machine method (SVM) [25], Semi-supervised 1D-CNN [26], Semi-supervised 2D-CNN [27], 3D-CNN [28], P-RN [29], DenseNet [30] and a capsule network-based method SS-Caps [13]. For other comparison methods, the parameters are pre-set according to the corresponding paper recommendations. For the proposed method, the parameter settings are given in Section IV-B. In Table 1-3, class results and global metrics are arranged in rows, and different classifiers are displayed in columns. All experiments were run ten times with different random training samples, and the classification accuracy was given in the form of mean ± standard deviation.

B. PARAMETER ANALYSIS
In this experiment, some parameters need to be set in advance. For the Conv-Caps method, through the test comparison, we set the network depth to 6. The first convolutional layer has a kernel size of 3 and a number of 100. The second convolutional layer has a kernel size of 3. And the patch size set to 9, strides are set to 1. The parameters m + , m − , and λ in the loss function were set to 0.9, 0.1, and 0.5, respectively. For the proposed Caps-MRF, the smoothness parameter µ is set to 5.0. The training epoch was set to 100 with a learning rate of 0.001. In addition to these parameters above, several other important parameters may also affect the classification performance. In this section, we tested three parameters, including patch size, the effect of different network depths,  and the size of capsules in CapsNet, to evaluate the impact of parameter changes on classification performance. In the following experiments, we use Indian Pines datasets and uniformly select 10% of the training samples when changing other parameters. Other data are used to test.

1) ANALYSIS OF PATCH SIZE
The size of the hyperspectral patch determines how many features the model can capture, and therefore affects the classification results-the larger the patch, the better the discriminative features. Nevertheless, when the size becomes  larger to a certain extent, the classification performance may deteriorate. In this section, we will fix other parameters and test the effect of different patch sizes k = {7, 9, 11, 13} on classification performance.
It can be seen from Figure 8 that for the IP data set, OA first improves as the patch size increases, but then becomes worse. As we can see, the more tremendous sample patch results are better because it can contain more spatial information to optimize performance. However, too large a sample patch will also bring other problems, such as too much calculation time. For a proper trade-off, we choose the patch size of 9 as the default setting.

2) EFFECT OF DIFFERENT NETWORK DEPTHS
In the framework of Caps-MRF, different layer depths will significantly affect the training effect. We determine the most suitable network depth by increasing or decreasing non-linear layers. According to the information given in the literature, the deeper level of deep learning model, the worse of HSI classification results [31]. So we train and test the proposed method on five networks with depth 5, 7, 9, and 11. Figure 9 shows the experimental results. As can be seen from the table, our experiment verifies that increasing network depth does not necessarily lead to better classification results. May be due to the gradient vanishing problem in the deeper neural network.
We set the network depth to 7 for the best performance. In other experiments, it is used as the default setting for network depth.

3) THE DIMENSION OF THE CAPSULE
One of the critical components of CapsNet is the capsule. In the proposed model, the activity of neurons in the capsule represents various attributes of remote sensing scene images. There are two layers of capsules, PrimaryCaps and DigiCaps,  which are lower-level capsules representing small entities and higher-level capsules representing more complex entities, respectively. Considering the size of the capsules in CapsNet, the smaller size of the capsules will weaken the capsules' performance, leading to confusion about similar scene categories in the context of the image. High-capsules may contain more noise or extra information [32]. Too high or too low will hurt the classification results, so we set a set of values ((6,12), (8,16), (10,20), (12,24)) to study The effect of capsule size on the model. The experimental results are shown in Figure 10. It can be seen that the value (8,16) has the best performance, so we use it for the proposed model.

C. CLASSIFICATION RESULT
In this section, we report the classification results of the proposed method and other classical methods. Select the parameters as described above, and Table 1-Table 3 shows the classification results of the Indian Pinas, SalinaA, and PaviaU datasets using different methods. It can be seen that for IP datasets, Caps-MRF has the highest OA, AA, and K. At the same time, the proposed Conv-Caps are 0.20%  higher than SS-CNN in OA. For SA datasets, OA of Caps-MRF is better, which indicates that the superiority of our method lies in less sample and higher accuracy. For PC data sets, MRF optimization improves Conv-Caps' OA by 0.73%. Compared with other methods, the classification accuracy of the method based on the capsule network is higher.
Finally, we evaluated the accuracy of classification from a visual perspective. As shown in Figure 11, 12, and 13, we can find out how different classification methods affect the classification results. It can be seen from the picture that the classification error is mainly caused by some fragment structure or intra-class variation. The classification graph of each model or classifier of the SA dataset is very satisfactory. Because the SA dataset is a relatively simple data set, all models have obtained fairly good results and accuracy, including the SVM classifier.
In addition, the comparison of Figure 11h and Figure 11i proves that the use of MRF can significantly improve classification accuracy. Compared with the results of other methods, the performance of 3D-CNN is second only to Caps-MRF. The main reason is that it involves joint spectralspatial features, which is better than the single spectral or spatial features.
For the PC dataset, by comparing the real ground reference with the classification map, we found that the classification results produced by the SVM model are very noisy. The reason may be that only the spectral information contained in the HSI data is considered. And yet, SS-Caps and Caps-MRF obtain a smoother appearance in the visualization results by combining spectral and spatial features, the results further illustrate the advantages of the spectralspatial method. At last, it can be concluded that the proposed method enhances the generalization ability of the network. We can study spectral-spatial features by more effectively considering the spatial position, spectral features, and possible transformations of spectral-spatial features. Caps-Net can use higher levels of abstraction to characterize HSI data. In order to define the class boundaries more accurately, the basic information of the corresponding spectral space characteristics and feature data conversion is supplemented as a set of instantiated parameters.
We also compared the number of parameters and FLOPs of several different methods. Taking the IP data set as an example, the result is shown in Figure 14. The abscissa represents the parameter of networks, the ordinate represents accuracy, and the radius of the circle represents FLOPs. It can be seen that compared with other networks, our method obtains better results with fewer parameters.

D. FURTHER EXPLANATION
In order to further evaluate the robustness and generalization ability of the proposed method, the classification results VOLUME 8, 2020 obtained by the proposed method are compared with the results under different training set sizes, and the noise has been added to study the adaptability of the method. The IP data set is still used as an instance, with the same parameters as shown above.

1) DISCUSSION OF TRAINING SIZE
One of the critical factors in determining the performance of a model or classifier is the number of training samples. Without a large number of training samples available, deep learning models may be difficult to extract effective features. For HSI data, there are not enough training samples to choose from, so it is crucial to determine the number of training samples about HSI classification.
As shown in Figure 15, we randomly selected 5%, 10%, 15%, and 20% from the Indian Pine dataset as training samples. Use SVM, 1D-CNN, 3D-CNN, SS-Caps, and the proposed method. Compared with 3D-CNN and SS-Caps, the OA obtained by Caps-MRF increased by 2.75% and 0.44%, respectively. By increasing the training sample size, the classification accuracy will also increase until it reaches a stable level. The above experimental results further prove the superiority of the method under the small training sample size.

2) DISCUSSION OF NOISE LEVEL
Noise robustness experiments were conducted using noisecontaminated HSI, and the classification performance of the four classical methods used in the previous section and the proposed method at different noise levels had been compared. Taking the IP data set as an example, a group of Gaussian noises with a mean of 0 and different variances (σ ) are added for classification experiments. The value of σ is (0.15, 0.3, 0.45). The parameter settings of all methods are the same as above. It can be seen from the figure that the increase in noise has an impact on the classification performance of all methods. Among them, SVM is the most sensitive to noise and has the fastest performance degradation, possibly because it can  only extract shallow features. However, the method using the spectral-spatial characteristics shows excellent classification performance.
According to Figure 16, compared with other methods, the method based on the capsule network shows superior performance under different noise levels. As the noise variance σ increases, the OA of Caps-MRF decreases from 98.87% to 97.69%, which is much better than other methods. Experiments also show that even under boisterous conditions, our Caps-MRF still performs well.

3) USE SPATIALLY DISJOINT SAMPLES
Since the random abstract strategy will affect the reliability of the classifier results [33], in order to accurately measure the true generalization ability of the model, we use a strictly spatially separated training and test set for experiments. Also take the IP data set (available from the GRSS DASE website (http://dase.grss-ieee.org) as an example, using the method proposed in the previous section for comparison. As shown in Figure 17, it can be seen that our method has more advantages than other similar methods. At the same time, there are huge performance differences between different methods.
The reason why the performance of the spectral space method is significantly affected may be due to the high intra-class similarity and inter-class variability of HSI, and the presence of some noise. That is, the spatial correlation between the training and testing sets will reduce the quality of the conclusions obtained by the spectral space method.

V. CONCLUSION
In this article, a new, improved capsule network called the Conv-Caps was investigated for HSI classification. Furthermore, based on Conv-Caps, the MRF framework was introduced, and a new method called Caps-MRF was presented. Correctly, Caps-MRF can better use spectral and spectral-spatial features in HSI data, and MRF helps maintain labelling consistency in smaller neighbourhoods. Experimental results on three HSIs (Indian Pines, Pavia Center and Salinas A) illustrate the proposed framework has a better performance than traditional methods, especially in the case of a limited number of training samples.
The proposed methods explored the potential of the capsule network for hyperspectral classification. The equivariance of the capsule network effectively guarantees the retention of valid information, and the MRF spatial restriction helps achieve more accurate model convergence. In future work, we will further consider improving CapsNet in combination with other post-classification methods. Since the capsule can better extract spectral features, it will play a significant role in HSI image processing.