Target recognition in synthetic aperture radar image based on PCANet

: Automatic targets recognition (ATR) for synthetic aperture radar (SAR) image is very important. ATR can be used in traffic management, national frontier safety, and so on. Traditional algorithms for SAR ATR is composed of extraction features and training classifier. The features are essential for the classification accuracy. However, choosing good features by hand is a hard task. The deep convolutional neural networks (CNNs) which can learn features automatically have got a great performance in natural images. However, the CNNs have many parameters and need a lot of data to train such networks. The remote- sensing data of SAR is limited. Then, the authors need a simple network which needs not much data and easy to train. The principal component analysis network (PCANet) is a shallow network that performs well in the recognition task and needs no hand features choosing. Though this network has produced a wide application in the natural images, it is rarely used in the SAR images. The experimental result of the moving and stationary target acquisition and recognition (MSTAR) dataset shows that the PCANet can achieve over 99% accuracy on ten-class targets. This result is better than traditional algorithms and is very close to the results of deep-learning methods.


Introduction
With the development of remote-sensing technology, the amount of remote-sensing data are increased fast [1]. Target recognition based on remote-sensing data is also increasing. Automatic targets recognition (ATR) for SAR image is used in traffic management, national frontier safety, and many fields. The synthetic aperture radar (SAR) can get high-resolution images in all-weather and allday [2]. So, SAR images are wildly used in the target detection and classification. A good ATR method can help us to more effectively deal with many works.
The traditional method of classification is to extract some handcrafted features and train a classifier [3]. Wang et al. proposed a classification scheme based on geometric and backscattering characteristics [4]. Ding et al. proposed a target recognition method by exploiting the azimuth sensitivity information (ASI) [5]. The ASI is constructed based on the original SAR images, which can describe the azimuth sensitivity of a certain target class at a special azimuth. Zhu et al. proposed an improved shape contexts method which can describe the topology and intensity of the scattering points of the targets [6]. These methods need to use the features designed by researchers. These handcrafted feature extraction methods have some drawbacks. The first is that we do not know which features are best to train the classifier. The second is that it needs to change features for different targets type. Different targets have different effective features. Then, we could not know if there are more features which can improve the performance of recognition.
With the development of deep-learning method, many imagerecognition fields have achieved great performance. The deep convolutional neural networks (CNNs) can automatically learn the feature from a large dataset. It is challenging to utilise deep CNNs for SAR target classification because the CNNs have many parameters to train and require much training data. However, it is hard to get enough SAR data and train a CNN. Some researchers had made some efforts to cope with this problem. Lin et al. proposed the convolutional highway unit to train deeper networks with limited SAR data [7]. Chen et al. presented a new allconvolutional networks (A-ConvNets), which only consists of sparsely connected layers, without fully connected layers being used [8]. Huang et al. proposed a transfer learning-based method, making knowledge learned from sufficient unlabelled SAR scene images transferrable to labelled SAR target data [9]. These methods need less data than the usual CNN, while such networks are also complicated and not easy to train. Therefore, we need a simple network which has fewer parameters and is easy to train. The principal component analysis network (PCANet) is a shallow network that presents great performance in the recognition mission and needs no hand-crafted features choosing [10]. Though this network has produced a wide application in the natural images, it is rarely used in the SAR images.
In this letter, we attempt to use the PCANet which can learn powerful high-level feature representations with few SAR training data. Then, the network has fewer parameters than CNNs. The experimental result of the moving and stationary target acquisition and recognition (MSTAR) dataset shows that the PCANet can achieve over 99% accuracy on ten-class targets. Then, this result is better than traditional hand-crafted features methods.
The rest of this paper is organised as follows. Section 2 introduces the PCANet. Experimental results are provided in Section 3. Conclusions and future work are provided in Section 4.

Description of PCANet
In this section, we describe the PCANet. The structure of PCANet is illustrated in Fig. 1. The number of training images is M. I i is the ith image. The size of each image is l × r. The patch (or 2D filter) size is s 1 × s 2 . The method of learning filters from training images is as follows: For each pixel in an image, we take s 1 × s 2 patch and collect all patches of the ith image. We use p i,j to denote the jth patch in I i . After subtracting patch mean from each patch, we obtain where p i, j is a mean-removed patch. Then, we get If the filter number in stage i is L i , PCA minimises the reconstruction error within a family of orthonormal filters, i.e.
where I L1 is identity matrix of size L 1 × L 1 . The solution is the L 1 principal eigenvectors of PP T . The PCA filters are expressed as where mat s 1 , s 2 (x) maps x to a matrix W, and q l (PP T ) denotes the lth principal eigenvector of PP T . The second stage shares the same progress with the first stage. The only difference is the input data. The input of the second stage is the output of the first stage, the lth filter output can be expressed as where * denotes 2D convolution. After the same progress, we get the output of the second stage, which is The output number of the first stage is L 1 , and every output of the first stage has L 2 outputs, and the output number of the second stage is L 1 × L 2 . Then, we binarise the outputs and get H(O i l ), where H (·) is a Heaviside step function. Then, we convert the L 2 outputs in O i l into a single integer-valued image: For each of the L 1 images T i l , we partition it into B blocks. We compute the histogram of the decimal values in each block, and concatenate all the B histograms into one vector and denote as Bhist(T i l ). Then the feature of the input image is defined to be set of block-wise histograms Then, we can train a support vector machine (SVM) classifier use these features. Then, we can use this PCANet and SVM to recognise a new target.

Experimental results
The experiment data used in this letter were collected by the Sandia National Laboratory SAR sensor platform. The data is called MSTAR data set. Those images were collected using an X-band HH-polarisation SAR sensor, in a 0.3 m*0.3 m resolution spotlight mode, full aspect coverage. The ten target classes are listed in Table 1, which are captured at 15° and 17° depression angles. Examples of ten target classes of SAR images and optical images are shown in Fig. 2. The MSTAR data set is widely used to test the performance of SAR recognition algorithms. In our experiment, the PCANet is evaluated on the ten targets classification problem. The train data are images that captured at 17° depression angle and the test data are images that captured at 15° depression angle. We do not do any particular pre-processing algorithm for those SAR images. To find out the best performance of different parameters. We test some examples. When s 1 = s 2 = 7, L 1 = L 2 = 8, we test the influence of train data number. The results are shown in Table 2. The average accuracy is better when the data is more. Then, we test the influence of different size of s 1 . The results are shown in Table 3. With the size of s 1 becoming larger, the accuracy first becomes higher and then lower. This is related to the targets sizes. We also change the number of L 1 , the result is shown in Table 4. Some essential parameter values of PCANet in  this experiment are as follows, s 1 = s 2 = 10, L 1 = 10, L 2 = 8. Table 5 shows the confusion matrix of the ten targets classification. Each row in the confusion matrix denotes the actual target class, and each column represents the class predicted by the algorithm. ZIL131, ZSU234, and BTR70 can be recognised with a probability of 100%. The average accuracy of all targets is 99.22%. We also compare the performance of PCANet with some stateof-the-art methods. These methods include SVM [11,12], AdaBoost [12,13], modified polar mapping classifier (M-PMC) [14], ASI [5], and A-ConvNets [8]. The classification accuracies of these algorithms are listed in Table 6. As all those algorithms use the same data, we just cite the results published in papers. We can see that the PCANet has a better performance even when compared with other CNNs methods.