Block Sparse Bayesian Learning over Local Dictionary for Robust SAR Target Recognition

. This paper applied block sparse Bayesian learning (BSBL) to synthetic aperture radar (SAR) target recognition. The traditional sparse representation-based classiﬁcation (SRC) operates on the global dictionary collaborated by diﬀerent classes. Afterwards, the similarities between the test sample and various classes are evaluated by the reconstruction errors. This paper reconstructs the test sample based on local dictionaries formed by individual classes. Considering the azimuthal sensitivity of SAR images, the linear coeﬃcients on the local dictionary are sparse ones with block structure. Therefore, to solve the sparse coeﬃcients, the BSBL is employed. The proposed method can better exploit the representation capability of each class, thus beneﬁting the recognition performance. Based on the experimental results on the moving and stationary target acquisition and recognition (MSTAR) dataset, the eﬀectiveness and robustness of the proposed method is conﬁrmed.


Introduction
Synthetic aperture radar (SAR) has been used in Earth observations since it was first developed. Automatic target recognition (ATR) is a special application in SAR image interpretation, which aims to analyze the interested targets in images and determine their labels. Since the start in 1990s, SAR ATR methods have been studied widely using feature extraction and classification algorithms [1,2]. Different types of features were applied to SAR ATR including geometrical, transformation, and electromagnetic features. Target contour, region, shadow, etc. are typical geometrical features, which describe the sizes or shape distributions [3][4][5][6][7][8][9][10][11][12]. Ding et al. developed a matching algorithm of binary target regions for SAR ATR [3], which was further improve by Cui et al. using the Euclidean distance transform [4].
e Zernike and Krawtchouk moments were employed to describe the target regions in [5,6], respectively. Anagnostopoulos extracted the outline descriptors for SAR target recognition [7]. Papson validated the utility of target shadow for SAR ATR [8].
e transformation features were usually obtained in mathematical or signal processing ways. e mathematical tools include principal component analysis (PCA) [13], kernel PCA (KPCA) [14], and nonnegative matrix factorization (NMF) [15]. In addition, some newly proposed manifold learning methods were demonstrated effective for SAR ATR [16][17][18][19]. Image decomposition tools including wavelet [20], monogenic signal [21,22], and empirical mode decomposition (EMD) [23,24] were adopted in SAR ATR with good performance. e scattering center is the typical scattering feature with several applications in SAR ATR [25][26][27][28][29][30][31]. A Bayesian matching scheme of attributed scattering centers was developed in [26] for target recognition. Ding et al. used the attributed scattering centers as the basic features and proposed several classification schemes [27,28]. Zhang proposed a noise-robust method using attributed scattering centers [29]. Furthermore, the attributed scattering centers were employed to partially reconstruct the target to enrich the available training samples [30,31]. In addition to the use of single-type features, many multifeature SAR ATR methods were designed in the present works [32][33][34][35][36]. e classification algorithms were mainly introduced from the pattern recognition fields. e famous classifiers including support vector machine (SVM) [37,38], adaptive boosting (AdaBoost) [39], and sparse representation-based classification (SRC) [40][41][42], were successfully applied to SAR ATR. SVM was first used by Zhao and Principe for SAR target recognition [37]. After then, SVM has been the most popular classifier to classify different kinds of features for SAR ATR [3,5,38]. Sun et al. developed the AdaBoost for SAR target recognition, which enhanced the classification performance by boosting several simple classifiers [39]. Based on the compressive sensing theory, SRC was first validated in face recognition [43] and further used in SAR ATR in many related works [40][41][42]. With the progress in deep learning, many novel networks were developed for SAR target recognition [44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59], in which the convolutional neural network (CNN) is the mostly used. Network architectures including the all-convolutional neural networks (A-ConvNets) [46], enhanced squeeze and excitation network (ESENet) [47], gradually distilled CNN [48], cascade coupled CNN [49], and multistream CNN [50], were developed and applied. Other works enriched the effective training samples using transfer learning, data augmentation, and so forth, thus improving the classification ability of the networks [51][52][53]. However, the performance of deep learning models has close relation to the scale of the training set. With scarce training SAR images, the final performance will be significantly impaired.
is paper proposes a novel classification scheme for SAR target recognition by improving traditional SRC. SRC performs linear representation of the test sample over the global dictionary established based on all the training samples. e reconstruction errors from different classes are analyzed to obtain the target label afterwards. In essence, the relative representation capabilities are compared in SRC but the absolute representation capability of each class is not exploited fully. erefore, this paper represents the test sample over the local dictionaries from individual training classes. erefore, the capability of each class can be fully investigated as for describing and representing the input sample. Considering the azimuthal sensitivity of SAR images [60,61], the test sample is only related to those training samples, which share similar azimuths with it. When the atoms in the local dictionary are sorted according to the azimuths, the linear coefficients over the local dictionary are sparse ones with block structure; i.e., the nonzero elements accumulate in a small azimuth interval. Accordingly, the block sparse Bayesian learning (BSBL) [62] is employed to solve the sparse coefficients on the local dictionary, which could exploit the block structure with higher precision. Finally, the reconstruction errors of individual classes are analyzed to determine the target type. To investigate the performance of the proposed method, the moving and stationary target acquisition and recognition (MSTAR) dataset is employed for test and comparison. e results validate the superiority of the proposed method under the standard operating condition (SOC) and typical extended operating conditions (EOC).

SRC
SRC can be regarded as a modification of the linear representation problem with the idea of compressive sensing [40][41][42][43]. For the sample to be classified, it is represented over the global dictionary comprising of all the training samples while the linear coefficients are sparse with only a few nonzero ones. e global dictionary is denoted as For the test sample y, the reconstruction process is illustrated as follows: where x denotes the solved coefficient vector.
With the solution of x, the target label of y is determined by calculating the reconstruction errors of different classes and comparing them as follows: where δ i (x) extracts the coefficient vector of the ith class.
In SRC, the representation errors actually embody the relative capabilities of reconstructing the test sample for different classes. However, the absolute representation capability of each class cannot be effectively exploited. In other words, how could individual classes best reconstruct the test sample should be further evaluated.

Block Sparse Bayesian Learning over Local Dictionary
3.1. Sparse Representation over Local Dictionary. Rather than the representation over the global dictionary, this paper represents the test sample on the local dictionary as follows: where x i denotes the linear coefficient vector over the ith local dictionary and ε i is the reconstruction error. Figure 1 illustrates four SAR images of BMP2 target from the MSTAR dataset, which are measured at different azimuths. As shown, SAR images of the same target from notably different azimuths have obviously distinct appearances. Because SAR images are sensitive to the azimuth changing, only those training samples (atoms in the dictionary) with approaching azimuths to that of the test sample are useful in the linear reconstruction. By arranging the atoms in the local dictionary in the descending (or ascending) manner, the nonzero elements in x i tend to amass in a small azimuth interval. So, the resulting x i is a sparse vector with the block structure. To better reconstruct the test sample, BSBL is employed to estimate x i , which is demonstrated more suitable for the reconstruction of block sparse signals [62].
Compared with the traditionally global dictionary-based SRC, the sparse representation over the local dictionary could further exploit the representation abilities of the training classes. e reconstruction error in (2) reflects the absolute representation capability of the ith class as for representing the test sample y. In addition, with the constraint of azimuthal sensitivity during the linear representation, the reconstruction errors from different targets can be used to make reliable decisions on the target label.

BSBL Framework
. Assuming x as a block sparse signal, it contains the block structures as follows: e signal x in the previous equation has g blocks among which only a few ones are nonzero. Here, d i denotes the length of the ith block. Usually, the samples in the same block are closely related. To describe the block structure as well as the intrablock correlation, the BSBL framework [62] employs the parameterized Gaussian distribution: In the previous equation, c i and B i are unknown deterministic parameters in which c i represents the confidence of the relevance of the ith block and B i captures the intrablock correlation. Assuming that different blocks are mutually independent, then the signal model can be rewritten as the following equation: where Γ is a block diagonal matrix in which the ith principal diagonal is c i B i . e observation y is modeled as the following equationfd7: where Φ is a M × N sensing matrix and n denotes the noise term. e sensing matrix Φ is an underdetermined matrix and the noise is modeled as a zero-mean Gaussian distribution with and variance of β − 1 with β being an unknown parameter. erefore, the likelihood is given by e main body of the BSBL algorithm iterates between the estimation of the posterior p( e update rules for the parameters c i , B i and β are derived using the Type II Maximum Likelihood method, which leads to the following cost function: Based on the estimations of the parameters c i ,B i and β, the MAP estimates the coefficient vector x as follows: x � ΣΦ T βy.
3.3. Target Recognition. By solving the block sparse coefficients on local dictionaries, respectively, the reconstruction error of each training class is obtained as follows: where x i is the solved coefficient vector over the ith local dictionary by BSBL. Afterwards, the target label is determined to the minimum-error class as (2). Figure 2 illustrates the main idea of the proposed method. During the implementation, PCA is performed as a feature extraction step for both training and test samples and the detailed steps summarized as follows: Step 1: Arrange the training samples of each class according to their azimuths in an ascending order Step 2: Represent the test sample on the local dictionaries using BSBL Step 3: Reconstruct the test sample with different classes to obtain the residuals Step 4: Make the classification decision according to the minimum error

Preparation.
With volumes of measured SAR images, MSTAR dataset has long been used on the examinations of target recognition algorithms. As shown, SAR images of the 10 targets in Figure 3 are available in the dataset, collected by X-band radar with the resolution of 0.3 m (cross range) × International Journal of Optics 0.3 m (range). Samples for each target cover 0°∼360°aspect angles in both training and test sets. Accordingly, several experimental conditions could be set up to test the SAR ATR methods including SOC and EOCs. Several reference methods are chosen from the current works to be compared with the proposed method including SVM [37], AdaBoost [39], SRC [40], and A-ConvNet [46]. ese methods aimed to improve the performance by updating the classification schemes. For SVM, AdaBoost, and SRC, they also performed on the PCA feature vectors, which are consistent with the proposed method for fair comparison. A-ConvNet was a CNN-based method, which was trained by the original image pixels. All these methods are performed by the authors on the same hardware platform with the proposed one.

Recognition Results.
In the following experiments, the SOC is first set up for classification. Afterwards, three different EOCs are set up including configuration variances, depression angle variances, and noise corruption. Simultaneously, the four reference methods are tested and compared with the proposed one.

Recognition under SOC.
e conditions for the SOC experiment are set up as in Table 1, which include the 10 classes of targets in Figure 3. Overall, the training and test samples are assumed to share high similarities. Specifically, the test samples of BMP2 and T72 include two different configurations from their training sets (denoted by the serial number). e classification results of this method in this case are obtained as in Figure 4, which is displayed as a confusion matrix. As shown, the correct recognition rates of different classes are higher than 97% recorded in the diagonal. As an overall evaluation, the average rate of the correct recognition ((AR cr )) reaches 98.76%. Table 2 compares AR cr s of the proposed method and the reference ones. It reflects that the result of A-ConvNet is slightly lower than the proposed method owing to the good classification of deep learning models. In comparison with SRC, the recognition performance is greatly enhanced by the proposed method, which validates the effectiveness of BSBL as a classification scheme. With the highest AR cr , the proposed method achieves the best effectiveness under SOC.  Figure 2: Procedure of SAR target recognition based on BSBL over local dictionaries. As reported in the current literatures, the MSTAR dataset can be employed to set up several different EOCs with regard to target configurations, depression angles, and noises. In the following, the proposed method is tested under the three typical EOCs, respectively.

Configuration Variance
e military vehicles usually have several different variants, which have structural modifications. e training and test samples under configuration variance are set up as in Table 3 with four targets to be classified. Among them, BDRM2 and BTR70 are placed in the training set but with no test samples in order to enhance the classification difficulties. e test configurations of BMP2 and T72 are totally different from the counterparts in their training samples. Figure 5 illustrates four different configurations of T72. As observed, they share similar global appearances but have some local differences. Table 4 gives the assigned labels of all the test samples of BMP2 and T72. Each configuration from BMP2 and T72 can be correctly recognized with an accuracy over 96% and AR cr reaches 97.18%. AR cr s of different methods with regard to configuration variance are compared in Table 5. With the highest AR cr , the superior robustness of the proposed method over the reference methods can be validated. Specifically, in comparison with traditional SRC, the proposed method noticeably improves AR cr with a large margin, which demonstrated the high effectiveness of BSBL.

Depression Angle Variance
When SAR images are measured from a depression angles notably different from the corresponding training sample, they have many differences although from the same azimuth. e training and test samples under large depression angle variances are set up as in Table 6 with three different targets. e training samples are combined by SAR images of three targets at 17°depression angle but the test set is comprised by two subsets from 30°and 45°, respectively. Figure 6 illustrates SAR images from the three different depression angles, respectively, in which their differences can be intuitively observed. Table 7 compares AR cr s of the five methods at the two test depression angles, respectively. At 30°, these methods maintain higher AR cr s than 94%. However, at 45°depression angle, AR cr of each method degrades significantly below 73%. In comparison, the highest AR cr s at both depression angles are achieved by the proposed method, showing its better robustness to large depression angle variances. AR cr of A-ConvNet decreases greatly at 45°depression angle as the training set could hardly reflect and describe the situations occurring in the test samples. As a result, the trained networks lose its high validity. Compared with traditional SRC, BSBL over the  Figure 4: e confusion matrix of the proposed method on 10 classes of targets under SOC.

Noise Corruption
Noises are common in measured SAR images, which cause obstacles to correct target recognition. In the previous works, two types of noises are used to simulate noisy SAR images for classification, for example, additive Gaussian noises [63] and random noises [46]. Figure 7 shows exemplar SAR images with random noises. Some of the original pixels are replaced with randomly high values according to the noise level. At each noise level, the performance of different methods is tested and the results are plotted in Figure 8. As shown, the proposed method gets the highest AR cr at each noise level, showing its superior       International Journal of Optics robustness to noise corruption. As a compressive sensing algorithm, BSBL has better robustness to noises. Similarly, SRC generally achieved better performance than SVM, AdaBoost, and A-ConvNet under noise corruption. Compared with traditional SRC, BSBL contributes to the better performance of the proposed method.

Conclusion
BSBL is applied to SAR target recognition in this paper, which is performed on local dictionaries. For each training class, it produces a reconstruction error for the test sample based on the solution form BSBL. ese reconstruction errors fully exploit the representation capability of different classes, which can be used to judge the target label. As the azimuthal sensitivity, the linear coefficients generated for the test sample over local dictionary are block sparse ones; thus BSBL is more suitable for solution. From the results on the MSTAR dataset, the proposed method could achieve AR cr of 98.76% for 10 classes under SOC and 97.18% under configuration variance. AR cr s at depression angles of 30°a nd 45°are 97.32% and 72.85%, respectively. e robustness under random noise corruption also defeats the four reference methods. All these comparisons show the superior performance of the proposed method.

Data Availability
⋆ e MSTAR dataset used to support the findings of this study is available online at http://www.sdms.afrl.af.mil/ datasets/mstar/.