Radar Target Recognition using Salient Keypoint Descriptors and Multitask Sparse Representation

In this paper, we propose a novel approach to recognize radar targets on inverse synthetic aperture radar (ISAR) and synthetic aperture radar (SAR) images. This approach is based on the multiple salient keypoint descriptors (MSKD) and multitask sparse representation based classification (MSRC). Thus, to characterize the targets in the radar images, we combine the scale-invariant feature transform (SIFT) and the saliency map. The goal of this combination is to reduce the SIFT keypoints and their time computing time by maintaining only those located in the target area (salient region). Then, we compute the feature vectors of the resulting salient SIFT keypoints (MSKD). This methodology is applied for both training and test images. The MSKD of the training images leads to construct the dictionary of a sparse convex optimization problem. To achieve the recognition, we adopt the MSRC taking into consideration each vector in the MSKD as a task. This classifier solves the sparse representation problem for each task over the dictionary and determines the class of the radar image according to all sparse reconstruction errors (residuals). The effectiveness of the proposed approach method has been demonstrated by a set of extensive empirical results on ISAR and SAR images databases. The results show the ability of our method to predict adequately the aircraft and the ground targets.


Introduction
Nowadays, the synthetic aperture radar is becoming a very useful sensor for earth remote sensing applications. That is due to its ability to work under different meteorological conditions. Recent technologies of radar images reconstruction have significantly increased the overwhelming amount of radar images. Among them, we distinguish the inverse synthetic aperture radar (ISAR) and synthetic aperture radar (SAR). The difference between these two types of radar images is that the motion of the target leads to generate the ISAR images, whereas the motion of the radar conducts to obtain the SAR images. Both types are reconstructed according to the reflected electromagnetic waves of the target. Recently, the automatic target recognition (ATR) from these radar images has become an active research topic and it is of paramount importance in several military and civilian applications [1,2]. Therefore, it is crucial to develop a new robust generic algorithm that recognize the aerial (aircraft) targets in ISAR images and the ground battlefield targets in SAR images. The main goal of the ATR system from ISAR or SAR images is to assign automatically a class target to a radar image. To do so, a typical ATR involves mainly three steps: pre-processing, feature extraction and recognition. The pre-processing locates the region of interest (ROI) which is in the most time the target. The feature extraction step aims to reduce the information of the radar image by the conversion from pixel domain to the feature domain. The main challenge of this transformation (conversion) is to preserve and keep

Overview of the proposed approach: MSKD-MSRC
We illustrate in Figure 1 the working mechanism of the proposed method (MSKD-MSRC). It is composed by three complementary steps. The first and the second steps include the pre-processing and the characterization using MSKD method. The last step is dedicated to the recognition task using the MSRC classifier. These steps are detailed in the next subsections.

Radar images pre-processing and characterization: MSKD
As mentioned above, we combine the saliency map and SIFT descriptor in order to compute the MSKD for the radar images.

Saliency Attention
The human visual system (HVS) can automatically locate the salient regions on visual images. Inspired by the HVS mechanism, several saliency models are proposed to better understand how the attentional parts regions on images are selected. The most used model in the literature is that proposed by Itti et al. [7]. It locates well the salient regions on an image that visually attract the observers. To achieve this goal, this model exploits three channels: intensity, color and orientation. In this work, we do not integrate the color information using this model due to the grayscale nature of SAR and ISAR images. Based on the intensity channel, this model creates a Gaussian pyramid I(σ), where 0 ≤ σ ≤ 8. To obtain the orientation channel from the intensity images, the model applies a pyramid of oriented Gabor filters O(σ, θ), where θ ∈ {0 • , 45 • , 90 • , 135 • } is the orientation angles for each level of pyramid. After that, the feature maps (FM) of each channel are computed using the center-surround difference ( ) between a center fine scale c ∈ {2, 3, 4} and a surround coarser scale s = c + µ, µ ∈ {3, 4} as following: • Intensity: In total, 30 FM are generated (6 FM for intensity and 24 FM for orientation). To create one saliency map per channel, these feature maps are normalized and linearly combined. Finally, the overall saliency map is obtained by the summation of the two computed maps (intensity and orientation). The scale invariant feature transform (SIFT) is a local method proposed by Lowe [17] to extract a set of descriptors from an image. This method has found a widespread use in different image processing applications [34][35][36]. The SIFT algorithm mainly covers four complementary steps:

•
Scale space extrema detection: The image is transformed to a scale space by the convolution of the image I(x, y) with the Gaussian kernel G(x, y, σ): where σ is the standard deviation of the Gaussian probability distribution function. The difference of Gaussian (DOG) is computed as follows: where p is a factor to control the subtraction between two nearby scales. A pixel in the DOG scale space is considered as local extremum if it is the minimum or maximum of its 26 neighbors pixels (8 neighbors in the current scale and 9 neighbors in the adjacent scale separately).

•
Unstable keypoints filtering: The found keypoints in the previous step are filtered to preserve the best candidates. Firstly, the algorithm rejects the keypoints with a DOG value less than a threshold, because these keypoints are with low contrast. Secondly, to discard the keypoints that are poorly localized along an edge, this algorithm uses a Hessian matrix of size 2 × 2: We note γ 2 ≥ 1 the ratio between the larger and the smaller eigenvalues of the matrix H. Then, the method eliminates the keypoints that satisfying: where Tr(.) is the trace and Det(.) is the determinant. • Orientation assignment: By selecting a region, we calculate the magnitude and the orientation of each keypoint. After that, a histogram of 36 bins weighted by a Gaussian and the gradient magnitude is built covering the 360 degree range of orientations. The orientation that achieves the peak of this histogram is assigned to the keypoint. • Keypoint description: To generate the descriptor of each keypoint, we consider a neighboring region around the keypoint. This region has a size of 16 × 16 pixels and are divided to 16 blocks of size 4 × 4 pixels. For each block, a weighted gradient orientation histogram of 8 bins are computed. The descriptor is therefore composed by 4 × 4 × 8 = 128 values.

Multi salient keypoints descriptors (MSKD)
We illustrate in the second column of the Figure 2 the distribution of the SIFT applied to an example of ISAR and SAR images. It is clear that the SIFT generates a large number of keypoints to be processed. The majority of the these keypoints is located in the background of the radar images. This background doesn't present a crucial information for radar image characterization. To handle this problem, we propose a new method called MSKD that combine a saliency attention and SIFT methods. More precisely, we apply firstly the saliency attention model on the radar image. An example of the saliency maps of SAR and ISAR images are illustrated in the second column of the Figure 1. It is observed that this model locates and enhances the most attractive regions of the input radar images which is the target area. This saliency map is exploited as a mask to segment the radar image into background and target areas. An example of this segmentation is illustrated in the third column of the Figure 1. From the segmented radar image, we compute the SIFT keypoints. In this way, we filter out the keypoints located in the background region as illustrated in the third column of Figure 2. Finally, the descriptors matrix of one radar image is expressed as: where k is the number of salient keypoints (SKP) in the radar image and m is the size of the descriptor of each SKP which is equals in our work to 128 values.

Radar images recognition: MSRC
After obtaining the MSKD for each test and training radar images, the next step consists on its classification using the MSRC. More specifically, we first construct a dictionary containing all MSKD of training radar images. Given the MSKD of the test radar image to classify, we solve an optimization problem that codes each descriptor (task) of the test radar image by a sparse vector. The l-2 norm difference between these sparse vectors and the descriptors leads to obtain the residuals (the error of the reconstruction) of each class for each MSKD. These residuals are after sum up. Finally, the class with the minimum residual is affected to the test radar image.

Dictionary construction
The dictionary A is obtained by the concatenation of all computed MSKD of training radar images as follows: where s is the number of training radar images. We mention that the number of salient keypoints (SKP) differs from a radar image to another. Assume that n is the number of SKP in all training radar images, then, the size of the dictionary A is m × n values.

Recognition via multitask sparse framework
Given a radar image to recognize, we compute from it the set of local descriptors using the MSKD method: To recognize Y in sparse framework, we should compute the sparse reconstruction errors (residuals) for each task y i . After that, its class is found according to its sparse linear representation with all training samples: with X = (x 1 , . . . , x n ) ∈ R n×k is the sparse coefficient matrix. To obtain it, the following optimization problem is solved as follows:X where . 1 and . 2 are respectively the l 1 -norm and the l 2 -norm. denotes the error tolerance. The Equation 11 represents a multitask problem since X and Y have multiple atoms (columns). This can be transformed to k l 1 -optimization problem, one for each y i (each task): Equation 12 can be efficiently solved via second-order cone programming (SOCP) [37]. After obtaining the sparsest matrixX = (x 1 , . . . ,x k ), the total reconstruction error of each task for each class is computed as follows: where c = {1, . . . , nc} is the labels of classes, nc represents the number of classes and δ c : R n → R n is the characteristic function that selects only the coefficients associated with the c-th class and set all others to be zero. After that, the sum fusion is applied among all reconstruction residuals of all tasks according to the nc classes. Finally, the MSRC decides the class of the test sample as the class that produces the lowest total reconstruction error: Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 April 2018 doi:10.20944/preprints201804.0251.v1

Experimental results
In this section, we demonstrate the effectiveness of the proposed approach by conducting numerical recognition results on two radar images databases. The first one is composed by ISAR images and the second contains SAR images. To the best of our knowledge, until now there is not a generic approach proposed in the literature having the ability to recognize with the same treatment the ISAR and SAR images except our previous work [16]. That is why aside from our MSKD-MSRC, we also implement two ART methods which are practically close to our method for a fair comparison. The first one uses the SIFT with matching (SIFT+matching). The second one consists on using the MSKD method in combination with the matching (MSKD+matching). We note that the performance of the ATR system is related to its capabilities to locate (ROI) containing the potential targets and its ability to provide a high recognition rate from the signature of the targets.

Database description
The ISAR images used in our work was acquired in the anechoic chamber of ENSTA Bretagne (Brest, France). The experimental setup of this chamber is depicted in Figure 3. The radar targets are illuminated with a frequency-stepped signal with a band varying between 11.65 GHz and 18 GHz. A sequence of pulses is emitted using a frequency increment ∆ f = 50 MHz. By applying the inverse fast Fourier transform (IFFT), we obtain 162 grayscale images per class with a size of 256 × 256 pixels. To construct the ISAR database images, we have used 12 reduced aircraft models with 1/48 reduced scale. For each target class, 162 ISAR images are generated. Consequently, the total number of ISAR images in this database is 1944. For a rigorous details about the experiments conducted on the anechoic chamber, the reader is refereed to [1,38]. Samples of each aircraft target class of this dataset are displayed in Figure 4.

Target recognition results
We study in Figure 5 the influence of the number of atoms in the dictionary (the size of training set) on the recognition rate. We select randomly 195, 390, 585 and 780 atoms that correspond to 10%, 20%, 30% and 40% of all ISAR images in the database respectively. The remaining ISAR images are used for the test. Consequently, we adopt configurations where the number of ISAR images in the training set are less than those of the test set. It is observed that all methods are sensitive to the number of dictionary atoms. When this number increases, the recognition rate rise as well. The proposed  Additionally, comparing the matching and the MSRC to recognize the MSKD, it is observed that with the decreasing number of atoms, the recognition rate of MSKD+matching descends faster than that of the proposed method. In the upcoming experiments, we adopt 780 atoms. The comparison in term of the overall recognition rate is given in Table 1 where the best accuracy are highlighted in bold. According to this table, some observations are concluded. First, the SIFT method provides the worst result. That is due to the location of keypoints in the background of the ISAR images which is not necessary for the recognition as illustrated in Figure 2. Contrary, the MSKD contributes to enhance the recognition rate thanks to its concentration of SIFT keypoints in the target area. Then, by considering the matching, with a little number of keypoints of 23559 that corresponds to 17.83% of the 420027 initial keypoints, the recognition rate is improved by 5.29%. This issue demonstrates the benefit of the adopted filtration of the SIFT keypoints. Second, the MSRC performs better than the matching. That can be explained by the fact that the multitask sparsity of the MSKD of ISAR images leads to an enhancement of recognition rate. We provide in Figure 6 the confusion matrix of the proposed method as well as the remaining ones. For each confusion matrix, the diagonal values correspond to the recognition rate per class that should be high, while the rest of values represents the misrecognition rate that must be low. The proposed method exceeds other methods for the recognition rate for all classes except the F15 target. In addition, the MSKD+MSRC gives a recognition rate of 100% for five target classes which are F117, F104, A10, F14 and Mig29. The SIFT+matching gives a high recognition rate comparing to other methods for only one class which is the Rafale. The overall recognition rate of our method is 93.65% which is 11.04% and 5.75% better than SIFT+matching and MSKD+matching methods respectively. This improvement demonstrates the power of the combination of the MSKD and the multitask sparse classifier to recognize the ISAR images.

Databases description
Regarding to the SAR images, the moving and stationary target acquisition and recognition (MSTAR) public dataset 1 is used. This last is developed by Air Force Research Laboratory (AFRL) and the Defense Advanced Research Projects Agency (DARPA). The SAR images in this dataset are gathered by the X-band SAR sensor in spotlight mode. The MSTAR dataset includes multiple ground targets. Samples of each military ground target class of this dataset are displayed in Figure 7. Two major versions are available for this dataset: • SAR images under standard operating conditions (SOC, see Table 2). In this version, the training SAR images are obtained at the 17 • depression angle and the test ones at 15 • depression angle. Then, there is a depression angle difference of 2 • . • SAR images under extended operating conditions (EOC) including:

-
The configuration variations (EOC-1, see Table 3). The configuration refers to small structural modifications and physical difference. Similarly to the SOC version, the training and the test targets are captured at 17 • and 15 • depressions angles respectively.

-
The depression variations (EOC-2, see Table 4). The SAR images acquired at 17 • depression angle are exploited for training, while the ones taken at 15 • , 30 • and 45 • depressions angles are used for testing.
The main difference between the SOC and EOC versions is that in the SOC, the condition of training and test sets are very near contrary to the EOC. We note that in the opposite case of the ISAR images database, the MSTAR is already partitioned to training and test datasets.

Target recognition results
We provide in Table 5 the quantitative comparison between the different methods on several version of MSTAR dataset. As can be seen from this table, the MSKD performs much better than the use of the whole SIFT keypoints. The reason is that not all SIFT keypoints are useful to characterize the SAR images, and it can be remedied by the the adopted filtration method as illustrated in Figure 2. This  conclusion in an important motivation for coupling the saliency attention and the SIFT. Additionally, the use of the multitask SRC leads to an overwhelming superiority compared to the matching approach thanks to the sparse vectors extracted from each task in the MSKD. Considering the 10-class ground target (SOC), the recognition rate of the proposed method is 80.35% which is 35.17% and 32.52% better than the SIFT+matching and the MSKD+matching. The confusion matrix of all methods are displayed in Figure 8. The proposed method provides a confusion between the BMP2, BRDM2 and BTR70 targets because they have the same vehicle type which is the armored personnel carriers. It does not able to classify correctly the BMP2 target. However, it has the high recognition rates for all classes compared to the remaining methods.
Regarding to the configuration variations (EOC-1), the high recognition rate of 84.54% is given by the proposed method comparing to the remaining ones. It is an improvement of 40.14% and 17.22% compared to the competitors. We give in Table 6    studied under EOC-1 version. We remark that the recognition rates of BMP2 and T72 are obviously less than BTR70. This is because that they have many variants included in the test set which are not existed in the training set. We note also that the MSKD-MSRC holds a remarkable superiority on BTR70 and T72 classes. However, it performs poorly on BMP2 class. Moreover, SIFT+matching method is not able to recognize any images in the BTR70 class. For the depression variations (EOC-2), the recognition rate is sharply degraded when the aspect angle increases for all methods. That is due to the variation between the aspect angles especially in the case of 30 • and 45 • which represent a change of 13 • and 28 • compared to the train targets captured at 17 • . For instance, using the proposed method drops from 84.18% to 68.58% to 36.32%. The recognition rate of MSKD-MSRC still achieves the highest recognition rate in 15 • and 30 • depression angles. Whereas, the SIFT+Matching and the MSKD+Matching work better in the case of 45 • depression angle. Table 7 records the confusion matrix of all methods under EOC-2 using different depression angles. The proposed method gives a balanced recognition rate per class with a low value of the misclassification. However, for the 45 • depression angle, we show a high confusion between BRDM2 and 2S1 which drastically degrades the overall recognition rate. Similarly, the 30 • depression angle enjoys the trend with a more moderation to 45 • depression angle.

Conclusions and Future Work
This paper proposed a new generic algorithm called MSKD-MSRC for radar target recognition in ISAR/SAR images. Our approach represents each radar images with a set of salient keypoint descriptors (MKSD) located in the target area. For each test descriptor in the MSKD, a sparse reconstruction error (residual) according for each class is computed. After that, we sum all obtained residuals for the whole MSKD descriptors for all classes. The class with the minimum residual is affected to the test image. Extensive experiments are conducted on the ISAR images database and in the different version of MSTAR dataset. We notice that the SIFT and matching perform worst in the case of ten and eleven classes. However, it is a competitive method in the case of three classes. Considering the EOC condition, the difference in depression angles has generally more influence in recognition rate than the configuration variations. For all above performance comparisons in the experimental results, it can be concluded that despite that our approach does not provide the high recognition rate for all Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 19 April 2018 doi:10.20944/preprints201804.0251.v1 classes, it achieves in most cases the high overall recognition rates with a balanced performance for all ISAR and SAR classes in a reasonable runtime. Additionally, it effectively deals with the challenge of target recognition under EOC with a slight degradation in the case of EOC-2 (45 • ). Considering the minor flaws of the proposed method, the future work will focus on using other local descriptors as well as testing the proposed system in other radar images databases such as those acquired in the maritime environment.

Conflicts of Interest:
The authors declare no conflict of interest