Automatic Image Matting of Synthetic Aperture Radar Target Chips

A matting technique to extract the targets from synthetic aperture radar (SAR) images is presented. Binary segmentation is performed initially for rough identification of target boundaries. Trimap is then estimated by combining the boundary structures of the input and segmented images using guided filter. In order to improve the accuracy of estimated trimap, super-pixels based segmentation is performed. A propagation based matting algorithm is then applied to separate the target from non-target region. Simulations conducted on different SAR images from MSTAR database show significance of proposed technique.


Introduction
Automatic extraction of targets from images (acquired from different sensors) play a key role in recognition and classification. Generally two types of data are used for that purpose i.e., optical and radar data. Figure 1 (a) shows the optical image, that comprises of visible wavebands and is similar to how human eye perceive the world. Whereas, the images acquired from synthetic aperture radar (SAR) sensors are of great importance due to their suitability in handling day-night operations and all-weather conditions. An example of X-band SAR image is shown in Fig. 1 (b). It can be observed that the speckle noise present in SAR image greatly affects the quality causing difficulty in interpretation.
Target extraction from SAR image requires 'target chip' that is obtained after detecting and discriminating the targets from SAR image. Target detection is performed on entire SAR image containing multiple targets and extracts the regions having potential target. In case of stationary targets, screening techniques can be utilized to detect multiple targets from SAR scene whereas minimizing the false alarm [1]. After detection and discrimination of target regions, a target chip is acquired that contains background clutter and the actual target (along with its shadow) that lies at the centre of the chip [2]. This paper is intended to improve the accuracy of target extraction from its background clutter for effective target's recognition and classification. Figure 2 shows the screening process to identify potential targets and the target chip obtained by extracting region of interest.
In past decade, different methods have been developed for target extraction from SAR images. Samanta and Sanyal [3] proposed an adaptive threshold technique based on region merging to segment the SAR images. The technique produces satisfactory results in simple images but fails to perform well in complex background and noise conditions. Amoon and Rezai-Rad [4] and Tan, et al. [5] utilized mean filtering after histogram equalization to reduce the noise level. Threshold value is estimated to detect binary target for efficient segmentation of SAR images. However the techniques fail to preserve useful spacial information and have limited performance for targets having several gray levels. Anagnostopulos [6] and Ding, et al. [7] applied basic thresholding and constant false alarm rate (CFAR) detection respectively, along with morphological operations for target region extraction, however the morphological operations fail to preserve target shape for complex targets.
Huang, et al. [8] proposed wavelet decomposition and CFAR for segmentation of the target and shadow regions from noisy SAR images. Han, et al. [9] proposed a level set approach where shape priors are used into active contours to extract the targets from SAR images, however the approach is inefficient since it requires user intervention to define the shape priors. R. Zhang and M. Zhang [10] utilized active contour without edges after histogram equalization to segment SAR targets, however the target boundaries are irregular for similar background color.
Hu, et al. [11] adopted super-pixel segmentation technique based on statistical region merging [12] and nonsubsamples contourlet transform [13]. Fuzzy based clustering [14] is then used to estimate smooth target boundaries, however fails to completely mitigate the noise effects. Saliency based target extraction approaches [15], [16] utilize spectral residual and Bayesian-morphological operations, however the techniques provide limited performance for images containing noise. Ambrosanio, et al. [17] proposed Kolmogorov-Smirnov (KS) test based approach to detect the target, however it fails to discriminate the actual target and its shadow. Cho, et al. [2] proposed matting based approach to extract the target while preserving the intensity information. The approach provides robustness, however has limited performance for noisy images.
Concisely, existing techniques failed to accurately extract target from SAR target chip due to presence of noise, which made the interpretation and extraction of targets a challenging task. In this work, a matting technique for extraction of target from SAR target chips is proposed. Since the aim of this work is to separate the target region from non-target region, it can be assumed that the target chip is a combination of two layers i.e., target and non-target layer. However, in real-time, the intensity of SAR images varies due to intensity variation of the sensing environment caused by either target reflectance, scattering characteristics or coherent processing of SAR images. The varying intensity of the target area defines the opacity of the target, that can be estimated by image matting. (Matting aims to extract foreground regions from image by estimating the opacity value of pixels near boundary areas, that are mixed between foreground and background). Thus, image matting extracts the target area by assigning target as foreground, non-target as background and opacity of the target as alpha matte. The main contributions of proposed technique are described below, 1. The impact of noise is reduced using median filtering for correct identification of target region.
2. Trimaps are estimated automatically, and their accuracy is improved via super-pixels based segmentation approach.
3. Accurate target is extracted from target chip using propagation based matting approach.

Proposed Methodology
Image matting deals with the problem to separating foreground objects from background by estimating the opacity values of the foreground objects. Since it is an illposed problem, thus user intervention is required to generate a trimap (to divide the image into foreground, background and unknown regions). However, user interaction is difficult and time-consuming while processing SAR target chips. The paper proposed an automatic trimap generation technique to solve this problem (as represented in the flowchart shown in Fig. 3). For that purpose, correct identification of target region is required. Let G be the input SAR image chip with dimensions m × n. The first step is to apply median filter to mitigate the effect of speckle noise, that limits the ability to detect ground targets, where ξ indicates median filtering and ϕ represents filter size (8 × 8). Decreasing the filter size reduces some amount of noise but unable to completely removed it, while increasing the filter size over-smooths the image.
In SAR images, the target region has high mean intensity value as well as high standard deviation (due to increase in variability of pixel intensities caused by edges and scattering centres). Based on these characteristics, a patch from the corner of G representing the background area, is selected (since the target is roughly present at image centre) to identify the actual target area. The threshold value is computed as, where µ and δ represents the mean and standard deviation of patch, respectively, and k = 0.9 is the bias.
The threshold value is applied on each pixel of image M to estimate binary target image B i.e., The threshold correctly identifies the foreground target but also produces some amount of noise in non-target region. Thus, morphological opening operation is used to remove unwanted artifacts and noise; where and ⊕ represents erosion and dilation operators respectively with disk structure elements d 1 and d 2 . Figure 4 shows the result of binary segmentation for initial target detection.
After the identification of target and non-target areas, a trimap is generated by passing the binary mask H and guidance image M through guided filter i.e., where ω represents local window size of the guided filter.
The initial trimap is defined as, where T o (m, n) = 1 represents the foreground region, T o (m, n) = 0 represents the background region and T o (m, n) = 0.5 defines the unknown region.
To further refine the trimap by reducing the unknown region, the image M is passed through simple linear iterative clustering algorithm (SLIC) [18], that groups the image pixels on the basis of color and texture similarities and represent them as super-pixels. i.e., (Fig. 5(b)).
where ϑ denotes the total number of super pixels. Let χ (i) be the i th super pixel (where i = 1, 2, 3, ..., ϑ) and T (i) o (m, n) represents the pixels of trimap in i th super pixel, the improved trimap T (i) (m, n) is defined as, o (m, n) represents the mean value of pixels belong to i th super-pixel of initial trimap T o . γ 1 = 0.8 and γ 2 = 0.2 are empirically selected constants with γ 1 > γ2. Increasing the threshold γ 1 causes many definite foreground pixels to be marked as unknown, whereas lowering the threshold γ 1 may cause unknown pixels to be part of definite foreground. Similarly, Increasing the threshold γ 2 causes many unknown pixels to be part of background region, while lowering the threshold γ 2 leaves many definite background pixels as unknown.
The final trimap T(m, n) is obtained by assigning the value 1 to the pixels of trimap that belong to super-pixel i, if their mean value is greater than defined threshold γ 1 . Similarly, 0 value is assigned to the pixels of trimap that are part of super-pixel i, if there mean value is less than defined threshold γ 2 and 0.5 is assigned to all the remaining pixels in the super-pixel region.
The target chip consists of a target region (that is a connected region with no holes) and the background clutter. Based on these characteristics, propagation based matting is applied to define affinities between neighbouring pixels based on intensity similarity instead of sampling based matting, that is useful for non-distributed color distribution. Knearest neighbours (KNN) matting algorithm [21] is applied on source image G with refined trimap T to generate alpha matte (which separates the target and non-target regions) i.e.,

Results and Discussion
The proposed technique is simulated using publicly available MSTAR database [22] containing X-band SAR images. The images are captured with different dispersion angles and orientation, with target that lies at the centre of the image. The targets don't have clear edges, and the presence of speckle noise in SAR images make the target detection and extraction a more challenging task.
The simulations are conducted on an Intel-Core i7-3520M CPU (2.90GHz) with 8-GB RAM and MAT-LAB R-2014a. The proposed technique is analyzed visually and quantitatively, by applying different matting techniques including closed form matting (CFM) [19], Kullback-Leibler divergence based sparse matting (KL sparse matting) [20] and KNN matting [21]. Figure 6 shows the benefit of superpixels based trimap refinement step. Figure 6 (c) shows the estimated trimap using guided filtering, whereas, the matting by applying CFM [19], KL sparse matting [20] and KNN matting [21] are shown in Fig. 6 (d),(e),(f) respectively. The results clearly show the amount of error around the boundaries of the target. However, matting results (shown in Fig. 6 (h),(i),(j)) produced after refining the trimap using super-pixels ( Fig. 6 (g)) are much improved, consequently enhancing the target extraction accuracy.
The comparison of different matting techniques with proposed trimap is shown in Fig. 7. In CFM [19], α is estimated by pixels affinities based on color line model. KL sparse matting [20] collects sparse set of definite foreground/background samples to compute the value of unknown pixel. KNN matting [21] is based on non-local principle which utilizes closed-form solution to find pixel affinities in k-nearest neighbours. Figure 7 (a) shows the input target chips containing the targets that lie at the center of the chip along with the background clutter. The ground truths indicating the actual target with removed background are shown in Fig. 7 (b). The performance of different matting techniques is analyzed to extract targets from SAR images. It can be observed that all the matting techniques successfully extract the target without losing any information due to accurate trimap estimation. However, CFM [19] produces over-smooth results along the boundaries of the extracted target. Moreover, KL sparse matting [20] fails to completely overcome noise artifacts from background, thus produces inaccurate results.  [19] with proposed refined trimap (d) KL-Matting [20] with proposed refined trimap (e) KNN-Matting [21] with proposed refined trimap.
In comparison, KNN matting [21] better extracts the target with refined trimap without losing any target information and generating less noise along target boundaries. Table 1 shows the confusion matrix of proposed technique against ground truth. Total 1000 images, that contain target along with its shadow and background clutter, are selected from MSTAR database. The proposed technique correctly extracted the target and non-target regions for 973 images. There were 20 cases where targets were incorrectly identified as non-targets, and for 7 cases some non-target region was incorrectly identified as target region. Therefore the recall score (ratio of actual targets that are correctly extracted) of proposed technique is 97.9%.
The specificity (ratio of non-target regions that are correctly identified as non-target) is equal to 99.2%. The precision (positive predictive value) of the proposed technique is 99.2% and the negative predictive value is equal to 97.9%. The overall accuracy of proposed technique is 98.6%.
To verify the accuracy of extracted target maps, mean square error (MSE) and peak signal to noise ratio (PSNR) are computed by comparing matting results with ground truths.

Conclusion
A matting technique to extract the target from SAR target chips is proposed. Initially, the technique roughly identifies the target through binary segmentation. A trimap is then estimated by applying guided filter, which is further refined by performing super-pixels based segmentation. Finally, propagation based matting algorithm is applied to separate the target from non-target region. Visual and quantitative analysis is performed on different SAR images from MSTAR database to show the significance of proposed technique.