Spectral-Spatial Anomaly Detection of Hyperspectral Data Based on Improved Isolation Forest

 Abstract —Anomaly detection in hyperspectral image is affected by redundant bands and the limited utilization capacity of spectral-spatial information. In this article, we propose a novel Improved Isolation Forest (IIF) algorithm based on the assumption that anomaly pixels are more susceptible to isolation than the background pixels. The proposed IIF is a modified version of the Isolation Forest (iForest) algorithm, which addresses the poor performance of iForest in detecting local anomalies and anomaly detection in high-dimensional data. Further, we propose a spectral-spatial anomaly detector based on IIF (SSIIFD) to make full use of global and local information, as well as spectral and spatial information. To be specific, first, we apply the Gabor filter to extract spatial features, which are then employed as input to the Relative Mass Isolation Forest (ReMass-iForest) detector to obtain the spatial anomaly score. Next, original images are divided into several homogeneous regions via the Entropy Rate Segmentation (ERS) algorithm, and the preprocessed images are then employed as input to the proposed IIF detector to obtain the spectral anomaly score. Finally, we fuse the spatial and spectral anomaly scores by combining them linearly to predict anomaly pixels. The experimental results on four real hyperspectral data sets demonstrate that the proposed detector outperforms other state-of-the-art methods.



Abstract-Anomaly detection in hyperspectral image is affected by redundant bands and the limited utilization capacity of spectral-spatial information.In this article, we propose a novel Improved Isolation Forest (IIF) algorithm based on the assumption that anomaly pixels are more susceptible to isolation than the background pixels.The proposed IIF is a modified version of the Isolation Forest (iForest) algorithm, which addresses the poor performance of iForest in detecting local anomalies and anomaly detection in high-dimensional data.Further, we propose a spectral-spatial anomaly detector based on IIF (SSIIFD) to make full use of global and local information, as well as spectral and spatial information.To be specific, first, we apply the Gabor filter to extract spatial features, which are then employed as input to the Relative Mass Isolation Forest (ReMass-iForest) detector to obtain the spatial anomaly score.Next, original images are divided into several homogeneous regions via the Entropy Rate Segmentation (ERS) algorithm, and the preprocessed images are then employed as input to the proposed IIF detector to obtain the spectral anomaly score.Finally, we fuse the spatial and spectral anomaly scores by combining them linearly to predict anomaly pixels.The experimental results on four real hyperspectral data sets demonstrate that the proposed detector outperforms other state-of-the-art methods.

I. INTRODUCTION
YPERSPECTRAL image (HSI) with hundreds of contiguous bands for each pixel can provide abundant spectral and spatial information simultaneously [1].HSI has been widely applied in many remote sensing applications, such as anomaly detection [2], [3], classification [4], and change detection [5].Among these applications, hyperspectral anomaly detection has received extensive attention.A wide variety of methods have been developed, which aims at distinguishing Manuscript received April 15, 2021.This work was supported by the National Natural Science Foundation of China (NSFC) (No. 61801455).(corresponding author: Bin He.) Xiangyu Song is with the Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Science (CIOMP), Changchun 130033, China, and also with the University of Chinese Academy of Science, Beijing 100049, China.(e-mail: songxiangyu17@ mails.ucas.edu.cn).
Sunil Aryal is with the School of Information Technology, Deakin University, 75 Pigdons Rd, Waurn Ponds, VIC, 3216, Ausralia.(e-mail: sunil.aryal@deakin.edu.au).outliers, whose spectral and spatial signatures are highly distinct from their surrounding pixels or the global background in an unsupervised way.
In the literature, most methods have concentrated on examination of the role of HSI spectral signatures in anomaly detection, employing exclusively the spectrum of a given pixel to determine its outlier status.The statistical model-based technique is the first category in hyperspectral anomaly detection.One of the most well-known methods is the Reed-Xiaoli (RX) algorithm [6], proposed by Irving S. Reed and Xiaoli Yu, which is considered as the main benchmark method.The RX detector assumes that the background can be modeled by employing multivariate Gaussian distributions.The RX detector has two versions, i.e., the global RX and local RX (LRX), where LRX models the background with neighborhood pixels.However, most real-world hyperspectral images (HSIs) cover different classes of materials and exhibit complex backgrounds, which means that the Gaussian distribution assumption is oversimplified in real-world HSIs.Therefore, several variants of the RX detector have been proposed [7]- [12].For example, the kernel RX [7] detector is a nonlinear version of the RX, which calculates the Mahalanobis distance between the pixels to be tested and the background in higher dimensional feature space with the kernel theory.The cluster-based anomaly detector (CBAD) [8] segments the whole HSI into several clusters and then detects anomalies in each cluster with the RX detector.Zhou et al. proposed a novel cluster kernel RX detector [12] to accelerate the kernel RX detector by partitioning the whole HSI into several clusters and then employing a fast eigenvalue decomposition algorithm to obtain detection results.
In addition to statistical model-based methods, there are many other types of detectors.For example, the low-rank and sparse representation detector (LRASR) is a typical geometrical modeling-based method proposed in [13]  low-rank property of background pixels to distinguish sparse pixels.The background joint sparse representation (BJSR) [14] detector is a representation-based method, that selects the most representative background bases with the joint sparsity model, and background pixels are then suitably represented with the selected bases, whereas anomaly pixels cannot be represented.
Similarly, collaborative representation-, sparse representation-, and tensor representation-based anomaly detectors have also received substantial attention.For example, the prior-based tensor approximation (PTA) detector is a typical tensor representation-based method proposed in [15], which combines priors (i.e., low-rank, sparse, and piecewise smooth) with the advantages of the tensor representation of HSIs.Then, the priors are embedded into the dimensions of a tensor with different regularizations according to certain physical meaning to preserve the global structure while increasing the gap between anomaly and background pixels.Moreover, hyperspectral anomaly detectors based on support vector data description (SVDD) [16], [17], morphological and attribute filters [18], [19], deep learning [20]- [23], etc. have been investigated as well.Additionally, Li et al. proposed a novel kernel isolation forest-based detector (KIFD) [24], [25] according to the isolation forest (iForest) algorithm [26], [27] two years ago.This was the first time that iForest was introduced into remote sensing applications.Subsequently, Wang et al. [28] established a hyperspectral anomaly detector that combined multiple features and iForest (MFIFD) last year.Both methods have been demonstrated to perform well.
Although both the KIFD and MFIFD have been revealed to perform well in hyperspectral anomaly detection, we have identified certain weaknesses of iForest in detecting anomalies in high-dimensional data and detecting local anomalies.The basic motive of our research is to enhance the detection accuracy by overcoming those two limitations of iForest-based anomaly detectors in hyperspectral anomaly detection.We propose a new improved isolation forest (IIF) algorithm.Furthermore, in this article, a novel spectral-spatial IIF-based detection framework (SSIIFD) is developed.Specifically, the main contributions of this article are as follows: The remainder of this article is organized as follows: Section II briefly reviews the iForest and its two variations; the extraction of spatial features with the Gabor filter and the entropy rate superpixel segmentation (ERS) algorithm are briefly reviewed in this section.The proposed method is introduced in detail in Section III.In Section IV, experimental results are presented.Finally, conclusions are discussed in Section V.

A. Isolation Forest and Its Two Variations
The isolation forest (iForest) introduced by Liu et al. [26], [27] is an outlier detector that does not employ distance or density measures.It builds an ensemble of isolation trees (iTrees) for a given data set.The main advantage of this algorithm is that it does not rely on a determined profile representing the data to find samples that do not conform to this profile.Rather, it utilizes the fact that anomalies are 'few and different', which makes them more susceptible to isolation in a binary tree structure than normal points.Hence, anomalies are isolated closer to the root of the tree, whereas normal points are isolated toward the deeper end of the tree.In other words, anomalies exhibit shorter average path lengths than those of normal points over a collection of iTrees.Here, the principle of iForest is briefly reviewed.For more details of the iForest algorithm, we refer readers to [26], [27].
Specifically, in an iForest, data are subsampled and processed in a tree structure based on random cuts in the values of arbitrarily selected features in a given data set.Each tree is grown until each instance is isolated into a leaf node.Those samples that travel deeper into the tree branches are less likely to be anomalous, whereas shorter branches are indicative of anomalies.As such, the aggregated lengths of the tree branches provide a measure of the occurring anomalies or an anomaly Fig. 3. Architecture for our proposal for hyperspectral anomaly detection with a spectral-spatial joint optimization scheme.score for every given point.To demonstrate that anomalies are more susceptible to isolation under random partitioning, an example of the random partitioning process of a normal point versus an anomaly is shown in Fig. 1.We observe that a normal instance,  , generally requires more separating lines to be isolated, while an anomaly instance,  , generally requires less separating lines to be isolated.
On the one hand, unsatisfactory results have often been achieved when employing iForest in the detection of local anomalies in data sets containing multiple clusters of normal instances because the local anomalies are masked by those normal clusters of similar density.Hence, they become less susceptible to isolation via iTrees.In other words, iForest does not detect local anomalies because the path length globally measures the degree of anomaly.It does not consider the isolation magnitude of an instance from its local neighborhood.To address this problem, Aryal and Ting et al. [29], according to the mass estimation theory [30], developed ReMass-iForest by replacing the global ranking measure based on path length with a local ranking measure based on relative mass that takes local data distribution into consideration.ReMass-iForest applies the same implementation of iTrees as that of iForest.Empirical evaluations have indicated that ReMass-iForest performs better than iForest in terms of the task-specific performance.
On the other hand, only one data dimension is randomly selected in every partition.In other words, the applied branch cuts are simply parallel to the coordinate axes, which results in certain regions, not necessarily containing many data points, ending up with many branch cuts.As such, most dimensions of the data are not considered when building iTrees, which reduces the reliability of the algorithm, especially in regard to highdimensional problems with a large number of attributes.Hariri et al. [31] presented an extension to iForest, namely the extended isolation forest (EIF), by using hyperplanes with random slopes (non-axis-parallel) to split data in the creation of iTrees, which resolves the issues associated with the assignment of anomaly scores to given data points.The results of EIF are more reliable and robust and in some cases more accurate in a given dataset.
iForest, in recent years, has been successfully applied in remote sensing applications.Specifically, iForest was first introduced into the hyperspectral anomaly detection field by Li et al. [24], [25].In addition, Wang et al. [28] proposed a hyperspectral anomaly detector combining multiple features and iForest.Although both methods are shown to perform well, we have identified their weakness in anomaly detection in HSIs (aka high-dimensional data) containing hundreds of spectral bands and multiple clusters of background pixels.In this article, we develop a novel improved iForest method, optimized for hyperspectral anomaly detection, namely, IIF-based anomaly detector (IIFD), by combining spatial texture information and spectral characteristics.

B. Gabor Filter
The Gabor filter1 , which is a sinusoidal function modulated by a Gaussian envelope, has been widely adopted in various applications of computer vision and image processing [32], [33].The Gabor filter captures certain physical structures of an object in an image, such as specific orientation information, based on a spatial convolution kernel.In recent years, Gabor filters have been successfully applied in hyperspectral classification [34], [35].The most important advantage of Gabor filters is their invariance to rotation, scale, and translation.Furthermore, they are robust against photometric disturbances, such as illumination changes and image noise.Hence, considering these Gabor features, the spatial texture information of HSIs can be effectively represented.
In a two-dimensional ,  coordinate system, the Gabor filter, including real and imaginary components, can be represented as:

C. Entropy Rate Superpixel Segmentation
A superpixel segmentation algorithm, as a preprocessing step, should exhibit a low computational complexity and adhere well to the object boundaries.Liu and Tuzel et al. [36] proposed the ERS algorithm with the graph topology that maximizes the objective function under the matroid constraint.Specifically, the objective function comprises two components: the entropy rate of a random walk on a graph and a balancing term.The matroid is a combinatorial structure that generalizes the concept of linear independence in vector space.Furthermore, in [36], regarding an undirected graph  ,  where  is the vertex set and  is the edge set, the graph is partitioned into a connected subgraph by choosing a subset of edges  ⊆  such that the resulting graph  ,  consists of smaller connected components or subgraphs.The objective function of the ERS algorithm is optimized with both the entropy rate   and balancing term   :  ←  ∪   , 0, ℎ,  where μ 0 is the weight of the balancing term, and  ⋅ denotes the trace of a square matrix.The entropy rate   favors the formation of compact and homogeneous clusters, whereas the balancing term   encourages clusters of similar sizes.A greedy optimization scheme for the problem expressed in ( 5) is given in [37].

III. PROPOSED METHOD
Given an HSI, in practical applications, the detection result will be improved when considering both spatial and spectral information [38], which is beneficial for noise suppression and discrimination enhancement between anomalies and the background in HSIs.The proposed SSIIFD framework is designed to detect anomaly pixels by measuring spectral and spatial anomaly scores for every pixel.A schematic of the proposed framework is shown in Fig. 3, which consists of the following three parts: 1) The Gabor filter is applied to extract spatial information from the principal component analysis (PCA)-projected subspace.Gabor features are then employed as the input to the ReMass-iForest detection algorithm to obtain the spatial anomaly score (Part Ⅰ).
2) The original HSI is divided into several homogeneous regions via the ERS algorithm [36], which are denoted by matrices whose rows are spectral vectors of pixels.The proposed IIFD is then applied to these high-dimensional matrices to obtain the spectral anomaly score (Part Ⅱ).
3) Finally, we fuse the detection results by linearly combining the obtained spatial and spectral anomaly scores to predict the anomaly pixels given the input HSI (Part Ⅲ).

A. Gabor Feature Extraction
Let  ∈ ℝ denote the input HSI data, where  is the number of pixels, and  is the number of spectral bands.To extract the Gabor feature [34] of each pixel, we first obtain the projection matrix  ∈ ℝ by solving the following PCA model: where  ∈ ℝ denotes the identity matrix, and  ⋅ denotes the trace of a square matrix.The top  principal components of the HSI are defined as: where  ⋅ denotes the mean function. are then convolved with a Gabor filter [39] with different orientations and scales.Finally, filtering coefficients are extracted as the Gabor feature of each pixel.The Gabor feature matrix is represented as  ∈ ℝ , where  is obtained based on the number of principal components C and the orientations and scales of the Gabor filter.In this article, we employ forty Gabor filters in five scales and eight orientations and then apply these filters to the top principal component  of the input HSI.Hence,  5 8 1 40.

B. Constructing ReMass-iForest for Anomaly Detection in the Spatial Domain of HSIs
Given input HSI data  ∈ ℝ , as mentioned before, ReMass-iForest applies exactly the same implementation of iTrees as that of iForest [29].Each iTree is constructed from a small random subsample  ∈ ℝ ,   , where  denotes pixels randomly selected from the input  .Let  denote all the  th band pixels of  , and let  denotes a randomly selected value between the minimum and maximum of  .We recursively divide  into two nonempty child nodes by randomly selecting a band  and a split value , where  is a number between 1 and .Specifically, if  is smaller than , the th selected pixel is divided into the left node, and vice versa (0  ).A branch stops splitting when the height of the iTree reaches the height limit log  or the number of pixels in each node equals 1.The iTree construction process is repeated  times, which indicates that the iForest comprises  iTrees.
Here, we give a graphical interpretation for the structure of an iTree as in Fig. 2 inspired by [24].Each node represents a single pixel or a number of pixels with similar spectral values.Furthermore, we provide details of the construction of ReMass-iForest in Algorithms 1 and 2.

C. Constructing Improved Isolation Forest for Anomaly Detection in the Spectral Domain of HSIs
As we have reviewed in the previous section, the ReMass-iForest method addresses the problem whereby iForest does not detect local anomalies by using a local ranking measure based on relative mass.The EIF method resolves the poor iForest performance given high-dimensional data by using hyperplanes with random slopes to split data in iTree construction.Because HSI data possess the characteristics of high dimensions and a complex background, iForest-based hyperspectral anomaly detectors face two key challenges: 1) the detection of local anomaly pixels in a complex background; 2) the selection of more separable bands during iTree construction.Aiming at the first challenge, the proposed IIF algorithm shares the consideration of relative mass to formulate anomaly scores with ReMass-iForest; they are different in terms of how they construct their iTrees.
Regarding the second challenge, the proposed IIF algorithm selects a subset of bands that contains more discriminative and informative features between the anomaly and background at each branching step in the process of building an iTree.Specifically, let  denote all the th band pixels of HSI data , and  and  denote the anomaly pixels and background pixels, respectively, of  , while a threshold  is required to separate all pixels into  and  .We propose a separability criterion inspired by [40], which is defined as: where  ∪   ;  ⋅ is the standard deviation function and  ,  simply returns   2. This criterion is normalized using   , and in terms of the standard deviation calculation, a reliable one-pass solution with low computational cost can be found in [41].As a result, we obtain a separability index for each band to determine its separability in the identification of background and anomaly pixels.Let   denotes the separability index of the th band, and  denotes the best corresponding threshold, where 1  .Once every band separability index in the given HSI data has been calculated with (7), these separability indexes can be ranked in descending order of their  ⋅ value.The bands among the top  of the list are chosen as the high-value band subset, denoted as  , whereas the other bands are regarded as the  redundant band subset, denoted as  .Therefore, for a given -band HSI, inspired by [31], the branching criterion in terms of data splitting for a given pixel   ,  , ⋯ ,  is as follows:

𝒙 𝑒 ⋅ 𝒏 0 8
Where  denotes a randomly selected value between the minimum and maximum of , and  is a -dimensional normal vector, which is obtained by drawing a random number for each Then, the coordinates of , corresponding to  , are set to zero e.g., if the fifth, ninth, and seventieth band are regarded as redundant bands with (7), the fifth, ninth, and seventieth component of vector  are set to zero.Furthermore, if the condition is satisfied, pixel  is divided into the left node.Otherwise, it is moved down to the right node.These processes are described in more detail in Algorithms 3 and 4.
The proposed IIF method can work directly on the original HSI data.Here, to fully utilize the local information, the original HSI data are segmented into several subregions via the ERS approach before feeding them to the IIF.This preprocessing step exerts a positive influence on local anomaly detection and computational burden reduction, which is

D. Anomaly Detection Using the Proposed Framework
This subsection focuses on anomaly detection in both the spatial and spectral domains.As shown in Fig. 1, Gabor features and an HSI marked via ERS are fed to the constructed ReMass-iForest and IIFD, respectively, to detect anomalies.As mentioned before, the proposed IIF and ReMass-iForest algorithms share the same measure to detect anomaly pixels, namely, both algorithms rely on the relative mass to formulate anomaly scores.In each iTree  , the anomaly score of a pixel  w.r.t its local neighborhood,   , can be estimated as the ratio of the data mass as follows: where   denotes the leaf node in  in which  falls,   denotes the immediate parent of   , and  ⋅ denotes the data mass of a tree node.Moreover,  is a normalization term, which is the subsample size used to construct  .Obviously,  ⋅ occurs in 0, 1 .The higher the score the higher the likelihood of  being an anomaly pixel.In contrast to the path length in iForest,   measures the degree of anomaly locally.
Then, the anomaly score   of a test pixel  can be calculated as the average of the local anomaly scores over  iTrees as follows: 1    10 By performing the operations mentioned above for each pixel in the Gabor features and HSI data segmented by ERS, the spatial anomaly score  and the spectral anomaly score  can be obtained.In order to take full advantage of the spatial and spectral detection results,  and  are linearly combined to precisely distinguish anomaly pixels from the background as follows: where  is a balance parameter.As known, the spectral domain in HSI data contains more precise information than the spatial domain.Obviously, when the value of  is greater than 0.5, this suggests that the spectral features play a more important role in the final detection result than the spatial features.

IV. EXPERIMENTS
In this section, we carry out several experiments to evaluate the detection performance of the proposed SSIIFD method, and comparison results against five state-of-the-art detectors are presented.All experimental algorithms are implemented on a PC with Windows 10, Intel Core i7-9700 CPU @ 3.00 GHz and 16 GB RAM, and MATLAB 2017b.

A. Hyperspectral Data Sets
Here, four real hyperspectral data sets captured at different scenes are employed to evaluate the effectiveness of the proposed SSIIFD method, which are listed as follows: 1) San Diego-I Data Set: The first data set was captured by the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) over the airport area of San Diego, CA, USA.The spatial resolution is approximately 3.5 m/pixel, and the spectral resolution is 10 nm.It contains 224 spectral channels in wavelengths ranging from 370 to 2510 nm.After the removal of water absorption and noisy bands (1-6, 33-35, 97, 107-113, 153-166, and 221-224), 189 bands are retained in the experiments.The whole image scene covers an area of 400 400 pixels.A region with a size of 100 100 pixels is selected from the top left of the image, denoted as San Diego-I.Three airplanes, denoted by 58 pixels, are the anomalies to be detected in this scene.The sample image and ground truth map are shown in Fig. 4(a) and (b), respectively.2) San Diego-II Data Set: The second data set has been widely used in related publications [25], [28], and [38].
Compared to the San Diego-I data set, this region exhibits a size of 100 100 pixels located at the center of the whole image, which is selected for anomaly detection and denoted as San Diego-II.Three airplanes, denoted by 134 pixels, are the anomalies to be detected in this scene.The sample image and ground truth map are shown in Fig. 5(a) and (b), respectively.3) Texas Coast Data Set: The third data set was captured by the AVIRIS sensor over an urban area of Texas Coast, TX, USA.This urban scene consists of 100 100 pixels, with 207 spectral channels in wavelengths ranging from 450 to 1350 nm.The spatial resolution is 17.2 m/pixel.The scene mainly consists of a stretch of meadow and three highways.
Houses are regarded as the anomalies in this scene.This HSI is corrupted by serious strip noise, which resulted in challenges in the detection of the above anomaly pixels.
The sample image and ground truth map are shown in Fig. 6(a) and (b), respectively.4) Gulfport Data Set: The fourth data set was captured by the AVIRIS over the airport area of Gulfport, MS, USA.This airport scene consists of 100 100 pixels, with 191 spectral channels in wavelengths ranging from 550 to 1850 nm.The spatial resolution is 3.4 m/pixel.This scene mainly comprises an airport runway, highway, and some vegetation.Three airplanes of different sizes are the anomalies to be detected.The sample image and ground truth map are shown in Fig. 7(a) and (b), respectively.

B. Parameter Tuning
Here, we investigate the influences of the parameters  and  on the detection performance of the proposed SSIIFD method.Parameter  controls the number of subregions in the preprocessing step.Parameter  controls the proportion of the spectral anomaly scores in the final detection results.Fig. 8 shows the effect of parameters  and  on the area under the curve (AUC) of the receiver operating characteristic (ROC) of the IIF given each data set.Based on the parameter tuning results, we can draw three conclusions, which are listed as follows: 1) The detection performances, represented by AUC values, tend to increase and then decrease with increasing number of superpixels.This is mainly because that excess superpixels will lead to oversegmented regions and cannot fully utilize all samples that belong to the homogeneous area, whereas a too small number of superpixels will lead to undersegmentation and introduce some samples from different homogeneous areas and cannot make full use of local information.Moreover, an excessively large number of superpixels results in each region containing a limited number of pixels, which does not guarantee the reliability and stability of the detection results, e.g., all the pixels in a given region may be anomaly pixels, which introduces a high missed detection rate.
2) The detection performance, considering a proper value of , is always better than that when the  value is set to 1 (which indicates that the proposed method is directly performed on the original HSI data without preprocessing).Hence, the proposed method, which takes the local homogeneity of HSIs into account, is more effective than the method without segmentation preprocessing.3) As shown in Fig. 8(b), when parameter  ranges from 0.01 to 1, the AUC values initially increase, then slightly decrease, and finally reach their peaks at approximately 0.6 for the San Diego-II and Texas Coast data sets.In regard to the San Diego-I and Gulfport data sets, we can observe that maximum AUC values occur at 0.8 and 0.2, respectively.In summary, based on the experiments and analysis mentioned above, we obtain the optimal fundamental superpixel number  for the Texas Coast, San Diego-II, and the other two data sets, at 3, 5, and 4, respectively.Additionally, in this article, we, inspired by [42], set ω to 0.618 for each data set under the guidance of the golden section method.

C. Analysis of the Detection Performance With and Without Employing Spatial Information
In this section, we investigate the influence of spatial information on the detection performance of the proposed method.As shown in Fig. 9(a), the proposed IIFD detects most anomaly pixels in the original San Diego-II data set, and the AUC score of the detection result is 0.9891, from which we can draw two conclusions: 1) the IIFD effectively detects most anomalies without relying on spatial information; 2) a high false alarm rate is the main problem.Therefore, we apply the Gabor filter to extract spatial information in the PCA-projected subspace, and the extracted Gabor features are then employed as input to the ReMass-iForest detector to obtain the spatial anomaly score.In addition, the original San Diego-II data set is segmented into five subregions with the ERS approach, and each subregion is fed to the IIF detector to obtain the spectral anomaly score in turn.Finally, we fuse the detection results by linearly combining the obtained spatial and spectral anomaly scores to predict the anomaly pixels given the input San Diego-II data set.As such, some false alarms are effectively removed.Fig. 9(b) shows the final detection map, and the AUC score of the final detection map is 0.9922.
Based on this experiment, we observe that both spectral and spatial information play an important role in the detection of anomaly pixels.

D. Comparison Methods and Evaluation Indexes
In our experiments, the anomaly detection performance of the proposed SSIIFD is evaluated and compared to that of five state-of-the-art detectors: RX [6], LRASR [13], PTA [15], KIFD [24], and MFIFD [28].Specifically, RX is a representative statistical modeling-based technique.LRASR is a typical geometrical modeling-based technique based on lowrank representation and sparse representation theories.PTA is a typical tensor representation method.The KIFD and MFIFD methods are representative iForest-based techniques.Furthermore, the parameters of dictionary learning in LRASR are set the same as those reported in [13], i.e., the number of clusters  15, the number of atoms in each cluster is set to 20, and parameters  and  range from 0.01 to 1.The PTA parameters are set according to the suggestions in [15], i.e., the   truncated low-rank  is set to 1, and hyperparameters , , , and  are set to 1, 0.01, 0.001, and 1, respectively.In terms of the KIFD method, the subsample size is set to three percent of all pixels in the image, the number of trees  1000, and the number of principal components  300, which are consistent with the original work [24].The parameters of MFIFD method are set the same as those given in [28], i.e., the subsample size  256, and the number of trees is set to 25.In summary, the parameters of the five baselines are defined in accordance with the original works [6], [13], [15], [24], [28].
In the experiments, both qualitative and quantitative evaluation approaches are employed to evaluate the detection performance.Specifically, we report the qualitative analysis of the detection performance with the detection map, whereas quantitative evaluation is conducted by using the ROC curve, AUC value, and separability range.The ROC curve reflects the relationship between the probability of detection (PD) and the false alarm rate (FAR), which are obtained via thresholds ranging from 0 to 1.The PD and FAR are defined as follows: where  denotes the number of detected object pixels,  denotes the total number of real object pixels,  denotes the number of false alarm pixels, and  denotes the total number of pixels in the image.If a detector attains a higher PD than that of the other detectors at the same FAR, this illustrates that this detector outperforms the others.In other words, an ROC curve located near the upper leftmost corner suggests that the detector obtains a better detection result.Furthermore, a better detector usually achieves a larger AUC value, which is calculated based on the whole area under the ROC curve.More details on these two metrics have been reported in [43].Moreover, the separability range clearly describes the ability of a detector to distinguish anomaly pixels from the background [44].Specifically, a good detector typically features a distinct gap between the anomaly pixels and the background, meanwhile, the anomaly scores of the background are suppressed within a small range.

E. Detection Performance
In this section, we first qualitatively investigate the detection performance via detection maps.Based on the detection maps shown in Figs.4-7, we observe that the proposed SSIIFD detects anomaly pixels more clearly and accurately at lower FARs over the five comparison methods.For example, in Figs. 4 and 5, the proposed SSIIFD, PTA, KIFD, and MFIFD detect the locations and shapes of the three airplanes accurately.However, PTA, KIFD, and MFIFD also falsely detect many anomalies, whereas there are few false alarms in the detection result obtained with SSIIFD.LRASR detects the locations of the three airplanes, but the shapes of these three airplanes are not determined.RX obtains a poor separation between the anomaly pixels and background.Additionally, the proposed SSIIFD achieves a robust detection performance in images corrupted by serious strip noise.As shown in Fig. 6, the proposed SSIIFD, KIFD, and MFIFD effectively detect most anomalies, while only SSIIFD effectively removes the interference of strip noise and suppresses most of the background into low-detection outputs.In other words, KIFD and MFIFD perform poorly in both background suppression and noise reduction.RX and PTA realize satisfactory background suppression, while RX misses many anomaly pixels and PTA does not mitigate the influence of strip noise as effectively as does LRASR.Regarding anomaly targets with relatively different and irregular shapes and sizes, as shown in Fig. 7, RX does not obtain detection results with a low contrast between the anomaly pixels and background.MFIFD fails to detect two small airplanes clearly due to the blurring effect produced in the filtering operation.The LRASR, PTA, and KIFD methods detect all three airplanes, while some background pixels are mistakenly detected as anomalies.
Moreover, the detection performances of the compared methods were quantitatively evaluated based on AUC scores as summarized in Table I, and the highest AUC scores were highlighted for each data set.It is obvious that SSIIFD achieved the highest scores on all data sets.RX and LRASR attained the lowest detection accuracy for the San Diego-I and Texas Coast data sets, respectively.Although the MFIFD and KIFD methods yielded a relatively stable detection performance, they failed to achieve the highest AUC scores in any experiment.Additionally, the ROC curves of the different methods are shown in Fig. 10.As mentioned before, a better detector occurs nearer to the upper left corner (0, 1) and achieves a higher PD at the same FAR.Fig. 10 shows that the SSIIFD method is superior to the MFIFD, KIFD, PTA, LRASR, and RX methods under most conditions.The proposed SSIIFD method obtains much better ROC curves than those of the other methods, as shown in Fig. 10(a), (b), and (d), i.e., the PD value of SSIIFD, in every case, is higher than that of other methods with FAR ranging from 0.0001 to 1. Regarding the Texas Coast data set, the proposed SSIIFD method achieves a higher PD value than that of the other compared methods under most conditions, as shown in Fig. 10(c).
Furthermore, another quantitative evaluation aspect of the proposed SSIIFD, separability map, is exploited to investigate its ability in anomaly background separation, as shown in Fig. 11.There are two boxes for each detector.The green and red boxes indicate the distributions of the background and anomalies, respectively.The position of the boxes reflects the separability between the background and anomaly pixels.In other words, the greater the distance between these two boxes, the better the detector is.As shown in Fig. 11, it is obvious that the proposed SSIIFD offers the best performance in terms of the separability between the anomalies and background, whereas the other methods exhibit more or less overlap between the anomaly and background boxes.For example, as shown in Fig. 11(b), the proposed SSIIFD, RX, and LRASR effectively suppress the background within a small range.However, for both RX and LRASR, overlap occurs between the anomaly and background boxes, which suggests that they do not efficiently distinguish anomaly pixels from background pixels.In contrast, the background boxes of PTA and MFIFD reflect that these two methods do not suppress most of the background into lowdetection outputs.In other words, the PTA and MFIFD falsely detect many anomalies, which corresponds to their detection maps, as shown in Fig. 5(e) and (g), respectively.In terms of the KIFD method, although the background anomaly score is suppressed within a small range, the value of the background class is relatively high, which indicates that the KIFD does not efficiently distinguish anomaly pixels from the background.
ReMass-iForest and iForest exhibit the same time complexity, i.e., Ο    log  .The time complexity to construct IIF consists of three major components: 1) computation of the band separability according to (7), 2) sorting of the band separability values, and 3) calculation of the branching criterion according to (8).The time complexity associated with IIF construction of  trees is Ο   log   , where  is the subsample size and  is the number of bands in the input HSI.The time complexity of anomaly score evaluation is Ο  , where  is the number of pixels in the input HSI.Hence, the   II.It should be noted that RX is the fastest method, whereas KIFD is the slowest.This principally occurs because KIFD employs kernel-PCA during preprocessing, and numerous iTrees are constructed to obtain stable anomaly scores.The running time of the proposed SSIIFD method is similar to that of the MFIFD method, which is much more efficient than the PTA and KIFD methods.

F. Sensitivity to the Parameters and Discussion
In this section, we perform experiments to reveal the effect of the parameters of the proposed SSIIFD method on the detection performance.There are three parameters in the proposed SSIIFD method, i.e., the number of the used bands , the number of iTrees , and the size of subsample .Parameter  controls the number of spectral bands to be employed in the construction of the proposed IIF.Fig. 12 shows the influence of different numbers of iTrees  on the detection performance and the running time on each data set.As shown in Fig. 12(a), the AUC value for the Gulfport data set remains nearly stable, whereas the AUC value for the San Diego-II data set slightly fluctuates within a small range.Regarding the other two data sets, the AUC values initially increase and then fluctuate within a small range.Moreover, as shown in Fig. 12(b), the running time of the proposed SSIIFD method achieves a nearly linear growth with increasing number of iTrees.In addition, Fig.  shows the influence of different subsample sizes  on the detection performance and running time on each data set.As shown in Fig. 13(a) the AUC value for the Texas Coast data set remains nearly stable, whereas the AUC values for the other three data sets slightly fluctuate within a small range, i.e., from 0.96 to1.Furthermore, as shown in Fig. 13(b), the running time of the proposed SSIIFD method achieves a nearly linear growth with increasing subsample size .Hence, considering both the performance and efficiency of the proposed SSIIFD method, we set  32 and   2.5%  ( is the number of pixels in the input HSI) for each data set as default parameter values.
Moreover, Fig. 13 shows the effect of parameter  on the detection performance given each data set.Regarding the San Diego-I and San Diego-II data sets, the AUC values gradually increase and tend to remain stable when the parameter  ranges from 1 to 100.In contrast, the AUC values obtained for the Gulfport and Texas Coast data sets exhibit a larger fluctuation with increasing .This mainly occurs because pretreatment of water absorption and noisy bands is applied to the two San Diego data sets, and almost all 189 bands exhibit a high signalto-noise ratio (SNR), whereas the Gulfport and Texas Coast data sets, without pretreatment, exhibit low-SNR and poorquality bands, especially the Texas Coast data set.As a result, in terms of the two San Diego data sets, by calculating the  value for each band with (7), we obtain the separability index as expected, which accurately measures how separable each band is in the identification of background and anomaly pixels.For the other two data sets with no pretreatment, too large or too small value of  leads to the usage of those noisy and water absorption-affected bands with a high probability.In other words, in low-SNR and noisy bands, the  value does not accurately measure how separable the band is in distinguishing anomaly pixels from the background.Therefore, parameter  is set as   /3 for each data set in this article.

V. CONCLUSION
In this article, we propose a novel IIF algorithm to address the poor performance of iForest in regard to high-dimensional data and detecting local anomalies.Then, a novel spectralspatial anomaly detection framework based on IIF (SSIIFD) is proposed.Gabor features and segmented HSI data are employed to construct ReMass-iForest and IIF, respectively.The advantages of the proposed SSIIFD method are threefold: first, the method fully utilizes spectral and spatial information in HSIs; second, this method fully employs global and local information in HSIs; third, this method detects anomaly pixels more clearly and accurately at lower FAR.The experiments on four real hyperspectral data sets reveal that SSIIFD is stable and superior to other state-of-the-art methods in terms of both objective and subjective evaluations.In the future, the application of SSIIFD in other remote sensing applications will be investigated (e.g., change detection and shadow detection).In addition, how to classify and recognize the detected anomaly pixels will be the focus of our future research.

HFig. 1 .
Fig. 1.A graphical example illustrating the principle of iTrees.Given a Gaussian distribution (205 points), (a) an anomaly instance,  , is isolated through only four random partitions; (b) a normal instance,  , requires eleven random partitions to be isolated.

Fig. 2 .
Fig. 2.An example illustrating the structure of an iTree.1) An SSIIFD is proposed, which can make full use of the spectral and spatial information, and the global and local information of HSIs.2) An IIF algorithm is proposed for the first time which effectively improves the poor performance of iForest in the detection of anomalies in high-dimensional data and local anomaly detection.3) Experiments on four real data sets demonstrate that the proposed SSIIFD can obtain the best detection accuracy.The remainder of this article is organized as follows: Section II briefly reviews the iForest and its two variations; the extraction of spatial features with the Gabor filter and the entropy rate superpixel segmentation (ERS) algorithm are briefly reviewed in this section.The proposed method is introduced in detail in Section III.In Section IV, experimental results are presented.Finally, conclusions are discussed in Section V.

Fig. 8 .
Fig. 8. Influence of parameter  and parameter  on the detection performance of the proposed SSIIFD on each HSI data set.(a) Number of superpixels, .(b) Value of the balance parameter, .demonstrated with experimental results.Specifically, the original data  are transformed into  submatrices:  ,  , ⋯ ,  , where   ⋯  , i.e.,  ∪  ∪ ⋯ ∪  .

Fig. 9 .
Fig. 9. San Diego-II data set.(a) Detection map without employing spatial information.(b) Detection map considering spatial information.

Fig. 10 .
Fig. 10.ROC curves of the methods for.(a) the San Diego-I data set, (b) San Diego-II data set, (c) Texas Coast data set, and (d) Gulfport data set.

Fig. 11 .
Fig. 11.Background-anomaly separability maps of the algorithms for (a) the San Diego-I data set, (b) San Diego-II data set, (c) Texas Coast data set, and (d) Gulfport data set.

Fig. 12 .
Fig. 12.Effect of the number of iTrees on each data set.(a)AUC value, (b) running time.

Fig. 13 .
Fig. 13.Effect of the subsample size on each data set.(a)AUC value, (b) running time.time complexity of the proposed IIF is Ο   log    .Additionally, the compared methods are implemented in MATLAB, and the running times given the four data sets are listed in TableII.It should be noted that RX is the fastest method, whereas KIFD is the slowest.This principally occurs because KIFD employs kernel-PCA during preprocessing, and numerous iTrees are constructed to obtain stable anomaly scores.The running time of the proposed SSIIFD method is similar to that of the MFIFD method, which is much more efficient than the PTA and KIFD methods.

Fig. 14 .
Fig. 14.Influence of the number of used spectral bands  on the detection performance of the proposed SSIIFD on each HSI data set.
where  is the wavelength of the sinusoidal factor,  is the orientation of the normal to the parallel stripes of the Gabor function,  is the phase offset,  is the standard derivation of the Gaussian envelope, and  is the spatial aspect ratio specifying the ellipticity of the support of the Gabor function. 0 and  /2 return the real and imaginary parts, respectively, of the Gabor filter.Parameter  is determined by  and spatial frequency bandwidth  as: Algorithm 1:   , ,  Input:  -input data,  -number of trees,  -sub-sampling size Output: a set of   1: Initialize  2: set height limit ℎ =  (log ) 3: for  = 1 to  do Algorithm 2:  , ℎ, ℎ Input:  -input data, ℎ -current tree height, ℎ -height limit Output: an  1: if ℎ ℎ or || 1 then 2: return   ← || 3: else 4: let  be a list of bands of  5: randomly select a band  ∈  6: randomly select a split value  from  and  values of the th band of  7: let  be the value of the th row and th column of  8:  ←  ,   9:  ←  ,   10: return   ←   , ℎ 1, ℎ , ℎ ←   , ℎ 1, ℎ ,  ← ,  ←  11: end if

TABLE I AUC
SCORES OF THE METHODS FOR THE EXPERIMENTAL DATA SETS

TABLE II RUNNING
TIME (SECONDS) OF THE METHODS FOR THE EXPERIMENTAL DATA SETS