Hyperspectral Anomaly Detection Based on Spectral Similarity Variability Feature

In the traditional method for hyperspectral anomaly detection, spectral feature mapping is used to map hyperspectral data to a high-level feature space to make features more easily distinguishable between different features. However, the uncertainty in the mapping direction makes the mapped features ineffective in distinguishing anomalous targets from the background. To address this problem, a hyperspectral anomaly detection algorithm based on the spectral similarity variability feature (SSVF) is proposed. First, the high-dimensional similar neighborhoods are fused into similar features using AE networks, and then the SSVF are obtained using residual autoencoder. Finally, the final detection of SSVF was obtained using Reed and Xiaoli (RX) detectors. Compared with other comparison algorithms with the highest accuracy, the overall detection accuracy (AUCODP) of the SSVFRX algorithm is increased by 0.2106. The experimental results show that SSVF has great advantages in both highlighting anomalous targets and improving separability between different ground objects.


Introduction
Hyperspectral remote sensing image processing technology is a branch of the signal processing field, many signal processing-related methods provide theoretical and technical support for hyperspectral remote sensing image processing.According to the characteristics of hyperspectral remote sensing images, remarkable achievements have been achieved in the application directions of hyperspectral image classification [1][2][3], unmixing [4][5][6], super-resolution mapping [7,8], and target detection [9].In recent years, many experts and scholars have systematically reviewed different types of hyperspectral remote sensing image processing methods.samples include hyperspectral spatial enhancement techniques or super resolution (SR) [10], the application of machine learning to lithology mapping and mineral exploration [11], and the application of deep learning to anomaly detection [12].These systematic reviews provide important reference and guidance for the further research and development of hyperspectral remote sensing image processing, and strongly promote the continuous innovation and improvement of related technologies.
Although hyperspectral images are rich in spectral and spatial information, they still face various challenges in the research process, including redundancy of high dimensional data, pollution of spectral noise and atmospheric influence, mixed pixels, and different objects within the same spectrum and in different spectra of the same object.Spectral dimension transformation involves mapping hyperspectral images to the corresponding feature space through the feature processing method, which makes the ground objects indistinguishable in the original feature space separable in the new feature space.In hyperspectral abnormal target detection, spectral dimension transformation can improve the Sensors 2024, 24, 5664 2 of 20 separability between the background and the anomaly target.The most common feature processing methods are principal component analysis [13] (PCA), independent component analysis [14] (ICA) and nonlinear principal component analysis [15].The essence of spectral dimension transformation is to obtain higher-level features of original hyperspectral images by a mapping method and improve the accuracy of anomaly detection by using its ability to improve the separability between different ground objects.The hyperspectral anomaly detection (HAD) of differential images [16] utilizes difference images to estimate background changes during the feature extraction stage, so as to suppress background signals and highlight anomaly signals.Fractional Fourier entropy [17] employs fractional Fourier transform for pre-processing, then uses space-frequency representation to obtain features from the intermediate region between the original spectrum and its complementary Fourier transform.Unsupervised spectral mapping and feature selection [18] highlights an anomaly target by searching for the optimal feature subset from the candidate feature space while mapping high-dimensional features to a low-dimensional space using unsupervised neural networks.
In addition, research on HAD that is based on a linear model has also found some success.The linear model is able to obtain the error term of the hyperspectral image.For example, the hyperspectral image is mapped to the feature space of other dimensions, and then the features of the mapping are re-projected to the original feature space by the opposite method.Finally, the reconstruction error is taken as the value of the anomaly degree.Because normal samples are easier to reconstruct than anomaly samples, the samples with higher reconstruction errors are considered abnormal targets.For example, the residuals between the reconstructed image and the original image are obtained by PCA projection reconstruction, and the projection parameters are updated in several iterations [19].The work of [20] is enhanced by that of [19], and potential anomaly target is filtered out according to the error value of each iteration.The reconstruction probability algorithm of an autoencoder (AE) [21] is also a detection model that can obtain reconstruction errors through feature mapping.The joint graph detection (JGD) [22] model considers both spectral and spatial features.Through the spectral sub-model, the reconstruction error between the original hyperspectral sensing image (HSI) and the feature image after the graph Fourier transform (GFT) is mapped to fractional Fourier entropy (FrFE), which enhances the anomaly detection capability and shows the advantage in distinguishing background anomalies.To solve the problem of the PCA being sensitive to feature scales and outliers, robust PCA (RPCA) [23] decomposes data into low-rank and sparse matrices to enhance robustness to noise and outliers.RPCA integrating sparse and low-rank priors (RPCA-SL) is a new variant that achieves a more precise separation by combining prior targets and is solved using a near-end gradient algorithm.A discriminant reconstruction method based on spectral learning (SLDR) [24], firstly uses a spectral error map (SEM) to detect the anomaly, and then uses a spectral angle distance (SAD) to restrict the AE to follow a unit Gaussian distribution.The obtained SEM can well reflect the spectral similarity between the identification and reconstruction.The mixture of Gaussian low-rank and sparse decomposition [25] decomposes the HSI into low-rank background and sparse components, then deduces the hybrid Gaussian model of sparse components by variable decibels, and finally calculates the anomaly by Manhattan distance.Pixel-associate AE [26] uses super-pixel distance to build two representative dictionaries, and then obtains the hidden layer expression of similarity measure by AE.
In HAD, spectral dimension transformation involves mapping the background and anomaly target of hyperspectral data to another feature space, so that it can then identify the background and anomaly target that cannot be separated in the original feature space and thus improve the detection accuracy.However, the traditional feature mapping method is to map both the background and anomaly target to the same feature space; however, this cannot effectively highlight the anomaly target.The main factor affecting this problem is the uncertainty of the mapping direction.It is difficult to separate the anomaly target from the background effectively by the conventional spectral dimension transformation.
To solve this problem, hyperspectral anomaly detection based on the spectral similarity variability feature (SSVF) was proposed.First, the AE network is used to fuse highdimensional similar neighborhoods into lower-dimensional similar features, which have similar information to neighborhood pixels and can reduce the computational burden of subsequent networks.The SSVF is then obtained using an autoencoder of the residuals, essentially to acquire the error between the image itself and its similar neighbors.Finally, the RX detector was used to obtain the final detection result of SSVF.Therefore, the proposed algorithm is also called SSVFRX.Hyperspectral images in which the background and its neighboring image elements have a high degree of similarity can be initially judged to belong to the same ground feature.In contrast, anomalous targets and their similar pixels have a low probability of belonging to the same ground feature.The similarity change feature allows most of the same features to be mapped in the same direction, while anomaly targets are mapped in the opposite direction.
This paper evaluates the superiority of SSVF in improving the difference between anomaly target and background through experiments.Comparative experiments are used to judge the effect of introducing a similar neighborhood on the improvement of the difference between the abnormal target and the background.SSVF aims to solve the uncertainty of the mapping direction of the spectral dimension transformation method for an unsupervised network model by introducing a similarity difference value to obtain a mapping direction which can improve the difference between anomaly target and background.

Experiment Data Description
The superiority of the SSVFRX algorithm was verified using seven hyperspectral experimental datasets.The detailed parameters of the experimental datasets are shown in Table 1, and the false-color images and their ground truth images are shown in Figure 1.Additionally, the following must be explained: (1) D 1 and D 2 are from the Remote Sensing and Image Processing Group (RSIPG) repository [27], captured at an altitude of 1200 m on a sunny day.D 1 is the full image, while D 2 is a cropped portion containing an anomaly.Both datasets have undergone residual stripe removal, and D 1 has been further processed with noise whitening and partial spectral discarding.(2) D 3 and D 4 are from the San Diego Airport, with the anomaly target being aircraft.
(3) D 5 is from the Digital Imaging and Remote Sensing (DIRS) laboratory, which is part of the Chester F. Carlson Center for Imaging Science at the Rochester Institute of Technology.(4) The high-spectral datasets D 6 and D 7 are from the personal website of Xudong Kang, School of Electrical and Information Engineering, Hunan University.The original images were downloaded from the AVIRIS website [28].The authors extracted 100 × 100 sub-images and applied a noise level estimation method to remove the noisy bands.

Hyperspectral Anomaly Detection Based on Spectral Similar Variability Feature
The proposed algorithm is divided into three main steps: data pre-processing, similar feature fusion (SFF) and spectral similarity variability feature extraction.The overall flow chart is shown in Figure 2. The process of data pre-processing involves processing the

Hyperspectral Anomaly Detection Based on Spectral Similar Variability Feature
The proposed algorithm is divided into three main steps: data pre-processing, similar feature fusion (SFF) and spectral similarity variability feature extraction.The overall flow chart is shown in Figure 2. The process of data pre-processing involves processing the origin HSI by PCA and whitening.SFF refers to the fusion of similar features from multiple similar neighborhoods, using AE networks to obtain a low-dimensional feature representation of the same dimension as the original image.Spectral similarity change feature extraction refers to the calculation of the difference value between similar features and the original features using a residual autoencoder network.Finally, the final detection result is obtained by the RX detector.In hyperspectral image processing, the pre-processing stage is crucial for improving data quality and subsequent analysis effectiveness.Before network training, the hyperspectral images are usually pre-processed, such as by reducing dimension and whitening.
The hyperspectral dataset is represented as X= x 1 ,x 2 ,...,x N , where x i = x 1 i ,x 2 i ,...,x n i , x j i is the j th dimension of the i th sample, N is the number of samples, and n is the sample dimension.In hyperspectral image processing, the pre-processing stage is crucial for improving data quality and subsequent analysis effectiveness.Before network training, the hyperspectral images are usually pre-processed, such as by reducing dimension and whitening.
The hyperspectral dataset is represented as X = x (1) , x (2) , ...,x (N) , where (i) j is the j th dimension of the i th sample, N is the number of samples, and n is the sample dimension.
The data pre-processing process is shown in Figure 3. Firstly, principal component analysis (PCA) is used to obtain the feature after reducing dimension X p , and then whitening is used to obtain the whitened features X w . where is the covariance matrix, λ i is the i th eigenvalue of the covariance matrix, and Σ k is the first k columns of the covariance matrix.
original features using a residual autoencoder network.Finally, the final detection result is obtained by the RX detector.In hyperspectral image processing, the pre-processing stage is crucial for improving data quality and subsequent analysis effectiveness.Before network training, the hyperspectral images are usually pre-processed, such as by reducing dimension and whitening.

Data Preprocessing
The hyperspectral dataset is represented as X= x 1 ,x 2 ,...,x N , where x i = x 1 i ,x 2 i ,...,x n i , x j i is the j th dimension of the i th sample, N is the number of samples, and n is the sample dimension.
The data pre-processing process is shown in Figure 3. Firstly, principal component analysis (PCA) is used to obtain the feature after reducing dimension X p , and then whitening is used to obtain the whitened features X w . where is the covariance matrix, λ i is the i th eigenvalue of the covariance matrix, and Σ k is the first k columns of the covariance matrix.

Similar Feature Fusion Based on Autoencoder
Hyperspectral images have strong high-dimensional properties, and their similar features can be reconstructed in a manner that is nearly lossless by an autoencoder for images with similar features.This process can help to further improve the separability between classes through its own nonlinear transformations while reducing the training burden of the subsequent residual network.The fusion model of similar features fusion is shown in Figure 4.

Similar Feature Fusion Based on Autoencoder
Hyperspectral images have strong high-dimensional properties, and their similar features can be reconstructed in a manner that is nearly lossless by an autoencoder for images with similar features.This process can help to further improve the separability between classes through its own nonlinear transformations while reducing the training burden of the subsequent residual network.The fusion model of similar features fusion is shown in Figure 4.The Euclidean distance is used as a similarity measure to find the nearest neighboring sample in the sample set.The feature of each sample is represented as x i and its neighborhood is represented as S i = S 1 i ,S 2 i ,...,S Q i , also known as the set of Q neighborhoods nearest to x i in the dataset, where Euclidean distance is used as a similarity measure.The specific process is as follows: First, calculate the similarity, as follows: where d i the similarity set of the i th sample Then, the similarity matrix d i is arranged from small to large, and the first Q sam- The Euclidean distance is used as a similarity measure to find the nearest neighboring sample in the sample set.The feature of each sample is represented as x (i) and its neighborhood is represented as Q , also known as the set of Q neighborhoods nearest to x (i) in the dataset, where Euclidean distance is used as a similarity measure.The specific process is as follows: Sensors 2024, 24, 5664 6 of 20 First, calculate the similarity, as follows: where d i the similarity set of the i th sample.Then, the similarity matrix d i is arranged from small to large, and the first Q samples are selected, as its similarity neighborhood set is Q .The autoencoder is used to undertake similar feature fusion.The training sample set can be represented as S = S (1) , S (2) , ...,S (N) .The network structure is shown in Figure 3.The network uses gradient descent to minimize the objective function, as follows: where α, β is the network parameter, M is the number of batches, S (i) is the i th input similar sample, λ is the weight decay term, n l is the number of layers of the network, and s l is the number of nodes at layer l.
After the training, the fixed parameters and the expression of the hidden layer are obtained.
Finally, a nonlinear similar feature Y, which contains similar information is obtained.

Spectral Similar Variability Feature
Although hyperspectral images are very rich in spectral information, because of illumination, noise and other factors, the spectral information of pixel exists in the phenomenon of 'same object and different spectrum' (that is, the spectrum of the same object is different).This difference is defined as spectral variation (SV), and the extracted spectral variation information is called the spectral variation feature.The questions of how to extract the spectral variation information and how to use it to enhance the performance of anomaly detection are the highlights of the study in this section.
Every pixel has its similar pixel in the global scope, so that the spectral features combined with other similar pixels are called similar features (SF).Suppose similar pixels belong to different spectra of the same category, the variation between them is called the spectral similar variation feature (SSVF).The advantages of SSVF in hyperspectral anomaly detection may lie in the following aspects.First, the different characteristics of the background and anomaly target show that there is a large variation between the anomaly target as outliers and their similar features.Second, it can be seen from the different spectral changes of different ground object types that SSVF can distinguish different ground object types in scenes to a large extent.

Spectral Similar Variability Feature Extraction Based on Residual Autoencoder
In the similar feature fusion stage, a similar fusion feature Y with similar information of multiple neighboring pixels is obtained by using the autoencoder.In order to obtain the variability feature between the SFF and the original features, the residual autoencoder network is used to take the SFF as inputs and the original features as labels.The structure of the residual autoencoder network is shown in Figure 5, and the method of obtaining the SSVF is as follows: First, the activation value of the network is obtained by forwarding propagation, as follows: Sensors 2024, 24, 5664 7 of 20 where x, θ 1 and θ 2 are parameters of the network, and Y is a similar feature.
The purpose of the residual autoencoder is to obtain the error generated when samples in the similar feature space are mapped to the original feature space.The parameter θ 1 , θ 2 is adjusted through back-propagation to minimize the cost function J (the mean square error of the sample set).
where Z (i) represents the i th activation values and X (i) represents the i th original hyperspectral data.
Then the difference between the activation value Z and the input Y of the residual network is used as the error for back-propagation to update the network parameters θ 1 and θ 2 .
After the training is completed, the spectral similar variability feature can be obtained as follows: The detection results are obtained by the following methods: where RXdetector(•) represents the RX anomaly detection algorithm and R(E) represents the detection results of the feature sets.

Comparison Algorithm
In this experiment, 10 groups of related comparison algorithms were selected to verify the superiority of the SSVFRX algorithm.Global RX detector (GRXD) [29] is the most basic detection method in the field of anomaly target detection and is widely used in a variety of anomaly detection fields.GRXD, based on PCA [13], is the most commonly used feature extraction method.Principal component reconstruction error (PCRE) [19] is the anomaly detection method based on the residual (error) caused by PCA projection in the reconstruction of original images.Anomaly detection based on autoencoder (ADAE) [21] is a method used to detect an anomaly target through the residual of the autoencoder.Hyperspectral anomaly detection by fractional Fourier entropy (FrFE) [17] is an anomaly detection method based on feature extraction and selection.The low-rank and sparse decomposition model with a mixture of Gaussian (LSDMMoG) [25] is an anomaly detection method for constructing hybrid Gaussian models based on sparse components and lowrank backgrounds.Information entropy estimation based on point-set topology (IEEPST) [30] combines point-set topology and information entropy theory to reveal data characteristics and data arrangement in topological space.Hyperspectral anomaly detection based on chessboard topology (CTAD) [31] refers to the use of checkerboard topology to mine high-dimensional data features.Hyperspectral anomaly detection with guided autoencoder (GAED) [32] is a guided multi-layer autoencoder that reduces the feature representation of the anomaly target by providing feedback.

Comparison Algorithm
In this experiment, 10 groups of related comparison algorithms were selected to verify the superiority of the SSVFRX algorithm.Global RX detector (GRXD) [29] is the most basic detection method in the field of anomaly target detection and is widely used in a variety of anomaly detection fields.GRXD, based on PCA [13], is the most commonly used feature extraction method.Principal component reconstruction error (PCRE) [19] is the anomaly detection method based on the residual (error) caused by PCA projection in the reconstruction of original images.Anomaly detection based on autoencoder (ADAE) [21] is a method used to detect an anomaly target through the residual of the autoencoder.Hyperspectral anomaly detection by fractional Fourier entropy (FrFE) [17] is an anomaly detection method based on feature extraction and selection.The low-rank and sparse decomposition model with a mixture of Gaussian (LSDMMoG) [25] is an anomaly detection method for constructing hybrid Gaussian models based on sparse components and low-rank backgrounds.Information entropy estimation based on point-set topology (IEEPST) [30] combines point-set topology and information entropy theory to reveal data characteristics and data arrangement in topological space.Hyperspectral anomaly detection based on chessboard topology (CTAD) [31] refers to the use of checkerboard topology to mine high-dimensional data features.Hyperspectral anomaly detection with guided autoencoder (GAED) [32] is a guided multi-layer autoencoder that reduces the feature representation of the anomaly target by providing feedback.

Parameter Selection
In order to better generalize the model, the parameter selection phase focuses on selecting a common hyperparameter that applies to most of the data.Therefore, it mainly explains how to adjust the parameters within a certain range.
(1) The first parameter to be adjusted is Q (the number of K neighbors).Because the number of K neighbors directly affects the dimension of input data in the phase of similar feature fusion, the value of Q should not be too large in order for it not to affect the computational efficiency.Take D 3 as an example, as shown in Table 2, when Q = 9, the anomaly detection accuracy reaches its maximum.However, if Q = 9, then, when the data set dimension is 511, the input data dimension will be as high as 4599, which will affect the computational efficiency of the algorithm.Therefore, Q is set to 5 at this stage.(2) In order to ensure the stability of detection results, when the network reaches the convergence state, the error of detection performance is small.The hyperparameters can be adjusted to control the degree and speed of network convergence and avoid falling into a local optimum in the following ways: The main parameters are learning rate (a), learning rate decay (b), maximum number of iterations (T) and batch size.The first of these is used to control the attenuation speed.According to experience, a = 0.1.As the number of iterations increases, a(t) = b × a(t − 1).However, through debugging, the algorithm convergence speed is slow when b ̸ = 1, and it is easy to fall into a local optimum, so b = 1.Batch size refers to the sample size of a model training process.It is related to the number of training samples, and a small sample may only need one batch of training.Although large batch size can improve the training speed, it may also cause slow convergence, low generalization performance and even over-fitting.If the batch size is small, data need to be loaded more frequently.Experience has shown that batch size is usually 1% of the sample size (batch size = N × 1%).The number of iterations, T, depends on the convergence degree and speed after the above parameters are determined and is generally set to 100 times according to the convergence situation.
(3) n 0 is the implicit layer dimension of the residual autoencoder.The mapping direction of the hyperspectral image is controlled by adjusting n 0 .Different mapping spaces affect the separability of different features.Based on experience, this is usually set to n − 20, where n is the original data dimension.(4) n 1 is the dimension of the last layer of the residual network.As the algorithm needs to obtain the difference between similar fusion features and the original data, it must be consistent with the original image dimension.

Experimental Results
The basic evaluation indexes adopted in this chapter mainly include the three-dimensional receiver operating characteristic (3D ROC) [33], statistical separability analysis (SSA) [34] and detection result image (DRI) [35].Seven experimental data and five comparison algorithms were selected to verify the superiority of SSVFRX.
The 3D ROC is an extension of the traditional ROC curve, where the threshold τ is used as an independent variable to illustrate the three-dimensional relationship among PD, PF, and τ.Here, P D represents the probability of correctly identifying a target when the true value is indeed a target, also known as the probability of detection.P F represents the probability of incorrectly identifying a target when the true value is a non-target, also known as the probability of false alarm.Figures 6-12 display the 3D ROC curves for seven sets of experimental data, along with their corresponding 2D projections.Based on these, new quantitative performance metrics is redefined.AUC (D,F) is the area under the P D and P F curve, while AUC (D,τ) is the area under the P D and τ curve.Both of these metrics are positively correlated with target detection performance, meaning that the higher the value, the better the detection performance.AUC (F,τ) is the area under the P F and τ curve and is negatively correlated with background suppression performance, meaning that the lower the value, the better the background suppression.In addition, several comprehensive indicators are defined below: AUC TD = AUC (D,F) + AUC (D,τ) , which represents target detectability (TD).AUC BS = AUC (D,F) − AUC (F,τ) , which represents background suppressibility (BS).AUC SNPR = AUC (D,τ) /AUC (F,τ), which measures the signal-to-noise ratio by treating the target as the signal and the background as noise.AUC TDBS = AUC (D,τ) − AUC (F,τ) , which represents TD within the background.AUC ODP = AUC (D,τ) + (1 − AUC (F,τ) ), which represents overall detection accuracy.
The aforementioned metrics across the 7 experimental datasets are presented in Tables 3-9, where ↑ indicates that the value is proportional to the performance, ↓ indicates that the value is inversely proportional to the performance, and bold indicates the optimal solution.After analyzing the AUC results for these datasets, the following results are obtained: (1) Background suppressibility (BS): AUC (F,τ) and AUC BS correlate with BS capacity.
The SSVFRX model exhibits a number of characteristics in experiments on BS.In most experimental datasets, SSVFRX has the best AUCBS (comprehensive BS) performance.Despite the low performance of AUC (F,τ) under a single hypothesis, its comprehensive BS is strong.In addition to this, in datasets D 4 , D 5 , the AUCBS of SSVFRX are second only to one model, which is a different model, and the difference is very small, 0.0013 and 0.0185, respectively, which suggests that SSVFRX has a superior performance in background suppression.optimal solution.After analyzing the AUC results for these datasets, the following results are obtained: (1) Background suppressibility (BS): AUC(F,τ) and AUCBS correlate with BS capacity.
The SSVFRX model exhibits a number of characteristics in experiments on BS.In most experimental datasets, SSVFRX has the best AUCBS (comprehensive BS) performance.Despite the low performance of AUC(F,τ) under a single hypothesis, its comprehensive BS is strong.In addition to this, in datasets D4, D5, the AUCBS of SSVFRX are second only to one model, which is a different model, and the difference is very small, 0.0013 and 0.0185, respectively, which suggests that SSVFRX has a superior performance in background suppression.(2) Target detectability (TB): AUC(D,F), AUC(D,τ), AUCTD and AUCTDBS represent the TD in different cases.
Combining the detection results in Tables 3-7, the SSVFRX model has the best AUC(D,F) performance among all experimental data.However, SSVFRX generally performs worse in the AUC(D,τ) of a single hypothesis.This may be due to limitations in the target detection ability under different threshold conditions.
The AUCTD of SSVFRX is ranked 2nd, 1st, 2nd, 2nd, 2nd, 2nd, 1st, 3rd in D1~D7, respectively, which indicates that the target detection performance is relatively stable in different scenarios and performs well in most of the cases.The AUCTDBS of SSVFRX is ranked 5th, 1st, 3rd, 2nd, 2nd, 1st, 2nd in D1~D7, respectively.This indicates that in terms of the ability of TD to remove BS, SSVFRX performs relatively consistently and excels in most cases.(2) Target detectability (TB): AUC (D,F) , AUC (D,τ) , AUC TD and AUC TDBS represent the TD in different cases.
Combining the detection results in Tables 3-7, the SSVFRX model has the best AUC (D,F) performance among all experimental data.However, SSVFRX generally performs worse in the AUC (D,τ) of a single hypothesis.This may be due to limitations in the target detection ability under different threshold conditions.
The AUC TD of SSVFRX is ranked 2nd, 1st, 2nd, 2nd, 2nd, 2nd, 1st, 3rd in D 1 ~D7 , respectively, which indicates that the target detection performance is relatively stable in different scenarios and performs well in most of the cases.The AUC TDBS of SSVFRX is ranked 5th, 1st, 3rd, 2nd, 2nd, 1st, 2nd in D1~D7, respectively.This indicates that in terms of the ability of TD to remove BS, SSVFRX performs relatively consistently and excels in most cases.
The AUCTD of SSVFRX is ranked 2nd, 1st, 2nd, 2nd, 2nd, 2nd, 1st, 3rd in D1~D7, respectively, which indicates that the target detection performance is relatively stable in different scenarios and performs well in most of the cases.The AUCTDBS of SSVFRX is ranked 5th, 1st, 3rd, 2nd, 2nd, 1st, 2nd in D1~D7, respectively.This indicates that in terms of the ability of TD to remove BS, SSVFRX performs relatively consistently and excels in most cases.(3) Overall detection accuracy: AUCODP represents the overall detection accuracy.
The overall detection results show that the SSVFRX model has higher AUCODP scores in most of the datasets.This reveals that it has an advantage in overall detection accuracy.It should be noted in particular that AUCODP only differed by 0.054, 0.1675 and 0.1374 compared with CTAD in D4, D5 and D7, respectively, but even so, the performance of SSVFRX is still the best model besides CTAD.This shows that SSVFRX has a better overall detection performance than other global detection methods and outperforms local detection methods on most datasets.
SSA is used to assess the separability of the anomaly target and the background.The red box indicates the range of values for the anomaly target and the green box indicates the range of values for the background.The distance between the lower limit of the red box and the upper limit of the corresponding green box reflects the degree of separability between the anomaly target and the background.A larger distance represents a higher degree of separability between the anomaly target and the background, or, in other words, a more prominent anomaly target.The height of the green box represents the degree of background suppression, and the smaller the height, the higher the degree of background suppression.As shown in Figure 13, SSVFRX can significantly improve the separability between the background and the anomaly target and suppress the background.In particular, in datasets D4,D5 and D7, SSVFRX has a lower degree of separability than CTAD, but a higher degree of background suppression.(3) Overall detection accuracy: AUC ODP represents the overall detection accuracy.
The overall detection results show that the SSVFRX model has higher AUC ODP scores in most of the datasets.This reveals that it has an advantage in overall detection accuracy.It should be noted in particular that AUC ODP only differed by 0.054, 0.1675 and 0.1374 compared with CTAD in D 4 , D 5 and D 7 , respectively, but even so, the performance of SSVFRX is still the best model besides CTAD.This shows that SSVFRX has a better overall detection performance than other global detection methods and outperforms local detection methods on most datasets.
SSA is used to assess the separability of the anomaly target and the background.The red box indicates the range of values for the anomaly target and the green box indicates the range of values for the background.The distance between the lower limit of the red box and the upper limit of the corresponding green box reflects the degree of separability between the anomaly target and the background.A larger distance represents a higher degree of separability between the anomaly target and the background, or, in other words, a more prominent anomaly target.The height of the green box represents the degree of background suppression, and the smaller the height, the higher the degree of background suppression.
As shown in Figure 13, SSVFRX can significantly improve the separability between the background and the anomaly target and suppress the background.In particular, in datasets D 4 , D 5 and D 7 , SSVFRX has a lower degree of separability than CTAD, but a higher degree of background suppression.(a) D1 DRI is a two-dimensional flat view that uses color depth to represent an anomaly.As shown in the legend in Figure 8, the value represents the probability that the sample is an DRI is a two-dimensional flat view that uses color depth to represent an anomaly.As shown in the legend in Figure 8, the value represents the probability that the sample is an anomaly.DRI contains spatial information that can be used to observe differences between various categories, including anomaly target and background.As can be seen from Figure 8, the contours between the anomaly target and the background are clearer, and more separable for SSVFRX than for the other comparison algorithms.
As shown by the running time in Table 10, the SSVFRX algorithm exhibits a higher computational cost compared with the other compared algorithms.This is mainly due to the fact that the algorithm is more sensitive to the data dimension as well as the complex structure containing two sets of deep learning networks.Therefore, the runtime increases significantly when dealing with the higher dimension dataset (D 2 ).

Discussion
The hypothesis is that SSVFRX has a greater degree of difference between the background and anomaly target, which can lead to better detection accuracy.The experimental results show that this hypothesis is correct.By theoretical analysis, the superiority of the SSVFRX algorithm may be due to the following reasons.
First, through network training, features of the same type are more likely to be mapped to the same direction.Second, the similar neighborhoods of an anomaly target tend to differ to a greater extent from themselves.Third, the trained model tends to match the characteristics of most data, and the errors arising from a smaller number of anomaly targets account for a lower proportion of the back propagation.
The possible reasons for the superiority of SSVFRX are analyzed based on the following experimental results: Firstly, the 3D ROC detection results (Tables 3-9 and Figures [4][5][6][7][8][9][10][11][12] indicate that, in most cases, SSVFRX significantly improves both target detectability and background suppression for datasets D 4 , D 5 , and D 7 , which is relatively low compared with CTAD.A similar trend is observed in the SSA analysis (Figure 13).SSA shows that, except for D 4 , D 5 , and D 7 , SSVFRX enhances the separability of backgrounds and anomaly target more effectively.A possible reason for this is that CTAD is a local anomaly detection method, which has some advantages in highlighting anomalies when compared with the global detection algorithm of SSVFRX.However, both anomalies are relative to different backgrounds, and anomalies in the global scope do not necessarily belong to anomalies in the local scope, and vice versa.Therefore, CTAD exhibits relatively weaker background suppression capabilities, as evidenced by its lower performance compared with SSVFRX in datasets D 4 , D 5 , and D 7 (Tables 6, 7 and 9).This is further validated in the detection results in Figure 14.Comparing the CTAD and SSVFRX in Figure 14, it is clear that there is more false detection in the CTAD background, while the background is clearer in the SSVFRX.A possible reason for this is that the suppressed background in CTAD is not the global background.Thus, it is easy to produce a situation where the background is mistaken for an anomaly.
Second, it can be seen from the DRI (Figure 14) that SSVFRX obtains a clearer contour of the anomaly target, representing a better separation between the anomaly target and the background.It can be inferred that SSVFRX is able to increase the difference between the background and the anomaly target. of samples map to the direction of their similar neighborhoods.One of the anomaly targets belongs to an isolated point, which makes it very different from its similar image elements.While the other categories of ground objects in the background have smaller differences from the similar image elements.This is the reason that SSVFRX is able to suppress the background better.
Third, the running time of the SSVFRX algorithm (Table 10) shows a relatively high computational cost.Nevertheless, it demonstrates significant advantages in key aspects such as detection accuracy, background suppression, and anomaly target highlighting.In practical applications, it is necessary to balance the algorithm's performance and efficiency according to specific requirements.For scenarios where real-time processing is not critical but high detection accuracy is crucial, the advantages of SSVFRX may far outweigh its runtime drawbacks.Moreover, improvements can be made to reduce computational costs.First, dimensional reduction methods can be considered.For example, in a group of data sets, D 1 and D 2 are compared to show that, when the sample size is large, a lower dimension reduces the computational cost.Second, parallel computing or GPU acceleration techniques can be utilized to enhance the algorithm's execution speed.
The SSVFRX algorithm shows significant advantages in anomalous target detection tasks, being able to improve both background and anomalous target separability and background suppression, especially in terms of background suppression.These advantages mainly stem from the efficient mapping of different classes to their similar samples.While local detection methods may have their advantages in some specific cases, global SSVFRX is more advantageous in terms of background suppression.

Conclusions
SSVFRX is capable of capturing rich anomaly and difference information, effectively distinguishing different types of features, and highlighting anomaly targets.Experiments have shown that SSVFRX is able to improve target and background separability and background suppressibility at the same time.The advantages of SSVFRX are mainly reflected in several aspects: the anomaly target is shown as an isolated point, and its similar features are different from the original features, SSVFRX can accurately capture such differences and improve the accuracy of anomaly detection.Meanwhile, SSVFRX maps the background to a similar direction to enhance the background suppression effect, which improves the detection capability and reduces the false alarm rate.However, there is still room for improvement in computational efficiency, such as optimizing network structures, developing high-dimensional data processing techniques, exploring optimal parameter configurations, and leveraging parallel computing or GPU acceleration.Through continuous optimization, the efficiency and performance of SSVFRX are improved, so that it can play a greater role in the field of hyperspectral anomaly detection.

Figure 1 .
Figure 1.False-color image and target position of experimental data.

Figure 1 .
Figure 1.False-color image and target position of experimental data.

Figure 3 .
Figure 3.The flow chart of pre-processing.

Figure 3 .
Figure 3.The flow chart of pre-processing.

Figure 13 .
Figure 13.Comparison of SSA of different methods on different datasets.

Table 2 .
The relationship between parameter k and AUC.

Table 3 .
AUC performance comparison of different methods on D 1 .

Table 3 .
AUC performance comparison of different methods on D1.

Table 4 .
AUC performance comparison of different methods on D 2 .

Table 5 .
AUC performance comparison of different methods on D3.

Table 5 .
AUC performance comparison of different methods on D 3 .

Table 5 .
AUC performance comparison of different methods on D3.

Table 6 .
AUC performance comparison of different methods on D4.

Table 6 .
AUC performance comparison of different methods on D 4 .

Table 7 .
AUC performance comparison of different methods on D5.

Table 7 .
AUC performance comparison of different methods on D 5 .

Table 8 .
AUC performance comparison of different methods on D 6 .

Table 8 .
AUC performance comparison of different methods on D6.

Table 9 .
AUC performance comparison of different methods on D 7 .

Table 10 .
Running time comparison of different methods on different datasets.