An Object-Oriented Deep Multi-Sphere Support Vector Data Description Method for Impervious Surfaces Extraction Based on Multi-Sourced Data

: The effective extraction of impervious surfaces is critical to monitor their expansion and ensure the sustainable development of cities. Open geographic data can provide a large number of training samples for machine learning methods based on remote-sensed images to extract impervious surfaces due to their advantages of low acquisition cost and large coverage. However, training samples generated from open geographic data suffer from severe sample imbalance. Although one-class methods can effectively extract an impervious surface based on imbalanced samples, most of the current one-class methods ignore the fact that an impervious surface comprises varied geographic objects, such as roads and buildings. Therefore, this paper proposes an object-oriented deep multi-sphere support vector data description (OODMSVDD) method, which takes into account the diversity of impervious surfaces and incorporates a variety of open geographic data involving OpenStreetMap (OSM), Points of Interest (POIs), and trajectory GPS points to automatically generate massive samples for model learning, thereby improving the extraction of impervious surfaces with varied types. The feasibility of the proposed method is experimentally veriﬁed with an overall accuracy of 87.43%, and its superior impervious surface classiﬁcation performance is shown via comparative experiments. This provides a new, accurate, and more suitable extraction method for complex impervious surfaces.


Introduction
Global urbanization is rapidly accelerating, and the urban population is growing at an astonishing rate. As of 2022, 56% of the world's population live in urban areas, and the urban population of the world has surged from 751 million in 1950 to 4.4 billion in 2022 [1]. The urban population boom has led to a significant demand for urban land space, which results in the continued expansion of urban impervious surfaces. Impervious surfaces, such as building roofs, parking lots, and roads, are earth surfaces that prevent water from penetrating into the ground [2,3]. Their physical imperviousness brings a great challenge to the urban ecological environment. The transformation of permeable surfaces to impervious surfaces can negatively impact the urban thermal [4,5] and hydrological environments [6][7][8]. Therefore, the effective extraction of impervious surfaces is critical to monitor their expansion and ensure the sustainable development of cities.
Remote-sensed images are becoming increasingly important as a data source for impervious surface extraction, owing to their advantages of large-area simultaneous observation, increasingly convenient access, and high spatial resolution. Shao et al. [9] and Cao et al. [10] extracted impervious surfaces by constructing time series of Landsat images. Liu et al. [11] and Misra et al. [12] used Sentinel-2 satellite images to generate the higherspatial-resolution impervious surface products. Attarchi [13] demonstrated the potential of Advanced Land Observing Satellite/Phased Array L-band Synthetic Aperture Radar images in impervious surface detection in different urban areas. These studies utilized a single type of optical remote sensing images or radar images and can quickly extract impervious surfaces. However, optical images are susceptible to light and clouds, and radar images are susceptible to speckle noise and geometric deformation, which will affect the accuracy of the model.
The methods integrating multiple remote sensing data sources are proposed to improve impervious surface extraction accuracy. Shao et al. [14] fused Gaofen-1 and Sentinel-1A images to achieve urban impervious surfaces. Guo et al. [15] constructed a multi-featurebased urban impervious surface extraction method based on Sentinel-2 multispectral data and Luojia 1-01 images. Sun et al. [16] extracted impervious surfaces based on the WorldView-2 and airborne LiDAR datasets. These multiple remote sensing data can well compensate for the use of a single image. Impervious surface extraction based on remote sensing is usually combined with machine learning algorithms, such as support vector machines (SVM), random forest, neural networks, and decision tree [17][18][19]. Despite their effectiveness, most of these machine-learning-based methods require massive training samples, which need to be manually labeled in remote sensing images.
To alleviate the manual workload of labeling impervious surface samples, researchers proposed auto-labeling methods for impervious surface extraction using open geographic data shared by ordinary people or organizations [20]. Mao et al. [21] employed OSM data to eliminate the shading effect of vegetation and improve impervious surface extraction accuracy. Huang et al. [22] used OSM data to assist in the selection of training samples for model training and generated a global artificial impervious surface area dataset at 10 m resolution. Points of Interest (POIs) are also often employed in impervious surface extraction applications [23][24][25]. In addition, social media data with geographic and human activity characteristics have emerged as another promising source of impervious surface information. Miao et al. [26] generated samples from Twitter and OSM and verified their feasibility for extracting impervious surfaces. Wu et al. [27] integrated Twitter, Weibo, POIs, and OSM data to propose a new impervious surface extraction scheme from synthetic aperture radar images. Vehicle trajectory GPS data have been successfully utilized to generate massive impervious surface samples of road types, which can be used for the automatic extraction of impervious surfaces [28].
Although open geographic data can be important auxiliary data for impervious surface extraction, most of the data are generated on impervious surfaces, resulting in strongly biased samples that only contain impervious surface samples. There exists a serious data imbalance issue where there is only one class of target data in the training sample [29,30]. To address this issue, one-class algorithms, such as one-class support vector machine (OCSVM) and support vector data description (SVDD), have been proposed and applied to impervious surface extraction based on open geographic data and remote-sensed images [26,28,31]. However, these one-class classification methods assume that impervious surface objects comes from a single cluster, which ignores the fact that an impervious surface comprises varied geographic objects, such as roads and buildings. These different types of impervious surfaces present different characteristics involving spectral and texture in remote sensed images.
To overcome the challenge of data multimodality, the concept of multiple hyperspheres is introduced into one-class methods in the field of anomaly detection. Hu et al. [32] proposed a multimodal deep support vector data description (DSVDD) method. This approach constructs multiple hyperspheres to provide a better description for the target class of data for text classification. Zahra et al. [33] proposed a deep multi-sphere support vector data description (DMSVDD) method, which embeds normal data with multimodal distributions into multiple data-packed hyperspheres with minimal volume to generate useful and differentiated features. These methods provide valuable insights for overcoming the limitations of traditional one-class classification methods in handling impervious surface extraction based on open geographic data and remote-sensed images.
Given the difficulty of acquiring massive impervious surface samples with labels and the problem that the traditional one-class classification algorithm is not applicable to multimodal sample data, an object-oriented deep multi-sphere support vector data description (OODMSVDD) method is proposed in this paper. The study aims to integrate multiple open geographic data sources involving OSM, POIs, and vehicle trajectory GPS points to generate impervious surface samples automatically, which will manually reduce the workload of labeling samples. Furthermore, a new one-class classification approach with multiple hyperspheres is explored to improve impervious surface extraction accuracy. The remainder of the paper is organized as follows: The study materials and methods involved in this paper are elaborated in Section 2. Sections 3 and 4 present the experiment setup and experimental results, respectively. The discussion and conclusion are described in Sections 5 and 6.

Study Area
For our study, a portion of Shenzhen City was selected as our study area. Shenzhen City is one of the world's most rapidly urbanizing cities and has been ranked as an "Alpha-" city by GaWC 2020 [34]. The study area includes an area of 113.90 • E-114.09 • E and 22.60 • N-22.78 • N, mostly belonging to Longhua, Longgang, Nanshan, Baoan, and Guangming districts, as shown in Figure 1.

Data Sources
As shown in Figure 1, the experiments in this paper use one panchromatic multispectral image from the Gaofen-1 satellite launched by China. The image was taken on 2 October 2018, and consists of one panchromatic band and four multispectral bands that were preprocessed and fused into a multispectral image with a 2-meter resolution. The dataset contains 10,000 × 10,000 pixels, covering varied land types, such as water bodies, grasslands, woodlands, artificial buildings, and roads.
Multiple sources of open geographic data, including POIs from Amap (Figure 2a), vehicle trajectory GPS data [35] (Figure 2b), and roads and buildings data from OSM (Figure 2c,d), were collected for this study. These open geographic data are used to automatically generate impervious surface samples. In order to minimize the errors caused by the temporal differences in the data, all the open geographic data were collected in 2018 ( Table 1)   In this paper, an object-oriented automatic extraction method of impervious surface based on DMSVDD is proposed. The technical flow chart of the proposed method is shown in Figure 3. The method consists of five parts: (1) automatic generation of impervious surface samples, which integrates multiple open geographic datasets to generate impervious surface samples; (2) object-oriented training sample processing based on multi-scale segmentation results; (3) construction and training of DSVDD based on single sphere for impervious surface extraction; (4) modeling DMSVDD algorithm for impervious surface extraction that takes into account the diversity of impervious surface types; (5) optimization of impervious surface extraction results based on object blocks.

Automatic Generation of Impervious Surface Samples
Multi-sourced open geographic data are utilized to generate the impervious surface samples with various land cover categories. To ensure consistency with the study area, the datasets are cropped based on the fused GF-1 image and projected to UTM projection with WGS-1984 datum. The data used to generate impervious surface samples include POIs, vehicle trajectory GPS points, and roads and buildings from OpenStreetMap (OSM). To generate impervious surface raster sample data, all the open geographic datasets are classified into two types, including points and polylines or polygons. Two different sample generation methods are developed for these two types of data, respectively ( Figure 4).
(1) The point datasets including POIs and vehicle trajectory are converted into raster data, the raster image pixel values are computed based on the frequencies of points falling within the pixel, and while there is no point falling in a pixel, the pixel value is set as zero.
(2) The roads and buildings data are converted into rasters. We assign a value of "1" for areas with data (i.e., covered by roads or buildings), and areas without data are assigned a value of "0".  The more open geographic data generated on impervious surfaces in a certain area, the higher the probability that the area is covered by impervious surfaces. However, due to the users' opportunistic observation efforts, it is inevitable that there exists spatial bias in open geographic data, especially in data shared by ordinary people [36][37][38]. To address this issue, a threshold method is used to filter the generation of impervious surface samples ( Figure 4). Firstly, maximum-minimum normalization is implemented to compress the frequencies of various multi-sourced open geographical data to the interval [0, 1]. Secondly, a sliding window is used to crop the normalized frequency data and remote sensing data. The size of the cropped block is determined based on the network structure of the model. In our paper, the cropped block size is 14 × 14 pixels, and the coverage area is 28 m × 28 m. Thirdly, for each cropped block, its total pixel values are computed and noted as Y, and then a threshold y is set. If Y is not less than the threshold y, then this block has a high probability of being an impervious surface. All pixel blocks that meet the threshold requirements will be used as impervious surface samples to train the impervious surface model.

Object-Oriented Processing of Training Samples
Due to the difference between sample blocks and geographical objects, it is possible that the filtered training samples still contain a small fraction of permeable materials, which are noise samples. In order to eliminate the permeable surface data from the training samples, the object-oriented analysis technique is adopted to process each filtered sample individually. This can make the final training samples as "pure" as possible to reduce the uncertainty brought by noise samples.
Object-oriented analysis techniques can segment remote sensing images into geographical objects as the basic processing unit. In our experiments, we used eCogition software (version 9.0, Trimble Germany GmbH, Munich, Germany, 2014) to segment remote sensing images. The eCogition software is currently one of the most popular tools for image segmentation [39]. For each filtered sample, we performed an object-oriented process, as follows ( Figure 5): (1) If the sample block contains only one object block, the sample is used directly for model training; (2) if the sample contains more than one object block, only the image spectral data of the largest object block are retained, while the other regions within the sample are filled with zero values. These samples processed by the object-oriented method are used to train the model.

Impervious Surface Extraction Based on DSVDD
The DSVDD approach couples SVDD and neural network approaches [40]. One-class classification objective function as in Equation (1) is used by DSVDD to learn the feature representation of an impervious surface, which in turn is used for classification. As shown in Figure 6a, the positive class samples are mapped into an optimal hypersphere during the training process of DSVDD. The hypersphere should be as small as possible and contains as many positive sample points as possible. The sample points falling outside the hypersphere are considered negative classes.
where R denotes the radius of the hypersphere; c is the center of the hypersphere; and the number of samples to be taken is n. φ(·; W ) : x → F represents the neural network, and for sample data x i ∈ X in network φ using the W weight parameter the feature is represented as φ(·; W ), and W = W 1 , . . . , W L is the weight matrix of the network with L hidden layers.
Parameter v ∈ [0, 1] is used to control the trade-off between the sphere volume and the boundary. The first part of the formula, min R, represents a constraint on the hypersphere. The second part min R,W λ 2 ∑ L l=1 W l 2 F of the equation is a constraint on the network that is used to prevent the network from overfitting.
To determine whether a test sample x i belongs to the positive or negative class, the DSVDD approach calculates the anomaly score of the sample using Equation (2), where W * is the weight matrix of the trained neural network.

Impervious Surface Extraction Based on DMSVDD
DMSVDD [33] is an improvement of DSVDD that addresses the tendency of positive class data to exhibit a multimodal distribution (Figure 6b). The idea of DMSVDD lies in embedding the mapping of positive class data with multimodal distribution into multiple data hyperspheres with minimal volume to generate useful data representations. The objective function of DMSVDD is defined as follows.
where R, K, and c represent the radius of the hypersphere, the number of hyperspheres, and the center of the hypersphere, respectively. For each sample x i , assuming that its corresponding hypersphere center is c j , all hyperspheres are constrained using min R,W such that the data are reasonably distributed among multiple hyperspheres, i.e., as many samples as possible fall into multiple spheres and the total volume of all spheres is minimized. The second part min R,W λ 2 ∑ L l=1 W l 2 F serves the same purpose as the previous section and is used to prevent the network from overfitting. In the DMSVDD method, the anomaly scores of all the samples also are compared with the trained hypersphere radius to determine the class of the samples. The formula for determining the sample class in DMSVDD is shown in Equation (4), where c k is the center of the hypersphere to which sample x i belongs, which is determined according to the nearest neighbor principle. W * and R * , respectively, refer to the trained neural network weight matrix and radius.

Result Optimization Based on Object Blocks
The images to be predicted are fed into the trained model to obtain a classification map, which is considered as the rough impervious surface classification results of OODSVDD or OODMSVDD. A segmented object block as a whole has theoretically consistent classification results. However, due to the network input block size of 14 × 14, which is sometimes smaller than an object block size, and the prediction errors of the neural network model, an object block may be classified as multiple results. To ensure the overall consistency of the object block, object-oriented methods can be applied to optimize the coarse classification results.
To finely correct the classification results using object-oriented methods, we adopt a spatial statistical analysis method. If multiple predicted values occur within a block of objects, the predicted results (impervious surface/permeable surface) covering a larger area of the block are given higher priority. Thus, the predicted labels within an object block are counted and the final labels of that object block are taken according to the predicted label with the largest area within that object block. The resulting classification map represents the final fine classification results of OODSVDD or OODMSVDD after processing as above.

Assessment of Results
In order to verify the feasibility of the proposed method, five metrics commonly used for classification studies are calculated: Overall Accuracy (OA), Precision, Recall, F1-score and Area Under The Curve (AUC). True positive, true negative, false positive, false negative, false positive rate, and true positive rate are denoted as TP, TN, FP, FN, FPR, and TPR, respectively. AUC represents the area under the receiver operating characteristic curve, which is formed by connecting the (FPR, TPR) values of the samples in Cartesian coordinate system. The other metrics are calculated as follows.

Experimental Setup
(1) Filtering threshold for training samples To determine a reasonable threshold setting range for training sample filtering, the experiments were conducted to count the number of automatically generated training samples under different threshold screening criteria, as shown in Figure 7. It can be found that the number of filtered training samples decreases sharply as the threshold value is set higher. These training samples are obtained by cropping the remote sensing images through sliding windows (Figure 4), and the step of each sliding is set to one half of the sample edge length, so that there is a 50% overlap between two consecutive samples. Therefore, the threshold should not be set too high to ensure that the model training has an adequate number of samples distributed over different geographical locations. If the threshold is set too high, the filtered samples will be overly concentrated in the central, core business district of the city, lacking impervious surface training samples that are slightly out of the central area. Finally, considering the adequacy of the number of samples and the homogeneity of the geographic distribution of samples, five experimental scenarios with thresholds y (y ∈ {10, 20, 30, 40, 50}) were set, and 30,000 training samples were randomly selected from the set of samples after threshold screening for model training. (2) Structure of neural network According to the basic convolutional network structure, a convolutional neural structure applicable to impervious surface feature extraction is constructed for the DSVDD, DMSVDD, OODSVDD, and OODMSVDD methods in this paper ( Figure 8). The input to the network is a three-band block of impervious surface images, with a size of 14 × 14 pixels. The network is made up of a convolutional layer, a pooling layer, and a fully connected layer connection. A convolution kernel of size 5 × 5 is used to extract the local features of the input image, and a filter of size 2 × 2 is used to compute the maximum pooling. Then, these features are flattened to a fully connected layer of 98 nodes. Finally, these features are described by the methods to achieve the classification of samples. The objective functions of the OODSVDD and OODMSVDD networks are shown in Equations (1) and (3), respectively. To achieve the best results, the values of the parameters v and λ are determined by the network parameter tuning.

Experimental Scenarios
In order to test the performance of the proposed method under different conditions, the following three sets of comparison experiments are set up. The performances of all scenarios are evaluated using the average performance of five random seeds and using the same test samples.
(1) Filtering threshold for training samples To determine a reasonable threshold setting range for training sample filtering, the experiments are conducted to compute the number of automatically generated training samples under the different thresholds ( Figure 7). It can be found that the number of filtered training samples decreases sharply as the threshold value increases. These training samples are obtained by cropping the remote sensing images with a 14 × 14 sliding window (Figure 4), and the step of each sliding is set to one half of the sample edge length. It means that there exists a 50% overlap between two adjacent samples. To train the models, 30,000 samples are randomly selected from the generated impervious surface samples in this study. The sample size will be below 60,000 when the threshold value is greater than 50 at the intervals of 10. Therefore, considering the number and the spatial homogeneity of the samples, five experimental scenarios with thresholds y (y ∈ {10, 20, 30, 40, 50}) are set.
(2) Different methods A set of control experiments are set up to verify the effectiveness of the object-oriented models involving DSVDD, DMSVDD, and OODMSVDD. For the above three methods, the samples are generated based on the same datasets and from the same region. The only difference is that the training samples of OODMSVDD adopt an object-oriented process (see Section 2.3.3), while the training samples for the DSVDD and DMSVDD methods are generated directly from the multi-sourced geographic data. (

Experiment 1: Comparison between OODMSVDD and OODSVDD
As shown in Table 2, it can be observed that the OODMSVDD method achieved the best overall results for impervious surface extraction when the threshold value was set to 30 (the map of impervious surface extraction results is shown in Figure 9). The method obtained the highest OA value of 87.43%, while the Precision and Recall values were 84.68% and 91.42%, respectively. Typically, it is challenging to achieve high performance in both the Precision and Recall of a model, so F1-score, the harmonic mean of these two metrics is usually used to evaluate the model's accuracy and recall . The OODMSVDD method achieved the maximum F1-score and AUC values in the experiment, at 87.91% and 92.40%, respectively, when a threshold of 30 was used. This once again demonstrates the excellent performance of the OODMSVDD method for impervious surface extraction when a threshold of 30 is used. The horizontal comparison of the performance of OODMSVDD and OODSVDD highlights the importance of the number of multi-spheres for the model. A typical demonstration area was selected for the experiment to show the classification results in order to increase the intuitiveness of the model performance at different thresholds, as shown in Figure 10. The experiments show that the Precision and OA of OODMSVDD exceed the corresponding OODSVDD method when the filtering thresholds are 10, 20, 30, and 40, respectively. This indicates that multi-sphere method can improve the Precision and OA of the model in most cases. However, the results differ from the above conclusions when the threshold value is 50. By analyzing the distribution of training samples at different threshold values, it is found that the samples are mainly concentrated in building types, and the road types cover less when the threshold value is 50. This can affect the accuracy of the model because the roads cover fewer samples. Moreover, most of the building roofs have more consistent surface materials, resulting in a more homogeneous sample data type that is more suitable for single-sphere models. Nevertheless, even in this case, the OA of the multi-sphere model is only 0.62% lower than that of the single-sphere model. Therefore, the OODMSVDD method is more effective for the extraction of impervious surfaces, especially when the impervious surfaces include many different types.

Experiment 2: Comparison between OODMSVDD and DSVDD/DMSVDD
As shown in Table 3, the optimal OA performance of the DSVDD method is 86.95%. Meanwhile, the best OA result for the DMSVDD method is 86.96%. The OA results based on DSVDD and DMSVDD are 0.48% and 0.47% below the optimal OA of 87.43% for the OODMSVDD method, respectively. This indicates that OODMSVDD has better impervious surface extraction performance than DSVDD and DMSVDD, with suitable sample selection. In particular, the best OA results of the multi-sphere methods, including DMSVDD and OODMSVDD, are reached at the sample filtering threshold of 30, while the DSVDD method achieved the highest OA at the threshold of 20. This result proves that the selection of the sample filtering threshold is crucial for the impervious surface extraction, and there exists different optimal thresholds in the multi-sphere SVDD methods and the DSVDD method.
In terms of multi-sphere methods, the results based on the DMSVDD method and the proposed OODMSVDD method are analyzed. The worst classification performance of the DMSVDD method is obtained at a threshold value of 40, and the worst classification OA is 83.68%, which is 0.48% lower than the worst OA of the OODMSVDD method of 84.21%. We observed that the differences between the maximum and minimum results of the five metrics in the OODMSVDD method are smaller than those of the DMSVDD method. This indicates that the object-oriented analysis method can improve the robustness of the multi-sphere algorithms. Furthermore, the corresponding Recall of the OODMSVDD method is higher than that of the DMSVDD method when the threshold values are 10, 30, and 50. This shows that the object-oriented analysis method can improve the Recall of the model in most cases.  Table 4 presents the classification performance based on different data sources. It reveals that combining multiple sources of data to automatically generate training samples yields the best results for impervious surface extraction. When considering using only one type of data, the POI-data-based impervious surface classification results have the highest Precision, OA, F1-Score, and AUC scores. However, the Recall is the lowest in this case, differing from the highest by 7.55%. On the other hand, the extraction results based on building data rank low in all metrics except Recall. This is because the roofs of buildings are covered with various objects, including impervious surfaces, such as concrete and asphalt, in addition to permeable surface constituents, such as natural moss and artificially planted greenery on some old residential buildings. This leads to the inclusion of permeable surface features in the training samples. This will make the model inaccurately learn the features for impervious surfaces, ultimately resulting in the low accuracy and high recall of the model.

Experiment 3: Evaluation of OODMSVDD Based on Different Data Sources
Compared to the road dataset, the model trained based on the vehicle trajectory GPS points shows better performance for all metrics. Meanwhile, the model trained with the road data has poorer overall performance for the impervious surface extraction. The Precision and Recall of the models based on the trajectory GPS data samples are relatively balanced, achieving higher scores for both F1-Score and AUC compared to the road data. This is because the trajectory GPS data cover a wider range of roads than the OSM road data. The trajectory data are collected by vehicle GPS, which can provide more comprehensive, realistic, and timely road data. Therefore, although both training samples can be used to generate road samples, the model based on trajectory GPS data achieves better results in impervious surface extraction than road data from OSM. When two or three data types are used to generate samples to train the model, the impervious surface extraction results are comprehensively better than using a single data source. When POI is overlaid with any kind of data, the model achieves better impervious surface extraction performance. The reason for this may be that POI data cover more types of impervious surface samples than other data and therefore can improve the model results for other data. When the four data types mentioned above are combined to generate samples, the impervious surface extraction achieves the best overall performance with an OA of 87.43%, and the Recall and Precision are well balanced. This demonstrates that each type of the datasets can produce different kinds of information on impervious surfaces and complement each other to improve the results. In addition, samples generated from multiple data sources are more suitable for the multi-sphere algorithm.

Discussion
To investigate the performance of the proposed method for impervious surface extraction with different training sample sizes, we vary the sample size between 10,000 and 40,000, with an interval of 10,000. The top-ranked result of each threshold has been bolded. The results in Table 5 demonstrate that the model consistently achieves the best classification performance when the threshold value is set to 30, regardless of the sample size variation. Conversely, when the threshold is set to 10 or 50, the classification performance of the proposed method suffers. This is attributed to the presence of noisy data in the impervious surface samples generated from the multi-sourced open geographic data. A low threshold fails to effectively filter out irrelevant training samples, which in turn reduces the accuracy of impervious surface extraction. On the other hand, a high threshold filters out training samples with high impervious surface probability. This can result in a reduction in the diversity of impervious surface samples and consequently affect the model results. Hence, it is essential to determine an appropriate threshold for filtering the training samples.
The overall performance of the model is stable for a fixed threshold, with similar overall accuracies (OA) observed at sample sizes of 20,000, 30,000, and 40,000. The best classification accuracy is usually achieved with sample sizes of 20,000 or 30,000, while the worst is achieved with a sample size of 10,000. The metrics scores of the model show that the lack of information leads to a low precision due to small sample size. However, when the sample size is large enough and contains enough information, the accuracy of the proposed model does not change significantly with the increase in the number of samples. This feature facilitates the extension of the proposed method for impervious surface extraction in areas with few training samples generated from multiple sources of data.

Conclusions
In the context of rapid urbanization, it is crucial to develop an effective impervious surface extraction method for monitoring changes in urban ground cover materials. The increase in impervious surfaces poses significant challenges to the health of cities. In this paper, we propose an object-oriented deep multi-sphere support vector data description (OODMSVDD) method that considers multiple types of impervious surfaces. Our approach uses multiple sources of data, such as vehicle trajectories and OSM, to automatically generate single-class impervious surface samples. This greatly reduces the burden of the manual labeling of training samples. We employ object-oriented analysis techniques and extend the number of support vector data description (SVDD) spheres to improve the accuracy and robustness of the impervious surface extraction model. Our improved model is better suited to extract impervious surfaces consisting of diverse geographic objects.
The use of multi-source data to address the challenge of acquiring labeled samples in impervious surface extraction has yielded promising results. They show that it is essential to filter the training samples using a suitable method, such as a threshold method employed in this study. Furthermore, selecting a moderate number of training samples can strike a balance between sample size and better classification results. Even in situations where data sources are insufficient, the proposed method can still achieve relatively good results for impervious surface extraction. However, there is a need for further research on the impact of data uncertainty, such as position drift in trajectory data, POI localization errors, and other issues. Additionally, exploring a precise method to filter the automatically generated training samples and reduce the impact of data uncertainty on the results will be a focus of future researches.