Object-Oriented Remote Sensing Image Change Detection Based on Color Co-Occurrence Matrix

detection for remote sensing images. Abstract: Aiming at the problem of misdetection caused by the traditional texture characteristic extraction model, which does not describe the correlation among multiple bands, an object-oriented remote sensing image change detection method based on a color co-occurrence matrix is proposed. First, the image is divided into multi-scale objects by graph-based superpixel segmentation, and the optimal scale is determined by the overall goodness F-measure (OGF). Then, except for the extraction of the spectral features, the multi-channel texture features based on the color co-occurrence matrix (CCM) are extracted to consider the correlation among multiple bands. To accurately ﬁnd the representative features to overcome the impact of feature redundancy, a cumulative backward search strategy (CBSS) is further designed. Finally, the change detection is completed by inputting the difference image of dual time points to the trained random forest model. Taking Shenzhen and Dapeng as the study areas, with Sentinel-2 and Skysat images under different spatial resolutions, and the forest–bareland change type as an example, the effectiveness of the proposed algorithm is veriﬁed by qualitative and quantitative analyses. They show that the proposed algorithm can obtain higher detection accuracy than the texture features without band correlation.


Introduction
With the development of society and the economy, the depth and breadth of human utilization of land resources have been greatly improved, leading to a significant increase in the frequency of land changes [1][2][3]. Accurately acquiring information about land changes is of great significance for understanding the relationship between humans and nature and for promoting sustainable development of the social economy and ecological environment, such as deforestation, forest fires, and landslide disaster monitoring [4,5]. Remote sensing technology can achieve large area observations in a short time and has become one of the most important means to extract the change information of land [6][7][8]. However, with the development of spatial resolution of remote sensing sensors, the geometric and texture information is prominent. Although it provides rich information for change detection, it still faces new challenges due to the serious phenomenon of "same object, different spectrum" and "foreign object, same spectrum" [9][10][11].
The change detection methods include two main categories: detection after classification and detection before classification [12]. In the first one, the accuracy of detection depends on the classification result, and errors can accumulate and propagate. In addition, the computing efficiency of change detection will be affected by the classification operation when the image size is too large. The other one detects the changing area according to its characteristics, which can enable the detection of the targeted areas and improve efficiency. It includes the difference method, ratio method, change vector analysis (CVA), and model-based methods [13][14][15]. The difference and ratio methods are usually used for the single-band images, and CVA can be regarded as a multi-band extension of the difference method. However, they have some common problems; for example, the threshold is difficult to determine, and using only the threshold as the judgment criterion will inevitably lead to error detection. Model-based methods train the detection model according to the samples, which can continually optimize the model parameters and obtain the optimal change detection result. Thus, model-based methods have become the widely used methods, wherein the most popular models are deep learning and random forest. For example, Lv et al. [16] proposed a spatial-spectral attention network to inhibit the pseudo-change noise, preserve the shape of the changed areas, and improve the change detection accuracy. However, deep learning often requires a large number of samples to train the model well, and it has poor transferability. In the case of small samples in specific areas, the random forest is also considered to be a good tool [17].
From another perspective, change detection methods can be divided into two types according to the basic units of operation: pixel-based and object-oriented. Traditional pixel-based algorithms take a single pixel as the basic unit, which can easily cause a "salt and pepper" phenomenon in the detection results due to the absence of context information [18,19]. Object-oriented algorithms take a homogeneous region as the basic unit. The homogeneous region is composed of a group of connected pixels with the same characteristics, and its shape can be irregular. This special characteristic makes it possible to describe spatial relevance information, which can improve the "salt and pepper" phenomenon [20,21]. Ma et al. [22] demonstrated that object-oriented methods are more accurate than pixel-based methods by testing four common unsupervised change detection methods on the basis of the spectral information. Lv et al. [23] proposed the object-oriented sorted-histogram comparison methods, which achieved satisfactory results. The characteristics that are extracted based on objects possess a higher-level description of the image [24]. False detection and missing detection can easily occur when relying only on spectral features for change detection. It is an inevitable trend to introduce deep features, such as texture characteristics [25].
There are two classical texture feature extraction methods: local binary pattern (LBP) [26] and gray level co-occurrence matrix (GLCM) [27]. LBP describes the texture by using the relationship between the central pixel and the adjacent pixel in the local neighborhood. For example, Lei et al. [28] combined LBP and speeded-up robust features (SURF) based on the fixed-size objects to complete the change detection of Google Earth images. The traditional LBP is too simple and has poor noise resistance. With this development, a series of improved LBP-based texture extraction algorithms are proposed, such as binary rotation-invariant and noise-tolerant (BRINT) [29], Locally Encoded TRansform feature hISTogram (LETRIST) [30], local directional zigzag pattern (LDZP) [31], and overlapped multi-oriented tri-scale local binary pattern (OMTLBP) [32]. They combine the mean method, scale method, and histogram method to increase the noise resistance, breadth, and wholeness of the feature description. LBP can obtain the local texture information, but it is difficult to capture the overall information of an image [33]. GLCM describes the texture through the joint conditional probability among image gray levels. It can not only take into account the overall situation of the image but also describe the texture information in multiple directions by extracting different statistics. For example, Li et al. [34] proposed a change-detection method based on spectral and GLCM texture features, which can significantly increase the robustness of noise and spectral similarity. However, the traditional texture feature extraction methods can extract information only from single-band images. For multispectral remote sensing images with multiple bands, the bands must be treated as independent, which will lose some information hidden among the bands that could be used to distinguish changing areas. Thus, the description of texture features among bands is also important and cannot be ignored.
To solve the above problems, the object-oriented remote sensing image change detection method based on a color co-occurrence matrix is proposed in this paper. The color co-occurrence matrix (CCM) [35] is an extension of the GLCM, which can describe the texture features not only within the band but also among the bands. First, the band combination of the remote sensing images of two periods is implemented to ensure the spatial consistency of the objects, where the multi-scale objects are generated on the basis of the graph-based superpixel segmentation method and the fractal net evolution approach (FNEA). Second, to obtain the optimal scale to establish the optimal objects, the overall goodness F-measure (OGF) index was used to evaluate the quality of the superpixels. Then, after extracting the spectral features of the object, the texture statistical variables, including band correlation, are extracted on the basis of the CCM. To accurately find representative features to overcome the Hughes phenomenon, where too many features lead to information redundancy and affect the detection accuracy, a cumulative backward search strategy (CBSS) is further designed on the basis of the ranking of the importance of the features. The main idea of the CBSS is to remove the features with the lowest contribution in turn until the number of features reaches the minimum set value. The sorting standard is the cumulative value of the importance of features obtained through multiple rounds of training. The combination with the smallest feature number and the highest accuracy is regarded as the best feature combination. Finally, change detection results are obtained by putting the differential image of selected features into the random forest model. The main contributions of this paper are summarized as follows: (1) The proposed approach abandons the traditional method of describing the texture features in multi-channel images, which involves separately extracting texture features from single-band gray images and then concatenating them to describe multi-channel texture features. The CCM is innovatively used to establish inter-channel correlation and directly extract color texture features from multi-channel images for change detection. (2) The cumulative backward search strategy (CBSS) is proposed to eliminate the impact of instability in feature sorting under the one-time training on feature selection, which is beneficial for finding more representative and effective feature subsets.
In this study, Shenzhen and Dapeng are selected as the research areas, and Sentinel-2 and Skysat images constitute the research data. Considering that forest protection and bareland supervision are the focus of attention in Shenzhen, the forest-bareland change type is selected as an example to verify the effectiveness of the proposed method. The remainder of this manuscript proceeds as follows: Section 2 introduces each aspect of the proposed algorithm, Section 3 provides the experimental results and discussion, and Section 4 presents the conclusion and prospect.

Methods
In this paper, on the basis of object orientation, change detection is carried out under the framework of a random forest based on the color co-occurrence matrix model. The overall process is shown in Figure 1. It includes four main steps: (a) Optimal object construction based on multi-scale segmentation: In order to ensure the spatial consistency of the two periods of remote sensing images, band combination is carried out. On the basis of over-segmentation based on graph-based superpixel segmentation, the FNEA algorithm is utilized to establish multi-scale objects, and then the optimal segmentation scale is determined according to the maximum criterion of OGF to obtain the optimal object. (b) Multi-feature extraction based on object: There are two types of features: spectral features and texture features. Spectral features include the mean and variance of bands, and texture features are extracted by CCM on the basis of a pairwise combination of bands, including 8 directions and 4 statistical variables, i.e., angular second moment (ASM), contrast (Con), correlation (Cor), and inverse different moment (IDM). (c) Feature filtering based on CBSS: On the basis of the random forest model, the contribution degree of characteristics of one-time training is estimated. The cumulative value of the contribution degrees obtained from multiple rounds of training is considered as a standard to delete the least important feature in turn until the given minimum number of features is reached. Then, the feature combination with the highest accuracy and the smallest number of features is selected as the optimal feature combination. (d) Object-oriented change detection based on the random forest: The differential image based on the optimal feature vector set after screening is incorporated into the random forest model for change detection, and the detection results are evaluated qualitatively and quantitatively. (a) Optimal object construction based on multi-scale segmentation: In order to ensure the spatial consistency of the two periods of remote sensing images, band combination is carried out. On the basis of over-segmentation based on graph-based superpixel segmentation, the FNEA algorithm is utilized to establish multi-scale objects, and then the optimal segmentation scale is determined according to the maximum criterion of OGF to obtain the optimal object. (b) Multi-feature extraction based on object: There are two types of features: spectral features and texture features. Spectral features include the mean and variance of bands, and texture features are extracted by CCM on the basis of a pairwise combination of bands, including 8 directions and 4 statistical variables, i.e., angular second moment (ASM), contrast (Con), correlation (Cor), and inverse different moment (IDM). (c) Feature filtering based on CBSS: On the basis of the random forest model, the contribution degree of characteristics of one-time training is estimated. The cumulative value of the contribution degrees obtained from multiple rounds of training is considered as a standard to delete the least important feature in turn until the given min-

Optimal Object Construction Based on Multi-Scale Segmentation
Due to the diversity of the surface object types and the complexity of the spatial distribution in remote sensing images, different regions also require different segmentation scales [36]. In order to obtain the optimal segmentation scale, the remote sensing image is first over-segmented into a series of smaller superpixels on the basis of the graph theory segmentation method, and then the adjacent superpixel blocks are merged on the basis of the FNEA algorithm to obtain the different scales of superpixel segmentation results, and the optimal segmentation scale is determined by OGF index to obtain the best object.
The main idea of superpixel segmentation based on graph theory is to map the image into a weighted undirected graph G = (V, E), in which the set of vertexes V is composed of pixels of the image, and set of edges E corresponds to the adjacent relationship between pixels. Each edge has a corresponding weight w, which is the dissimilarity between neighboring elements. The pixels are merged according to the criteria, and then the superpixels are formed [37]. The merger guidelines are as follows: where Diff (C 1 , C 2 ) is the difference between regions C 1 and C 2 , which is defined as the minimum weight edge connecting the two components. Int(C) is the internal difference of the regions C, and it is defined as the largest weight in the minimum spanning tree of the component. τ(C) = B/|C| is the threshold function to regulate the extreme case where both regions contain only one pixel; B is a constant parameter, and |C| is the size of component C. FNEA is one of the most widely used methods of multi-scale segmentation, which has been commercially applied in eCognition software. Its principle is to form different scales by merging connected superpixels [38,39]. The merge cost function is: where h color and h shape are the costs of spectral heterogeneity and shape heterogeneity after the merger, respectively, and w color is the weight of spectral heterogeneity. Spectral heterogeneity h color is defined as: where obj 3 is the superpixel after merging obj 1 and obj 2 , c is the band index, w c is the band weight, σ is the standard deviation, and n is the number of pixels. Shape heterogeneity h shape is defined as: where w compt is the compactness weight. Assuming that the circumference of the superpixel is represented by l, the compactness h compt and edge smoothness h smooth are: where b is the perimeter of the object's bounding box.
In order to determine the optimal segmentation scale among many scales, the segmentation results are evaluated on the basis of the OGF maximization criteria: where MI norm and LV norm are the results after the normalization of the weighted variance LV of the segmentation object and Global Moran's I index MI (for detailed information, please see reference [40]). α controls its relative weight, where α > 1 indicates higher weights of LV norm , and finer results are preferred, and α < 1 indicates higher weights of MI norm .

Spectral Features
The spectrum value is the most direct and basic information of the image; it characterizes the intensity of the target object to electromagnetic wave reflection. In this study, the mean and variance of all pixel spectral features in the optimal segmentation object are selected as the object spectral feature vector.
Given a remote sensing image I = {I i : i = 1, · · · , n}, where i is the pixel index and n is the number of pixels, we have I i = {I ic : c = 1, . . . , h}, where c is the band index, and h is the number of bands. Then, the mean and variance characteristics of object j in c band can be expressed as: where Ω j represents the total number of pixels in the object j, and i ∈ Ω j represents the pixels belonging to object j.

Texture Features Based on Color Co-Occurrence Matrix
The remote sensing image is divided into N gray levels, that is In particular, when c = s and d = 1, the color co-occurrence matrix degenerates into the traditional gray level co-occurrence matrix. According to the number of bands, C 2 h + h co-occurrence matrices can be established. If h = 3, i.e., R, G, B, the six co-occurrence matrices will be CCM RR , CCM GG , CCM BB , CCM RG , CCM RB , and CCM GB .
Taking the 3 × 3 local region as an example, we assume that there are 3 bands and 8 gray levels in the local region, as shown in Figure 2a. The number of pixel pairs of the color co-occurrence matrix about bands R and G with d = 1 and θ = 0 is shown in Figure 2b. In the local region, there are two pairs of pixels that meet the requirements of I ic * = 1, I i c * = 1, and |i − i | = 1, ∠ii = 0, as shown by the red lines. In fact, the local region can have an irregular shape because the real boundary is constrained by objects, which ensures the homogeneity of the texture feature description.
Based on the CCM, the four statistical variables of the texture features are extracted [27], as shown in Table 1, including angular second moment (ASM), contrast (Con), correlation (Cor), and inverse different moment (IDM). The ASM represents the evenness of the grayscale distribution of the image, Con represents the sharpness, Cor represents the consistency of image texture, and IDM represents the homogeneity of image texture. Based on the CCM, the four statistical variables of the texture features are extracted [27], as shown in Table 1, including angular second moment (ASM), contrast (Con), correlation (Cor), and inverse different moment (IDM). The ASM represents the evenness of the grayscale distribution of the image, Con represents the sharpness, Cor represents the consistency of image texture, and IDM represents the homogeneity of image texture. , is the probability value of the occurrence of the element at the position (e, o) in the matrix, and and are the mean and variance of the corresponding gray level, respectively. The mean values of the statistical variables in different directions compose the color texture feature set of the object. The evenness of the grayscale distribution of the image

Feature Filtering Based on the Cumulative backward Search Strategy
The random forest (RF) algorithm is an integrated algorithm that takes the decision tree model as the basic classification model. The final result is obtained by establishing multiple decision tree classifiers and voting, or averaging, their predicted values. Compared with a single decision tree, the RF has higher accuracy and stability by using random sampling of samples and features. The RF can solve classification and regression problems and can estimate the importance of features at the same time [41,42].
In order to effectively utilize the characteristic information of changed and unchanged samples and to ensure that the selected features are stable and representative, a cumulative backward search strategy is proposed based on the random forest framework. In the proposed strategy, the traditional way of determining feature ranking by a single search is abandoned [43], and the cumulative value of the feature contributions by multiple searches is regarded as the foundation for sorting the features. This approach can en-

Feature Filtering Based on the Cumulative backward Search Strategy
The random forest (RF) algorithm is an integrated algorithm that takes the decision tree model as the basic classification model. The final result is obtained by establishing multiple decision tree classifiers and voting, or averaging, their predicted values. Compared with a single decision tree, the RF has higher accuracy and stability by using random sampling of samples and features. The RF can solve classification and regression problems and can estimate the importance of features at the same time [41,42].
In order to effectively utilize the characteristic information of changed and unchanged samples and to ensure that the selected features are stable and representative, a cumulative backward search strategy is proposed based on the random forest framework. In the proposed strategy, the traditional way of determining feature ranking by a single search is abandoned [43], and the cumulative value of the feature contributions by multiple searches is regarded as the foundation for sorting the features. This approach can enhance the accuracy of the feature sorting and avoid the impact of randomness and volatility of the model on the sorting results. The details are elucidated as follows, and the flow chart is shown in Figure 3.
First, after determining the changed and unchanged object samples, 70% of the samples are selected as training samples to train the model. The test set, including all the samples and many additional superpixels, is used to evaluate the model training accuracy. The difference images of the spectral and color texture feature sets are input into the random forest, and the importance of each feature of a single training round is estimated. In order to ensure the stability of filtering features, the model is trained many times, and the features are re-sorted according to the cumulative value of the feature importance of each training. On the basis of the spectral features, the least important color texture features are eliminated successively, and a new feature combination is formed and re-input into the random forest for training. The training is stopped once the number of features reaches the minimum given value. Finally, according to the fluctuation of the test accuracy under the condition of multiple feature combinations, the feature combination with fewer features and higher test accuracy is selected as the optimal feature combination.
ples and many additional superpixels, is used to evaluate the model training accuracy. The difference images of the spectral and color texture feature sets are input into the random forest, and the importance of each feature of a single training round is estimated. In order to ensure the stability of filtering features, the model is trained many times, and the features are re-sorted according to the cumulative value of the feature importance of each training. On the basis of the spectral features, the least important color texture features are eliminated successively, and a new feature combination is formed and re-input into the random forest for training. The training is stopped once the number of features reaches the minimum given value. Finally, according to the fluctuation of the test accuracy under the condition of multiple feature combinations, the feature combination with fewer features and higher test accuracy is selected as the optimal feature combination.

Change Detection Based on Random Forest
After the optimal feature combination is obtained, the feature vector of pixels in the image is cleaned, and only the vector information of the optimal feature combination is retained. In this way, the random forest classification model is reconstructed. The final binary classification result is the object-oriented change detection result of the remote sensing images.

Study Area and Experimental Data
Shenzhen is located in the southern part of Guangdong Province, between 113°43′ to 114°38′ east longitude and 22°24′ to 22°52′ north latitude, adjacent to Hong Kong, as shown

Change Detection Based on Random Forest
After the optimal feature combination is obtained, the feature vector of pixels in the image is cleaned, and only the vector information of the optimal feature combination is retained. In this way, the random forest classification model is reconstructed. The final binary classification result is the object-oriented change detection result of the remote sensing images.

Study Area and Experimental Data
Shenzhen is located in the southern part of Guangdong Province, between 113 • 43 to 114 • 38 east longitude and 22 • 24 to 22 • 52 north latitude, adjacent to Hong Kong, as shown in Figure 4a. It belongs to the subtropical monsoon climate, and the annual rainfall is 1933.3 mm. Shenzhen is the national economic center, scientific and technological innovation center, regional financial center, and trade and logistics center. Dapeng is located in the southeast of Shenzhen, and it is an important node in the Guangdong Hong Kong Macao Greater Bay Area (GBA). In the past 40 years of urbanization, Shenzhen has made remarkable achievements in economic and social aspects. The urbanization rate of Shenzhen has reached 100%; the land possesses high spatial heterogeneity and changes frequently. At the same time, the relevant departments attach great importance to the ecological environment construction in Shenzhen. The forest protection and bareland management have gradually developed into the center of urban supervision. In view of this, we take Shenzhen and Dapeng as the research areas and the forest-bareland change as an example to verify the effectiveness of the proposed algorithm. The locations of the research areas are shown in Figure 4. zhen has reached 100%; the land possesses high spatial heterogeneity and changes frequently. At the same time, the relevant departments attach great importance to the ecological environment construction in Shenzhen. The forest protection and bareland management have gradually developed into the center of urban supervision. In view of this, we take Shenzhen and Dapeng as the research areas and the forest-bareland change as an example to verify the effectiveness of the proposed algorithm. The locations of the research areas are shown in Figure 4. Considering that Shenzhen is often cloudy and rainy and the land is fragmented, the Sentinel-2 images of Shenzhen with 10 m resolution in January of winter 2020 and 2021 are selected as the Experimental Data 1. In order to ensure the full coverage of the image, the target image is synthesized by finding the useful image set under a three-month time window centered on January, with a cloud content less than 5%, as shown in Figure  5a1  Considering that Shenzhen is often cloudy and rainy and the land is fragmented, the Sentinel-2 images of Shenzhen with 10 m resolution in January of winter 2020 and 2021 are selected as the Experimental Data 1. In order to ensure the full coverage of the image, the target image is synthesized by finding the useful image set under a three-month time window centered on January, with a cloud content less than 5%, as shown in Figure 5a1

Determination of the Optimal Segmentation Scale
In remote sensing images, the scale has a very important significance, as different scales reflect different details of ground objects. Selecting the appropriate scale for certain research applications has been one of the hot issues in the field of remote sensing research. In order to determine the optimal scale, the OGF is used as the evaluation criterion. Figure 6 shows the changes in the OGF curve with the increasing scale, where Figure 6a,b represent Sentinel-2 and Skysat images, respectively. It can be seen that the OGF shows a trend of increasing first and then decreasing with the increase of the segmentation scale. For Sentinel-2, the maximum OGF = 0.71 corresponds to a scale of 41. For Skysat, the maximum OGF = 0.67 corresponds to a scale of 268. It also shows that the optimal scale will increase as the resolution increases.
In order to display the results of the optimal segmentation scale more intuitively, the representative regions are intercepted from the study area and are shown in Figures 7 and 8, where Figure 7a1-a3,b1-b3,c1-c3 show the segmentation results of different representative regions of Sentinel-2 images when the segmentation scale is 21, 41, and 61, respectively. Figure 8a1-a3,b1-b3,c1-c3 show different representative regions of Skysat images when the segmentation scale is approximately 100, 300, and 1500, respectively. It can be seen that the smaller scale will lead to over-segmentation, requiring unnecessary calculations and damaging the integrity of the surface, as shown in Figures 7 and 8a1-c1. The larger scale will lead to under-segmentation, and the bareland cannot be segmented from the surrounding vegetation and buildings, as shown in Figures 7 and 8a3-c3. In Figures 7 and 8a2-c2, the segmentation results are more closely in accord with human understanding of the earth's surface. Whether qualitative or quantitative, this can prove the effectiveness of the OGF index in the selection of the best scale. Thus, 41 and 268 can be regarded as the optimal scales to obtain the optimal objects in the subsequent experiments. Appl

Determination of the Optimal Segmentation Scale
In remote sensing images, the scale has a very important significance, as different scales reflect different details of ground objects. Selecting the appropriate scale for certain research applications has been one of the hot issues in the field of remote sensing research. In order to determine the optimal scale, the OGF is used as the evaluation criterion. Figure  6 shows the changes in the OGF curve with the increasing scale, where Figure 6a,b represent Sentinel-2 and Skysat images, respectively. It can be seen that the OGF shows a trend of increasing first and then decreasing with the increase of the segmentation scale. For Sentinel-2, the maximum OGF = 0.71 corresponds to a scale of 41. For Skysat, the maximum OGF = 0.67 corresponds to a scale of 268. It also shows that the optimal scale will increase as the resolution increases.

Determination of the Optimal Segmentation Scale
In remote sensing images, the scale has a very important significance, as different scales reflect different details of ground objects. Selecting the appropriate scale for certain research applications has been one of the hot issues in the field of remote sensing research. In order to determine the optimal scale, the OGF is used as the evaluation criterion. Figure  6 shows the changes in the OGF curve with the increasing scale, where Figure 6a,b represent Sentinel-2 and Skysat images, respectively. It can be seen that the OGF shows a trend of increasing first and then decreasing with the increase of the segmentation scale. For Sentinel-2, the maximum OGF = 0.71 corresponds to a scale of 41. For Skysat, the maximum OGF = 0.67 corresponds to a scale of 268. It also shows that the optimal scale will increase as the resolution increases.  tions and damaging the integrity of the surface, as shown in Figures 7 and 8a1-c1. The larger scale will lead to under-segmentation, and the bareland cannot be segmented from the surrounding vegetation and buildings, as shown in Figures 7 and 8a3-c3. In Figures 7  and 8a2-c2, the segmentation results are more closely in accord with human understanding of the earth's surface. Whether qualitative or quantitative, this can prove the effectiveness of the OGF index in the selection of the best scale. Thus, 41 and 268 can be regarded as the optimal scales to obtain the optimal objects in the subsequent experiments. representative regions are intercepted from the study area and are shown in Figures 7 and  8, where Figure 7a1-a3,b1-b3,c1-c3 show the segmentation results of different representative regions of Sentinel-2 images when the segmentation scale is 21, 41, and 61, respectively. Figure 8a1-a3,b1-b3,c1-c3 show different representative regions of Skysat images when the segmentation scale is approximately 100, 300, and 1500, respectively. It can be seen that the smaller scale will lead to over-segmentation, requiring unnecessary calculations and damaging the integrity of the surface, as shown in Figures 7 and 8a1-c1. The larger scale will lead to under-segmentation, and the bareland cannot be segmented from the surrounding vegetation and buildings, as shown in Figures 7 and 8a3-c3. In Figures 7  and 8a2-c2, the segmentation results are more closely in accord with human understanding of the earth's surface. Whether qualitative or quantitative, this can prove the effectiveness of the OGF index in the selection of the best scale. Thus, 41 and 268 can be regarded as the optimal scales to obtain the optimal objects in the subsequent experiments.

Optimal Feature Combination Filtering
The features reflect the ground object information of the remote sensing images. Generally speaking, a large number of features can provide rich information, but too many features not only bring redundancy but also affect the accuracy of detection. In this study, all features are filtered through a cumulative backward search strategy, and the im-

Optimal Feature Combination Filtering
The features reflect the ground object information of the remote sensing images. Generally speaking, a large number of features can provide rich information, but too many features not only bring redundancy but also affect the accuracy of detection. In this study, all features are filtered through a cumulative backward search strategy, and the importance of the features is calculated through training for all the features. The ranking results of the feature importance in a single training will be slightly different. In this study, after repeating the training many times, the cumulative importance value of each feature is taken for the feature ranking. The experiments show that the cumulative ranking result is almost consistent with the statistical sorting result, which sorts features on the basis of the highest number of occurrences at the location.

Object-Oriented Change Detection
To verify the effectiveness of the proposed method, six comparative experiments are designed, as shown in Table 2. In order to ensure the consistency of the experimental environment, all experiments are trained and evaluated on the basis of the random forest model.
(a) Sentinel-2 data For the Sentinel-2 images, there are 280 samples, where the number of positive samples is 90. According to the machine learning rules, 70% of the total samples randomly selected are used for training. The test set includes all samples and 50 additional superpixels. If the difference in the number of positive and negative samples in the training data set is too large, the training samples are redetermined until they meet the requirements. Figure 9 illustrates the change detection results of Sentinel-2, where Figure 9a-g show Comparisons 1-6 and the proposed algorithm, and Figure 9h shows the standard change detection result determined through manual visual interpretation. The white area represents unchanged superpixels, the green area represents the changing area corresponding to the standard, the red area represents false detection, and the blue area represents missing detection. It can be seen that there are many missing and false detections when only the spectral characteristics are considered, as shown in Figure 9a. After adding the texture features extracted by the GLCM and the LBP family, the errors are significantly reduced, as shown in Figure 9b-f. This effectively verifies the benefit of the texture characteristics in remote sensing change detection. The proposed algorithm extracts the texture characteristics on the basis of the CCM by considering the correlation among bands, and it filters the features by the CBSS, which greatly reduces the rate of missing and false detection. There are almost no red or blue regions, as shown in Figure 9g. pixels. If the difference in the number of positive and negative samples in the training data set is too large, the training samples are redetermined until they meet the requirements. Figure 9 illustrates the change detection results of Sentinel-2, where Figure 9a-g show Comparisons 1-6 and the proposed algorithm, and Figure 9h shows the standard change detection result determined through manual visual interpretation. The white area represents unchanged superpixels, the green area represents the changing area corresponding to the standard, the red area represents false detection, and the blue area represents missing detection. It can be seen that there are many missing and false detections when only the spectral characteristics are considered, as shown in Figure 9a. After adding the texture features extracted by the GLCM and the LBP family, the errors are significantly reduced, as shown in Figure 9b-f. This effectively verifies the benefit of the texture characteristics in remote sensing change detection. The proposed algorithm extracts the texture characteristics on the basis of the CCM by considering the correlation among bands, and it filters the features by the CBSS, which greatly reduces the rate of missing and false detection. There are almost no red or blue regions, as shown in Figure 9g.  Figure 10 illustrates the enlarged images of the change detection results in the different areas that were indicated in Figure 9. Figure 10a1-a8 show the spectral-based, GLCM, LETRIST, LDZP, OMTLBP, and GLCM-SBS methods as well as the proposed algorithm and the standard of Area 1; Figure 10b1-b8 shows the corresponding images of Area 2. Through an analysis of a large number of areas, it is found that the spectral-based method can detect only the changing area that completed the change, and the spectrum is abso-  Figure 10 illustrates the enlarged images of the change detection results in the different areas that were indicated in Figure 9. Figure 10a1-a8 show the spectral-based, GLCM, LETRIST, LDZP, OMTLBP, and GLCM-SBS methods as well as the proposed algorithm and the standard of Area 1; Figure 10b1-b8 shows the corresponding images of Area 2. Through an analysis of a large number of areas, it is found that the spectral-based method can detect only the changing area that completed the change, and the spectrum is absolutely homogeneous. Among the four texture feature extraction algorithms, the traditional GLCM is inferior to the improved LBP family, and the LBP family has some missing detection on the absolute change area. The LETRIST, LDZP, and OMTLBP methods have similar detection results regarding anti-interference, but the proposed method is the most similar to the standard results, as shown in Figure 10a7,a8,b7,b8.  In order to verify quantitatively the effectiveness of the proposed method, the missed detection rate, false detection rate, overall accuracy rate, and Kappa coefficient are calculated, the results of which are listed in Table 3; this is accomplished by counting the number of correctly detected superpixel blocks. It can be seen that the accuracy of Comparison 1 is the lowest-the missed detection rate is 48.15%, the error detection rate is 39.13%, and In order to verify quantitatively the effectiveness of the proposed method, the missed detection rate, false detection rate, overall accuracy rate, and Kappa coefficient are calculated, the results of which are listed in Table 3; this is accomplished by counting the number of correctly detected superpixel blocks. It can be seen that the accuracy of Comparison 1 is the lowest-the missed detection rate is 48.15%, the error detection rate is 39.13%, and the overall accuracy is only 60%. The addition of textures achieves an accuracy of greater than 90%. Among them, the error of GLCM is larger, and its false detection rate is 9.09%. The LETRIST method tends to false detection, as the false detection rate is much larger than the missing detection rate, and the results of the LDZP are just the opposite. The OMTLBP effectively improves the above issues by introducing different scales and unifies the missing and false detection rates-the OA is 97.83% and the Kappa coefficient is 0.95. The GLCM-SBS method performs better than the GLCM, but worse than the OMTLBP. The proposed method achieves the highest values due to the help of considering channel correlation-its OA is 98.15%, and the Kappa coefficient is 0.96.
(b) Skysat data For the Skysat images, there are 210 samples, where the number of positive samples is 90. As in the case of the Sentinal-2 data, 70% of the total samples randomly selected are used for training. The test set includes all the samples and 50 additional superpixels. Figure 11 is the change detection results of Skysat, where Figure 11a-g illustrate Comparisons 1-6 and the proposed algorithm, and Figure 11h shows the standard change detection result, which is determined through manual visual interpretation. The white area represents the unchanged superpixels, the green area represents the changing area corresponding to the standard, the red area represents false detection, and the blue area represents missing detection. It can be seen that the missing detections are distributed in the upper right corner, and each algorithm has varying degrees of missed detections. The LBP family has difficulty describing whole characteristics because of its local texture descriptor, and there are a lot of false detections, as shown in Figure 11c-e. As can be seen, the better results are attributed to the GLCM-SBS method and the proposed algorithm. Figure 12 shows the enlarged images of the change detection results in the different areas that were indicated in Figure 11. Figure 12a1-a8 illustrates the spectral-based, GLCM, LETRIST, LDZP, OMTLBP, and GLCM-SBS methods as well as the proposed algorithm and the standard of Area 1; Figure 12b1-b8 show the corresponding images of Area 2. It can be seen that the false detection instances are distributed around the previous bare soil areas, and the missing detections are around the urban areas. The spectral-based, GLCM, LETRIST, and LDZP methods have a lot of missing detections and false detections, as shown in Figure 12a1-a4,b1-b4. In general, among these algorithms, the proposed algorithm has higher similarity with the standard image. detection result, which is determined through manual visual interpretation. The white area represents the unchanged superpixels, the green area represents the changing area corresponding to the standard, the red area represents false detection, and the blue area represents missing detection. It can be seen that the missing detections are distributed in the upper right corner, and each algorithm has varying degrees of missed detections. The LBP family has difficulty describing whole characteristics because of its local texture descriptor, and there are a lot of false detections, as shown in Figure 11c-e. As can be seen, the better results are attributed to the GLCM-SBS method and the proposed algorithm.  Figure 12 shows the enlarged images of the change detection results in the different areas that were indicated in Figure 11. Figure 12a1-a8 illustrates the spectral-based, GLCM, LETRIST, LDZP, OMTLBP, and GLCM-SBS methods as well as the proposed algorithm and the standard of Area 1; Figure 12b1-b8 show the corresponding images of Area 2. It can be seen that the false detection instances are distributed around the previous bare soil areas, and the missing detections are around the urban areas. The spectral-based,  as shown in Figure 12a1-a4,b1-b4. In general, among these algorithms, the proposed algorithm has higher similarity with the standard image. In order to verify quantitatively the effectiveness of the proposed method in different data sets, the missing detection rate, false detection rate, overall accuracy rate, and Kappa coefficient are calculated and are listed in Table 4. It can be seen that with the increase in spatial resolution, the accuracy of detection has generally improved; for example, the accuracy of the spectral-based method is 92.45%. However, at the same time, the pixels among the superpixels increased dramatically, the texture features described by LETRIST and LDZP have difficulty representing global texture features, and the accuracies are rel- In order to verify quantitatively the effectiveness of the proposed method in different data sets, the missing detection rate, false detection rate, overall accuracy rate, and Kappa coefficient are calculated and are listed in Table 4. It can be seen that with the increase in spatial resolution, the accuracy of detection has generally improved; for example, the accuracy of the spectral-based method is 92.45%. However, at the same time, the pixels among the superpixels increased dramatically, the texture features described by LETRIST and LDZP have difficulty representing global texture features, and the accuracies are relatively low, only 88.68% and 89.15%, respectively. The OMTLBP uses a multi-scale approach to describe the whole texture characteristics, resulting in significant improvements. As a whole, the LBP family performs equal to or worse than the GLCM-based method, and the traditional GLCM-based methods perform worse than the proposed algorithm that describes the texture characteristics by considering the band correlation.

Discussion
To analyze deeply the performance of the comparison and proposed algorithms, the change situation is divided into four types, i.e., complete changing of forest-bareland, incomplete changing of forest-bareland, other changes, and unchanged. Here, complete changing means that dense forests change to bareland without green vegetation, and incomplete changing means that sparse forests change to bareland without green vegetation or that dense forests change to bareland with secondary vegetation. Typical examples of the detection results of these four types under each algorithm are shown in Figure 13a-d. For the complete changing type and unchanged type, most algorithms can achieve relatively perfect detection, except for the spectral-based and GLCM algorithms, as shown in Figure 13a,d. The main reason is the lack of sufficient description of the features of the superpixels. As the spatial resolution increases, the number of pixels in the superpixels increases, and the missing phenomenon improves, as shown in the third and fourth columns of Figure 13a. For the incomplete changing type, the distinguishability of the superpixel changes is reduced, which increases the difficulty of the change detection. The LBP family all have missing detection in the Sentinel-2 data, and the LETRIST and LDZP also exhibit this phenomenon in the Skysat data, as shown in the fifth and sixth columns of Figure 13b. The main reason is that the LBP family's ability to describe global texture features is not good enough, and the feature dimension is too high. For other changes, the LBP family has severe false detection compared with the statistics-based approach, i.e., GLCM-based, as shown in Figure 13c. In addition, algorithms with feature filtering are more effective and stable than those without feature filtering, as shown in the eighth and ninth columns of

Conclusions
In the study presented in this paper, an object-oriented remote sensing image change detection method based on a color co-occurrence matrix is proposed. It extends the gray co-occurrence matrix and constructs the color co-occurrence matrix by considering the correlation among bands. It can describe not only the texture in a band but also the texture

Conclusions
In the study presented in this paper, an object-oriented remote sensing image change detection method based on a color co-occurrence matrix is proposed. It extends the gray co-occurrence matrix and constructs the color co-occurrence matrix by considering the correlation among bands. It can describe not only the texture in a band but also the texture among bands, and the object-level texture can be more conducive to describing the overall characteristics. In addition, to avoid the impact of feature information redundancy on the detection accuracy, a cumulative backward search strategy is proposed to filter the features. It can effectively avoid the error filtering caused by the random fluctuation of feature ranking and obtain a stable and representative feature combination. Through the qualitative and quantitative analyses of the detection results of the proposed and comparison algorithms in different resolution remote sensing images, it is found that the proposed method is superior to the other methods in both false detection and missing detection. In the future, the proposed algorithm will be extended to incorporate multiple changing types, which can lay the foundation for mining special change patterns.