Fengyun-3D/MERSI-II Cloud Thermodynamic Phase Determination Using a Machine-Learning Approach

: Global cloud thermodynamic phase (CP) is normally derived from polar-orbiting satellite imaging data with high spatial resolution. However, constraining conditions and empirical thresholds used in the MODIS (Moderate Resolution Imaging Spectroradiometer) CP algorithm are closely associated with spectral properties of the MODIS infrared (IR) spectral bands, with obvious deviations and incompatibility induced when the algorithm is applied to data from other similar space-based sensors. To reduce the algorithm dependence on spectral properties and empirical thresholds for CP retrieval, a machine learning (ML)-based methodology was developed for retrieving CP data from China’s new-generation polar-orbiting satellite, FY-3D/MERSI-II (Fengyun-3D/Moderate Resolution Spectral Imager-II). Five machine learning algorithms were used, namely, k-nearest-neighbor (KNN), support vector machine (SVM), random forest (RF), Stacking and gradient boosting decision tree (GBDT). The RF algorithm gave the best performance. One year of EOS (Earth Observation System) MODIS CP products (July 2018 to June 2019) were used as reference labels to train the relationship between MODIS CP (MYD06 IR) and six IR bands of MERSI-II. CALIOP (Cloud-Aerosol Lidar with Orthogonal Polarization), MODIS, and FY-3D/MERSI-II CP products were used together for cross-validation. Results indicate strong spatial consistency between ML-based MERSI-II and MODIS CP products. The hit rate (HR) of random forest (RF) CP product could reach 0.85 compared with MYD06 IR CP products. In addition, when compared with the operational FY-3D/MERSI CP product, the RF-based CP product had higher HRs. Using the CALIOP cloud product as an independent reference, the liquid-phase accuracy of the RF CP product was higher than that of operational FY-3D/MERSI-II and MYD06 IR CP products. This study aimed to establish a robust algorithm for deriving FY-3D/MERSI-II CP climate data record (CDR) for research and applications.


Introduction
Clouds are the important factors for regulating the global energy exchange and water cycle, reflecting and absorbing incident solar radiation and Earth's outgoing long-wave radiation [1]. As an important geophysical parameter, the cloud thermodynamic phase (CP) product, derived from space-based imaging sensors such as MODIS (Moderate Resolution Imaging Spectroradiometer), including ice, 'uncertain', and liquid-water phases, aids further understanding of Earth's weather and climate systems on global scales. The CP products derived from measurements of satellite imaging sensors [2][3][4] provide a priori and crucial knowledge on cloud-top height (CTH), cloud optical thickness (COT), and cloud-top effective particle size (CPS).
Various retrieval methods for space-based imaging sensors have been developed in the past 20 years to improve the understanding of the natural characteristics of CP.
with the CALIPSO satellite for ML training and testing; they always pass over each other in the polar regions.
Therefore, an ML-based CP methodology was developed for FY-3D/MERSI-II using MODIS CP products as label data. To reduce algorithm dependence on spectral response features and the empirical thresholds of physical retrieval methods, one year of Aqua/MODIS CP products (July 2018 to June 2019) were used as reference label data to train the relationship between MODIS IR CP and FY-3D/MERSI-II six IR band radiance measurements. The official CALIOP cloud product (v. 4.2) and MODIS CP product, along with the operational FY-3D/MERSI-II CP product from the National Satellite Meteorological Center of China, were used for independent cross-validation. This ML-based CP algorithm is expected to mitigate deviations caused by differences in instrument spectral responses, as well as aid the development of a consistent global CP climate data record (CDR) through reprocessing the historical FY-3D/MERSI-II measurements.

The Optimal Machine-Learning Algorithm
ML techniques provide highly effective solutions to pattern recognition problems [27]. Here, five classical ML algorithms were used to train the prediction model for the FY-3D/MERSI-II CP, and the optimum algorithm for CP retrieval was obtained through comparing the results from five independent ML algorithms. The specific implementation steps were as follows: 1.
The ratio between training data and validation data was set. Note that for selecting the algorithm, only 1% of samples were randomly selected for training and testing with a ratio of 7:3 to reduce the memory occupation and time consumption; 3.
The performances of five ML algorithms were compared in training the sample set, namely, KNN [28,29], Stacking [30], RF [31], AdaBoost [32], and GBDT [33]. Adjustment parameters and dynamic ranges of the five algorithms are shown in Table 1 [19,34,35]. Through these comparisons, the GridSearchCV module in Sklearn, with relatively high accuracy and the shortest running time, was selected to adjust the parameters automatically and iteratively (Table 2). Table 1. Adjustment parameters and dynamic ranges of different ML algorithms.
where A is the number of liquid-water-phase pixels also identified as such by the ML training model, B is the number of ice-phase pixels identified as liquid-water phase by the ML training model, C is the number of liquid-water-phase pixels classified as ice phase by the ML training model, and D is the number of ice-phase pixels also identified as such by the ML training model. HR represents the ratio of the number of liquid-water-phase pixels and ice-phase pixels correctly identified by the ML training model to the total number of pixels. It signifies the overall inversion accuracy of liquid-water phase and ice-phase pixels. POD represents the ratio of the correct number of liquid-water-phase pixels identified by the ML training model to the total number of liquid-water-phase pixels. Therefore, a higher POD denotes a higher accuracy of liquid-water-phase inversion. FAR represents the ratio of the number of ice-phase pixels which are classified as liquid-water-phase pixels by the ML training model to the total number of liquid-water-phase pixels retrieved from the ML training model. Apparently, it represents the misjudgment rate of liquid-water phase. Of the five independent ML methods, the running time of the RF was the shortest, with both HR and POD scores being at a high level with a relatively low FAR (Table 2). We found that the RF algorithm gave the best performance. Therefore, the RF algorithm was applied in the subsequent model building.
As a classical bagging ensemble classification and regression technique, the RF algorithm can easily run in a parallel computing mode and capture nonlinear or complex relationships between predictor and predictand [31]. This method trains a large number of decision tree predictors and then ages them on average to improve prediction accuracy and reduce overfitting [19]. The debugging of the RF model requires the use of many parameters, including the number of trees in the forest (n_estimators), the maximum depth of the tree (max_depth), the minimum number of samples required to split an internal node (min_samples_split), and the minimum number of samples required to be at a leaf node (min_samples_leaf). It seems like that larger parameters lead to better model precision. However, model overfitting and memory consumption also occur. The final selection of the optimal model parameters depends on the change in out-of-bag (OOB) scores, which can adequately estimate unbiased estimates (deviations) of the regression or classification models. In this study, the model with the shortest running time and highest fitting degree was selected on the basis of the variation trend of OOB for next step retrieval.

Training Scheme and Model Configuration
The training and validation sets were derived from overlapping data from Aqua/MODIS and FY-3D/MERSI-II for 2 years (July 2018 to June 2020). The orbits of the two satellites do not coincide completely; hence, the orbit prediction method was adopted to overlap the image times of both orbits passing the same region with a time of difference less than 5 min. The different satellites also have spatial and parallax-effect differences [36]. The zenith angles of both the Aqua and the FY-3D satellites were screened in the overlap region of 0 • -45 • to reduce image deformation. To ensure that the training samples were representative on a global scale, a sample scheme was used to account for the influence of latitude, season, and overpass time ( Figure 1). A total of 70,313,100 geo-located pixels were involved, with 49,546,611 being used for validation. As found in previous studies [19,35,37], an increase in the number of samples may not significantly improve model performance under similar distributions. There were a large number of collocated pixels for FY-3D and Aqua during the June 2018 to July 2019 period; thus, a sensitive analysis was undertaken to determine the optimum number of samples. Totals of 3000, 5000, 10,000, 30,000, 50,000, 100,000, 300,000, or 500,000 pixels of training data were used, with 100,000 pixels being the optimum, above which further increases had no significant effect on model accuracy.
According to RF software package documentation, the empirical default value of random-split predictor variable max_features for the RF classification model is equal to the square root of the total number of predictive variables or features (http://scikit-learn.org/stable/modules/ensemble.html, accessed on 12 April 2021), and this parameter was set on the basis of model input-variable data. Due to the change in the amount of training data, related parameters of the RF model were retrained iteratively. The OOB score represents the fitting result of the unbiased estimation of the RF model; a higher OOB score denotes a better fit of the model. A higher number of trees in the forest (n_estimators) and a greater maximum depth of the trees (max_depth) lead to higher model fitting accuracy and a more complex model. Therefore, it is necessary to find a balance between model accuracy and running time by iterative training.
When n_estimators = 400 and max_depth = 20, the OOB score was higher than other models with similar running times ( Figure 2). It had a short running time while main- As found in previous studies [19,35,37], an increase in the number of samples may not significantly improve model performance under similar distributions. There were a large number of collocated pixels for FY-3D and Aqua during the June 2018 to July 2019 period; thus, a sensitive analysis was undertaken to determine the optimum number of samples. Totals of 3000, 5000, 10,000, 30,000, 50,000, 100,000, 300,000, or 500,000 pixels of training data were used, with 100,000 pixels being the optimum, above which further increases had no significant effect on model accuracy.
According to RF software package documentation, the empirical default value of random-split predictor variable max_features for the RF classification model is equal to the square root of the total number of predictive variables or features (http://scikit-learn.org/ stable/modules/ensemble.html, accessed on 12 April 2021), and this parameter was set on the basis of model input-variable data. Due to the change in the amount of training data, related parameters of the RF model were retrained iteratively. The OOB score represents the fitting result of the unbiased estimation of the RF model; a higher OOB score denotes a better fit of the model. A higher number of trees in the forest (n_estimators) and a greater maximum depth of the trees (max_depth) lead to higher model fitting accuracy and a more complex model. Therefore, it is necessary to find a balance between model accuracy and running time by iterative training.
When n_estimators = 400 and max_depth = 20, the OOB score was higher than other models with similar running times ( Figure 2). It had a short running time while maintaining the OOB value. Similarly, the balance of liquid-water and ice phase sample sizes can also significantly affect the accuracy of the final prediction [19,37]. Statistics for the number of liquid-water and ice phase samples during July 2018 to June 2019 indicated a liquidwater-to-ice-phase ratio during the northern mid-latitude summer of 1.56:1, with winter and spring-autumn ratios of 1.02:1 and 1.21:1, respectively. At low and high latitudes, the ratios were 1.35:1 and 0.8:1, respectively. Accordingly, the liquid-water-to-ice-phase ratio was set to the mean of 1.18:1. After iterative training and filtering, the optimal model configuration acquired had the following parameters: n_estimators = 400; max_depth = 20; min_samples_split = 2; min_samples_leaf = 7; max_features = 4; N (number of pixels in training set) = 100,000.  The sensitivity of input variables can be calculated using feature_importances in the RF algorithm, with the sum of importance of all variables being 1. Each input variable has its own physical characteristics and has a close relationship with the cloud phase ( Table  3). The higher the importance of a variable, the more sensitive it is in the model training.
The order of sensitivity of variables and their importance is shown in Table 3.  The sensitivity of input variables can be calculated using feature_importances in the RF algorithm, with the sum of importance of all variables being 1. Each input variable has its own physical characteristics and has a close relationship with the cloud phase ( Table 3). The higher the importance of a variable, the more sensitive it is in the model training. The order of sensitivity of variables and their importance is shown in Table 3. Table 3. The importance scores of predictive variables in the RF model and their corresponding rankings based on the configuration n_estimators= 400, max_depth = 20, min_samples_split = 2, min_samples_leaf = 7, and max_features = 4 for CP classification. Stronger increases in the absorption of ice particles can be found at 10-11 µm than that at 11-12 µm, while the effect on water particles is the opposite. This allows distinguishing between ice and water particles.
The water vapor absorption channel is very sensitive to the amount of water vapor

Reference Pixel Label
The Aqua polar-orbiting satellite was launched on 4 May 2002 at 1:30 p.m. local-time in a sun-synchronous orbit [38], similar to the FY-3D satellite. The MODIS sensor aboard Aqua has 36 spectral bands, covering the spectrum from VIS to IR (0.4-14 µm). The EOS/MODIS Collection-5 CP product and early collections combine 8-11 µm brightness temperature (BT) differences (BTDs) and 11 µm BT to distinguish ice, liquid-water, and mixed-phase clouds through a series of decision trees and thresholds [3]. To further reduce the influence of land surface radiation, the University of Wisconsin-Madison team improved the current CP algorithm from MODIS Collection-6 [3,8,9], providing an additional 1 km resolution CP product for MODIS based on the 7.3, 8.5, 11, and 12 µm bands for constructing the β index and BTD in distinguishing cloud phases through decision trees [9]. The cloud phase is usually classified into three categories: ice, liquid-water, and 'uncertain'; however, because of the difficulty in distinguishing mixed-phase and uncertain categories in the MODIS Collection-6 CP product, they were merged into one 'uncertain phase' category [3]. Here, we conducted training only for certain ice-phase and liquid-water-phase cloud samples (the 'uncertain' phase was not considered).
Considering the rapid movement and evolution of clouds, the MODIS Collection-6 CP product (1 km resolution) from July 2018 to June 2019 was carefully geo-located with FY-3D/MERSI-II Level-1B data, within 5 min temporal difference. The Aqua/MODIS Collection-6 CP product was obtained from the US National Aeronautics and Space Administration (NASA) website (https://modis.gsfc.nasa.gov/, accessed on 12 April 2021).

FY-3D/MERSI-II
FY-3D/MERSI-II is capable of global observations with two IR split-window bands of 250 m resolution, providing possible high-precision quantitative atmospheric, land, and oceanic products such as cloud, aerosol, water vapor, land surface characteristics, and ocean water color [39]. L1 data were obtained from the Fengyun Satellite Data Service Network (http://satellite.nsmc.org.cn/, accessed on 12 April 2021). Compared with the previous MERSI-I [40], the improved MERSI-II added NIR and IR spectral bands with central wavelengths of 1. 38 of central wavelength 11.25 µm (bandwidth 2.5 µm) was converted to two split-window bands of central wavelengths 10.8 and 12 µm. These IR bands also allow CP determination at night. Moreover, FY-3D/MERSI-II has similar six IR spectral bands of MODIS, which were used for CP training. The bandwidths of some FY-3D/MERSI-II IR bands are slightly wider than those of Aqua/MODIS, along with different central wavelengths (see Figure 3). These differences in sensor spectral features may lead to noticeable deviations in CP retrievals if the MODIS algorithm is directly applied to FY-3D/MERSI-II.
Remote Sens. 2021, 13, x FOR PEER REVIEW 9 split-window bands of central wavelengths 10.8 and 12 μm. These IR bands also allow determination at night. Moreover, FY-3D/MERSI-II has similar six IR spectral band MODIS, which were used for CP training. The bandwidths of some FY-3D/MERSI-I bands are slightly wider than those of Aqua/MODIS, along with different central wa lengths (see Figure 3). These differences in sensor spectral features may lead to noticea deviations in CP retrievals if the MODIS algorithm is directly applied to FY-3D/MERS The National Satellite Meteorological Center/China Meteorological Administrat (NSMC/CMA) has made an operational CP product based on MERSI-II available si October 2018. This operational CP product is developed using a combination of MERS VIS (0.88-0.68 μm), NIR (1.55-1.64 μm and 3.55-3.93 μm), and two IR (10.3-11.3 μm 11.5-12.5 μm) spectral bands [18]. Both the spectral and the texture characteristics of V NIR, and IR bands are used to determine CP on a pixel basis with a series of thresho for classifying liquid-water, ice, or mixed phases. The definition of the mixed phase in FY-3D/MERSI-II CP product differs from that in the MODIS product, with the form being defined as the mixed-phase state of liquid-water and ice phases. When the ref tivity in the 1.65 and 3.75 μm bands is greater than a given threshold value, the phas identified as a supercooled water cloud or mixed phase. In spite of both supercoo water and mixed-phase clouds exhibiting a liquid-water phase, they are categorized ice phase due to their relatively low temperature [41] (<0 °C). For MODIS, water drop at the top of the cloud layer and fuzzy ice particles that grow within the cloud (and through the cloud base) are identified as mixed phase, with mixed-phase and 'unde mined' classes being combined to reduce ambiguity [3]. The use of VIS bands in FY-3D/MERSI-II CP algorithm means it can generate CP product only during daytime et al. (2019) reported that the FY-3D/MERSI-II CP product has biases in ice clouds.

CALIOP Cloud Products
The CALIPSO satellite was launched in 2006 with CALIOP, a wide-field cam (WFC), and an infrared imaging radiometer (IIR) aboard [42]. CALIOP is the f spaceborne cloud and aerosol lidar with three detection channels (1064 and 532 nm v tical and parallel channels) providing accurate high-resolution vertical profiles of ae sols and clouds globally [43]. The CALIOP cloud classification product includes uid-water, ice, oriented ice crystals, and 'unknown' types. Validation products were rived from the CALIPSO 1-km cloud product (v. 4.20) with CALIOP cloud-top ph information [44]. Since Aqua and CALIPSO are in the 'Afternoon (A)-train' constellat they have the same trajectories and cover the same areas in adjacent time [25]. To red  [18]. Both the spectral and the texture characteristics of VIS, NIR, and IR bands are used to determine CP on a pixel basis with a series of thresholds for classifying liquid-water, ice, or mixed phases. The definition of the mixed phase in the FY-3D/MERSI-II CP product differs from that in the MODIS product, with the former being defined as the mixed-phase state of liquid-water and ice phases. When the reflectivity in the 1.65 and 3.75 µm bands is greater than a given threshold value, the phase is identified as a supercooled water cloud or mixed phase. In spite of both supercooled water and mixed-phase clouds exhibiting a liquid-water phase, they are categorized as ice phase due to their relatively low temperature [41] (<0 • C). For MODIS, water droplets at the top of the cloud layer and fuzzy ice particles that grow within the cloud (and fall through the cloud base) are identified as mixed phase, with mixed-phase and 'undetermined' classes being combined to reduce ambiguity [3]. The use of VIS bands in the FY-3D/MERSI-II CP algorithm means it can generate CP product only during daytime. Li et al. (2019) reported that the FY-3D/MERSI-II CP product has biases in ice clouds.

CALIOP Cloud Products
The CALIPSO satellite was launched in 2006 with CALIOP, a wide-field camera (WFC), and an infrared imaging radiometer (IIR) aboard [42]. CALIOP is the first spaceborne cloud and aerosol lidar with three detection channels (1064 and 532 nm vertical and parallel channels) providing accurate high-resolution vertical profiles of aerosols and clouds globally [43]. The CALIOP cloud classification product includes liquid-water, ice, oriented ice crystals, and 'unknown' types. Validation products were derived from the CALIPSO 1-km cloud product (v. 4.20) with CALIOP cloud-top phase information [44]. Since Aqua and CALIPSO are in the 'Afternoon (A)-train' constellation, they have the same trajectories and cover the same areas in adjacent time [25]. To reduce the influence of vertically distributed mixed-phase cloud on validation, only single-layer cloud samples detected by CALIOP were used here.

Validation Using Independent MODIS CP Product
Spectral surface emissivity, surface type, and snow and ice coverage are all related to cloud and aerosol retrievals [45,46]; thus, for validation, different surfaces were classified according to latitude and season. Data for July 2019 to June 2020 (Section 3.1) were input into the trained RF model for validation. For subsequent product comparisons, the consistency of product phase states must be ensured. Here liquid water was defined as positive and ice as was defined as negative. Five classical indices were used to evaluate the classification results of liquid and ice phases: POD, FAR, HR, critical success index (CSI, optimal = 1), and Heidke skill score (HSS, optimal = 1). These are defined as follows:

HSS = 2(AD − BC)/[(A + C)(A + D) + (A + B)(B + D)],
where A is the number of pixels that both MODIS reference CP and the FY-3D/MERSI-II CP retrieved from the ML model (this study) are classified as liquid-water phase, B represents the number of pixels identified as ice phase by MODIS but classified as liquid-water phase by MERSI-II in this study, C is the number of pixels labeled as liquid-water phase by MODIS but classified as ice phase by MERSI-II in this study, and D is the number of pixels that both the MODIS reference CP and the MERSI-II CP of this study classified as ice phase. A high POD value indicates high accuracy in liquid-water phase identification, while a low FAR value indicates high accuracy in ice-phase identification; the highest CSI index indicates the highest success rate of retrieval model for the liquid-water phase. As can be seen from Table 4, except for mid-latitude winters, the POD values of all categories were >0.9, FAR values were <0.2, and HR values were >0.8. The four evaluation indices for the mid-latitude summer were all relatively high with the best performance. POD in the mid-latitude winter was only 0.8, but FAR was relatively low. A low liquid-water cloud detection rate led to reduced HR. The influence of snow cover may have contributed to the low detection rate of liquid water in mid-latitude winters when snow cover was not uniform because of the mixed pixel effect, and high snow reflectivity affected accuracy for other surface types near the snow-covered area, which, in turn, influenced the accuracy of classification by the RF model. Wang et al. (2020) found that the MODIS CP product accuracy for snow, ice, and barren surface types is much lower than that of other types, leading to the identification of too many liquid-water phases as ice phases, consistent with the MERSI-II results in this study. The quality of the MODIS CP product resulted in a reduction in liquid-water phase detection capability; performances in mid-latitude spring and autumn and at low latitudes were generally similar, with relatively high POD and high FAR, indicating the classification of too many ice-phase clouds as liquid-water phase. At high latitudes, where there are large areas of ice and snow cover, the total annual POD reached 0.94, and FAR was relatively low, indicating good ML performance, which differs from the results of Wang et al. (2020). This inconsistency may be due to the FY-3D and Aqua satellite orbits generally overlapping at high latitudes, since both training and validation data with small satellite viewing angles at high latitudes are more prevalent than in the mid-latitude winter.

Comparison of Spatial Distributions
To better understand the reliability of the FY-3D/MERSI-II CP retrieval method based on the ML approach, the trained RF CP product was further compared with the MODIS CP product, as well as the operational FY-3D/MERSI-II CP product. Five images were randomly selected from each region of mid-latitude winter, mid-latitude spring and autumn, mid-latitude summer, low, and high latitudes. In the RF classification process, each pixel was assigned a SCORE: if >0.5, the classification tended toward the liquid-water phase; if <0.5, the classification tended toward the ice phase; if around 0.5, the classification model had no obvious CP classification. Pixels with a score of 0.48-0.52 were defined as being of an 'uncertain' phase. For each latitude and season, a total of 15 random images of FY-3D and Aqua with coincident areas were selected for comparison. Overlapping pixels of liquid-water or ice phase in each image were extracted for comparison, and the POD, FAR, CSI, HR, and HSS precision indices were calculated. Results are shown in Table 5. The MODIS, RF FY-3D, and operational FY-3D CP products are compared in Figures 4-8 where images highlight the spatial differences in the three products. As the liquid-water phase temperature is higher than that of the ice phase, the FY-3D/MERSI-II band 24 (10.3-11.3 µm) BT could be set as the reference image, whereby bluer colors denote lower temperatures and redder colors denote higher temperatures. Hence, cooler clouds are in blue, while warmer clouds are in red. Figure 4 demonstrates that, in mid-latitude spring and autumn, the RF CP product (Figure 4b) is consistent with the MODIS CP product (Figure 4a). In the area of 60 • N 10 • W, operational FY-3D/MERSI-II CP product (Figure 4c) identifies most liquid-water phases as ice and mixed phases, as reflected in the BT image (Figure 4d). In the region near 61 • N 12 • W, the RF CP indicated an ice phase in most liquid-water-phase regions of the MODIS CP product, with the RF product being able to identify fine ice-phase regions. CP retrieval results obtained by the RF algorithm were significantly improved compared with those of the operational FY-3D algorithm (Table 5), with POD increasing from 0.82 to 0.90. product being able to identify fine ice-phase regions. CP retrieval results obtained by the RF algorithm were significantly improved compared with those of the operational FY-3D algorithm (Table 5), with POD increasing from 0.82 to 0.90. In the mid-latitude winter, RF products and MODIS CP products had strong spatial consistency ( Figure 5). In the area 48°S 15°E, many pixels were identified as liquid-water phase by both RF (Figure 5b) and MODIS CP products (Figure 5a). However, for the operational FY-3D/MERSI-II CP product (Figure 5c), these pixels were classified as mixed phase because the definition of mixed phase in the operational FY-3D/MERSI-II CP product differs from the definition of uncertain phase in the MODIS algorithm. Apart from the mixed phase, the mid-latitude winter RF CP product performed comparably with the operational FY-3D CP product (Table 5). In the mid-latitude winter, RF products and MODIS CP products had strong spatial consistency ( Figure 5). In the area 48 • S 15 • E, many pixels were identified as liquid-water phase by both RF (Figure 5b) and MODIS CP products (Figure 5a). However, for the operational FY-3D/MERSI-II CP product (Figure 5c), these pixels were classified as mixed phase because the definition of mixed phase in the operational FY-3D/MERSI-II CP product differs from the definition of uncertain phase in the MODIS algorithm. Apart from the mixed phase, the mid-latitude winter RF CP product performed comparably with the operational FY-3D CP product (Table 5). Very few uncertain phases were produced by the RF method in the mid-latitud summer (Figure 6), due mainly to the relatively strict threshold for that phase. More pix els were identified as uncertain or mixed phases for the MODIS (Figure 6a) and opera tional MERSI-II CP products (Figure 6c). In the area near 65°N 50°E, the operationa MERSI-II CP product gave a large number of pixels of mixed phase, whereas, for th MODIS and RF products (Figure 6b), this region was covered mainly by liquid-water an ice phases, respectively. The significant reduction in FAR for the RF CP product repre sented an improvement in ice-phase detection accuracy (Table 5), and its POD decreas indicated a reduction in liquid-water phase detection capability. The CSI, HR, and HSS o RF CP product all increased significantly while FAR decreased, indicating the improve ice-phase detection capability of the RF CP product associated with an overall increase i detection accuracy. Very few uncertain phases were produced by the RF method in the mid-latitude summer (Figure 6), due mainly to the relatively strict threshold for that phase. More pixels were identified as uncertain or mixed phases for the MODIS (Figure 6a) and operational MERSI-II CP products (Figure 6c). In the area near 65 • N 50 • E, the operational MERSI-II CP product gave a large number of pixels of mixed phase, whereas, for the MODIS and RF products (Figure 6b), this region was covered mainly by liquid-water and ice phases, respectively. The significant reduction in FAR for the RF CP product represented an improvement in ice-phase detection accuracy (Table 5), and its POD decrease indicated a reduction in liquid-water phase detection capability. The CSI, HR, and HSS of RF CP product all increased significantly while FAR decreased, indicating the improved icephase detection capability of the RF CP product associated with an overall increase in detection accuracy. The three products had strong spatial consistency in the low-latitude region ( Figure  7), although the cloud detection results of MODIS (Figure 7a) and FY-3D (Figure 7c) were significantly different. In the area near 7°N 40°W, the operational FY-3D/MERSI-II cloud mask product was significantly different from MODIS CP product. Just as in the mid-latitude spring and autumn case (Figure 6), the operational FY-3D/MERSI-II CP product identified ice and mixed phases in areas where the other two products identified a liquid-water phase. With the reduction in FAR, the accuracy of ice-cloud detection for the RF CP product was improved with CSI increasing significantly (Table 5). The three products had strong spatial consistency in the low-latitude region (Figure 7), although the cloud detection results of MODIS ( Figure 7a) and FY-3D (Figure 7c) were significantly different. In the area near 7 • N 40 • W, the operational FY-3D/MERSI-II cloud mask product was significantly different from MODIS CP product. Just as in the midlatitude spring and autumn case (Figure 6), the operational FY-3D/MERSI-II CP product identified ice and mixed phases in areas where the other two products identified a liquidwater phase. With the reduction in FAR, the accuracy of ice-cloud detection for the RF CP product was improved with CSI increasing significantly (Table 5). The RF CP product obtained from the high latitude was obviously more consistent with MODIS CP product than the FY-3D product ( Figure 8). FY-3D CP products ( Figure  8c) generally inverted liquid-water phase to ice phase, with almost all liquid-water phases being wrongly identified as ice phases in the operational FY-3D CP product in high-latitude images ( Table 5). The RF CP product (Figure 8b) significantly improved liquid-water phase detection capability while ensuring the accuracy of ice phase identification, which also significantly increased CSI, HR, and HSS. The RF CP product obtained from the high latitude was obviously more consistent with MODIS CP product than the FY-3D product ( Figure 8). FY-3D CP products (Figure 8c) generally inverted liquid-water phase to ice phase, with almost all liquid-water phases being wrongly identified as ice phases in the operational FY-3D CP product in high-latitude images ( Table 5). The RF CP product (Figure 8b) significantly improved liquid-water phase detection capability while ensuring the accuracy of ice phase identification, which also significantly increased CSI, HR, and HSS.
From the above five comparisons, it can be seen that RF-derived CP and MODIS CP were generally consistent. Compared with these two products, too many liquid-water phase pixels were identified as ice phase in the operational FY-3D CP product in almost all cases regardless of season and latitude. The accuracy of the RF CP product in each case was higher than or equal to that of the operational FY-3D CP product, indicating that the consistency between MODIS and RF CP was much better than with the operational FY-3D/MERSI-II product. However, there were two disadvantages with the RF approach: (1) in the process of pixel basis for retrieval, the derived CP image might appear discontinuous, i.e., there are isolated pixels with a cloud phase different from that of neighboring pixels; (2) the determination of the threshold range of the uncertain phase is a problem affecting the size of the uncertain-phase region, with the threshold here being set according to experience, which might have influenced the final results. From the above five comparisons, it can be seen that RF-derived CP and MODIS were generally consistent. Compared with these two products, too many liquid-w phase pixels were identified as ice phase in the operational FY-3D CP product in alm all cases regardless of season and latitude. The accuracy of the RF CP product in each c was higher than or equal to that of the operational FY-3D CP product, indicating that consistency between MODIS and RF CP was much better than with the operatio FY-3D/MERSI-II product. However, there were two disadvantages with the RF approa (1) in the process of pixel basis for retrieval, the derived CP image might appear disc tinuous, i.e., there are isolated pixels with a cloud phase different from that of neighb ing pixels; (2) the determination of the threshold range of the uncertain phase is a pr lem affecting the size of the uncertain-phase region, with the threshold here being according to experience, which might have influenced the final results.

Comparison with Active CALIOP CP Data
When the optical thickness is low, cloud products retrieved from spaceborne ac lidar systems (e.g., CALIOP) are often regarded as true values in determining the qua of passively observed cloud products [42]. Here, the MODIS CP, RF, and operatio FY-3D/MERSI-II products were further validated using CALIOP cloud-top phase d from the CALIPSO 1 km resolution cloud product. As the liquid-water and ice phase

Comparison with Active CALIOP CP Data
When the optical thickness is low, cloud products retrieved from spaceborne active lidar systems (e.g., CALIOP) are often regarded as true values in determining the quality of passively observed cloud products [42]. Here, the MODIS CP, RF, and operational FY-3D/MERSI-II products were further validated using CALIOP cloud-top phase data from the CALIPSO 1 km resolution cloud product. As the liquid-water and ice phases of the MODIS IR CP product were used as reference labels, the quality of the product should be assessed to test the accuracy of the MODIS CP product against 'true' liquid-water and ice phase detections. The accuracy of the RF and operational FY-3D CP products should also be assessed relative to the CALIOP CP product. These validations were undertaken as described below.
First MODIS, RF, operational FY-3D/MERSI-II, and CALIPSO/CALIOP cloud products collected within 5 min of each other from July 2019 to June 2020 were collocated, and a dataset of collocated pixels was generated. From this dataset, only pixels identified as liquid-water and ice phase in MODIS CP product were selected and labeled as single-layer clouds in CALIOP (other pixels in the dataset were removed). The numbers of liquid-water and ice phase pixels in MODIS CP, RF, and CALIOP were calculated (Table 6). For the operational FY-3D/MERSI-II product, the phase corresponding to overlapping pixels included liquid-water, ice, and mixed phases. As the temperature of the mixed phase is below the freezing point (0 • C), it was categorized as ice phase. The accuracies of the MODIS, RF, and operational FY-3D/MERSI-II products were compared and validated against the CALIOP CP benchmark. Table 6. CALIOP cloud-top phase cross-validation with MODIS IR, RF, and operational FY-3D/MERSI-II CP products based on season and latitude (the uncertain phase in MODIS and RF products was eliminated, and the mixed and ice phases were merged in the operational FY-3D/MERSI-II product). Yellow shading represents the correct probability of the CP products.

CALIOP
The RF and MODIS CP products generally had comparable accuracy (Table 6). In mid-latitude summer and mid-latitude spring and autumn, the RF approach demonstrated slightly better ability in identifying the ice phase. In high latitudes, mid-latitude summer, and mid-latitude spring and autumn, the RF approach detected more liquid-water phases than MODIS. The performance of the RF model in the mid-latitude winter was significantly poorer than that of MODIS, which remains a problem to be addressed. The retrieval accuracy of the operational FY-3D/MERSI-II CP product was inferior to that of the RF product for liquid-water and ice phases (Table 6), and the liquid-water phase accuracy of the RF product was higher than that of the MODIS and operational FY-3D/MERSI-II product. A possible reason is the satellite zenith angle range control during training. The RF CP product, thus, improved liquid-water phase inversion.

Summary
This study aimed to establish a global, all-day FY-3D/MERSI-II algorithm for longterm CP CDR Fengyun satellite data. To reduce algorithm dependence on the spectral response properties and the empirical thresholds of physical methods, an ML-based methodology was developed for retrieving CP from China's polar-orbiting satellite FY-3D/MERSI-II. The MODIS CP product was used as reference data for training, with five ML algorithms being used to train the sample set. The RF module, with relatively high accuracy and the shortest running time, was selected for use in training and retrieval. Using the RF algorithm for verification, we obtained POD values for all other categories >0.9, except for winter at mid-latitudes, and FAR and HR values <0.2 and >0.8, respectively. The RF CP product was, thus, consistent with MODIS CP product. Derived CP images of different representative regions were selected for comparisons, with the HR of each RF CP product image being higher than that of the corresponding operational FY-3D/MERSI-II product. When compared with CALIOP cloud products, the accuracy of liquid-water phase detection by the RF product was higher than that of Operational FY-3D/MERSI-II CP products. The following conclusions were drawn from the validation analyses:

1.
The RF CP product is spatially consistent with MODIS CP product, and its accuracy is comparable with that of MODIS CP product when compared with CALIPSO cloud products.

2.
The RF-based CP algorithm has the highest accuracy at high latitudes and the lowest accuracy at mid-latitude winter compared with the MODIS CP product. 3.
The RF product developed here may supplement the lack of data from existing MERSI-II CP products at night; it also indicates an improvement in accuracy over the operational FY-3D/MERSI-II CP product.
Although the accuracy of the RF CP product is comparable to that of the MODIS CP product, large uncertainties remain concerning the threshold of the uncertain phase. The ML method has potential for use in exploiting image data from FY-3D weather satellites and in providing global CP product with higher spatial resolution (e.g., 250 m).