A satellite-derived dataset on vegetation phenology across Central Asia from 2001 to 2023

Satellite-observed land surface phenology (LSP) data have helped us better understand terrestrial ecosystem dynamics at large scales. However, uncertainties remain in comprehending LSP variations in Central Asian drylands. In this article, an LSP dataset covering Central Asia (45–100°E, 33–57°N) is introduced. This LSP dataset was produced based on Moderate Resolution Imaging Spectroradiometer (MODIS) 0.05-degree daily reflectance and land cover data. The phenological dynamics of drylands were tracked using the seasonal profiles of near-infrared reflectance of vegetation (NIRv). NIRv time series processing involved the following steps: identifying low-quality observations, smoothing the NIRv time series, and retrieving LSP metrics. In the smoothing step, a median filter was first applied to reduce spikes, after which the stationary wavelet transform (SWT) was used to smooth the NIRv time series. The SWT was performed using the Biorthogonal 1.1 wavelet at a decomposition level of 5. Seven LSP metrics were provided in this dataset, and they were categorized into the following three groups: (1) timing of key phenological events, (2) NIRv values essential for the detection of the phenological events throughout the growing season, and (3) NIRv value linked to vegetation growth state during the growing season. This LSP dataset is useful for investigating dryland ecosystem dynamics in response to climate variations and human activities across Central Asia.


a b s t r a c t
Satellite-observed land surface phenology (LSP) data have helped us better understand terrestrial ecosystem dynamics at large scales.However, uncertainties remain in comprehending LSP variations in Central Asian drylands.In this article, an LSP dataset covering Central Asia (45-100 °E, 33-57 °N) is introduced.This LSP dataset was produced based on Moderate Resolution Imaging Spectroradiometer (MODIS) 0.05-degree daily reflectance and land cover data.The phenological dynamics of drylands were tracked using the seasonal profiles of near-infrared reflectance of vegetation (NIRv).NIRv time series processing involved the following steps: identifying low-quality observations, smoothing the NIRv time series, and retrieving LSP metrics.In the smoothing step, a median filter was first applied to reduce spikes, after which the stationary wavelet transform (SWT) was used to smooth the NIRv time series.The SWT was performed using the Biorthogonal 1.1 wavelet at a decomposition level of 5. Seven LSP metrics were provided in this dataset, and they were categorized into the following three groups: (1) timing of key phenological events, (2) NIRv values essential for the detection of the phenological events throughout the growing season, and ( (SDG) 15 [1] , which is related to ecosystem productivity, biodiversity, and land degradation.

Background
Information on land surface phenology (LSP) can facilitate our understanding of land surface dynamics [2 , 3] .Knowledge of LSP has been significantly enhanced by the rapid growth of time-series remote sensing data [3] .Nonetheless, a substantial knowledge gap regarding LSP in Central Asia, a region characterized by extensive drylands, remains [3 , 4] .The degradation of ecosystem functions and services in drylands is a barrier to sustainable development in Central Asia [5] .Data on LSP are important for comprehending the dynamics of several key ecological indicators related to SDG 15 [6] .Although global LSP products are available, information on LSP in Central Asia remains insufficient, partially due to widespread missing data for sparse vegetation and uncertainties in modeling complex vegetation index (VI) time series [7] .In this context, I developed an LSP dataset specific to the drylands in Central Asia [8] by utilizing satellite-derived images of the near-infrared reflectance of vegetation (NIRv) [9] , considering the low vegetation cover and complexity of satellite VI time series in drylands.

Data Description
The spatio-temporal characteristics of the LSP dataset for Central Asia (LSPCA) are outlined in Table 1 .These LSP metrics were categorized into the following three groups: (1) timing of key phenological events, (2) NIRv values essential for detecting the phenological events throughout the growing season, and (3) NIRv value linked to vegetation growth states during the growing season ( Table 2 ).Additionally, other phenological metrics, such as the length of the vegetation growing season (LOS), can be calculated using the provided LSP metrics.
The images of the LSP metrics are included in the compressed file ' LSPCA.zip' .Each LSP metric is stored in an individual subfolder, and the name of the subfolder is the same as that of the LSP metric.Each subfolder contains 23 images.The name of an image denotes the LSP metric and the corresponding year.For example, 'SOS_2022.tif'indicates the SOS image for 2022.
The values of POS, SOS, and EOS fall within specific valid ranges, as outlined in Table 2 .The POS is within the range of day of year (DOY) 32 to 258.In the southern regions of Central Asia, the SOS may occur during the last two months of the year.For example, the SOS of a pixel has been recorded on 12/9/2021.In this case, the year of the SOS is also identified as 2022, given that the peak NIRv of this growing season occurs in 2022 [10] .Consequently, the SOS value is negative (-22 for SOS on 12/9/2021).The spatial pattern of the SOS in 2022, as depicted in Fig. 1 , illustrates numerous instances of negative SOS values.The yearly identifiers for all other LSP metrics are likewise assigned according to the POS [10] .This process allows users to track an entire growing season across two calendar years [10] .Notably, NIRv_MEAN represents the mean NIRv value from the SOS to the EOS, rather than the mean value over a calendar year.Fig. 2 shows the spatial distribution of NIRv_MEAN during the growing season identified as 2022.Additionally, the values of the masked pixels are set at -20,0 0 0.

Experimental Design, Materials and Methods
The MODIS MCD43C4 V061 0.05 °daily reflectance data [11] for the period of 20 0 0-2023 and the MCD12C1 V061 land cover data [12] for 2001 were obtained from https://search.earthdata.nasa.gov/search .The NIRv was calculated to track the vegetation seasonality dynamics ( Eq. ( 1) ).NIRv is suitable for observing vegetation dynamics in drylands with a low vegetation cover [9 , 13] .A previous study reported that NIRv can be used to effectively estimate the gross primary productivity of vegetation in sparse drylands [13] .
The following steps were conducted to process the NIRv time series of a land pixel (as identified by the land cover data): identifying low-quality observations, smoothing the NIRv time series, and retrieving the LSP metrics from the smoothed NIRv time series.

Identifying Low-Quality Observations
Observations with a Bidirectional Reflectance Distribution Function (BRDF) quality value greater than 3 were considered to be low quality [14] .Additionally, data with NIRv < 0 and NIRv > 0.6 were identified as low-quality data.The corresponding data points were excluded from the NIRv time series.The normalized difference snow index (NDSI) [15] was employed to detect the snow-affected NIRv values for which the NDSI > 0.1 [16] .The 5 th percentile of high-quality NIRv values observed within a five-year temporal window was used to replace the snow-affected observations in the NIRv time series [10] .Any remaining data gaps in the NIRv time series were filled using linear interpolation.

Smoothing the NIRv Time Series
Droughts can lead to complex seasonal NIRv profiles of drylands in Central Asia.Here, I applied a median filter to detect spikes in the NIRv time series [17] .If the absolute difference between the raw and the filtered NIRv values (11-day temporal window) was greater than 2.5 times the standard deviation of the raw NIRv values within a 21-day window, the raw NIRv value was replaced with the filtered value.The stationary wavelet transform (SWT), a local filtering method, was then applied to further reduce the noise in the NIRv time series [18 , 19] .SWT was performed on the NIRv time series at a decomposition level of 5 using the Biorthogonal 1.1 wavelet.The SWT decomposed the NIRv time series into an approximation signal and five detailed signals.Typically, the detailed signals contain information on high-frequency fluctuations in the NIRv time series, which may be considered noise [19] .Here, all detailed signals were eliminated to obtain a smooth NIRv time series [20] .An example of the final NIRv time series reconstructed through the two aforementioned steps is shown in Fig. 3 .

Retrieving the LSP Metrics
The LSP metrics were estimated annually within a 14-month temporal window.For example, when retrieving the LSP metrics for 2005, NIRv data from 11/2/2004 to 12/31/2005 were used.A single growing season was considered for the LSPCA dataset.The maximum NIRv peak (NIRv_PEAK) and the corresponding date (POS) were initially detected.POS was assumed to occur prior to DOY 258.The following steps were performed if the detected NIRv_PEAK exceeded 0.015.First, the seasonal minimum NIRv values before and after POS were detected.Then, the seasonal amplitudes of the NIRv for the vegetation green-up (AMP_G) and senescence (AMP_S) phases within the temporal window were calculated.Finally, the thresholds set to 20% of AMP_G and AMP_S were applied to retrieve the SOS and EOS, respectively [7] .If the detected POS occurred prior to DOY 32, all LSP metrics for the corresponding growing season were masked, as the NIRv time series for that growing season may be abnormal.

Limitations
In vegetated drylands, droughts can result in extremely low vegetation productivity and abnormal seasonal NIRv profiles.Consequently, cases of missing data can occur in the 23-year time series of an LSP metric, particularly for desert vegetation pixels.In such cases, a long-term trend analysis of the LSP metrics for a specific pixel cannot be carried out.

Ethics Statement
The author confirms that the current work does not involve human subjects, animal experiments, or any data collected from social media platforms.

Fig. 1 .
Fig. 1.Spatial distribution of the SOS in 2022 derived from the land surface phenology dataset for Central Asia.

Fig. 2 .
Fig. 2. Spatial distribution of the NIRv_MEAN in 2022 derived from the land surface phenology dataset for Central Asia.

Fig. 3 .
Fig. 3.An example of the reconstructed near-infrared reflectance of vegetation (NIRv) time series.
3) NIRv value linked to vegetation growth state during the growing season.This LSP dataset is use-ful for investigating dryland ecosystem dynamics in response to climate variations and human activities across Central Asia.© 2024 The Author(s).Published by Elsevier Inc.This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

Table 1
Spatio-temporal characteristics of the land surface phenology dataset for Central Asia.

Table 2
Characteristics of the land surface phenology metrics.