Observations of Satellite Land Surface Phenology Indicate That Maximum Leaf Greenness Is More Associated With Global Vegetation Productivity Than Growing Season Length

Vegetation green leaf phenology directly impacts gross primary productivity (GPP) of terrestrial ecosystems. Satellite observations of land surface phenology (LSP) provide an important means to monitor the key timing of vegetation green leaf development. However, differences between satellite‐derived LSP proxies and in situ measurements of GPP make it difficult to quantify the impact of climate‐induced changes in green leaf phenology on annual GPP. Here, we used 1,110 site‐years of GPP measurements from eddy‐covariance towers in association with time series of satellite LSP observations from 2000 to 2014 to show that while satellite LSP explains a large proportion of variation in annual GPP, changes in green‐leaf‐based growing season length (GSL, leaf development period from spring to autumn) had less impact on annual GPP by ∼30% than GSL changes in GPP‐based photosynthetic duration. Further, maximum leaf greenness explained substantially more variance in annual GPP than green leaf GSL, highlighting the role of future vegetation greening trends on large‐scale carbon budgets. Site‐level variability contributes a substantial proportion of annual GPP variance in the model based on LSP metrics, suggesting the importance of local environmental factors altering regional GPP. We conclude that satellite LSP‐based inferences regarding large‐scale dynamics in GPP need to consider changes in both green leaf GSL and maximum greenness.

(ΣGPP) to GPP-based photosynthetic seasonal and physiological covariates. For example, Xia et al. (2015) found that over 90% of the ΣGPP variation can be explained by photosynthetic GSL and annual maximum GPP. Following the same method, Zhou et al. (2016) and Zhou et al. (2017) supported the conclusion and found that changes in maximum GPP explain more variability in ΣGPP than the start and end of photosynthetic timing. However, whether a similar relationship holds for leaf-greenness-based phenology and physiology, and how the relationship varies by biome and space, remains unclear. More importantly, although GPP upscaling methods that relies on leaf greenness related measurements (Jung et al., 2019;Le Quéré et al., 2016) have been developed to estimate large spatial scale vegetation productivity, the comparison between impacts of GSL changes in green leaf development on vegetation productivity with that of GSL changes in photosynthetic duration over large spatial scales have rarely been studied.
In contrast, satellite remote sensing provides spatially continuous long-term observations of green leaf development and GSL at global scale (Ganguly et al., 2010;Gray et al., 2019). Satellite-observed land surface phenology (LSP) data are the only source of global green leaf GSL information (Caparros-Santiago, 2021;Piao et al., 2019) and has been widely incorporated into both process-based and data-driven ecological models to upscale field-based measurements of GPP from eddy covariance towers to larger areas (Falge et al., 2002;Richardson et al., 2010Richardson et al., , 2013. However, scaling up GPP by directly using LSP may be challenging because LSP measures the period of green leaf development rather than photosynthesis, ecosystem-scale phenology measured by LSP data sets do not always accurately capture the seasonality of GPP, and annual maximum GPP in Xia et al. (2015) is not necessarily reflected by annual maximum of leaf greenness. Moreover, although large-scale ΣGPP responses to leaf greenness changes have been investigated (Huang et al., 2018;Keenan et al., 2014), few studies have explored how changes in green-leaf-based GSL and maximum greenness jointly regulate geographic and interannual variation in ΣGPP across large spatial scales, and how the effects vary by biome and space, even though both processes occur simultaneously in many ecosystems. Additionally, many previous studies used methods that required multiple years of EC data for each site, thus excluding a large number of sites with insufficient data. This constrained our knowledge about the relationship between satellite green-leaf observations and ΣGPP to relatively few sites and biomes (F. J. Zhang et al., 2022). Therefore, investigating the nature and magnitude of how satellite LSP observations explain ΣGPP variation, as well as the joint effect of changes in green-leaf-based GSL and maximum greenness on ΣGPP at large spatial scales across multiple biomes, are helpful to understanding how satellite observations can be used to upscale knowledge derived from field-measured photosynthesis and study climate change induced dynamics in the global carbon cycle.
In this study, by using extensive EC measurements provided by the FLUXNET2015 project (Pastorello et al., 2020) in combination with global time series of LSP observations from NASA's Moderate Spatial Resolution Imaging Spectroradiometer (MODIS) from 2000 to 2014, we investigated the strength of the covariance between LSP and GPP seasonality and modeled how satellite-observed green leaf phenology and maximum greenness control ΣGPP across global terrestrial ecosystems. We explored questions including How well do seasonal metrics of satellite-observed greenness explain large-scale variation in ΣGPP? Do changes in phenology have biome-dependent effects on ΣGPP? And, how effective would satellite-observed metrics be in estimating ΣGPP spatially and interannually? We first compared the consistency between satellite observed leaf phenology and EC-derived GPP seasonality, then replicated previous studies using EC-derived photosynthetic GSL and maximum GPP to estimate ΣGPP, and finally evaluated the performance of using satellite LSP-derived green leaf GSL and maximum greenness in inferring ΣGPP. We used Bayesian hierarchical models that allowed us to use all available observations, even at sites with only a few years of EC data, to quantify the variation of the relationship between ΣGPP and green leaf phenology within and across global biomes and flux sites. We found that satellite LSP-derived GSL had less ΣGPP sensitivity to GPP derived GSL by ∼30% and that the GSL-ΣGPP relationship varies by biome type and flux site. This site-level variability might highlight the role of local climate factors in controlling productivity. Importantly, we also found that maximum greenness exerted stronger control on ΣGPP than LSP-derived GSL, suggesting that future leaf greening trends, represented by trends of increasing maximum greenness, would increase global vegetation productivity more than extending the time period of green leaf development.

Data Sets
We obtained GPP measurements of 166 EC flux sites distributed across the globe (Figure 1) from the FLUX-NET2015 project (https://fluxnet.org/data/fluxnet2015-dataset/). Although the flux data set provides decades of EC measurements until 2014, we only used the EC data within 2000-2014 because this time period matches with MODIS observations. Then, we filtered out site-years data that contain continuous missing values with more than 45 days. After data preprocessing, there were 1,110 site-years data left, representing 11 biome types ( Figure 1). To identify which GPP variable in the data set was better for studying phenology, we undertook a sensitivity analysis ( Figure S1 in Supporting Information S1) by calculating GPP-based phenometrics from each daytime GPP variable in the FLUXNET2015 data set. As no significant differences were found among those GPP variables, we chose GPP_DT_VUT_REF, which represents daytime GPP using a Variable U-star (U*) Threshold (Pastorello et al., 2020), to conduct this research. We extracted daily GPP time series (FULLSET_DD) as well as ΣGPP measurements for each site-year (FULLSET_YY) from the FLUXNET2015 data to conduct this analysis. Here, a site-year is defined as a calendar year at each site.
To analyze EC measurements with satellite LSP observations, we extracted satellite LSP data from 2000 to 2014 for flux site locations from the MCD12Q2 v6 product (https://lpdaac.usgs.gov/products/mcd12q2v006/) (Ganguly et al., 2010;Gray et al., 2019). This data product provides global, annual LSP estimates at 500 m spatial resolution based on daily time series of the two-band enhanced vegetation index (EVI2, Jiang et al., 2008). The MCD12Q2 LSP product uses spline functions to smooth the EVI2 time series and percentage thresholds to extract phenological dates. The estimated phenometrics include Greenup, MidGreenup, Maturity, Peak, Senescence, MidGreendown, and Dormancy, representing 15%, 50%, 90%, and 100% of annual EVI2 amplitude in spring and autumn, respectively.

GPP-Based Seasonality Estimation and Evaluation
To compare with satellite LSP phenometrics, we applied the MCD12Q2 LSP algorithm (Gray et al., 2019) to the daily GPP time series to retrieve annual GPP-based seasonality metrics as well as annual maximum GPP (GPP max ) and annual minimum GPP (GPP min ). We followed the same procedure of MCD12Q2 phenometrics estimation, whereby for any particular year, we first gathered daily GPP measurements for the full calendar year plus 6 months before and after and then smoothed the time series using a spline function. The GPP-based seasonality metrics were then estimated by the same percentage thresholds as MCD12Q2 but were based on GPP amplitude. We obtained 866 site-years of GPP seasonality metrics out of the 1,110 site-years of EC measurements from the algorithm. There were 244 site-years data that had GPP values but the annual amplitudes derived from the smoothed spline function were too low (<4 gCm −2 yr −1 ) to estimate reliable seasonality.
To evaluate the consistency between the GPP-based seasonality metrics and the satellite LSP data, we iteratively searched through each GPP seasonality metric for a potential match of the same type of LSP phenometric within a certain time period window (±185 days), centered on the GPP metric date. If all of the corresponding LSP phenometrics in a site-year were found, the site-year was marked as "match," otherwise "no-match." The search window is wide enough to capture potential matches, and the consistency between the GPP seasonality metrics and the LSP phenometrics was then evaluated by linear regression. Among the matched phenometrics and GPP metrics, we linearly regressed GPP seasonality metrics against LSP phenometrics. Besides reporting R 2 and root mean squared error (RMSE) from the linear regression, the mean relative deviation (MRD) and mean absolute deviation (MAD) were also calculated to quantify the differences between LSP phenometrics and GPP seasonality metrics. Specifically, MRD evaluates bias in the GPP and EVI2 metrics, while MAD quantifies the absolute deviation between the two data sources. With N defined as the total number of matched site-years, the formulas of MRD and MAD were used as follows: The "no-match" site-years were also investigated by comparing their time series data of MODIS EVI2 and EC measured GPP, respectively. Five reasons for their mismatch were investigated: EVI2 data missing, GPP data missing, EVI2 amplitude too low to retrieve LSP, GPP amplitude too low to retrieve GPP seasonality metrics, and both LSP and GPP metrics exist but do not match. Note that since the MCD12Q2 LSP product only processed pixels with annual EVI2 amplitude greater than 0.1 to distinguish between EVI2 data missing and EVI2 amplitude too low to retrieve LSP, we utilized the MCD43A4 Nadir Bidirectional Reflectance Distribution Function (BRDF)-Adjusted Reflectance (NBAR) data set (Schaaf & Wang, 2015).
We also investigated the consistency of GPP seasonality metrics with LSP from single pixels, 3-by-3 pixel windows, and 5-by-5 pixel windows, respectively. The values in pixel windows were aggregated by mean and median. We found that single pixel LSP represents GPP seasonality metrics the best with higher R 2 values, lower RMSE values, compared with pixel window based aggregated LSP (result not shown). So, we used single-pixel LSP to conduct further analysis in this study.

Annual GPP Model Analysis
To understand the phenology and physiology effects on ΣGPP, inspired by Xia et al. (2015), we introduced five nested Bayesian hierarchical regression models representing different hypotheses. The Bayesian framework allows partial pooling (Gelman & Hill, 2006) that accounts for the unbalanced number of site-years for categories of biome types and flux sites. More importantly, the Bayesian hierarchical models help us understand the site-level and biome-level effects by capturing the variability of model coefficients among flux sites and biome types. The models considered are as follows: Model 2 ∶ ∼ 0 + 1 + 2 max + 3 min, 2 Model 4 ∶ ∼ 0 + 1 + 2 max + 3 min, 2 Model 5 ∶ ∼ 0 + 1 + 2 max + 3 min, 2 ∼ , 2 0 ∼ , 2 , = 0, 1, 2, 3 where Y represents ΣGPP; j represents biome type; k represents flux site; Z max and Z min represent the annual maximum and minimum of GPP or EVI2; μ n , η n , n = (0, 1, 2, 3) represent the mean values of the population distribution of intercepts and slopes at biome and site levels, respectively; 2 , 2 , and 2 , = (0, 1, 2, 3) are the corresponding variances. Model 1 is the simplest model that explores whether ΣGPP can be explained by GSL and Z max and Z min ; Model 2 considers background GPP/EVI2 variability among biome types; Model 3 explores the variability of covariates among biome types; Model 4 adds site-level intercepts to test the importance of considering site-level background GPP/EVI2 rates; Model 5 is the full model that considers both site-level and biome-level variability. Cross validation was used to determine the significance of model fit improvements when the model complexity increases. By using GPP-metrics-based models as benchmarks, we are able to understand the effects of phenology and physiology on annual carbon uptake and evaluate the capability of satellite observations in capturing these effects. The GSL values were calculated by the duration between MidGreenup and MidGreendown dates, representing the timing of the time series reaching 50% of amplitude in spring and autumn, estimated from GPP time series for GPP-metrics-based models or obtained from the MCD12Q2 LSP product for EVI2-metrics-based models. MidGreenup and MidGreendown phenometrics were selected to characterize the timing of productivity because they are highly correlated with other metrics (e.g., earlier or later) but are retrieved more robustly from satellite EVI2 time series. Note that evergreen plants also have leaf-based seasonality and that can be captured by satellite observations (R. Wang et al., 2019). To compare the sensitivity of ΣGPP to model covariates, the standardized models with centered and scaled covariates were also implemented. Note that to exclude the influence of multiple phenological cycles and cross-calendar-year phenology on ΣGPP calculation, we focused on the data with a single phenological cycle and locate at Northern Hemisphere only in this modeling analysis. The GPP-based phenological cycles were determined by the fitted spline function same as producing the MCD12Q2 LSP product (Gray et al., 2019).
The Bayesian hierarchical models were implemented by JAGS software (v4.3.0) and R programming language (v3.6.3). All parameters in the models were assigned uninformative prior distributions to let data dominate the calculation of posterior distributions. We summarized the parameter posterior distributions by median values and 95% credible intervals (CIs) obtained from samples of Markov Chain Monte Carlo by JAGS. All the R scripts are available online (See Data Availability Statement).

Phenometrics Comparison
To compare the GPP seasonality metrics with MODIS LSP phenometrics, we first aligned their dates individually ( Figure S2 in Supporting Information S1). Of the 1,110 site year GPP seasonality metrics data, 758 (68%) matched phenometrics with the MODIS LSP; 124 (11%) did not have GPP seasonality metrics due to missing GPP values; 96 (9%) were left because MODIS LSP did not provide phenometric values; 11 partially matched with the MODIS LSP phenological cycle. That is, not all GPP seasonality metrics in a phenological cycle could be definitively matched with equivalent MODIS LSP phenometrics. Note that the MCD12Q2 algorithm can produce up to 2 phenological cycles in a year; and only one of them did not find a match at all. 120 (11%) of the site-years of data had GPP and EVI2 amplitudes that were too low to reliably estimate GPP seasonality metrics and MODIS LSP phenometrics so they were assigned NAs; this could also be considered consistent between EC measurements and satellite LSP observations.
The regression analysis suggests that LSP measurements from MODIS had general agreement with EC measurements across sites but exhibit systematic bias among deciduous and evergreen vegetation. Specifically, MidGreenup and MidGreendown from LSP measurements estimate later start of GSL for evergreen vegetation and later end of GSL for deciduous vegetation, relative to corresponding metrics from EC measurements (Figure 2, Table 1). For deciduous sites, the satellite-observed MidGreenup showed strong agreement with corresponding estimates derived from EC measurements (MRD = 2 days, R 2 = 0.78, p-value < 0.05). However, MidGreendown LSP measurements showed significant systematic bias relative to EC-derived estimates (MRD = −14 days, R 2 = 0.79, p-value < 0.05). At evergreen sites, however, MidGreenup from LSP observation was systematically late relative to EC estimates (MRD = −23 days, R 2 = 0.47, p-value < 0.05) and MidGreendown was only weakly correlated with EC measurements (MRD = −3 days, R 2 = 0.16, p-value < 0.05). The consistent MidGreenup and MidGreendown biases for deciduous and evergreen are representative for most phenometrics considered in spring and autumn ( Figure S6 and Table S1 in Supporting Information S1).

Annual GPP Sensitivity
Nearly all of the variance in large-scale ΣGPP is explained by three characteristics of seasonality in the EC measurements: GSL, growing season maximum GPP (GPP max ), and growing season minimum GPP (GPP min ) (Figures 3a and 3b); the model with both biome-and site-level effects (Model 5) explained 98% of variance in ΣGPP, with a RMSE of 73.65 gCm −2 yr −1 (Figure 3b). This corresponds to roughly 5% of the average ΣGPP across all site-years. These results are consistent with results from previous studies (Xia et al., 2015;Zhou et al., 2016) but are based on more site-years of EC measurements and a different model structure.
The results in Figures 3a and 3b demonstrate that GPP phenology effectively explains geographic and interannual variance in ΣGPP measured at EC tower sites. However, because EC towers provide a sparse and nonrepresentative sample of global terrestrial ecosystems, it is difficult to use these data to draw inferences regarding large-scale dynamics in GPP arising from changes in phenology. To explore how well leaf greenness based phenological and Note. MRD = mean relative deviance and MAD = mean absolute deviation. . Note that no significance tests were performed for the internal box plots as they already show the big differences between flux and MODIS phenometric groups; small differences (e.g., <5 days) could be caused by uncertainties contained in both data sets, although they may be significant. We provide mean absolute deviation (mean relative deviation) in the corresponding table to better quantify pairwise differences.
physiological metrics derived from satellite LSP observations explain ΣGPP, we estimated a Bayesian hierarchical model with the same basic form but using the matched MODIS LSP metrics (GSL, minimum and maximum EVI2) at Northern Hemisphere as proxies for corresponding metrics derived from EC measurements (Figures 3c  and 3d). Compared to models fitted using EC-derived seasonality metrics, models estimated using MODIS LSP metrics yielded weaker agreement with in situ measurements of ΣGPP (Figures 3c and 3d). The LSP-based model with site-level effects (Model 5) showed strong overall correlation with ΣGPP (R 2 = 0.88) but the RMSE was nearly double than that obtained using metrics based on EC measurements (190.45 gCm −2 yr −1 ). These results indicate that satellite-based LSP metrics are able to estimate ΣGPP but include substantial uncertainty. Consistent with results based on EC-derived seasonality metrics, the strong positive relationship between ΣGPP and LSP-derived GSL and EVI2 max (Figure 3e) demonstrate that satellite-based observations of green leaf duration and maximum greenness (e.g., Keenan et al., 2014;Park et al., 2016) explain a large proportion of variability in ΣGPP across global terrestrial ecosystems.
Models estimated from MODIS LSP metrics suggest a smaller magnitude of green-leaf-based GSL effect on ΣGPP relative to the EC metrics-based models. To quantify this, we estimated models using standardized ECand LSP-derived metrics, which allowed us to compare the magnitude of coefficients (i.e., the relative sensitivity of ΣGPP) for each predictor variable across models (Figure 3e). After controlling for EVI2 max and EVI2 min , the influence of satellite LSP-derived GSL was roughly half the magnitude of GSL derived from EC measurements after controlling for GPP max and GPP min . An increase of one standard deviation in EC-derived GSL increased the standard deviation in ΣGPP by 0.48 (0.33-0.63, 95% Bayesian CI), versus 0.27 (0.04-0.50) for GSL derived from MODIS LSP, controlling for GPP max and GPP min . Controlling for GPP max and GPP min and extending the photosynthetic GSL by 1 day in the model estimated from EC-derived metrics leads to an increase in ΣGPP of 7.2 gCm −2 yr −1 but only 5.0 gCm −2 yr −1 in the corresponding LSP-metrics-based model controlling for EVI2 max and EVI2 min . Stated more directly, the result suggests that GSL changes in green leaf development had roughly 30% less effect on ΣGPP on average across biomes compared to changes in photosynthetic duration. The magnitude of the GSL effect in models estimated using both EC metrics and MODIS LSP metrics varied across biomes. Overall, variance in the GSL effect across biomes was smaller in the EC metrics-based models (4.7 gCm −2 yr −1 ) than in the LSP metrics-based models (7.6 gCm −2 yr −1 ) (Figures 3e and 4, Figure S3 in Supporting Information S1). Croplands (CRO) and deciduous broadleaf forest had the largest GSL effects, with values of 8.4 and 8.2 gCm −2 yr −1 , respectively, in the EC metrics-based model, and 7.2 and 6.2 gCm −2 yr −1 , respectively, in the LSP metrics-based model. Evergreen needleleaf forest and grasslands (GRA) showed lower GSL effects, with a value of 6.0 and 6.3 gCm −2 yr −1 estimated by the EC metrics-based model, compared with 3.9 and 1.9 gCm −2 yr −1 estimated by the LSP metrics-based model. While the EC metrics-based model identified a substantial GSL effect on ΣGPP in Wetlands (WET) (7.67 gCm −2 yr −1 ), the LSP metrics-based model found almost no effect of GSL on GPP. In general, the LSP metrics-based model had a smaller estimated GSL effect on ΣGPP and larger uncertainty ranges for most biome types compared to the EC metrics-based model. The magnitude of the differences regarding the role of GSL between EC metrics-versus LSP metrics-based models varies among different places on Earth depending on their dominant vegetation types.
Relative to the EC metrics-based models, the LSP metrics-based models showed greater sensitivity to site-level variability in ΣGPP. In the EC metrics-based models, the biome-level model (Model 3) explained the large majority of variance in ΣGPP (R 2 = 93%), and accounting for site-level variability (Model 5) provided only modest improvement (R 2 = 98%) (Figures 3a and 3b). Indeed, cross-validation experiments indicate that accounting for site-level variability did not significantly improve the model ( Figure S7 in Supporting Information S1). In the LSP metrics-based models, however, inclusion of site-level variability (Model 5) increased the proportion of explained variance in ΣGPP from 55% to 84% (Figures 3c and 3d), with similar results achieved in cross-validation experiments ( Figure S7 in Supporting Information S1). Stated another way, ΣGPP modeled using LSP-derived phenology metrics is more sensitive to site-to-site variation in phenological metrics than corresponding metrics and models based on EC measurements. In fact, inclusion of site-specific intercepts (Model 4) explained the largest proportion of variance in ΣGPP in the LSP metrics-based models ( Figure S7 in Supporting Information S1). This suggests that phenological and physiological metrics derived from LSP observations do not capture differences in overall productivity across the EC sites included in this analysis; that is, sites with the same GSL, EVI2 max , and EVI2 min derived from LSP metrics can have significantly different ΣGPP.
The EC metrics-and LSP metrics-based models indicate that ΣGPP is more sensitive to GPP max and EVI2 max , respectively, than GSL (Figure 3e). Normalized GPP max and EVI2 max effects were both about 60% larger than normalized GSL effects. Holding GSL and GPP min constant, an increase in GPP max of one standard deviation increases ΣGPP by 0.77 (0.58-0.94) standard deviations. The corresponding sensitivity in the LSP-based model was 0.43 standard deviations (0.17-0.70). Similar to GSL, biome-level variability in ΣGPP associated with variability in GPP max and EVI2 max was higher in the EC metrics-based model. In fact, EVI2 max , which represents maximum leaf greenness was nonlinearly related to the GPP max at each site ( Figure S8 in Supporting Information S1). At the same time, variance in GPP max increases with EVI2 max (Figure S8 in Supporting Information S1), which suggests that while maximum leaf greenness is a good indicator of mean maximum vegetation productivity, other factors exert substantial control on GPP max at local scale.
GPP min and EVI2 min play a modest role in regulating ΣGPP uptake in most, but not all, biome types (Figure 3e). The normalized coefficient of GPP min ranged from −0.03 to 0.27 (40% less than the normalized GPP GSL effect) and the normalized coefficient of EVI2 min ranged from −0.07 to 0.33 (20% less than the normalized EVI2 GSL effect). The effects of GPP min and EVI2 min were both weakly negative in mixed forest (MF), while other biomes were weakly positive. Since GPP min and EVI2 min were contributed by evergreen trees in MF while GPP max and EVI2 max were contributed by both evergreen and deciduous trees, when controlling for the EVI2 max /GPP max and GSL, a higher EVI2 min /GPP min value means a lower amplitude and fewer leaves being developed by deciduous trees, yielding a lower overall ΣGPP. However, in pure evergreen ecosystems, a lower amplitude does not necessarily mean a lower annual GPP. The effects of both GPP min and EVI2 min variables are slightly lower than 95% significance level (Amrhein et al., 2019) based on our data, but ignoring the minimum seasonal productivity or aggregating maximum and minimum metrics into the seasonal amplitude may obscure important factors that are diagnostic of total seasonal carbon update, especially in evergreen systems which have higher minimum greenness and smaller greenness amplitude.

Discussion
The result that EVI2 max had a larger effect on ΣGPP indicates that increases in maximum leaf greenness are more associated with ΣGPP than increases in the growing season duration of green leaves. Generally speaking, more green leaves results in larger increases in annual productivity compared to longer green leaf duration due to advanced spring and/or delayed autumn. Previous studies have shown that while regional decreases in leaf greenness are present in the satellite record (Jong et al., 2012;Sulla-Menashe et al., 2018), so-called "greening" of global vegetated land areas has been ongoing since at least the early 1980s (C. Chen et al., 2019;Huang et al., 2018). Satellite observations and ecological models suggest that this greening is diagnostic of enhanced terrestrial vegetation productivity and has potential to mitigate climate warming by increasing the terrestrial carbon sink (Piao et al., 2020). However, although no consensus has been reached, a variety of studies have also suggested that increases in early and mid-growing season productivity might negatively impact end-of-season GPP, effectively offsetting early season increases in GPP (Buermann et al., 2018;Lu & Keenan, 2022;Piao et al., 2008;Zani et al., 2020). Therefore, improved understanding of how changes in leaf greenness and GSL jointly impact ΣGPP is required to forecast future change in large-scale carbon budgets. Our results (Figure 3e) indicate that greening trends casued by maximum greenness increasing might have a larger impact on the net carbon uptake of terrestrial vegetation than changes in GSL of leaf development.
Our results showing that satellite LSP-derived metrics had a smaller GSL effect on ΣGPP compared to EC-based metrics might have important implications for the use of remotely sensed LSP metrics to infer vegetation productivity at regional, continental, and global scales (e.g., Keenan et al., 2014;Richardson et al., 2010Richardson et al., , 2013. The smaller magnitude of green leaf based GSL effect on ΣGPP has the potential to bias understanding regarding if and how changes in the satellite LSP-derived growing season of terrestrial ecosystems impact the sign and magnitude of net carbon fluxes. Future warming is expected to extend both the leaf and photosynthetic GSLs in many ecosystems, thereby potentially increasing ΣGPP (Hua et al., 2021;Piao et al., 2019). However, our results suggest that leaf GSL changes had smaller effect on ΣGPP than photosynthetic GSL changes, but the extended leaf GSL in spring and autumn (Buermann et al., 2018;Piao et al., 2007Piao et al., , 2008Wolf et al., 2016) might increase carbon loss by ecosystem respiration, and thus reduce the total net carbon uptake.
Differences in the timing of phenology from LSP and EC measurements may explain differences in model results from each data source. Our comparisons of phenophase transition dates derived from LSP and EC measurements of GPP are broadly consistent with prior work (D'Odorico et al., 2015;Lu et al., 2018;Shen et al., 2014) but are based on a much larger data set that supports additional and more nuanced interpretation. First, at deciduous sites, the timing of autumn phenology in LSP measurements was biased late compared to the timing estimated from EC measurements (Figures 2b and 5a). The reasons for this are unclear, but this result almost certainly reflects complexity in the relationship between the timing of leaf coloration and decline in photosynthesis late in the growing season (X. Wang et al., 2020). As a consequence, LSP-derived leaf GSL was systematically longer than EC-derived photosynthetic GSL ( Figures S4 and S5 in Supporting Information S1), which explains why the LSP-based model showed smaller GSL effect on ΣGPP for deciduous sites. Second, at evergreen sites, the timing of spring phenology from LSP measurements is biased late relative to corresponding timing from EC measurements (Figures 2a and 5d), and the timing of autumn phenology from the two sources was only weakly correlated ( Figure 2, Table 1). This result has been previously noted (e.g., Melaas et al., 2013) and arises from the fact that photosynthesis in conifers starts well before the timing of leaf flushing and pigment changes later in the spring (Barr et al., 2009;Gao et al., 2021). These differences yielded shorter leaf GSL from LSP measurements relative to photosynthetic GSL from EC measurements and large site-level uncertainty for evergreen vegetation observed by satellite LSP observations compared to EC measurements ( Figures S4 and S5 in Supporting Information S1). More importantly, these results indicate that biome type information is critical in upscaling EC-measured GPP seasonality using green-leaf-based LSP observations to infer changes in global photosynthetic productivity over large spatial scales. Additionally, these results also highlight the importance of developing methods to better match remotely sensed phenology with vegetation photosynthetic activities such as solar-induced chlorophyll fluorescence and better vegetation indices (Gonsamo et al., 2012;Jin & Eklundh, 2014;Mohammed et al., 2019), although these new methods may have their own limitations such as coarser spatial resolution and shorter available time periods. We used the phenometrics available from the only global operationally produced LSP data, MCD12Q2, and used the same amplitude thresholds for determining GPP-based phenometrics. However, it is possible that different definitions may result in higher correlations between LSP and ΣGPP and that could vary by biome or otherwise across space.
The result that site-level variability contributes a substantial proportion of total variance in ΣGPP modeled by LSP metrics across sites and years is consistent with Butterfield et al. (2020), Richardson et al. (2010Richardson et al. ( , 2013, who found that the remotely sensed phenology-productivity relationship was strong across flux sites but does not capture interannual variability in ΣGPP at individual sites. Local environmental factors such as temperature, precipitation, forest age, and soil moisture are more important regulators of GPP than leaf phenology and physiology at fine spatial scales (Barr et al., 2009;Churkina et al., 2005;Piao et al., 2009;Richardson et al., 2010). However, when investigating ΣGPP variability across large spatial scales, we found these local environmental factors tended to be averaged out, so the effects of remotely sensed leaf phenology and physiology on ΣGPP were stabilized. In addition, factors complicating the relationship between GPP measurements and remotely sensed LSP metrics also contribute to the site-level variability (X. Chen et al., 2018;Peng et al., 2017;. Further, EC measurements are affected by site-specific characteristics such as wind direction and measurement height (Chu et al., 2021;Schmid, 2002), factors that cannot be captured by satellite LSP observations. Thus, the magnitude of the estimated GSL-ΣGPP relationship at any particular site depends on both the natural variability of the relationship and the interaction with local characteristics. Our results support the conclusion that it is feasible to infer large-scale spatiotemporal patterns in ΣGPP from satellite-observed leaf GSL but large uncertainty at fine spatial scales. Developing ways to explain this site-level variability, perhaps using ecological covariates, has the potential to substantially improve our models designed to upscale EC measurements and infer large scale ΣGPP using satellite LSP observations.

Conclusion
In summary, this study suggests that satellite LSP-based green leaf phenological and physiological metrics are capable of inferring vegetation productivity over large spatial areas for most biome types, and satellite observed leaf GSL trends are meaningful for projecting carbon cycle impacts into the future. However, caution must be used as satellite observed leaf GSL changes do not synchronize photosynthetic GSL changes for evergreen vegetation in spring and deciduous vegetation in autumn. Changes in leaf GSL had a smaller effect on ΣGPP compared to changes in photosynthetic GSL. Moreover, although changes in leaf GSL have a significant impact on ΣGPP, trends of vegetation greening or browning indicated by maximum leaf greenness changes might have more carbon impacts than the extended leaf GSL caused by current climate warming. Therefore, changes in both leaf GSL and maximum greenness need to be considered in satellite LSP-based inferences regarding large-scale dynamics of vegetation productivity.