Estimating the CO2 Fertilization Effect on Extratropical Forest Productivity From Flux‐Tower Observations

The land sink of anthropogenic carbon emissions, a crucial component of mitigating climate change, is primarily attributed to the CO2 fertilization effect on global gross primary productivity (GPP). However, direct observational evidence of this effect remains scarce, hampered by challenges in disentangling the CO2 fertilization effect from other long‐term confounding drivers, particularly climatic changes. Here, we introduce a novel statistical approach to separate the CO2 fertilization effect on photosynthetic carbon uptake using eddy covariance (EC) records across 38 extratropical forest sites. We find the median stimulation rate of GPP to be 3.2 ± 0.9 gC m−2 yr−1 ppm−1 (or 16.4 ± 4.2% per 100 ppm) under increasing atmospheric CO2 across these sites, respectively. To validate the robustness of our findings, we test our statistical method using factorial simulations of an ensemble of process‐based land surface models. We address additional factors, including nitrogen deposition and land management, that may impact plant productivity, potentially confounding the attribution to the CO2 fertilization effect. Assuming these site‐specific effects offset to some extent across sites as random factors, the estimated median value still reflects the strength of the CO2 fertilization effect. However, disentanglement of these long‐term effects, often inseparable by timescale, requires further causal research. Our study provides direct evidence that the photosynthetic stimulation is maintained under long‐term CO2 fertilization across multiple EC sites. Such observation‐based quantification is key to constraining the long‐standing uncertainties in the land carbon cycle under rising CO2 concentrations.


Introduction
Forests play a crucial role in the global carbon cycle, acting as a significant sink for anthropogenic carbon emissions.Approximately 25% of annual carbon emissions are estimated to be sequestered and stored by forests via photosynthesis, with boreal and temperate forests making substantial contributions (Pan et al., 2011).The physiological effects of increasing atmospheric carbon dioxide (CO 2 ) on plant productivity, known as the CO 2 fertilization effect, are expected to stimulate photosynthesis and drive the enhanced carbon uptake.However, obtaining observational evidence for these effects in natural ecosystems and understanding how this process has changed historically remains a key knowledge gap.
Multiple lines of evidence support an enhancement in photosynthesis (or gross primary production; GPP) in response to an increase in CO 2 : CO 2 enrichment experiments (Norby et al., 2010;Walker et al., 2021), ecosystem monitoring (Fernández-Martínez et al., 2017;Keenan et al., 2013;Mastrotheodoros et al., 2017) and indirect proxies based on long-term atmospheric carbonyl sulfide records (Campbell et al., 2017) or isotopomer signal (Ehlers et al., 2015).Process-based land surface models, which simulate the physiological responses of vegetation to environmental changes, also predict a stimulation of photosynthesis with increasing CO 2 levels (Sitch et al., 2015).However, multi-model projections of the CO 2 effect on long-term GPP diverge considerably due to uncertainties in process parameterizations and feedback mechanisms, particularly in response to meteorological extremes and climatic changes associated to rising CO 2 (De Kauwe et al., 2013;Rogers et al., 2014;Zaehle et al., 2005).Constraining the CO 2 fertilization effect in models through direct observational evidence is a longcalled-for necessity to advance our understanding of carbon cycling and essential for more reliable future projections of carbon sequestration.
The global network of eddy covariance (EC) flux towers observing the exchange of CO 2 at the ecosystem scale provides a valuable resource, as it has accumulated sufficiently long time series data to potentially provide direct evidence of the CO 2 fertilization effect (Baldocchi, 2020;Knauer et al., 2017;Zhan et al., 2022).Previous studies have attempted to attribute the CO 2 fertilization effect on GPP by utilizing EC records (Table 1).Chen et al. (2022) used an eco-evolutionary optimality framework to reproduce EC-inferred GPP and subsequently attribute the CO 2 fertilization effect on GPP.Their analysis estimated a global GPP enhancement of 2.24 gC m 2 yr 1 ppm 1 from 2001 to 2014 due to rising CO 2 .Similarly, Ueyama et al. (2020) utilized a model constrained with data from 104 EC towers and estimated a 0.43 gC m 2 yr 1 ppm 1 increase in GPP.These studies only relied on an indirect use of measurements for attribution, such as using EC data to calibrate a predefined model.These pre-defined models explicitly incorporate known biological processes, such as the photosynthesis and stomata response to elevated CO 2 .The model structure determines the model's capacity to capture complex patterns in the ecosystems and could influence the interpretability.Data-driven methods such as machine learning algorithms are highly flexible and can adapt to complex relationships (e.g., non-linear relationships) present in the data with fewer assumptions.Fernández-Martínez et al. (2017) employing a data-driven method, that is, generalized mixed linear models, attributed an increase of 4.49 ± 0.75 gC m 2 yr 1 ppm 1 in the GPP from 1995 to 2011 at 23 forest sites to CO 2 .The contribution of CO 2 was calculated as the difference in the GPP trend by maintaining the temporally varying predictors but constant CO 2 .The co-linearity among predictors still affects the estimation to an extent with this approach.These variations in CO 2 fertilization estimations emphasize the importance of disentangling the CO 2 effect on GPP directly, that is, not using pre-defined model structures, and by carefully considering confounding drivers in leveraging the continuously growing EC records.
In this study, we aim to disentangle the CO 2 fertilization effect on photosynthetic uptake directly and exclusively from long-term multi-site flux measurements and accompanying meteorological data.Several factors, such as CO 2 , climate changes, land-use and land-cover changes (LULCCs), affect ecosystem productivity and are correlated and confounded on the multi-decadal time-scale.We introduce the so-called GPP residual method that statistically captures the sensitivity of GPP to CO 2 and climate variables at different time scales to account for colinearities among the drivers.Instead of treating all potential drivers at once, by separating the long-term and short-term signals and deriving the sensitivity of GPP to climate at short-term scales, ideally the effect of climate variables could be eliminated at the long-term scales.First, we detrend the GPP time series to separate the longterm variability of GPP (the trend) primarily driven by CO 2 and climate, from the shorter-term variability (anomalies) primarily driven by climate variabilities.The method estimates the sensitivity of GPP to climate based on these anomalies.Analogous to Arora et al. (2020), the carbon-climate feedback parameter (γ) can be defined as the change in CO 2 flux (here GPP, where Arora et al. uses the total carbon pool) in response to change in temperature, where the temperature is used as a proxy to encapsulate all climatic changes.Next, we can quantify the γ effect based on long-term changes assuming the sensitivity remains consistent over the time scales of a few decades.The difference between the original observed GPP trend and the inferred GPP trend due to climate changes yields the unexplained GPP residual, which can be attributed to the long-term CO 2 effect on GPP, here referred to as the β effect.Specifically, we define the β factor as the sensitivity of GPP to increasing CO 2 concentration (gC m 2 yr 1 ppm 1 ), and the γ factor as the sensitivity of GPP to surface air temperature (gC m 2 yr 1 K 1 ).The standardized β (% per 100 ppm) and γ (% per Kelvin degree) are reported to account for different baseline levels of productivities across ecosystems.It's important to note that the GPP residual may also include other long-term effects specific to individual sites, such as signals related to nitrogen deposition or land management, which we test in a later section of the paper.First, we test the robustness of the GPP residual method using site-level simulations with the QUINCY model (QUantifying Interactions between terrestrial Nutrient CYcles and the climate system; Thum et al., 2019).We use the GPP residual method to further estimate the β and γ effects in both GPP and daily maximum net ecosystem production (NEE) (NEP max ) using long-term EC records.
At last, we compare the results based on the EC records to factorial simulations from a set of land surface models (TRENDY version 9; Sitch et al., 2015).Additionally, we test and discuss the relevance of other potential longterm impacts on GPP in more detail.

Eddy Covariance Data
This study comprises 38 forested EC sites (Table S1 in Supporting Information S1), where CO 2 flux data and meteorological data are collected by flux towers from Integrated Carbon Observation System (ICOS; Rebmann et al., 2018), and AmeriFlux (Novick et al., 2018).We focus on tree-dominated ecosystems due to the strong sensitivity of grass-dominated ecosystems to short-term climate variability, adding complexity to the disentanglement of their response to CO 2 (Hovenden et al., 2014;Reich et al., 2018).The sites span geographically across Europe and North America.The forest types can be broadly classified into: deciduous broadleaf forest (DBF; 12 sites), evergreen need-leaved forest (ENF; 20 sites), and mixed deciduous-coniferous forest (MF; 6 sites).
We obtain long-term recorded (≥10 years) EC data at daily scale and NEE at half-hourly scale from 1994 to 2022 (Table S2 in Supporting Information S1).GPP is estimated from the nighttime partitioning algorithm (Reichstein et al., 2005).Meteorological variables used in this study include temperature, incoming shortwave radiation and vapor pressure deficit (VPD).Growing degree days (GDD) are calculated as the sum of daily temperature above zero degree to estimate the timing of various phenological events.Due to the limited depth of soil moisture measurements (Yu et al., 2022), we calculate a water availability index (WAI; Tramontana et al., 2016) following a bucket model approach.The maximum cumulative water deficit (Aragão et al., 2007) represents the available water content (awc).WAI is calculated as the balance of precipitation recharge and observed evapotranspiration as follows: input(t) = min(Precipitation(t), awc WAI(t)) (1) where t represents the timestep t.
At the half-hourly resolution, the quality flags (QF) indicate if the data record is measured (QF = 0) or the quality level of the gap-filling applied to the data record (e.g., QF = 1 better, QF = 3 worse quality).At the daily scale, the quality control indicators represent the percentage of measured or high-quality gap-filled records (i.e., QF = 0 or 1).We exclusively include data with daily quality control indicators for NEE and meteorological variables surpassing 0.6, denoting a 60% or higher percentage of measured and high-quality gap-filled.In summary, EC sites are selected for this study based on three criteria: (a) only sites dominated by tree-ecosystems are selected; (b) there has to be a long-term record of EC observations (≥10 years) after the quality control when (c) at least 60% data per day is measured or gap-filled with good quality.In total we estimate β and γ for 228 site-months.

Atmospheric CO 2 Data
Due to the systematic biases (e.g., sensor calibrations) of the atmospheric CO 2 concentration measurements in the EC data, for consistency, we replace the CO 2 concentration measurement with the CO 2 product CAMS CF-1.6 (Chevallier et al., 2005(Chevallier et al., , 2010) ) from the nearest pixel to each EC site.The CO 2 reanalysis data spans from 1994 to 2022 with daily resolution, thus sufficient to match the time period of the EC data.

Nitrogen Deposition Data
We use annual nitrogen deposition (N dep ) data collected from the European Monitoring and Evaluation Programme (EMEP) Meteorological Synthesizing Center-West (MSC-W) model and the CMIP6 forcing database (Hegglin et al., 2016).Both N dep data set includes four components: dry oxidized nitrogen, dry reduced nitrogen, wet oxidized nitrogen and wet reduced nitrogen.To be consistent with the EC records (Table S1 (De Vries et al., 2006).An emerging relationship between N dep and β suggests a bias in the estimated β due to the influence of N dep .However, several studies indicate that the beneficial effects of N dep can transition into adverse impact on productivity once N dep surpasses a certain threshold, leading to a non-linear response of vegetation growth to N dep (Etzold et al., 2020;Kint et al., 2012;Schmitz et al., 2019;Schulte-Uebbing & de Vries, 2018;Tian et al., 2016).When analyzing the relationship between N dep and β, we exclude pixels near sites where annual N dep values exceed the certain threshold (i.e., 24 KgN ha 1 yr 1 ) in Etzold et al. (2020).

Estimating β and γ Using the GPP Residual Method
We develop the GPP residual method (Figure 1) to isolate the CO 2 fertilization effect (β) from other confounding factors (e.g., climate).β is inferred for each site and each month-of-year separately, using the median values of GPP and hydro-meteorological data across 5-day intervals within the considered months to filter out synoptic weather variability and its impact on GPP dynamics.The calculation of β consists of three steps: (a) Data preparation (Figure 1b).The growing season when plant photosynthesis is active is defined based on the mean seasonal cycle of GPP (averages by day-of-year) across the time series.A month is considered within the growing season, if there are more than 20 days when GPP is greater than 25% of the maximum of GPP as inferred from the mean seasonal cycle.Within each month, the median values of all variables are retrieved for every 5-day interval.
We then calculate anomalies using the median values by subtracting the long-term trend of the linear-fit for each month-of-year (e.g., July in 1999, …,2020).These anomalies are only used to train the model in step (2).We rescale the anomalies of all variables by adding the average value across the considered time period.The rescaling allows the random forest model in step ( 2) is trained and applied at an identical magnitude to extrapolate; (b) Model training and predicting climatic effects on GPP (Figure 1c).We train a random forest regression model for GPP anomalies using anomalies of hydro-meteorological variables (i.e., temperature (Temp), incoming shortwave radiation (SW in ), vapor pressure deficit (VPD), water availability index (WAI), growing degree days (GDD)).We use the model to predict GPP using the actual hydro-meteorological data (including both trends and anomalies) at each month-of-year.The predicted GPP (GPP climatic ) thus only reflects the effect of climate.Next to the random forest regression model, we call a multivariate linear regression model to test the robustness of the results from the random forest regression model (Figure S1 in Supporting Information S1); (c) Isolating nonclimatic effects on GPP (Figure 1d).At each month-of-year, the non-climatic effects on GPP (GPP residual ) are derived by removing the GPP climatic from the actual GPP time series (GPP residual = GPP-GPP climatic ).The change of GPP in response to CO 2 (i.e., β) is derived as the trend of the linear-fit between CO 2 concentration and GPP residual .Similarly, the sensitivity of GPP to temperature (i.e., γ) is derived as the trend of linear-fit between temperature and GPP climatic : Absolute change in GPP might not accurately reflect the response of ecosystems to CO 2 increase because ecosystems have different baseline levels of productivity.Considering that, we alternatively calculate relative change in GPP in response to increasing CO 2 and temperature, to allow for better standardization across different ecosystems, locations, and studies: where β relative is in the unit of % per 100 ppm, and γ relative in the unit of % per Kelvin degree.GPP baseline is calculated as the mean GPP over the first 2 years at a specific month.
The advantage of the GPP residual method is that we separate the confounding factors at different time scales, thus, it can solve the issue of multicollinearity to some extent, when the independent variables are highly correlated to one another.To show the different results yield from the GPP residual method and a multivariate regression method, we adopt a simple multivariate regression model as the following: where β is the sensitivity of GPP to CO 2 .After obtaining β values for each site-month, we calculate the median β.This approach helps to mitigate the influence of outliers.We further estimate the uncertainty of the median β using the bootstrap method.By repeatedly sampling from the considered β distribution, we create multiple bootstrap samples.Each sample is then used to calculate the median β.The standard deviation across these bootstrap estimates provides an estimate of the uncertainty associated with median β.We calculate the median γ and its uncertainty in the same way.
To consider the seasonal and spatial variation of GPP, we further calculate annual β and γ by aggregating monthly β and γ weighted by monthly GPP baseline : where i represents a specific month, and n is the total number of months.GPP gs is the sum of baseline GPP across the considered growing season.Similarly, the annual β and γ at each site can be further aggregated across space: where i represents a specific site, and n is the total number of sites.GPP sum is the sum of baseline GPP across all sites.To assess the robustness of the median β or γ values and determine if they are influenced by site selection, we compare the mean β or γ calculated across all sites, weighted by baseline GPP, with the median β or γ derived from the distribution of monthly β or γ values.If the median value remains relatively stable and comparable to the mean value across all sites, it suggests that the selection of sites does not significantly impact the robustness of the median β or γ estimations.
The EC technique allows for direct measurement of NEE, which is the difference between GPP and ecosystem respiration (RECO): The maximum net ecosystem production (NEP max ) is defined as the negative sign of the minimum NEE during a day from half-hourly measurement: In addition to GPP, we further estimate β or γ for NEP max following the same method.

Test the GPP Residual Method With Synthetic Data: A Proof-Of-Concept Analysis
We use the terrestrial biosphere model QUINCY (QUantifying Interactions between terrestrial Nutrient CYcles and the climate system;  et al. (2022).For better comparison with EC records, we take the last 20 years (1999-2018) in the simulations as the time period of the validation.
The advantage of this method is that we can compare the β estimated by the GPP residual method (β estimated ) with β modeled by QUINCY (β QUINCY ), which represents theory of photosynthetic responses to CO 2 , climate and water availability.β QUINCY is calculated as the sensitivity of the CO 2 -induced change in GPP to CO 2 concentration, in which GPP is calculated as the difference between simulations forced with transient CO 2 and constant CO 2 during the considered time period.β is calculated for each site and each month-of-year.The selection of months follows the same rule as the data preparation in the previous section.β estimated is calculated using the GPP residual method elaborated in the previous section from the simulation forced with transient CO 2 .
We evaluate the agreement of β estimated and β QUINCY for each synthetic forested site (166 sites in total) in the model.We use the root-mean-square error (RMSE) to measure the difference between β estimated and β QUINCY across the growing season.The RMSE of β estimation per site is calculated as: where n is the number of months when β is estimated; β estimated i and β QUINCY i is the β estimated and modeled at month i, respectively.In this study, we use the validated GPP residual method to estimate β in tree-dominated ecosystems based on measured meteorological data and the CO 2 atmospheric inversion product.

Determining β and γ in Simulations of the TRENDY v9 Model Ensemble
We use simulations from twelve process-based global dynamic vegetation models (DGVM) within the TRENDY projects (Le Quéré et al., 2018;Sitch et al., 2015) to derive the modeled β and γ.We use four simulations (called S0, S1, S2, S3 in the TRENDY v9 protocol; see Table S3 in Supporting Information S1) with and without LULCCs under both transient (historically observed) and pre-industrial (constant) environmental conditions.CO 2 effect on GPP modeled by TRENDY (β S1-S0 ) is calculated as the difference between output from S1 and S0, to avoid the effect from climate recycling.To test the robustness of the GPP residual method and the potential LULCCs effect, we apply the same statistical method (i.e., the GPP residual method) on simulations in S2 and S3, respectively.We derive γ from S2 simulations by calculating the sensitivity of GPP in S2 to temperature.We select grid-cells containing the 38 considered EC sites in all models.β and γ is calculated for the same site-months as data analyzed in EC records.

Evaluating the GPP Residual Method With a Land Surface Model: A Proof-Of-Concept Analysis
We develop and test the GPP residual method with QUINCY model simulations (Methods) from which we cannot only infer β with our method, but also directly compare with the modeled β.This approach offers a systematic and transparent framework for evaluating the method's performance at 166 synthetic model sites under known conditions, providing insights into its strengths and weaknesses.Overall, we find that our method can satisfactorily estimate β, and can capture the seasonal variations of β across biomes.We find β in tropical forest is overall well reproduced by our statistical method, supported by a mean root-mean-square error (RMSE) of 1.2 gC m 2 yr 1 ppm 1 .However, the performance in the cold northern high latitude regions, where part of the boreal needle-leaved forests and temperate forests are located, is slightly diminished, with a mean RMSE of 1.5 gC m 2 yr 1 ppm 1 (Figures 2a and 2b).When evaluating different methods using the model simulations, the GPP residual method with implemented random forests has the lowest RMSE (1.5 gC m 2 yr 1 ppm 1 ), in contrast to the other two methods (i.e., the two multivariate regression methods, Methods), both amounting to an RMSE of 1.6 gC m 2 yr 1 ppm 1 .The RMSE in Figure 2a is notably lower at each forest type compared with Figures S1a and S2a in Supporting Information S1.Particularly, the GPP residual method with implemented random forests exhibits better estimation in temperate broadleaf summergreen trees (TeBS), compared with both multivariate regression methods that estimate negative β in TeBS during the beginning and the late growing season (Figures S1a, S2a in Supporting Information S1).Consequently, by testing all methods at 166 synthetic forested sites, we find that the GPP residual method with implemented random forests exhibits relatively the best performance compared to the GPP residual method based on the multivariate regression approaches.By identifying climate effects on shorter-time scales and capturing their impact, including the rising CO 2 effect at longertime scales, we can more effectively isolate confounding factors affecting GPP drivers.
Additionally, we find a slight overestimation accompanied with higher RMSE, during summer months in boreal needleleaf evergreen (BNE) forested sites and boreal needleleaf summergreen (BNS) forested sites.The discrepancy between estimated β and modeled β can be attributed to the limitations associated with constructing a statistical model to estimate the sensitivity of GPP to climate variables relying on interannual variabilities.This means this statistical model does not account for vegetation acclimation on climatic variability in the long-term, such as phenological changes, which cannot be learned from interannual variabilities.Thus, the statistical method exhibits decreased accuracy, particularly in ecosystems where seasonality exerts strong control.
In addition to the limitation of capturing vegetation phenology, we individually consider the effect of rising CO 2 and the effect of changing climatic conditions.Thus, the synergetic effect of rising CO 2 and temperature (Drake et al., 1997) is not considered in our approach, where for example, increasing CO 2 can modify plants' response to temperature.This simplification could result in the overestimation of the CO 2 fertilization effect on GPP.On the other hand, the anomalies associated with extreme events can be theoretically reproduced by the statistical method.However, given that only a few instances of extreme events are in the training data set, we acknowledge that the non-linear relationship between climate and GPP during extreme conditions can induce errors in the estimation of β.Overall, we find encouraging consistency between the β estimated by the GPP residual method and β modeled by QUINCY.

CO 2 Fertilization Effect in Forested Ecosystems Inferred From Eddy Covariance Records
Using the GPP residual method, we estimate the strength of the CO 2 fertilization effect on photosynthetic carbon uptake as recorded in eddy covariance (EC) time series at 38 forested sites (Figure 3).We assess the sensitivity of GPP to CO 2 , denoted β (Methods), separately for each individual month across the years of the time series to account for seasonal variations.The median β value across all sites and months is 3.18 ± 0.92 gC m 2 yr 1 ppm 1 (or 16.40 ± 4.27% increase in GPP per 100 ppm rise in atmospheric CO 2 ).This is comparable to the β factor reported in Fernández-Martínez et al. ( 2017) and Chen et al. (2022), where β amounts to 4.49 ± 0.75 and 2.24 gC m 2 yr 1 ppm 1 (Table 1).The difference among these estimates likely arises from differences in study periods, vegetation types, or inherent methodological uncertainties.Notably, Ueyama et al. (2020) reported a β factor of only 0.43 gC m 2 yr 1 ppm 1 , about one-tenth of the value found in this study.While Ueyama et al. ( 2020) utilized a photosynthesis model constrained by EC data, we employ a data-driven method which does not rely on pre-defined model structures.The conceptually distinct approaches as well as different study periods and sites likely result in these significant differences among the identified β factors.
While β displays considerable variability across sites and months, positive β values are consistently observed in 61% of sites for at least 2 months in the record.The strongest enhancement of β occurs during the boreal summer months (Figure 3a), although a selection of sites exhibits stronger effects in spring (e.g., CA-Ca3) or autumn (e.g., US-GLE).
Among the analyzed sites, the top seven sites listed from the top to the bottom in Figure 3a (DE-Hzd, CA-LP1, CA-TP4, US-GLE, CA-Ca3, CA-Cbo, IT-Ren) exhibit the most pronounced GPP enhancement, with their site- specific annual mean β values surpassing the top 20% of all the sites.In addition to the median β value across all sites and months, we aggregate monthly β based on the monthly baseline GPP, to represent the mean β across selected sites.The aggregated mean β is 2.10 gC m 2 yr 1 ppm 1 , indicating the median β is representative and not biased by the site selection, considering the variation in GPP across sites.
The presented approach lacks the ability to isolate additional hidden long-term effects stemming from human activities, and these effects may manifest in an over-or under-estimation of β derived at individual sites.Notably, certain sites (e.g., DE-Obe, CH-Dav, FI-Let, DE-Hai) exhibit negative β values consistently throughout the growing months.Following by carefully checking the data and our statistical method, we propose three hypotheses for the occurrence of negative and potential over-or under-estimations of β: (a) Disturbance or managements.For instance, the negative β identified at the "CH-Dav" site may be associated with a disturbance event, specifically a harvest conducted in the year 2006.Similarly, the thinning activity at the "FI-Let" site in 2016 induced a declining of GPP trend, leading to a negative β estimate throughout the year.(b) Forest age.The forest at the "DE-Hai" site, for example, is considered of old age (around 150 years) and is recovering from a severe drought.(c) Biases introduced by our method.Although the method has been thoroughly tested with representative synthetic data (Figure 2), we acknowledge that the method has limitations and do not perfectly capture the GPP response to CO 2 , thus introducing errors.Conversely, other drivers, mainly nitrogen deposition (N dep ) at nitrogen-limited sites (De Vries et al., 2006, 2014;Sutton et al., 2008) or forests undergoing succession (Pugh et al., 2019), can induce a long-term increase of GPP, potentially resulting in an overestimation of β.Using N dep data at daily scale, we test its influence and incorporate N dep into the GPP residual method following two different strategies, (a) by including N dep as a predictor of GPP alongside climate variables (Figures 1b and 1c); (b) by accounting for the cumulative effect of N dep in the GPP residual as a long-term factor, and assessing its separability from the CO 2 signal using a linear regression model (Supplementary Text S1 in Supporting Information S1).The β estimated in the first strategy (Figure S3 in Supporting Information S1) exhibits no significant difference from Figure 3, indicating that the N dep effect is negligible at short-term scales.The response to N dep change is likely a long-term and cumulative phenomenon and might even show some degree of hysteresis, associated with the slowturnover pools of nitrogen (N) in wood and soil (Gilliam et al., 2019), and the redistribution of N within the plantsoil system (Gruber & Galloway, 2008).In the second strategy, that is, incorporating the cumulative N dep effect, we do not arrive at definitive results (details in Supplementary Text S1 in Supporting Information S1), as the uncertainties of the cumulative N dep effect and its time scales rooted in uncertainties of the nitrogen balance within the ecosystems, are too high (Davies-Barnard et al., 2020;Galloway et al., 2004;Sokolov et al., 2008;Zaehle & Dalmonech, 2011).In Section 3.3, we further investigate whether one could use spatial heterogeneity to estimate the nitrogen deposition effect and analyze spatial variations in β and their relationships with patterns of N dep .
Overall, these trends in the potential other drivers are rather site-specific and vary in magnitude and even sign, while CO 2 is rising uniformly with no distinct spatial signature.With a well-distributed and sufficiently large sample, the effects of other drivers may offset each other, so that the median value of β across all sites predominantly reflects the widespread CO 2 fertilization effect.This notion is supported by excluding known disturbance sites (e.g., forest thinning) from the analysis, resulting in a median β (2.96 ± 0.99 gC m 2 yr 1 ppm 1 ) that does not significantly differ from the β estimated using all sites (Figure 3b).
The daily maximum NEE (NEP max ) provides insights into the peak photosynthetic activity of the ecosystem during optimal conditions within a day.It is valuable for understanding the ecosystem's contribution to carbon sequestration.In addition to GPP, we identify the CO 2 fertilization effect on NEP max as 9.45 ± 2.38 gC m 2 yr 1 ppm 1 (or 17.16 ± 3.69% per 100 ppm).The temporal and spatial variation of β in NEP max is consistent with β in GPP (Figures 3 and 4), adding additional observational evidence of the CO 2 fertilization effect for better understanding of the global carbon cycle dynamics.
Next, we assess the robustness of our findings by testing multiple regression methods in estimating the GPP sensitivity to climatic changes, and evaluate their statistical performance.Testing a multivariate linear regression instead of a random forest regressor, we find that the median β yields a slightly different estimate of 2.99 ± 0.94 gC m 2 yr 1 ppm 1 (Figure S4 in Supporting Information S1).If we however apply the multivariate regression model without accounting for confounding drivers of rising CO 2 and climatic changes (Methods), the median β is notably lower and amounts to 2.38 ± 0.87 gC m 2 yr 1 ppm 1 (Figure S5 in Supporting Information S1).We utilize the "Out-of-Bag" (OOB) score to estimate the performance of the random forest regressor on unseen data without the need for a separate validation set (Methods).Although there are instances of relatively low OOB score at certain sites and months, no clear relationship emerges between estimated β values and model performances (Figure S6 in Supporting Information S1).

Exploring the Spatial Variation of the CO 2 Fertilization Effect
We further explore the spatial variability in estimated β. Thereby, we assess the roles of plant functional types (PFTs), forest age, temperature, VPD, maximum LAI and N dep .Past studies have found stronger stomatal responsiveness to changes in CO 2 in deciduous trees versus conifers (Brodribb et al., 2009;Klein & Ramon, 2019;Medlyn et al., 2001;Saxe et al., 1998), although variability exists when assessing their different responses in photosynthesis (Saxe et al., 1998).Overall, we find a greater enhancement in GPP in evergreen needle-leaved forest (ENF) in response to increasing CO 2 (Figure S7 in Supporting Information S1), compared with DBF.
The difference in GPP responses to increasing CO 2 across PFTs may vary with scales, or complex environmental conditions (e.g., under stress or not).Future work may focus on this difference in more detail.An open question is whether mature forests, which may be approaching a quasi-equilibrium state, are responding to CO 2 and climate in the same fashion as younger stands (Kira & Shidei, 1967;Luyssaert et al., 2008;Odum, 1969).We find no significant relationship between β and forest age, but we show a tendency of GPP enhancement to decline toward older stands (Figure S8a in Supporting Information S1).Theoretically, the enhancement in GPP would relate to differences in growing season temperature and VPD, with greater enhancement at warmer growth temperatures (suppression of photorespiration; Baig et al., 2015).However, we find no significant but slight negative trend in the relationship between β and temperature as well as VPD (Figures S8b and S8c in Supporting Information S1).This trend may be attributed to the combined impact of temperature and VPD.We find a positive tendency of β with increasing site maximum LAI (Figure S8d in Supporting Information S1), which could be an interaction between rising CO 2 and growth phase (i.e., regrowth).Photosynthetic carbon sequestration is expected to increase with elevated N dep , since nitrogen is considered to be usually the limiting nutrient in forests (De Vries et al., 2006).Both N dep and CO 2 effects are supposed to simultaneously act as long-term stimulator of GPP.Given that N dep and CO 2 are often correlated over time, it is challenging to isolate one from the other (Supplementary Text S1 in Supporting Information S1).We analyze the site-level GPP response to increasing CO 2 β against the N dep data derived from models (Methods).If a discernible relationship emerges, showing higher β values at higher N dep levels, or vice versa, it suggests that the β estimation might be largely biased by the nitrogen deposition effect and misattributed to increasing CO 2 effect.We find no significant relationship between estimated β and site-mean N dep across all sites excluding disturbed sites (Figure S9 in Supporting Information S1).This analysis suggests that a misattribution, if existent, appears to be negligible.However, we acknowledge that the disparity between local and modeled N dep , stemming from intricate deposition processes (Lamarque et al., 2013), may introduce biases into the comparison between site-level estimation and gridded data sets simulated by models.Moreover, while the relationship between nutrient availability and GPP is highly complex, evidence suggests that higher nutrient availability may be more beneficial for woody biomass increase rather than GPP in forest ecosystems (Vicca et al., 2012).

Comparing the CO 2 Fertilization Effect Inferred From Eddy Covariance Sites and TRENDY Ensemble
We compare our EC based β estimates with an ensemble of twelve process-based global dynamic vegetation models (DGVM) following the TRENDY simulation protocol (Le Quéré et al., 2018;Sitch et al., 2015).The TRENDY ensemble consists of four experiments (Table S3 in Supporting Information S1): the pre-industrial control run (S0), the run considering only CO 2 changes (S1), the run considering CO 2 and climate change forcings (S2), and the latter with additional prescribed LULCCs (S3).To conduct the comparison, we extract model time series from the individual grid-cells containing the 38 considered eddy-covariance sites.The modeled CO 2 fertilization effect inferred by calculating the difference β S1-S0 , exhibits a large spread among the TRENDY models.The median β S1-S0 across the grid-cells and models is 2.07 gC m 2 yr 1 ppm 1 , which is remarkably close to the median β obtained through the GPP residual method using EC records.Except for the ISBA-CTRIP and the CLM5.0 models, all of the other 10 models exhibit lower value for β than the median β estimated from observations (Figure 5).As different models and ecosystems have different baseline levels of productivity, we also compare the standardized β, which is the relative change in GPP in response to increasing CO 2 (β relative ; Methods).We find that seven models fall within the bootstrapped uncertainty range of the median β relative estimated from observations (Figure S11 in Supporting Information S1).We acknowledge the limitation of this comparison, as it involves contrasting site-level estimates with grid-level results, which is influenced by the heterogeneity within each grid-cell.Nevertheless, we argue that the median values across the sites and grid-cells provide a more aggregated perspective that helps mitigate the influence of sub-grid heterogeneity.
Other factors, such as nitrogen deposition, disturbances, and particularly land management, can influence ecosystem productivity as recorded in EC data.These factors potentially influence the estimation of β using the GPP residual method.To assess the effect of LULCCs on β estimation, we compare β derived from the TRENDY S2 and S3 simulations using the GPP residual method.The ensemble mean of β S2 closely aligns with β S3 (Figure 5), suggesting that the neglected effects of LULCCs do not substantially affect the β estimation.Furthermore, in line with testing the method using QUINCY simulations, the GPP residual method tends to slightly overestimate β when comparing β S2 with β S1-S0 derived from TRENDY.This further emphasizes that the method cannot account for long-term vegetation acclimation and phenological changes; however, these effects are minor within the considered time period.

Influence of Climatic Changes on Plant Productivity Throughout the Season
Conventionally, the γ factor is defined as the sensitivity of land carbon storage to climate variations using temperature change as the proxy (Arora et al., 2020;Friedlingstein et al., 2006;Gregory et al., 2009).Analogously, we define γ as the change in the climate-driven GPP component over temperature change, which is already obtained in the GPP residual method (Methods).The median γ in GPP estimated from the EC data set is 39.44 ± 4.96 gC m 2 yr 1 K 1 (Figure 6) or 1.78 ± 0.16% per Kelvin degree.Comparing this to the sensitivity of GPP to CO 2 , assuming a 100 p.m. increase in atmospheric CO 2 concentration is roughly equivalent to 1 K degree temperature increase in the historical period, we find that γ is considerably lower than β, in line with previous studies (Chen et al., 2022;Fernández-Martínez et al., 2017).The median γ in NEP max ( 17.33 ± 12.15 gC m 2 yr 1 K 1 ) is much lower than γ based on GPP, reflecting a negative response of NEP max to temperature variations, particularly at the peak of the growing season (Figure S12 in Supporting Information S1).The median γ estimated from the TRENDY ensemble (S2 simulations) is 13.62 ± 2.46 gC m 2 yr 1 K 1 (Figure S13 in Supporting Information S1).Also, γ exhibits a large spread among models compared to γ from EC, suggesting a more pronounced uncertainty in the process representation in estimating ecosystem responses to climate changes among the land surface models (Figure S14 in Supporting Information S1).A clear seasonality of γ emerges in both observations and models (Figure 6, Figures S12 and S13 in Supporting Information S1).While γ is higher at the beginning and the end of the growing season for most of the sites, most sites show negative γ in at least one month of the growing season (26 out of 38 in the EC estimated GPP; 36 out of 38 in the EC measured NEP max ; 33 out of 38 in the TRENDY ensemble mean).Our results indicate that warming may have a positive effect on vegetation productivity at colder conditions at the shoulder seasons and a potential negative effect in summer around the peak of the growing season.High temperatures are usually accompanied by a high VPD, down-regulates the stomatal conductance and limits plant productivity (Novick et al., 2016;Park Williams et al., 2013).On the other hand, plant productivity can be stimulated by increasing temperature at conditions when water is not limiting the ecosystem functioning (Fernández-Martínez et al., 2019).These dynamics highlight the complexity of climate-vegetation interactions and underscore the need for further research to fully understand their implications for ecosystem functioning and carbon dynamics.Overall, we recognize the inherent limitations in EC-based data acquisition, the assumptions of the GPP residual method, and the potential influence of other long-term factors such as human activities, which can introduce biases in the estimation of β and γ from observations.Despite these challenges, the GPP residual method is tested using synthetic data from a land surface model.The seasonality of β and γ aligns with vegetation functions described in previous sections.Furthermore, utilizing the TRENDY simulations, we demonstrate that the discrepancy in β estimation, influenced by LULCCs, remains within an acceptable range.

Conclusions
Our study isolates a robust, multi-decadal enhancement in vegetation productivity ( β GPP = 3.18 ± 0.92 gC m 2 yr 1 ppm 1 or 16.40 ± 4.27% per 100 ppm, β NEP max = 9.45 ± 2.38 gC m −2 yr −1 ppm −1 or 17.16 ± 3.71% per 100 ppm) across Northern Hemisphere forests in response to the rising atmospheric CO 2 concentration.We further diagnose the median value of GPP sensitivity to temperature (γ; as proxy for climatic changes) of 39.44 ± 4.96 gC m −2 yr −1 K −1 or 1.78 ± 0.16% per Kelvin degree, and find evidence of a negative effect of temperature on photosynthesis at the peak of the growing season.Assuming a 100 p.m. increase in CO 2 concentration is equivalent to 1 K degree temperature increase, the negative temperature effect at summer months potentially masks the positive increasing CO 2 effect on GPP.The average values of the TRENDY ensemble range at a lower β and a lower γ compared with those inferred from EC records.However, there is a large spread in β and γ among the various land surface models.Further research should focus on identifying the origins of this intermodel spread and testing methods where process formulations can be directly informed by the growing number of observations (e.g., ElGhawi et al., 2023).This study paves the way for future investigations into long-term drivers of change and ecosystem functioning.We anticipate that our approach could be readily applied to other ecosystems (e.g., drylands), other data sets (long-term satellite records of change, i.e., vegetation greenness, etc.), and other variables that describe ecosystem function (e.g., evapotranspiration).This study, alongside the growing body of studies determining large-scale ecosystem responses to rising CO 2 in observational data sets, build the foundation to constrain the future land sink of anthropogenic carbon and thus the remaining carbon budget.

Figure 1 .
Figure 1.Schematic of the statistical GPP residual method to isolate the CO 2 fertilization effect and climatic effect in observational data of GPP.(a) Hypothesis.The overall goal of the GPP residual method is to isolate the CO 2 fertilization effect on GPP by removing long-term climate effects inferred from short-term variability.(b) Data preparation.All the time series of climate variables and GPP are detrended and individually rescaled to the long-term mean of each variable.The black lines denote the actual time series for each variable, and the red line denotes the detrended time series.(c) A random forest model or multivariate linear regression model is trained to learn the sensitivity of GPP to the climate variables based on the detrended time series, that is, interannual variability.(d) The trained model predicts the long-term changes in GPP caused by climate changes using the original time series of climate predictors, including the long-term trend.The non-climate-induced effect on GPP is therefore estimated from the residual of absolute GPP minus climate-induced GPP, shown as the red line.

Figure 2 .
Figure 2. Testing the GPP residual method with QUINCY model simulations.(a) Seasonal variation of β across vegetation types estimated by the GPP residual method with a random forest model in red and QUINCY in black (166 synthetic sites with 1220 site-months).The red and black shaded area depicts one standard deviation around the mean value of β across multiple site-months (solid lines).(b) The map shows the root mean square error (RMSE) between estimated β and modeled β in the growing season for each site in the QUINCY model.Brighter color indicates lower bias and thus a better performance of the GPP residual method.

Figure 3 .
Figure 3. Estimation of the CO 2 fertilization factor β from eddy covariance (EC) data using the GPP residual method with a random forest model.(a) Plot showing the estimated β for each EC site across months in the growing season.The size of circles represents the magnitude of monthly baseline GPP.Sites are shown in descending order of the annual mean β (Methods).Site-codes marked by a star are presented separately at the end of the list, indicating that disturbances have been recorded at those specific sites.Site-codes shown in blue and black color locate in Europe (c) and North America (d).(b) The 10%-90% distribution of β values shown in panel (a).The gray (yellow) vertical dashed lines denote the median β (Methods) estimated from all sites and months (excluding the disturbed sites).The gray (yellow) shaded area indicates the bootstrap estimates for the uncertainty of median β from all sites (excluding the disturbed sites).Maps (c) and (d) display the annual mean β values at each site.

Figure 4 .
Figure 4. Estimation of β from eddy covariance data using the GPP residual method with a random forest model.Analogous to Figure 3 but β is estimated for NEP max .

Figure 5 .
Figure 5. Comparing β estimated from eddy covariance (EC) data and the TRENDY model ensemble.The medians and interquartile ranges of β are shown for each model and for the ensemble mean, as horizontal lines within the boxes, and the upper and bottom lines of the box, respectively.Each box includes grid-cells containing the 38 considered EC sites.Box plots for individual models are in an ascending order based to the median β.The dotted red line represents the median β derived from EC records (as shown also in Figure 3), with the uncertainty showing in shaded area.

Figure 6 .
Figure 6.Estimation of γ from eddy covariance (EC) data set using the GPP residual method with a random forest model.(a) Plot showing the estimated γ for each EC site across months in the growing season.The size of circles represents the magnitude of monthly baseline GPP.Sites are shown in descending order of the annual mean γ (Methods).Site-codes marked by a star are presented separately at the end of the list, indicating that disturbances have been recorded at those specific sites.Site-codes shown in blue and black color locate in Europe (c) and North America (d).(b) The 10%-90% distribution of γ values shown in panel (a).The gray (yellow) vertical dashed lines denote the median γ (Methods) estimated from all sites and months (excluding the disturbed sites).The gray (yellow) shaded area indicates the bootstrap estimates for the uncertainty of median γ from all sites (excluding the disturbed sites).Maps (c) and (d) display the annual mean γ values at each site.

Table 1
Estimation of CO 2 Effect on Gross Primary Productivity (Unit: gC m 2 yr 1 ppm 1 ) From Eddy Covariance Sites Across Previous Studies Thum et al., 2019), which has been evaluated against a subset of FLUXNET sites across large geographical ranges and different ecosystem types, to test the GPP residual method.The use of model experiments works as an integral part of our iterative development process, guiding refinements and ensuring that the method is well-prepared for subsequent application in real-world ecosystems.We perform two simulations with identical climate but varying CO 2 concentrations (transient CO 2 as observed, and constant CO 2 at levels of the year 1988) at 166 forested sites randomly distributed across the globe.The 166 sites are synthetic sites in the QUINCY model world that, unlike real FLUXNET sites, are well distributed across different biomes and climate zones.We use this comprehensive collection of simulated QUINCY model sites to provide the proof-of-concept for our statistical.The model setup and model simulations are identical with the "freeze-CO 2 experiment" in Zhan