Improved process representation of leaf phenology signiﬁcantly shifts climate sensitivity of ecosystem carbon balance

,


Introduction
Terrestrial ecosystems play a critical role in the Earth's climate system due to their varied couplings to and feedbacks between carbon, water, and energy with the atmosphere.Improving our ability to quantify and predict the response of terrestrial ecosystems to climate is essential to advancing our understanding of these feedbacks and predicting future climate change (Booth et al., 2012).Despite their importance, there remains considerable uncertainty in our understanding of the terrestrial carbon cycle, undermining our ability to make accurate predictions of future carbon-climate feedbacks (Friedlingstein et al., 2014;Huntzinger et al., 2017;Piao et al., 2013).
Vegetation phenology is a key component of terrestrial ecosystem dynamics as it is directly linked to key processes in the carbon, water, and energy cycles (e.g.photosynthesis, autotrophic respiration, evapotranspiration, and surface albedo), making it an area of focus in understanding the climate response of ecosystems.Phenology refers to the timing of periodic events in plant development such as reproduction, bud burst, canopy senescence, activity-dormancy cycles, and carbon allocation.
Quantitatively, the leaf area index (LAI) represents the one-sided surface area of leaves per area of ground surface.LAI is a cornerstone biophysical quantity for monitoring vegetation phenology, as it can be observed globally from space, and for representing the canopy in terrestrial biosphere models (TBMs), a key component of Earth system models (Sellers et al., 1997).LAI mediates the canopy interception of radiation, thus it directly controls processes such surface albedo and the rates of photosynthesis and transpiration.Indirectly, LAI also has significant impacts such as influencing how much precipitation reaches the soil surface altering plant available water and evaporation from the soil and canopy surfaces.It also represents the mass of foliar carbon in the canopy, coupling these processes to the cycling of carbon within the plant and litterfall that supplies carbon to the soil (Richardson et al., 2013).
A variety of concepts have been used to represent LAI dynamics in TBMs, including ecohydrological equilibrium (Yang et al., 2018), optimality principles such as the maximization of plant net carbon gain (Caldararu et al., 2014;Manzoni et al., 2015), direct carbon-supply (Xin et al., 2020), demand for growth (Schiestl-Aalto et al., 2015), and approaches that consider climate and biophysical controls more directly (Jolly et al., 2005;Knorr et al., 2010;Stöckli et al., 2008).Mechanistic modeling approaches are lacking, a reflection of our limited fundamental understanding of processes such as bud burst, leaf longevity, and canopy senescence and their variability across species (Cooke et al., 2012).Observations have shown that the dynamics of LAI are correlated with environmental variables (Clelend et al., 2007); in particular temperature, water availability, and photoperiod (Delpierre et al., 2016;Iio et al., 2014;Richardson et al., 2013), although variability also occurs within and across species for a given climate (Cole and Sheldon, 2017;Marchand et al., 2020).In the absence of mechanistic understanding, it is important that model formulations are generalized and calibrated to available observations ::::::::::::::::::::::::: (e.g.Wheeler and Dietze, 2021).Therefore, many TBMs use semi-empirical representations of LAI which depend upon an understanding of these correlations.While many of these models have shown fidelity in representing LAI dynamics (e.g.Stöckli et al., 2008), vegetation phenology remains a large source of uncertainty in models and is therefore an ongoing area of research (Migliavacca et al., 2012;Richardson et al., 2012;Seiler et al., 2022).
In the context of carbon-climate feedbacks, it is critical to understand the role of LAI phenology in mediating the carbon cycle, particularly net ecosystem exchange of carbon (NEE) (Richardson et al., 2012).However, few studies have investigated whether a more complex model representation of LAI can actually improve predictions of NEE or how these improvements affect the sensitivity of the terrestrial carbon balance to climate.These additional steps are needed to evaluate how the representation of specific processes in models ultimately affect the integrated response of NEE to climate (Fisher and Koven, 2020).
Neglecting these steps runs the risk of biased predictions of future carbon-climate feedbacks.
In this study we use a Bayesian :::::::: parameter : data assimilation system to generate a data-informed representation of LAI, its coupling to NEE, and the climate sensitivity of both LAI and NEE.Data assimilation or model-data fusion (MDF) provides a framework for systematically combining observations with a model (Rayner et al., 2019).For understanding a complex system like the terrestrial carbon cycle, MDF is a useful approach to improve model performance and mechanistic understanding by constraining the diverse set of processes, and their interactions, contributing to carbon exchange (Fisher and Koven, 2020;MacBean et al., 2016).
For the present study, we consider two key aspects of model uncertainty: (i) that the model formulation must accurately represent the main processes that govern LAI, NEE, and their response to climate (Schwalm et al., 2019), and (ii) that the parameters of the model must be appropriately assigned (Prentice et al., 2015).For a given model, MDF can provide a parameterization that is statistically consistent with observations and their uncertainties.When applied equally across multiple model structures, MDF can be used to evaluate different model structures and their impact on the data-informed processes (e.g.Famiglietti et al., 2021).Here, we investigate two model formulations for LAI and perform MDF at six flux tower sites distributed across diverse ecosystems from the tropics to high-latitudes (Baldocchi, 2008).We implement a prognostic, climate-sensitive LAI model into a TBM and benchmark this against an empirical diagnostic LAI model used in a previous version of the same TBM (Bloom and Williams, 2015;Quetin et al., 2020).We constrain both models using multiple observations of carbon states and fluxes, and then use these data-informed models to infer the climate sensitivity of NEE.The main objectives of this study are to: (i) Investigate the impact of a more complex process representation of LAI on predicting LAI and NEE dynamics in a MDF system, (ii) evaluate the impact of a more complex process representation of LAI on inferring the processes underlying NEE: GPP and R e , and (iii) evaluate how a change in LAI process representation affects the climate sensitivity of the terrestrial carbon cycle at seasonal and annual time-scales.

Study Sites
The study is focused at six sites distributed from the tropics to high-latitudes (Fig. 1) that are part of a global network of eddy covariance flux sites, FLUXNET (Pastorello et al., 2020).These sites cover a range of climate zones and phenological strategies (Table 1), allowing for more robust model evaluation and climate sensitivity analysis in the global context.Following Famiglietti et al. (2021), we selected these sites based on the following criteria: (i) meteorological forcing data availability, (ii) observational data availability including repeat woody biomass observations and eddy covariance measurements of carbon dioxide and water vapor fluxes, (iii) temporal coverage of at least ten years, (iv) no intensive human management (e.g., agriculture or logging), and (v) vegetation is dominated by C3 photosynthetic pathway.

Model-data fusion
To quantitatively evaluate the impacts of the process representation of LAI on the net carbon balance and its climate sensitivity, we utilized the CARbon DAta-MOdel fraMework (CARDAMOM, Bloom and Williams, 2015;Bloom et al., 2016).
CARDAMOM is a Bayesian MDF system, used to retrieve time-invariant parameters and initial conditions for the Data Assimilation Linked ECosystem model (DALEC) and has been used widely as a diagnostic tool to infer stocks and fluxes of carbon  and water (Bloom et al., 2020;Quetin et al., 2020;Yin et al., 2020;Smallman et al., 2021).CARDAMOM has the capability to 90 assimilate a diverse range of observations (Bloom et al., 2020;Famiglietti et al., 2021) and shows comparable performance to more complex terrestrial biosphere models when it is constrained by observations (Quetin et al., 2020).It has advantages over other MDF frameworks as it does not rely on definitions of plant functional types or on steady-state assumptions.
CARDAMOM is also capable of utilizing different DALEC model formulations (Famiglietti et al., 2021).A number of DALEC model versions have been developed for various purposes (Williams et al., 2005;Famiglietti et al., 2021).Recent

95
developments have incorporated more processes, as CARDAMOM is increasingly being used to diagnose climate sensitivity of terrestrial carbon and water cycles (Bloom et al., 2020;Ge et al., 2022;Smallman and Williams, 2019;Yang et al., 2022).
With this in mind, we implemented a climate-sensitive LAI phenology model into a version of DALEC, which is described further below.

Observations and Model Forcing
Multiple types of observations were used to constrain the processes relevant to LAI, NEE, and their interactions.This helps prevent over-fitting and ensures a consistent view of the terrestrial carbon cycle is achieved between model and data (Kaminski et al., 2013).Observations used include monthly LAI, monthly NEE, and annual woody biomass.Observations of LAI were retrieved from the Earth Observation Copernicus 1 km gridded product over each site, which includes a time-varying uncertainty estimate that we utilized (Fuster et al., 2020;Verger et al., 2014).Observations of NEE using the eddy covariance technique are from the FLUXNET2015 database (Pastorello et al., 2020).The time-varying uncertainty estimate for NEE was based upon propagating instrumentation error (0.58 g C m −2 d −1 Hill et al., 2012) and temporal aggregation due to missing sub-monthly time steps.Uncertainty due to temporal aggregation was estimated based on site specific statistical models derived from sub-sampling time periods without missing values.Aggregation and instrumentation uncertainty was then combined assuming uncertainties are fully correlated.The in-situ woody biomass observations were converted into estimates of above-and below-ground biomass (ABGB) using allometric scaling based on principle investigator advice at each site.Further details on all of the observations used can be found in Famiglietti et al. (2021).
The study period at each site ranges from 11 to 16 years (Table 1).The data was split into two periods, a training window (calibration) and a prediction window (validation).The first five years was used for calibration and the remaining data was used for validation.Climate forcing data for the model consisted of downward shortwave radiation, air temperature (average 2 m minimum and maximum), precipitation, vapor pressure deficit, and atmospheric carbon dioxide concentration.
To support model evaluation, we used gross primary productivity (GPP) and ecosystem respiration (R e ) fluxes from the FLUXNET partitioned NEE (Pastorello et al., 2020).We selected GPP and R e estimates derived using night-time partitioning, i.e.R e is determined as a night time R e fitted to a function of temperature extrapolated into the day time, thus, GPP is estimated as the residual between R e and NEE.This data was withheld from the model calibration step, thus providing a stringent metric of model skill in representing the processes governing carbon cycling.There are three primary motivations for withholding this data for validation: (i) These are model-based products, (ii) the NEE observations and the GPP and R e estimates are not wholly independent, and (ii) we only assimilate observations that can be produced directly from Earth observations to permit global application of this framework in future work.

Model Description
The DALEC model version used here has been fully described elsewhere (Bloom and Williams, 2015;Bloom et al., 2016;Quetin et al., 2020;Yang et al., 2022;Yin et al., 2020).We describe, in brief, the representation of the carbon cycle in DALEC.
Full description of the water balance can be found in Bloom et al. (2020), which includes a plant-available and a plantunavailabale soil water pool.Following this, we describe the two separate implementations of LAI phenology used in this study that are linked to same representation of carbon and water cycles.Common model parameters between the two models are shown in Table A1.

Carbon Balance
The carbon cycle in DALEC consists of six carbon pools (labile, foliar, wood, fine root, litter, and soil) and simulates pool transfers using ordinary differential equations.The NEE of an undisturbed ecosystem is calculated as: ::::::::::::::::::::::::::::::::: (1) Where GPP is the gross primary productivity, R h is the heterotrophic respiration from litter and soil carbon, R a is the autotrophic respiration, and R e is the ecosystem respiration representing the sum of R h and R a .The representation of GPP is based on the Aggregated Canopy Model (Williams et al., 1997, ACM), with the specific implementation described in (Bloom et al., 2016).The ACM model is a parsimonious approach for representing GPP with calibration (Williams et al., 1997).It requires inputs of temperature, carbon dioxide concentration, downward shortwave radiation, and LAI.The GPP simulated by ACM is then scaled by a soil moisture limitation factor that is calculated using the plant available soil water and a parameter for the wilting point.

LAI Phenology Models
The focus of this study is upon the representation of LAI phenology.We implemented a new model for LAI phenology into DALEC, based upon the model of Knorr et al. (2010).As a benchmark, we also utilized a diagnostic LAI phenology model commonly used in CARDAMOM studies, the Combined Deciduous-Evergreen Analytical model (CDEA).These two DALEC model formulations are denoted DALEC Knorr and DALEC CDEA , respectively.

CDEA Model
The CDEA model is a relatively simple model for simulating the phenology, growth, and turnover of LAI, with full descriptions described elsewhere (Bloom and Williams, 2015;Famiglietti et al., 2021;Quetin et al., 2020), with the specific formulation the same as that of Bloom et al. (2020).In brief, the CDEA model computes leaf onset and leaf fall factors that govern the flux of carbon from the plant labile carbon pool (C lab ) to the foliar carbon pool (C fol ) and flux of carbon from C fol to the litter carbon pool (C lit ), respectively.Carbon can also be supplied directly from GPP via a fractional allocation parameter.The leaf onset and leaf fall factors are based on a day-of-year approach that govern the timing of phenological events, including peak day of year for labile turnover (supporting leaf growth) and for foliar turnover (controlling litterfall).A parameter that governs the leaf longevity determines how much of the canopy is turned over each year.The CDEA model is a relatively simple and generic representation of LAI phenology and consists of eight parameters as well as two initial condition parameters for C lab and C fol (Table B1).There is no representation of direct environmental control on the timing of phenological events (e.g.spring onset, fall senescence), hence these are fixed from year to year.However, environmental effects on GPP can propagate through to LAI via changes in carbon allocation, thus allowing for the magnitude of LAI to be indirectly sensitive to climate via changes in carbon supply.

Knorr Model
The new LAI phenology model implemented into CARDAMOM is a development of the model by Knorr et al. (2010).This is a prognostic model governed by environmental constraints on the timing and growth of the canopy.Full description of the model can be found in Knorr et al. (2010).Here, we briefly describe the model and its novel implementation in DALEC which includes coupling to the carbon and water cycles.The Knorr LAI model as implemented in DALEC consists of ten parameters as well as four initial condition parameters (Table C1).
The prevailing understanding of the dynamics of LAI across global biomes is that there are three primary environmental controls: temperature, photoperiod, and water availability (Richardson et al., 2012).The Knorr model considers all three as potentially limiting factors, with the specific dynamics governed by climate and model parameterization.In addition, we couple LAI phenology to the plant carbon balance and incorporate a function for carbon supply limitation on LAI growth thus providing a fourth potentially limiting factor.

Representing Activity-Dormancy Triggers in a Population
A common approach to modeling leaf phenology is to use one or more growth triggers (or thresholds) that transition a plant into or out of an active growth state.This is problematic as it is often modeled using a discrete, binary formulation, which makes these functions unrealistic when representing a population of individuals (e.g.within a model grid cell).In reality, plants within a given population do not reach these thresholds simultaneously (Cooke et al., 2012), thus a distribution of threshold values to represent the population of individuals is more realistic, and this is likely to result in a relatively smooth transition toward the new growth state when integrated over the population.A discrete formulation is also non-differentiable, which is problematic for derivative-based MDF techniques.Knorr et al. (2010) developed a convenient solution to this problem by representing threshold parameters with a normal distribution in space.
Two temperature thresholds, one for temperature and one for photoperiod, are each represented by a cumulative normal distribution function (Φ).The multiplication of these two cumulative normal distributions gives the fraction of individuals within the population that are in an active growth state, f : f , as: Where T represents the air temperature memory, analogous to the growing degree days concept (Eq.20, Knorr et al., 2010), T φ is a parameter representing the mean temperature threshold for leaf onset, T r is a parameter representing the one sigma spatial range of T φ , t d is the day length, t c is a parameter representing the mean day length threshold for leaf senescence, t r is a parameter representing the one sigma spatial range of t c .

Temporal Dynamics of LAI
The temporal evolution of LAI is represented by the following ordinary differential equation: :::::::

Coupling to the Carbon Balance
While the fundamentals of the Knorr model are grounded in biophysical concepts for activity-dormancy of individuals in a population of plants, it only predicts the net change in LAI.Coupling LAI dynamics to the carbon cycle requires additional assumptions which were not defined in the original model description (Knorr et al., 2010).First, in both DALEC CDEA and DALEC Knorr , LAI is related to C fol via a parameter for the leaf mass of carbon per area, LMA, as follows: Second, we must consider that inputs to C fol come from plant carbon allocation and outputs go to C lit , with the rate of change in C fol represented by: :: Where F C,lab2fol :::::::: F C,lab2f ol : represents the flux of carbon from C lab to C fol , and F C,fol2lit :::::::: The carbon supplied via net primary productivity (i.e.GPP -R a ) into C lab provides the substrate to grow new leaves.To represent F C,lab2fol :::::::: F C,lab2f ol : we consider both the supply and demand of carbon for new foliar growth.The supply of labile carbon is the sum of new labile production at time t (Eq.C1) and C lab at end of the previous time step (expressed as a flux over the time step, ∆t), represented by:

) ∆t
::::::::: This formulation implies that the entire C lab pool is available for foliar growth at any given time step which is consistent with findings that C lab does not follow first-order decay kinetics (Martínez-Vilalta et al., 2016).We do not consider constraints on the release of stored labile carbon such as the phloem loading rate (e.g.Trugman et al., 2018).
For the demand for new foliar growth, we make the assumption that there is a gross demand flux of carbon from C lab to C fol when the canopy LAI is in a net growth state.Conversely, when the canopy (LAI) is in a net senescent state, there is zero gross demand flux of carbon from C lab .This is represented by: Where dLAI(t) dt ::::::: is computed from Eq. 3. The actual F C,lab2fol :::::::: F C,lab2f ol : is computed as the smoothed minimum of the supply and demand fluxes as follows: Where υ : ν represents a quadratically smoothed minimum function (see Eq. C2).This formulation ensures that new foliar growth only occurs when carbon substrate is available.
The flux F C,fol2lit , or litterfall flux, :::::::: F C,f ol2lit , : also depends on whether the canopy is in a phase of growth or senescence.
Here, we incorporate an additional term that is necessary to represent litterfall.We note that when the Knorr model is in a fully active growth phase (i.e.f = 1,) which may occur during canopy closure or in evergreen systems, the model (Eq.3) would predict zero LAI loss and hence zero litterfall.Observations across the major global biome types show that litterfall never goes completely to zero (Zhang et al., 2014), as leaves are being turned over constantly at a rate governed by factors such as longevity, herbivory, and disturbance, even if its not evident from ecosystem scale LAI observations (see Albert et al., 2019) :::::::::::::::: (Albert et al., 2019).To overcome this limitation, we add an additional term for loss of LAI via a nominal background turnover rate (θ f ol ).Therefore, the litterfall flux is computed as: This formulation ensures that some litter production occurs regardless of the growth-senescence state of the Knorr LAI model while ensuring conservation of mass.

Optimization Algorithm
Following previous CARDAMOM efforts, we jointly retrieve the probability distribution of DALEC time-invariant process parameters and initial state conditions (henceforth vector x) given the observational constraints (henceforth vector O) using a standard Bayesian inference formulation, where; p(x) is the prior probability distribution of x, and p(O|x) is proportional to the likelihood of x given observations O (L(x|O)).The prior probability of x, p(x) is characterized as the product of (i) a log-uniform prior distribution based on ecologically plausible minimum and maximum values, and (ii) ecological and dynamical constraints (EDCs), where p(x) is equal to 0 if DALEC parameter combinations or simulation outputs meet ecological conditions; these are described in (Bloom and Williams, 2015) and (Bloom et al., 2016).
For a given DALEC run, the likelihood, L(x|O), is defined as: Where L LAI and L NEE ::::: L LAI , :::::: L N EE ::: and :::::::: L biomass : are the model-observation mismatches.Each likelihood term is derived as: Where M i , O i and U i represent the model output, corresponding observation, and uncertainty, which represents the combined effects of model and observation error on model-data mismatch.
To sample p(x|O), we used a Differential Evolution metropolis hastings Markov Chain Monte Carlo (DE-MCMC) algorithm (Ter Braak, 2006) with 200 walkers (Levine et al., in prep) :::::::::::::::: (Levine et al., 2023); in previous efforts we used a adaptive metropolis hastings MCMC (Haario et al., 2001).We found that the two algorithms overall give statistically similar results with comparable run times, however the DE-MCMC algorithm was found to be more stable and less likely to generate chains trapped in local minima.

Parameter Uncertainty Reduction
Following model calibration using CARDAMOM, it is useful to evaluate the constraint that the observations provide upon the model parameters.The prior probability density function (PDF) for the parameters is log uniform between the assumed minimum and maximum prior limits.The posterior PDFs for each parameter are represented by the sub-sampled solutions from the CARDAMOM optimization, hence the posterior PDF can take any form.Uncertainty reduction of the parameters was quantified using the relative change in interquartile range (IQR) from the prior to posterior, given by 100 Note that this is calculated after transforming posterior parameter sub-samples into log space, to ensure consistency with the prior PDFs.The relative change metric is analogous to the relative uncertainty reduction calculated by the change in one sigma uncertainty used in other MDF studies (e.g.Knorr et al., 2010;Norton et al., 2019), but here the PDFs can be non-normal, therefore the IQR provides a simple and more representative metric of the uncertainty without the assumption of normality.

Model Performance
The model-data fit and predictive skill were evaluated using multiple statistical metrics.For the model output we used the ensemble median of CARDAMOM sub-samples at each time step.We used Pearson's correlation coefficient (r) to evaluate to the model skill at capturing the variability, the root mean squared error (RMSE) to evaluate the magnitude of the modelobserved residuals, and mean bias (bias) to evaluate the model prediction bias.The best benchmark for model performance is whether it can predict data outside of the calibration period, therefore all model skill metrics presented are computed over the validation period.
The year-to-year variation, or interannual variability (IAV), in carbon cycle processes is better related to climate-carbon cycle relationships (Piao et al., 2020).Therefore, on top of the monthly variability, we report model skill at capturing IAV in LAI, NEE, GPP, and R e over the validation period.We computed the IAV of the annual means, as well as on a seasonal basis.
For the tropical savanna site, AU-How, we define the annual mean by the sites hydrological year which goes from September to the following August (Hutley and Beringer, 2010), and compute seasonal IAV based on Austral seasons.
Trend analyses were performed using linear regression over the entire simulation period (calibration and validation) to increase temporal coverage.Only months where observations are available were included to ensure a direct comparison between the modeled and observed trend.

Climate Sensitivity
A key aim of this study is to evaluate the climate sensitivity of the carbon cycle following MDF with the two model formulations.We focused on two climatic drivers, temperature and precipitation.Temperature can impact a number of carbon cycle processes including turnover rates of carbon pools and the physiological response of GPP and LAI.Precipitation impacts plant available soil water, W , and evapotranspiration, E. Hence, precipitation can impact GPP in both model formulations via a soil moisture factor, and Knorr LAI directly via the balance between W and E. We also computed the climate sensitivity to vapor pressure deficit, but this sensitivity was found to be multiple orders of magnitude smaller than temperature and precipitation so we did not include it in the analysis.
We used the finite difference method to compute the intrinsic climate sensitivity of LAI and NEE to precipitation and temperature.All simulations were performed using the forward model, M , and CARDAMOM posterior parameter set, p opt .
First, the model was run using the prescribed forcing data (F ) and p opt to generate the control simulation.Second, we perturbed the precipitation and temperature forcing data (denoted F ), independently, over the entire simulation period and ran the model forward to generate the perturbed simulations.The size of the forcing perturbation needs to be sufficiently small to avoid a non-linear response in the model.For the precipitation (P ) perturbation we used δF = δP = 1 × 10 −8 mm d −1 and for the temperature (T ) perturbation we use δF = δT = 1 × 10 −5 • C (applied to both the minimum and maximum air temperature).
With the control and perturbed simulations, we computed the derivative of the model output, LAI and NEE, with respect to the climate forcing variables, precipitation and temperature, by: Where X represents the model output (LAI or NEE) and F represents either precipitation or temperature.This gave a time-series of the intrinsic precipitation and temperature sensitivities of LAI and NEE.We then decomposed these intrinsic sensitivities into (i) seasonal sensitivity by computing the monthly climatology over all simulation years, and (ii) average annual sensitivity by computing the average over the last n years of the simulation period.
To compare and evaluate the relative strength of precipitation sensitivity and temperature sensitivity, it was necessary to normalize the intrinsic sensitivities to a common unit.To do this, we scaled the intrinsic sensitivities by the respective climate variability.For the seasonal sensitivity analysis we multiplied the monthly average intrinsic sensitivity by the monthly interannual variability, computed as the standard deviation of each month in the simulation period.For the annual sensitivity analysis we multiplied the annual average intrinsic sensitivity by the interannual variability, computed as the standard deviation of the annual mean temperature or annual total precipitation.This is calculated using: Where S F X provides a measure of the climate sensitivity, S, of quantity X to the variability in forcing F .This generated four sensitivity metrics, two for LAI (S T LAI , S P LAI ) and two for NEE (S T N EE , S P N EE ), which are evaluated on seasonal and annual time-scales.
3 Results and Discussion

Model-Data Fit
The time-series of LAI and NEE at each site is shown in Fig. 2 for the calibration and validation periods, including the two models and the observations.This shows the CARDAMOM posterior PDF for both models at each site.Predictive skill over the validation period (r, RMSE, bias) at each site and for all site data combined is summarized in Fig. 3  Predictive skill for LAI (Fig. 3) shows that DALEC Knorr captures a larger proportion of the variability at one site (FR-Pue) and less at three sites (FI-Hyy, FR-LBr, GF-Guy), while both models perform similarly well at the remaining two sites This suggests that DALEC CDEA can produce unrealistic trends in both LAI and NEE, whereas DALEC Knorr tends to be more stable at these time-scales.
Evaluation of the model-data fit to interannual variability (IAV) of LAI and NEE at annual and seasonal time-scales reveal distinct patterns (Appendix Fig. A2).On a seasonal basis, the IAV in LAI for winter, spring, and summer is represented similarly well by both models.However, fall IAV in LAI is captured better by DALEC CDEA .This leads to a slightly better performance by DALEC CDEA in capturing LAI IAV on an annual basis.For NEE, DALEC Knorr performs slightly better at capturing IAV across sites with r=0.45 and RMSE=0.33 g C m −2 d −1 , while DALEC CDEA is r=0.33 and RMSE=0.35g C m −2 d −1 , with the largest differences in fit occurring during fall.
CARDAMOM is fitting a global cost function which considers all observations in the MDF simultaneously.Therefore, tradeoffs can occur between the fit to different observations, both in time and across the observation types (Kato et al., 2013).The results demonstrate that changing the model structure modifies how CARDAMOM converges to an optimal fit for the global cost function.There can therefore be compensatory effects between the fit to LAI and NEE observations.This occurs for the fall IAV in LAI and NEE, where DALEC Knorr better captures observed fall IAV in NEE, yet it performs worse at capturing observed fall IAV in LAI (Fig. A2).In other cases, a different model structure can lead to improved fit in both LAI and NEE, such as at FR-Pue for DALEC Knorr , suggesting the integrated model structure is overall improved.In any case, assimilating multiple observational data streams simultaneously has benefits over assimilating data streams in separate, sequential steps (Kaminski et al., 2013).A sequential approach requires all uncertainties from each step to be propagated to the next, and this can be challenging when dealing with non-linear models.By assimilating multiple data streams simultaneously, the complementarity of the observations can be exploited and a more consistent view of the terrestrial carbon cycle can be achieved.

Underlying Parameters and Process Constraints
Here, we describe the estimated model parameters and their uncertainty reduction.First, we focus on the Knorr LAI phenology parameters, which govern the link between local climate and phenology, then follow up with a comparison of the remaining shared parameters of the two models.
The parameter uncertainty reduction, calculated as the relative reduction in IQR from the prior to posterior (see methods), for the Knorr LAI phenology parameters differs across sites (Appendix Fig. A5).There are nine process parameters in the Knorr LAI model (excluding initial conditions) across six sites, giving a total of 54 estimated LAI phenology parameters.Of these, 14 parameters show an uncertainty reduction of more than 80%, 15 show an uncertainty reduction between 50-80%, and 8 show an uncertainty reduction between 20-50%.The temperate deciduous forest site, US-Ha1, sees the strongest uncertainty reductions in LAI phenology parameters, with seven of the nine parameters being constrained by more than 50%.The weakest uncertainty reductions in LAI phenology parameters occur at the tropical evergreen forest site, GF-Guy, with just two parameters showing uncertainty reductions greater than 50%.
Across all sites, the parameter representing the structural maximum LAI, Λ, is well-constrained, showing an approximate normal posterior PDF and uncertainty reduction of more than 80%, indicating that Λ is well characterized by the MDF regardless of the site.The parameter for the mean temperature threshold at leaf onset, T φ , shows strong uncertainty reductions of 66-80% at the four cooler sites (FI-Hyy, FR-LBr, FR-Pue, US-Ha1) and weaker uncertainty reductions of 35-39% at the two warmer tropical sites (AU-How, GF-Guy).This highlights the stronger role of temperature on LAI phenology at the cooler sites.The parameter for the mean photoperiod at leaf senescence, t c , shows strong uncertainty reductions at all sites except the tropical evergreen forest (GF-Guy).There are small reductions in uncertainty of 5-18% for the parameter governing water limitation on LAI (i.e.drought deciduousness), τ W , with the strongest reductions (15-18%) at AU-How, FI-Hyy, and US-Ha1.The parameter τ W is a proxy for the drought-deciduous behavior of phenology, with larger values indicating a stronger droughtdeciduous strategy.The FR-LBr, FR-Pue, and GF-Guy sites show the strongest drought-deciduous behavior, while FI-Hyy and US-Ha1 sites show the weakest.The leaf growth rate parameter, ξ, the leaf longevity parameter, k leaf , and the background foliar turnover rate, θ foliar , tend to show moderate uncertainty reductions across sites, although GF-Guy shows lower constraint on ξ and k leaf .
These results are distinguished from previous MDF studies that used the same phenology model, where phenological types and strategies were differentiated a priori (Kaminski et al., 2012;Kato et al., 2013;Knorr et al., 2010).In these studies, each plant functional type used a different prior PDF, while certain parameters were set as constants, effectively switching off some environmental factors that govern the growth triggers and senescence of LAI.As outlined in Knorr et al. (2010), this has advantages if there is sufficient prior knowledge about the species at each study site, however in most cases these details are based on limited evidence or entirely unknown.Extending these priors beyond well-characterized sites or for global scale analyses with satellite data, requires further assumptions, often by classifying plant functional types which has significant shortcomings (see Van Bodegom et al., 2012).Over confidence in prior knowledge can be problematic, considering the prior PDFs have a significant impact on the inferred parameters and, therefore, the climatic controls of both LAI and NEE.Here, we applied equivalent prior PDFs at all sites, thus allowing CARDAMOM to infer the controls based only on the observations and local climate.The posterior LAI phenology parameters generally show moderate to strong uncertainty reductions, indicating the ability of CARDAMOM to improve knowledge on phenological controls.The advantage of this approach is that environmental controls on LAI phenology emerge from the MDF system, exemplified by the stronger temperature control of spring leaf onset in cooler climate forests compared to warmer tropical sites.Furthermore, Knorr et al. (2010) fixed a priori the LAI water limitation parameter, τ W , to be zero for cooler forest plant functional types so that LAI is never impacted by water availability.
We find that even for the cooler forest sites, the posterior τ W is non-zero, indicating that water limitation plays a role on LAI dynamics, a finding that is also supported by empirical evidence (Buermann et al., 2018;Zhang et al., 2020).We note that switching off the photoperiod regulation process for canopy senescence, as was done in Knorr et al. (2010) for some PFTs, would likely lead to much larger τ W values than those in our study, and this would have implications on the link between water availability and LAI dynamics.
There are a number of common parameters between DALEC Knorr and DALEC CDEA (Table A1).The posterior PDFs for these shared parameters are shown in Appendix Fig. A6.The canopy efficiency, a key parameter for GPP, has a higher median in DALEC Knorr at four of the six sites, while it is the same at FR-Pue and lower at GF-Guy.The inferred carbon-use efficiency for DALEC Knorr , equal to the ratio of NPP to GPP (defined by 1-f auto ), is approximately the same at three sites (FR-LBr, GF-Guy, US-Ha1), higher at two sites (AU-How, FR-Pue) and lower at one site (FI-Hyy).The inferred LM A at each site is systematically lower in DALEC Knorr .Parameters that govern plant carbon allocation and turnover show distinct differences between DALEC CDEA and DALEC Knorr .The DALEC Knorr model shows lower fractional allocation of NPP to the foliar pool across sites, consistent with the lower LM A as less carbon is required to maintain the same LAI.DALEC Knorr also shows systematically lower fractional allocation to roots yet a higher fractional allocation to wood.Despite this, the woody residence time is about 60-80% lower in DALEC Knorr , as the turnover rate of woody biomass is significantly higher at all sites.Combined, these differences lead to a lower carbon residence time in vegetation in the DALEC Knorr model.
A key process linked with LAI phenology is litterfall production.We find that the litterfall rates predicted by DALEC CDEA and DALEC Knorr models are substantially different.In DALEC Knorr , average annual litterfall rates range between 5.0-12.9Mg C m −2 yr −1 .These fall well within the range of litterfall rates reported across global forest ecosystems, which range from 1-14 Mg C m −2 yr −1 (Zhang et al., 2014).The DALEC CDEA model predicts significantly higher litterfall rates at four sites (AU-How, FI-Hyy, FR-LBr, US-Ha1), ranging between 15.6-37.0Mg C m −2 yr −1 , which greatly exceeds observed rates documented by Zhang et al. (2014).DALEC Knorr infers litterfall rates at GF-Guy, the tropical evergreen site, that are well within the range observed for that ecosystem type, while DALEC CDEA predicts almost a factor two lower litterfall, which is below the range observed for tropical forests.Overall, this implies that the carbon allocation, turnover, and, subsequently, litterfall is more realistic in DALEC Knorr .
Parameters governing the water cycle are generally poorly constrained by the observations and both models tend to infer similar median parameter values.With no hydrology observations used here, there is no expectation of strong constraint on these processes.Despite this, it is notable that DALEC Knorr provides relatively strong constraint (∼80%) on the initial water pool size (W i ) across sites, perhaps a consequence of the process link between W and the water-limited LAI (Eqs.4, 5).

Validation of Inferred GPP and R e Fluxes
The two model structures implemented in CARDAMOM lead to differences in simulated GPP and R e , the component fluxes of NEE.Comparison of the model inferred GPP and R e to the FLUXNET partitioned GPP and R e over the validation period is shown in Fig. A4 and summary statistics in Fig. 3.We reiterate that no GPP or R e data was used during calibration.Almost invariably, DALEC Knorr has a higher correlation, lower RMSE, and lower bias in GPP and R e relative to DALEC CDEA .For GPP, with all site data combined, the model-data fit of DALEC Knorr gives r=0.78,RMSE=2.
while DALEC CDEA gives r=0.63,RMSE=3.5 g C m −2 d −1 , and bias=-1.8g C m −2 d −1 .Qualitatively, the fit to GPP shows similar performance to another recent CARDAMOM study that assimilated GPP and ET data (Smallman and Williams, 2019).
The inferred R e shows a similar RMSE and bias as the GPP model-data fit, although the correlations tend to be lower than for GPP in both models, with r=0.51 for DALEC Knorr and r=0.20 for DALEC CDEA .These results indicate that use of DALEC Knorr in CARDAMOM leads to a better representation of the component fluxes underlying NEE, despite a similar fit to assimilated data streams NEE and LAI.
The improvement in predictive skill of GPP by DALEC Knorr occurs predominantly at four of the six sites: AU-How, FR-LBr, FR-Pue, and GF-Guy.Both models simulate GPP as a function of the local climate, LAI, the canopy efficiency parameter, and the wilting point parameter that scales with W to impose soil moisture limitation.The difference in inferred GPP between the two models can be traced to the latter three of these factors.We find that the improved prediction of GPP by DALEC Knorr occurs due to a higher canopy efficiency at AU-How and FR-LBr, and a weaker soil moisture limitation at FR-Pue and GF-Guy.For the FI-Hyy site, even though the GPP predictions are similar between Knorr and DALEC CDEA , the underlying mechanisms are different as DALEC Knorr has a 46% higher canopy efficiency, which compensates for the lower LAI.At US-Ha1, the inferred mechanisms controlling GPP are very similar between the two models.
At the two cooler climate forest sites, FI-Hyy and US-Ha1, both DALEC Knorr and DALEC CDEA show similar performance against FLUXNET GPP and R e data.This suggests that DALEC CDEA is adequate at inferring NEE component fluxes at these sites.While difficult to trace precisely why this occurs, it may relate to the development of the ACM GPP model which was originally formulated to represent cool climate forests (Williams et al., 1997).It is notable that the model inferred GPP and R e performs worst at the tropical evergreen site, GF-Guy, with low correlation and the highest residuals (RMSE and bias) of any site.This follows from the relatively poor model-data fit to LAI and NEE at GF-Guy (Fig. 3).At this site the seasonal variability in LAI and NEE is small relative to the data uncertainty, so seasonal variability carries little weight in the MDF system, potentially making it difficult for CARDAMOM to resolve the seasonal dynamics or its controls.Modeling GPP and LAI at tropical sites has been a longstanding challenge.Evidence has shown that the coupling of leaf age-dependent changes in photosynthetic capacity drives seasonal dynamics GPP, suggesting that models need to separate LAI into cohorts to better represent these leaf demography processes (Wu et al., 2016), a level of complexity that is not currently considered in any version of the DALEC model.

Climate Sensitivity of LAI and NEE
Here, we present and discuss the inferred climate sensitivities of LAI and NEE for both models.The focus is upon DALEC Knorr as it provides the relatively more process-based representation of LAI, however, we also explore the results from the DALEC CDEA model as it provides a useful test case for the effect of model structure on inferred climate sensitivities.

Temperature Sensitivities
For both models and all sites, there is a positive ::: The ::::::: median S T LAI at :: on an annual timescale , indicating enhanced LAI with an increase in temperature ::::: ranges :::: from :::: zero :: to ::::::: strongly ::::::: positive ::::::::: depending ::::: upon ::: the ::::: model ::: and :::: site.At all six sites, DALEC CDEA shows a larger S T LAI than DALEC Knorr , perhaps indicative of the strong dependence of the CDEA formulation on temperature and lack of other ::: via :::::: carbon :::::: supply :::: and : a :::: lack ::: of : direct climate controls.Both the Knorr :::::::::: DALEC Knorr : and DALEC CDEA models infer the largest annual S T LAI at the two colder forest sites (FI-Hyy, US-Ha1), which is consistent with theoretical understanding of limiting factors on LAI (Caldararu et al., 2014;Richardson et al., 2013).DALEC Knorr infers a low annual S T LAI at the remaining four sites, with a median annual S T LAI of near zero for AU-How, FR-Pue, and GF-Guy, while there is a small positive median annual S T LAI at the FR-LBr site.DALEC CDEA , however, infers moderately strong S T LAI at the remaining four sites, and a median S T LAI for the two warm tropical sites (AU-How, GF-Guy) that exceed the S T LAI of two cooler temperate forest sites (FR-LBr, FR-Pue).: In ::::::: neither ::::: model :::: does ::: the :::: LAI ::::: show : a ::::::: negative ::::::::: sensitivity :: to :::::::::: temperature.: The seasonality of the inferred S T LAI from DALEC CDEA and Knorr models show distinct differences .DALEC CDEA tends to show a strong positive S T LAI year-round, often with a peak during the middle or late part of the growing season.DALEC CDEA also infers strong positive values during winter, even at the winter-dormant forest sites (FI-Hyy, US-Ha1) which goes against ecophysiological understanding at these sites (Richardson et al., 2013).For DALEC Knorr , the seasonality of S T LAI is centered on spring onset, with very low S T LAI at other times of the year.These seasonal patterns of S T LAI suggest that DALEC Knorr is more consistent with empirical understanding of temperature effects on LAI that show the temperature sensitivities is strongest during spring onset (Richardson et al., 2012;Piao et al., 2019).
Generally, at sites with a strong positive S T LAI at either annual or seasonal timescales, there is also a strong negative S T N EE (stronger carbon uptake with increased temperature), indicating the impact of LAI climate sensitivity on the sensitivity of NEE and the interrelated nature of these two processes.For DALEC CDEA , the larger positive S T LAI generally leads to more negative S T N EE , evident at seasonal timescales and by the negative annual S T N EE at across all sites.At four sites, the inferred median S T N EE by the two models differs in sign and magnitude, highlighting the impact of changes to the model formulation of LAI on the inferred temperature sensitivity of NEE.DALEC Knorr shows more variable annual S T N EE across sites, as the inferred median annual S T N EE can be either positive or negative depending upon the site.Seasonally, when there is a strong positive S T LAI (e.g.spring at FI-Hyy, FR-LBr, and US-Ha1) there is a concomitant negative S T N EE , demonstrating how a positive LAI temperature sensitivity leads to stronger carbon uptake and that there is a strong seasonal dependence of this link in DALEC Knorr .At these same three sites, other times of the year show a positive S T N EE , suggesting that LAI plays less of a role in in governing S T N EE outside of spring (Figs.A7-A8).

Precipitation Sensitivities
At all sites, S P LAI in DALEC CDEA is effectively zero (AU-How, FI-Hyy) or highly uncertain (FR-LBr, FR-Pue, GF-Guy, US-Ha1).This may reflect the weak process link between water availability and LAI dynamics in DALEC CDEA , considering the connection is mediated via the soil moisture limitation of GPP and subsequent changes in allocation of carbon to the foliar pool, which itself is buffered by the labile carbon pool, making LAI relatively insensitive to changes in water availability.In DALEC Knorr , there is often a weak but consistently positive S P LAI with a more tightly constrained uncertainty, which implies that increases in precipitation lead to small increases in LAI.This is consistent with the Knorr model formulation for waterlimitation on LAI (Eq.5), where changes in evapotranspiration (E) and/or plant available soil water (W ) can mediate LAI directly via the τ W parameter.More specifically, an increase in the ratio of W to E can increase LAI as there is less water limitation.The opposite is therefore also true, where a decline in the ratio of W to E will lead to a reduction in LAI, for example due to a decline in precipitation.The inferred S P LAI from DALEC Knorr is strongest at the AU-How and FR-LBr sites, with smaller S P LAI values at the three temperate and boreal forest sites, and effectively zero S P LAI at GF-Guy.Despite the lower DALEC CDEA -8.5 (-11.2 to -6.4) -3.1 (-4.4 to -1.8) -0.8 (-1.6 to 0.1) -0.6 (-1.3 to 0.0) -0.8 (-1.5 to -0.4) -0.8 (-2.7 to 2.2) Knorr model :::::::: DALEC Knorr 0.5 (-1.4 to 3.0) -1.8 (-5.5 to 0.9) 1.5 (0.2 to 3.3) 4.3 (1.7 to 7.3) -0.1 (-0.6 to 0.7) 0.8 (-9.4 to 7.3)  τ W inferred at AU-How across sites, there is a relatively stronger S P LAI which may be due to higher evaporative demand at this tropical site causing larger imbalances between E and W .
The influence of water availability on Knorr LAI has a clear seasonality at some sites with S P LAI typically being largest during the mid to late growing season .This seasonal dependence is consistent with recent evidence of late growing season water limitation on productivity at many forest sites (Buermann et al., 2018;Zhang et al., 2020).The evident seasonal dependence of water limitation on LAI in the Knorr model gives promise for an MDF approach to exploring the compensatory effects of temperature and water limitation on growing season phenology and productivity.
At most sites the inferred annual median S P N EE from both DALEC CDEA and Knorr models is negative, indicating an increase in net carbon uptake with increased precipitation.Only US-Ha1 shows a consistently positive annual S P N EE , across the uncertainty range and for both models.This is due to the strong positive sensitivity of R e to precipitation and very small positive sensitivity of GPP to precipitation inferred at the US-Ha1 site (Fig. A8).The seasonality in S P N EE shows considerable differences across sites, but it is evidently influenced by S P LAI .For example, DALEC Knorr shows a small positive S P LAI during the peak growing season at AU-How (Austral summer), and this leads to a stronger sensitivity of summer carbon uptake to precipitation.Similarly, the small positive S P LAI at FI-Hyy during the late growing season produces a slight seasonal shift in S P N EE to later in the year.In other sites the low S P LAI from DALEC Knorr is coupled with a very low S P N EE (FR-Pue, GF-Guy), whereas DALEC CDEA infers large and highly uncertain S P N EE values at these sites.The addition of the process-based Knorr model seems to help stabilize the sensitivity of NEE to precipitation at these two sites, as the exceptionally large and uncertain S P LAI from DALEC CDEA maps into S P N EE .

Significance and Limitations
Developing models that balance process realism with reliability and robustness is key to better predictions of carbon-climate feedbacks (Prentice et al., 2015).The climate sensitivity of LAI is an important mechanism of terrestrial carbon-climate feedbacks, as it is closely coupled to both GPP and NEE (Richardson et al., 2013).Here, we implemented a new model for LAI phenology in CARDAMOM that includes processes such as temperature and photoperiod controls on growth and senescence triggers, water limitation on LAI growth rate, and couple LAI dynamics to plant carbon allocation and litterfall.Many previous studies have developed and tested climate-sensitive LAI phenology models against LAI data directly (e.g.Fox et al., 2022;Jolly and Running, 2004;Stöckli et al., 2008;Viskari et al., 2015).However, in the context of carbon-climate feedbacks it is critical that both model formulation and parameterization of LAI considers its close coupling to other processes in the carbon cycle.By confronting multiple processes with observations simultaneously in CARDAMOM we were able to generate a parameterization that provides a more integrated view of the carbon cycle (Kaminski et al., 2013).Furthermore, by extending the validation beyond the assimilated data streams, we were able to rigorously evaluate model skill.This is a key step toward a better model representation of the aggregate behavior of the terrestrial carbon cycle (Fisher and Koven, 2020).
Including the two model formulations allowed us to evaluate how the increase in complexity of LAI phenology influenced the fit to assimilated data streams (LAI, NEE, biomass) and wholly withheld data streams (GPP, R e ).Evidently, the DALEC CDEA model is a parsimonious approach for fitting data and appears to give reliable predictions of the assimilated data streams (Fig. 3) and even outperforms DALEC  (Fang et al., 2019).For example, the seasonal amplitude of satellite LAI is often overestimated in boreal evergreen systems (Heiskanen et al., 2012).Other potential error sources include spatial sampling differences between the eddy covariance tower footprint (NEE, GPP, R e ), biomass samples, and the footprint of Copernicus satellite LAI data.The limitations of the observations imply caution when over-fitting to any one data stream, and that correcting these sampling biases should allow us to better reconcile model and data.Further evaluation of the two models suggest that DALEC Knorr better captures temporal variability and magnitude of FLUXNET GPP and R e data, and predicts more realistic annual litterfall rates compared to a global compilation of observations (Zhang et al., The temporal structure of the inferred climate sensitivities has implications for the response of ecosystem carbon cycling to a changing climate.Variability in climate forcing, due to both natural and anthropogenic factors, is not uniform in time or space, with seasonal dependencies on both the variability and trend (Franzke et al., 2020).The inferred sensitivities to temperature and precipitation  show strong seasonality, so the seasonal structure of future climate change will have a strong impact on the response of the carbon cycle.For example, a change in spring temperature forcing will have markedly different impacts on LAI and NEE than an equivalent change in fall temperature forcing due to seasonal differences in the intrinsic sensitivity of the terrestrial carbon cycle.The intrinsic sensitivity to climate is the combination of a number of distinct yet interrelated processes, each of which can have different effects on the net carbon balance.Here, the inferred temperature sensitivities of LAI, S T LAI , are positive definite in all cases.However, the inferred temperature sensitivities of NEE, S T NEE , can be positive or negative depending upon the site and season.The climate sensitivities of NEE integrate over a number of underlying processes, each of which can have differences in sign and magnitude in their sensitivities (e.g.GPP and R e can respond oppositely to precipitation, Appendix Fig. A7-A8).Constraining these sensitivities using diverse observations provides a robust way toward better representation of carbon-climate feedbacks.
Many phenological processes operate on time scales shorter than the monthly time step used in this analysis.Therefore, accurately resolving specific phenological events such as spring onset or fall senescence events, which can occur within days to weeks, is outside the scope of this study.However, the necessary mechanisms are included in the Knorr LAI model, providing a path toward finer time-scale analyses, which would help to characterize phenological responses to climate and the relationship with NEE.In any case, as outlined by Keenan et al. (2020) data-informed process modeling is key to resolving these processes, as implemented here, so that explicit consideration of processes and observational uncertainties can be mapped onto the inferred climate sensitivities.

Conclusions
This study demonstrates a holistic approach to model development for the purpose of improving model representation of the climate response of the terrestrial carbon cycle.We integrated a new formulation for LAI phenology into the DALEC terrestrial biosphere model, outlining the coupling to the carbon and water cycles, and performed MDF using CARDAMOM to calibrate the model against diverse Earth observations.Relative to the previous LAI phenology model in DALEC, the new DALEC model showed improved representation of the underlying processes governing the net ecosystem carbon balance, GPP and R e , yet similar performance against the assimilated observations, LAI, NEE, and biomass.This analysis was carried forward to evaluate the data-informed climate sensitivity of the new model structure and parameterization, which showed large changes in the seasonality, sign, and magnitude of LAI and NEE sensitivities to temperature and precipitation.The added process realism of LAI phenology in DALEC/CARDAMOM provided more realistic and robust predictions of the terrestrial carbon cycle and its response to climate, highlighting the important role that LAI phenology plays in representing the terrestrial carbon cycle especially when considered in a MDF system.The labile production flux is computed by: F labprod (t) = (GP P − R a )f lab = N P P f lab (C1) where GP P is the gross primary productivity, R a is the autotrophic respiration, N P P is the net primary productivity, and f lab is a parameter representing the fraction N P P allocated to the labile pool.
and scatter plots in Appendix Fig.A1.Pearson's r shows that NEE temporal variability is better captured by the DALEC Knorr model at four of the six sites (AU-How, FR-LBr, FR-Pue, GF-Guy), while both models show comparable performance at the remaining two sites (FI-Hyy, US-Ha1), with equal correlations between the two models with all site data combined (r=0.83).The RMSE in NEE for both models is comparable for each site, with smaller residuals from DALEC Knorr at three sites (AU-How, FR-LBr, FR-Pue), larger residuals at two sites (FI-Hyy and US-Ha1), and equivalent residuals at GF-Guy.For all site data combined, RMSE=0.96g C m −2 d −1 for both models.There is a small high bias in model NEE at most sites (<0.5 g C m −2 d −1 ), with DALEC CDEA showing a slightly lower bias for all site data combined (0.16 g C m −2 d −1 ) compared to DALEC Knorr (0.21 g C m −2 d −1 ), indicating the both models slightly underestimate net carbon uptake across sites.

(
. With all site data combined there is equal correlation with r=0.93.The RMSE for LAI shows that the DALEC CDEA residuals are smaller than DALEC Knorr at three sites (AU-How, FI-Hyy, US-Ha1) indicating better performance, while DALEC Knorr shows better performance at FR-LBr and FR-Pue, and finally both models show similar performance at GF-Guy.Across sites, there is a marginally better performance by DALEC CDEA (RMSE=0.55 m 2 m −2 ) relative to DALEC Knorr (RMSE=0.57m 2 m −2 ).Both models tend to systematically underestimate LAI, as both models are biased low by 0.23 m 2 m −2 for all site data.Both models capture the across site variability in ABGB with r=0.99, as shown in the Appendix Fig.A1.However, DALEC CDEA has smaller residuals for all site data combined, with a RMSE=472 g C m −2 and bias=-217 g C m −2 , versus a RMSE=952 g C m −2 and bias=-609 g C m −2 for DALEC Knorr .Observed and modeled trends for LAI and NEE are shown in Appendix Fig.A3.Observed LAI shows a significant trend at two sites, FR-LBr (p <0.05) and FR-Pue (p <0.001).DALEC CDEA also shows a significant trend at these two sites but overestimates the magnitude of the trend at FR-LBr by a factor of two and misrepresents the sign of the trend at FR-Pue.DALEC CDEA also produces a significant positive trend at FI-Hyy (p <0.05) and negative trend at GF-Guy (p <0.001), neither of which are shown by the observations.DALEC Knorr does not predict a significant trend in LAI at any site.No site shows a significant trend in observed NEE, which the DALEC Knorr is consistent with.DALEC CDEA , however, predicts a significant positive NEE trend (p <0.05) suggesting a weakening carbon sink, consistent with the strong negative trend in modeled LAI.

Figure 2 .
Figure 2. Model-data fit shown as time-series at each site and for each CARDAMOM LAI model formulation, for LAI (top panel) and NEE (bottom panel).Assimilated observations are shown as black markers (calibration) and withheld observations as red markers (validation).The gray shading shows the DALECCDEA model and the green shading shows the DALECKnorr model.

Figure 3 .
Figure 3. Top panel shows the model-data fit statistics to the assimilated observational data streams, LAI (top) and NEE (bottom), over the validation period.Bottom panel shows the model-data fit statistics to the withheld observational data streams, GPP (top) and Re (bottom), over the validation period.Markers show the Pearson's correlation coefficient (r), RMSE, and bias, per study site and for all site data combined.The corresponding time-series showing the model-data fits are shown in Fig. 2 for LAI and NEE, and Fig. A4 for GPP and Re.All site data combined scatter plots are shown in the Appendix Fig. A1.

Figure 4 .
Figure 4.The seasonal pattern of model-data fit to LAI and NEE, and the inferred climate sensitivity of LAI and NEE to interannual variations in precipitation and air temperature.

Figure 5 .
Figure 5.The seasonal pattern of model-data fit to LAI and NEE, and the inferred climate sensitivity of LAI and NEE to interannual variations in precipitation and air temperature.

Figure A1 .
Figure A1.Model-data fit to assimilated observations (NEE, LAI, ABGB) and wholly withheld observations (GPP, Re) over the validation period, for the DALECCDEA model (left) and DALECKnorr model (right).Each color represents a different site.

Figure A2 .
Figure A2.Model-data fit statistics for the interannual variability on an annual and seasonal basis against the assimilated observations (LAI, NEE) and wholly withheld data (GPP, Re) over the validation period, with all sites combined.Markers show the Pearson's correlation coefficient (r), RMSE, and bias, per study site and for all site data combined.

Figure A4 .
Figure A4.Model-data fit shown as time-series at each site and for each CARDAMOM LAI model formulation, for GPP (top panel) and Re (bottom panel).Both GPP and Re data were withheld from the MDF for validation.Observations are shown in red markers (validation data).The gray shading shows the DALECCDEA model and the green shading shows the DALECKnorr model.

Figure A5 .
Figure A5.LAI phenology posterior parameter PDFs for the Knorr model, shown as the log-normalized PDFs between the minimum (normalized constraint=0) and maximum (normalized constraint=1) parameter bounds.The violin plot shows the full posterior PDF, the solid vertical bars indicate the 25th to 75th percentile range (IQR), and the horizontal solid lines indicate the median.The percentages at the top of the figure indicate the parameter uncertainty reduction from prior to posterior, reported as the reduction in log-normalized IQR (see methods).

Figure A6 .
Figure A6.Posterior parameter PDFs for the common process parameters of the DALECCDEA model (gray) and DALECKnorr model (green), excluding initial conditions.Violin plots of 'normalized contraint' show the log-normalized PDFs between the minimum (normalized con-straint=0) and maximum (normalized constraint=1) parameter bounds.The violin plot shows the full posterior PDF, the solid vertical bars indicate the 25th to 75th percentile range (IQR), and the horizontal solid lines indicate the median.The percentages at the top of the figure indicate the parameter uncertainty reduction from prior to posterior, reported as the reduction in log-normalized IQR (see methods).

Figure A8 .
Figure A8.The seasonal pattern of model simulated GPP and Re, and the inferred climate sensitivity of GPP and Re to interannual variations in precipitation and air temperature.

Table 1 .
Description of the study sites including the site name, FLUXNET code, dominant plant functional type (PFT), location, mean annual temperature (MAT) and mean annual total precipitation (MAP).

Table 2 .
Annual temperature sensitivity per site, variable (LAI, NEE), and model in CARDAMOM.The values represent the median annual temperature sensitivity, with the 25th to 75th percentile range in brackets.* LAI sensitivities are scaled by 10 3 .

Table 3 .
Annual precipitation sensitivity per site, variable (LAI, NEE), and model in CARDAMOM.The values represent the median annual precipitation sensitivity, with the 25th to 75th percentile range in brackets.* LAI sensitivities are scaled by 10 3 .
Knorrat capturing LAI IAV.The close coupling of LAI with GPP and plant allocation in DALEC CDEA may provide flexibility in fitting LAI data.However, at four sites DALEC CDEA predicts significant positive (FI-Hyy) or negative (FR-LBr, FR-Pue, GF-Guy) trends in LAI that are not present in the observations, nor are they predicted by DALEC Knorr .At one site (FR-LBr), this leads to a significant trend in DALEC CDEA model NEE that is not supported by the observations, suggesting that DALEC CDEA may lead to large biases in the predicted net carbon balance over longer periods.The tighter coupling between climate and LAI phenology in DALEC Knorr may help moderate LAI and GPP dynamics during the prediction period, preventing unrealistic model trends.This difference in model formulation also leads to a significantly reduced biases in GPP by DALEC Knorr , and may imply that the DALEC CDEA model is over-fitting to the LAI data.It is also important to consider that the satellite LAI observations can have systematic biases that we do not consider in the MDF system, as CARDAMOM only considers random errors.Satellite LAI retrievals can be particularly challenging over boreal (e.g.FI-Hyy), tropical (e.g.GF-Guy), and open woody savannas (e.g.AU-How) due to effects such as low solar zenith angle, snow and cloud contamination, visibility of the understory, and a lack of validation data

Table B1 .
DALECCDEA LAI phenology parameters including process parameters and initial conditions, along with their prior range.a This parameter is used in DALECCDEA to allow for direct carbon allocation to C f ol , whereas DALECKnorr does not, as C f ol is only supplied with carbon from C lab .b Uses a circular prior range that extends beyonda 365.25 to prevent edge jumping during optimization (e.g.Dec 31 to Jan 1), as this parameter is used in a sine function with an annual period, so the actual day-of-year value can be computed as modulo 365.25.
a Only during canopy senescence.b Defined as the fraction of LAImax.