Large inter-annual variation in carbon sink strength of a permanent grassland over 16 years: Impacts of management practices and climate

Permanent grasslands cover one third of the European agricultural area and are known to store large amounts of carbon (C) in their soils. However, long-term assessments of their C sink strength are still scarce. Thus, we investigated the C budget of an intensively managed, permanent grassland in Switzerland over 16 years, compared the results to changes in soil C stocks


Introduction
The atmospheric CO 2 concentration has been increasing since the pre-industrial era contributing to climate change (IPCC, 2021).Ecosystems can contribute to and at the same time mitigate the effects of climate change (IPCC, 2023, their Figure 4.4).In particular, managing agroecosystems bears the potential to decrease their greenhouse gas (GHG) emissions and to increase their carbon (C) sink strength (Smith et al., 2008a;IPCC, 2023).
Grassland ecosystems make up for 34% of the agricultural area in Europe (Eurostat, 2016).In Switzerland, this share is even higher, with about 70% of the agricultural area (BFS, 2021), typically as permanent grassland, i.e., not included in any crop rotation.Grassland renewal, i.e., ploughing and reseeding, is less frequent in permanent grasslands than in temporary grasslands in crop rotations, and is usually related to high GHG emissions (Merbold et al., 2014).When aiming to reduce GHG emissions from agriculture, grasslands are important ecosystems to consider, not only because they are widespread and harbor high biodiversity compared to croplands (Schils et al., 2022), but also because they have the potential to act as C sinks (Soussana et al., 2007;Schulze et al., 2009;Kämpf et al., 2016;EASAC, 2022).
The grassland CO 2 exchange can directly be measured using the eddy covariance (EC) technique (Aubinet et al., 2012).It provides observations of the net ecosystem exchange (NEE), which represents the difference between two large CO 2 fluxes: gross primary production (GPP) (i.e., how much CO 2 is fixed by gross photosynthesis), and ecosystem respiration (Reco) (i.e., how much CO 2 is lost via plant and soil respiration).NEE is influenced by both, environmental conditions and grassland management, as recent studies in Swiss grasslands have shown (Ammann et al., 2020;Rogger et al., 2022).Environmental variables such as incoming shortwave radiation, water vapor pressure deficit, or air temperature influence grassland NEE (e.g., Rogger et al., 2022, Wohlfahrt et al., 2008a).This results in NEE being highly dynamic, varying on diel as well as seasonal scales.Grassland C sink strength is usually addressed via annual budgets of net biome production (NBP) (Kirschbaum et al., 2001;Schulze and Heimann, 1998).NBP is based on flux measurements (NEE) while also considering other budget components from ecosystem management, i.e., C imports (e.g., C added via organic fertilizer amendments), and C exports (i.e., C removed via harvest or grazing), to quantify the grassland C budget.Other studies also use the term NECB (net ecosystem carbon balance) (Ammann et al., 2020), which simply has a different sign convention than NBP (micrometeorological sign convention for NBP vs. ecological sign convention for NECB).However, information about the long-term variation of NBP (or NECB) for grasslands and its environmental and management drivers remains limited.
Considering the annual NEE budget, i.e., the cumulative sum of net ecosystem CO 2 fluxes over one year, grasslands usually fix more CO 2 via gross photosynthesis than they respire via plant and soil respiration (Soussana et al., 2007;Zeeman et al., 2010).However, in order to evaluate if grasslands act as C sinks or C sources, C imports and exports need to be considered (NBP) (Ammann et al., 2007;Schulze et al., 2009;Prescher et al., 2010;Rogger et al., 2022).Moreover, when considering all three GHGs, i.e., CO 2 , nitrous oxide (N 2 O) and methane (CH 4 ), in the net GHG exchange (NGHGE), the grassland CO 2 exchange (NEE) typically counteracts the fluxes of the other two GHGs in terms of CO 2 -equivalents, resulting in a net GHG sink (Allard et al., 2007;Hörtnagl et al., 2018;Jones et al., 2017;Soussana et al., 2007), except in renewal years when the grassland is resown (Drewer et al., 2017;Merbold et al., 2014).However, when using NBP instead of NEE in the GHG budget (NGHGB), grasslands can act as both GHG sinks (Schulze et al., 2009;Soussana et al., 2007), but also as GHG sources in renewal and post-renewal years (Ammann et al., 2020).Thus, not only climate variability, but also knowledge about management is essential when quantifying the climate change mitigation potential of grasslands.
Changes in soil C stocks over time indicate if a grassland has been a source or a sink of C, i.e., if C has been lost from or sequestered in the soil profile.The potential of soil C sequestration in croplands and grasslands globally is estimated as 0.4-8.6Gt CO 2 -eq yr − 1 (Jia et al., 2019).While NBP allows to investigate the influence of episodic management events (e.g., harvests) on changes in NEE, short-term changes in soil C stocks are more difficult to detect.Changes in soil C stocks or NBP are often related to a change in land use (e.g., conversion from forest or cropland to grassland) or a change in agricultural management (e.g., increasing organic fertilization; Conant et al., 2001).Results from NBP and soil organic C assessments do not always agree, and only a few studies used both methods (Leifeld et al., 2011;Skinner and Dell, 2015;Emmel et al., 2018;Ammann et al., 2020;Jones et al., 2017).Multiple reasons for this shortage of studies can be given: (1) data availability.Time series with more than 10 years of CO 2 flux measurements are scarce, while changes in soil C stocks might only show after a decade (but see Schrumpf et al., 2011).In addition, data for the whole soil profile, including bulk density measurements, are difficult to obtain (e.g., Gattinger et al., 2012); (2) discrepancy in temporal scales.While NBP can be reliably estimated after few years of flux measurements, soil C stocks can be determined with affordable efforts (both in terms of sampling numbers and finances) only after a long period, typically 10+ years; (3) long-term data documentation.Meta-data on fluxes and management must be available and kept over time.This may often be hampered by short funding cycles; soil sampling locations must be found again after a decade, often by different persons; setup and maintenance of the flux station must be well documented over many years; accurate and sufficiently detailed management data must be collected regularly over long periods; (4) long-term funding must be available.Thus, the number of flux sites for which all these aspects are fulfilled, is limited, particularly for intensively managed grassland with frequent management interventions.
Therefore, this study focused on multi-year C budgets (NBP) of an intensively managed, permanent grassland in Switzerland.We asked the following questions: 1 Did the permanent grassland act as a C sink or a C source over the last 16 years and are there trends in NBP? 2 Does the cumulative NBP reflect the changes in soil C stocks? 3 How do management practices and environmental variables influence the CO 2 exchange and the C budget of this grassland at different time scales?

Site description and management
The grassland site Chamau (CH-Cha in FLUXNET; 47.210222 • N, 8.410444 • E) at 393 m a.s.l. is located in the Reuss river valley on the Swiss Plateau, about 30 km south of the city of Zurich, Switzerland.It is part of a farm, which originally belonged to the ETH Zurich research facility Chamau, until it was sold in 2017 to the canton of Zug (ZG), and now is managed by the local agricultural consultation and education center (Landwirtschaftliches Beratungs-und Bildungszentrum (LBBZ) Schluechthof Cham, ZG).The average annual temperature is 9.9 • C, and the average annual precipitation sum is 1134 mm (2005-2020; Fig. B1).The site is located north of the Swiss Alps and shows two prevailing wind directions, following a mountain-valley wind system (day: up valley flow from NW/N, night: down valley flow from SE).The soil at Chamau was classified as Cambisol/Gleysol, with a silt loam soil texture and a pH of 5.3 in the top 10 cm of the soil (Roth, 2006).
In July 2005, a meteorological and flux measurement station was installed in the intensively managed, permanent grassland (for details on instrumentation, see Section 2.2).The area surrounding the flux station had been converted from cropland to grassland in 1998, sown as a ryegrass-white clover seed mixture, and stayed a permanent, intensively managed grassland since then (Gilgen et al., 2010;Roth, 2006).The only exception was the 2001 growing season when the area Table 1 Site management information per parcel.Number of cuts per year, number of grazing events, average number of grazing days for years with grazing, yield (mean annual harvest for the specified time periods), amount of C in the yield (Yield C), amount of N in the yield (Yield N), amount of C in the fertilizer (Fertilizer C), and amount of N in the fertilizer (Fertilizer N) are presented for the two parcels surrounding the EC station (A and B).Averages over three periods (2002-2004 premeasurements; 2005-2014 similar management of the two parcels; 2015-2020 clover experiment) are given.Between 2015 and 2020 (clover experiment), parcel B had a higher clover proportion than parcel A and no fertilization (see Fuchs et al., 2018 for further details).northeast of the station was used for silage maize, but already in 2002, it was resown as grassland again.The farm is managed as an integrated production system, i.e., permanent grassland management to provide feed for the livestock on the same farm.The Chamau grassland site is composed of two parcels whose outlines changed twice over the last 16 years, in 2014 and in 2019, resulting in three periods with constant areas (2005 to 2013, 2014 to 2018, 2019 to 2020; Fig. B2).The parcel south of the station is referred to as A1, A2, and A3, while the parcel mostly north of the station is referred to as B1, B2, and B3 for the periods 2005 to 2013, 2014 to 2018, and 2019 to 2020, respectively.Almost every year from 2002 to 2012, both parcels (A1, B1) had been oversown with a seed mixture, including white clover (Trifolium repens), English ryegrass (Lolium perenne) and common meadow-grass (Poa pratensis).Oversowing was usually preceded by harrowing to remove the undesired Poa trivialis (rough meadow-grass), followed by rolling to press the seeds into the soil.Despite the frequent resowing, also other plant species established at the site, e.g., Italian ryegrass (Lolium multiflorum), red clover (Trifolium pratense), and dandelion (Taraxacum officinale).For further details about species composition until 2012, please see Zeeman et al. (2010) and Gilgen et al., 2010.During some years, herbicide against Rumex spp.(e.g., Rumex obtusifolius) and Senecio spp. was applied, while in 2012, there was one molluscicide application.In 2012, a major sward renewal with ploughing and reseeding took place (Merbold et al., 2014).During 2002 to 2013, both parcels (A1, B1) were managed similarly with four to six cuts per year, usually followed by organic fertilizer applications (slurry).Fertilization with farmyard manure and mineral fertilizer was rare during the 16 years of study (only two and three times, respectively) compared to the frequent fertilization with slurry.Occasionally, the parcels were also grazed by sheep.In 2014, the areas of the parcels changed while the management stayed the same (A2, B2; Table 1).Starting in 2015, the management differed between the two parcels to test a N 2 O mitigation strategy (clover experiment; Fuchs et al., 2018).Parcel A2 was managed as before (2005 to 2013), while parcel B2 had a higher legume fraction and received no fertilizer any longer.Both parcels were still mowed at the same date, about every month from May to September, resulting in 4 to 5 cuts per year (Table 1).The mowing usually takes place during a period of fair weather, when the vegetation is high and active (large net CO 2 uptake).In 2019, parcel A2 was reduced in size (A3; because the area close to the barn was used as pasture), while the area of parcel B2 stayed the same (called B3 for consistency).The management in the framework of the clover experiment continued for both parcels A3 and B3 until the end of 2020.For further details on the management between 2005 and 2020, see Table B1 and Table B2.

Instrumentation
Since July 2005, concentrations of CO 2 and H 2 O vapor were measured with LI-7500 open path infrared gas analyzers (IRGAs) (LI-COR Biosciences, USA) at 2.42 m measurement height (six different instruments over 16 years).Instruments were calibrated typically once a year, and measurements were recorded at 20 Hz resolution.The IRGAs were mounted inclined to reduce the deposition of water droplets on the open path windows.3D wind speed (u, v, w) and wind direction were measured with ultrasonic anemometers (R3-50, Gill Instruments Ltd., UK) at 2.41 m height (three different instruments over 16 years).Data were also recorded at 20 Hz and merged with the IRGA data in real-time using a data acquisition system described in Eugster and Plüss (2010).The 20 Hz data were stored on the station computer and transferred to ETH Zurich.

Flux processing and post-processing
The 20 Hz data from all years were processed using EddyPro (LI-COR Biosciences, USA, EddyPro v6.1.0for the years 2005-2015; v6.2.0 for 2016; v7.0.6 for the years 2017-2020) to calculate half-hourly averages of the turbulent fluxes using the covariance between the turbulent fluctuations of gas concentration and vertical wind speed (Aubinet et al., 2012).Corrections for density fluctuations (Webb et al., 1980), and angle-of-attack (Nakai et al., 2006) as well as axis rotation for tilt correction using double rotation (Wilczak et al., 2001) were applied.Time lags were detected using covariance maximization, with a default time lag if no time lag was found.Raw data were tested for spikes and drop-outs (Vickers and Mahrt, 1997).Furthermore, spectra were corrected for high-pass (Moncrieff et al., 2004) and low-pass filtering effects (Horst, 1997).To minimize IRGA self-heating, the instruments were mounted inclined, and no self-heating correction was applied (Burba et al., 2008;Reverter et al., 2011).
Further quality checks were applied to the half-hourly output from EddyPro.Periods when the vertical wind speed (w) was out of range (in 2008, 2009 and 2016; probably due to malfunctioning of the ultrasonic anemometer), got flagged.A combined quality flag was calculated using the Python library diive v0.21.0 (available at: https://gitlab.ethz.ch/diive/diive-legacy), which (1) considered a steady-state and an integral turbulence characteristics test (Mauder and Foken, 2004) (0-1-2; fluxes with quality flag 2 were excluded), and checked (2) the spectral correction factor for out-of-range values (values with spectral correction factor > 4 were excluded), (3) the window dirtiness of the open path mirrors (if > 93%, values were excluded), (4) the completeness of the 20 Hz files (if < 95% or < 34,200 records, values were excluded), and (5) the occurrence of spikes and drop-outs in the 20 Hz data after the statistical tests by Vickers and Mahrt (1997).This combined quality flag was applied, and storage terms were added to the fluxes to calculate the net ecosystem exchange (NEE) of CO 2 (Aubinet et al., 2001).
Since the open path sensors are subject to noise, a de-spiking of the flux time series was performed, using a Hampel filter (Borchers, 2021).The filter's window width was 432 values (9 days) for CO 2 , values outside ±5 times the standard deviation range during this time window got flagged, eight iterations were performed.Subsequently, the 1st and 99th percentiles of the nighttime CO 2 fluxes were trimmed to eliminate further outliers.Values outside a range of ±50 µmol CO 2 m − 2 s − 1 were excluded.After these post-processing steps, 52% of 280′512 possible half-hourly CO 2 fluxes for the time period 2005 to 2020 were available.
Gapfilling for the calculation of annual CO 2 fluxes was done using the R package Reddyproc (v1.2.2, Wutzler et al., 2018).Seasonal thresholds for the friction velocity (u * ) were determined to account for different surface roughness conditions.Meteorological seasons were used (winter: DJF; spring: MAM; summer: JJA; autumn: SON), and the 50% percentile of the seasonal u * threshold distribution was used to filter the data.Seasonal u * thresholds ranged from 0.03 to 0.08 m s − 1 for the period 2005-2020.After u * filtering, the data coverage was 46.4%.Marginal distribution sampling (MDS; Reichstein et al., 2005), which is a combination of look-up tables (LUT; using global radiation (Rg), air temperature (Tair) and vapor pressure deficit (VPD), with 50 W m − 2 , 2.5 • C, and 5.0 hPa margins, respectively) and the mean diurnal course method (MDC; no meteorological data necessary) (Falge et al., 2001;Wutzler et al., 2018), was chosen as the gapfilling method.Long gaps > 2 months (in 2005, 2008, 2009 and 2016) which could not be filled with MDS, were filled using random forest (RF) models (Breiman, 2001).The RF models were trained in R (Liaw and Wiener, 2002) with data from years adjacent to the long gaps, with the same predictors as for MDS (Rg, Tair, VPD).Additional predictors in the RF models were the timestamp and a numeric variable which indicated the time that had passed since a defoliation event to also take into account the effect of mowing or grazing on CO 2 fluxes.If the MDS or the RF gapfilling introduced negative values for the CO 2 flux at night (when potential radiation equaled zero), they were set to missing values again and filled with the average NEE of good quality measured and gapfilled fluxes of the respective night.Measured negative fluxes during the night were kept in the dataset if they passed all quality checks.
Gaps in the half-hourly meteorological data (air pressure, global radiation, air temperature, precipitation, and relative humidity) were filled with data from nearby weather stations at Cham (47.188278

Carbon budgets
We use the micrometeorological sign convention with negative fluxes being fluxes from the atmosphere to the surface (CO 2 uptake, C imports) and positive fluxes being fluxes from the surface to the atmosphere (CO 2 emissions, C exports).We accounted for C imports and C exports when calculating the C budget (net biome production, NBP) of the site The unit for every term is g C m − 2 , and NBP was calculated on an annual basis using the yearly sums of NEE, C imports and C exports.
C imports comprised the frequent fertilization events with mostly slurry (Table B1, Table B2).C imports via oversowing were not considered since seed C inputs are negligible (Emmel et al., 2018).To quantify the C and N inputs via fertilization, the amount of fertilizer applied (mineral fertilizer, manure and slurry) was recorded by the farmer, and fertilizer samples were taken during each application (Zeeman et al., 2010).The chemical composition (i.e., C and N concentrations) of the manure and slurry samples was determined by a laboratory for soil and environmental analyses (Labor für Boden-und Umweltanalytik (LBU), Erich Schweizer AG, Thun, Switzerland).C concentration of the slurry was 404±35 g kg − 1 dry matter (DM; average from 2005 to 2020 ±SD, all samples from both parcels), while nitrogen (N) concentration was 70±21 g kg − 1 DM.
C exports represented the regular mowing events and were calculated from the harvested biomass, while C exports via leaching were considered negligible (Kindler et al., 2011;5.3g DOC loss m − 2 yr − 1 on average for grasslands).C exports via grazing were not considered in the C budget, since grazing events (mainly with sheep) were rare compared to mowing (Table 1), and other studies showed that exported C with the grazed biomass is balanced on site by feces C imports (Smith et al., 2008b).
Between 2005 and 2020, harvest yields at each mowing date were documented for each parcel by the farmer as the number of silage bales, hay bales, truckloads or estimated in decitonnes of fresh matter (dt FM; 1 dt = 100 kg).The weight of the harvest units was known, and DM yield could be calculated (i.e., fresh weight multiplied by dry matter contents, DMC).Depending on the kind of export (e.g., silage or hay), different DMC values were used: 37% for silage (mostly fresh biomass), and 86% for hay (almost fully dried biomass) (average DMC values from measurements by the farmer).The amount of C exported was then calculated, using a C concentration of 44% (average from C measurements 2015-2020), multiplied with the DM yield (in kg DM ha − 1 ) for the years 2005 to 2014.For the years 2015 to 2020, C concentration measured in harvested biomass was used (values between 41% and 45%).In addition, manual measurements of the yield were taken at each harvest date between 2015 and 2020.A metal frame with a known area (0.1 m 2 ) was placed at 10 random locations per parcel around the measurement station.The vegetation inside the frame was cut at frame height which corresponded to the approximate cutting height (4 cm).Subsequently, the biomass samples were dried until weight constancy at 60 • C for at most seven days, then weighed and scaled up to DM yield per one ha.The dried biomass was first crushed into smaller pieces, and a subsample was ground with a ball mill (MM200, Retsch, Germany), and weighed into tin capsules for the analysis of C and N concentrations (expressed in %) determined with a Flash EA 1112 series elemental analyzer (Thermo Italy, former CE Instruments, Rhodano, Italy).
For the C budget (NBP) of the site, we calculated the weighted average for C imports and for C exports based on the contributions of the two parcels to the measured CO 2 flux (NEE), since the parcels varied in outlines and areas over the 16 years of this study (Fig. B2).Especially for the period of 2005-2013, parcel B was expected to have a larger influence on NBP than parcel A, since the area of parcel B covered more than 80% of the footprint (Fig. B2, Fig. B3).Contributions of each parcel were calculated using the Kljun et al. (2015) footprint model (Fig. B3, Table B3), which provided relative contributions for each pixel (1 m spatial resolution) within the footprint for multiple half-hourly timestamps.These contributions per pixel were averaged over the three periods (2005-2013, 2014-2018, 2019-2020) and summed up according to the parcel's area.In all three periods, the two parcels together accounted for >85% of the total footprint of the flux measured at the station (Fig. B3).For calculating NBP, the contributions of the two parcels together were scaled to 100% (Table B3), and the C imports and C exports from the two parcels were averaged according to their respective contribution to the flux footprint (see Fig. B3) instead of simply assuming a 50:50 contribution of the two parcels.
After applying the quality checks and corrections to the flux data as described above (Section 2.3), many sources of errors had been reduced.However, we identified three remaining sources of uncertainty for the annual NBP budgets: (1) random uncertainty in NEE, (2) uncertainty in NEE due to applying different u * thresholds, and (3) uncertainty in estimating C imports and C exports.Thus, we calculated standard deviations for every year and the full 16-year period using error propagation, assuming that the different years did not correlate (similar to Buysse et al., 2017).The resulting values were used as an estimate of uncertainty for all the components of the C budget (NEE, C exports, C imports and NBP).Further details about these uncertainty calculations can be found in the Appendix A. For the long-term averages of the C budget components, we report the mean±95% confidence intervals (from two-sided t-tests).

Soil carbon stocks
The soil was sampled twice in the past 16 years to determine the changes in soil C stocks, first in summer 2005 (43 locations across the footprint; Roth, 2006) and then repeated at the same coordinates (determined with a Trimble® R10 GNSS System) for a subset of locations in summer 2018 (29 locations; Fig. B4a).To ensure comparability, we used similar methods as Roth (2006), but increased the soil depth from 0.1 m to 0.3 m at these 29 locations.We used a core drill (0.05 m in diameter) to take the soil samples and divided the soil cores into three layers (0-0.1 m, 0.1-0.2 and 0.2-0.3m).Sampling of the 0.2-0.3m soil layer was limited due to bedrock (only possible at 24 of the 29 locations).Additionally, we took soil samples from 0.3 m to a depth of at least 0.65 m at four of these 29 locations using a Pürckhauer core drill (0.02 m in diameter) (Fig. B4a).In 2005, only two locations were sampled deeper than 10 cm from two open soil pits (C1 and C2: Fig. B4a).While at C2 bedrock limited sampling deeper than 30 cm, C1 was sampled until 0.65 m in 2005.Thus, in total, 29 of the original 43 locations (Roth, 2006) were resampled in 2018, with a higher number of samples for depths >0.1 m in 2018 than in 2005.
Bulk density (BD) samples in the top 0.1 m of the soil (0-0.1 m depth) were taken horizontally with a metal ring (volume of 100 cm 3 ) within a 0.1 m radius at all 29 sampling locations in 2018.In 2005, bulk density was determined directly from the core drill samples using the weight of the soil and the core drill's volume (except at C1 and C2 where BD samples were taken with metal rings from the wall of the soil pits).Samples for C and N analyses were dried at 40 • C for at least two days until weight constancy.BD samples were dried at 105 • C for two days.All samples were weighed, subsequently pestled in the lab, and then sieved through a 2 mm sieve to separate fine earth from skeleton (i.e., mostly gravel, roots, and plant residues).The dry weights of the skeleton were recorded.Fine earth subsamples were milled using a ball mill (MM200, Retsch, Germany).Prior to weighing another subsample into tin capsules, the milled samples were dried again at 40 • C to eliminate all humidity.C and N concentrations in the fine earth were determined with a Flash EA 1112 series elemental analyzer (Thermo Italy, former CE Instruments, Rhodano, Italy).For soil deeper than 0.6 m, some samples were below the detection limit (<0.1% C in the sample).
The BD (in kg m − 3 ) was calculated according to Eq. ( 2) as done in Roth (2006) after Schroeder (1992): where DW soil is the dry weight of the sieved fine earth in the cylinder (in kg), DW skel is the dry weight of the skeleton (particles >2 mm; in kg), and V 100 = 100 cm 3 = 0.0001 m 3 is the volume of the sampling cylinder.BD was then used to determine the C pool for a certain depth interval (C stocks) using Eq. ( 3): where C conc is the C concentration (in%) of the fine earth, and Δz is the depth interval in m (0.1 m).
Since the 2005 profile data were taken by soil horizons, BD and C concentration for the respective depths (0.1 m intervals) in 2018 were averaged.An average BD from C1 and C2 (measured in 2005) was calculated for every 0.1 m depth interval (0.1-0.2 m until 0.6-0.7 m depth).This BD was then used for depths >0.1 m to calculate C stocks (2005 and 2018), assuming that the BD values below 0.1 m had not changed between 2005 and 2018.
For the layer 0.1-0.2m, a special approximation was used for the 2005 data as this layer was influenced by ploughing (tillage down to 0.23 m at this site) and therefore was more similar to the top 0.1 m.C stocks in the 0.1-0.2m layer were estimated to be 84% of the C densities in the top layer (0-0.1 m; based on data taken in 2018).
For the depth of 0.2 m until a depth of 0.7 m (in 0.1 m intervals), we estimated the C density, i.e., the C concentration multiplied by the BD (in g cm -3 ), at a certain depth based on the C density in the top soil layer (0-0.1 m), using an exponential depth function by Hilinski (2001): where C(z) is the C density at a certain depth (g cm − 3 ), Cb is the C density at the bottom of a profile (g cm − 3 ), C0 is the C density at the surface (0-0.1 m, g cm − 3 ), and K is a scale constant (cm − 1 ).C0 changed depending on sampling location, while Cb was assumed to be the same for all locations, averaged from measurements at 0.6-0.7 m depth.
Values for K were determined by using the measured C densities from five profile measurements (one from 2005 and four in 2018) for depths from 0.2 to 0.7 m for every depth layer.The average K (0.04) was then used to calculate the C densities for the missing locations (in 2005 and 2018) at depths from 0.2 to 0.7 m in 0.1 m intervals.C densities (in kg m - 3 ) multiplied by depth interval (0.1 m) resulted in an estimation for soil C stocks per area (kg m − 2 ).
Mean C stocks per depth layer were calculated from measured (where available) and modeled data.To estimate the total C stock until 0.7 m depth for the two years (2005 and 2018), the C stocks from the seven soil layers were added, and the mean from all 29 locations was calculated.Uncertainties of the C stocks per depth layer (i.e., standard deviation) were calculated based on standard deviations of C concentration and BD measurements in the 0-0.1 m soil layer using uncertainty propagation rules for multiplications.In order to keep comparability to the flux data, we indicated a C loss from the soil as positive number, while a C gain would be negative, according to the micrometeorological sign convention (see Section 2.4; Fig. B4).

Modeling the effects of environmental variables and management on NEE and NBP
To investigate which variables were the most important drivers for NEE and NBP on different time scales, we used a multiple linear regression model with data at three temporal resolutions (half-hourly, daily, and yearly).To increase reliability of the model, only measured NEE data of best quality (flag=0) were used for the half-hourly model.In addition, concurrent records of auxiliary variables needed to be available together with NEE, which resulted in about 27% of the possible data or 75,738 observations for the half-hourly model.For the daily model, daily sums based on the gapfilled half-hourly fluxes were used, but periods with long gaps (see Section 2.3) were eliminated from the dataset as well as periods when environmental variables were missing, which resulted in a data availability of 87%.NEE and NBP on the yearly scale were also modeled to distinguish between the effects of climate and management.For the yearly models, the year of sward renewal ( 2012) was removed from the data set as this was considered a rare management event.Data from 15 annual observations were used in subsequent analyses.The variable selection for all models was based on the most important environmental drivers of NEE based on literature (e.g., Peichl et al., 2011;Wohlfahrt et al., 2008a), which at first included the vapor pressure deficit (VPD), soil water content (SWC), precipitation (PREC), air temperature (Tair), and PPFD.However, SWC and PREC were dropped in a second step, since no linear relationship with NEE was found (SWC), and the distribution of the PREC data was right-skewed (many zero values).In addition to environmental variables, agricultural management, specifically the frequent defoliation by mowing (and grazing), directly affects NEE (Fig. B5).Thus, we included a management variable in the half-hourly and daily models (with five factor levels: "No Management", "Mowing", "Grazing", "Fertilization", "Ploughing") for the dates when the grassland was mown or grazed.The time of management for the half-hourly data was assumed as 12:00 at the respective management date.As the destructive impact of these management events was expected to persist over several days after the event, the factor was set to "Mowing" or "Grazing" for 10 days after defoliation (i.e., until daily average NEE became negative again) for the half-hourly and daily data.For "Fertilization", a period of 5 days was used, while "Ploughing" was considered for 3 months.For the yearly model, we marked the years when the clover experiment took place (2015-2020) in the management factor.For the half-hourly and daily models, residuals showed a temporal autocorrelation.Therefore, we used generalized least square (GLS) models (Pinheiro et al., 2022), with an AR1 correlation structure grouped by year (assuming that the years do not correlate).For all predictors, a linear relationship to NEE or NBP was assumed, which resulted in the following (simplified) model equation: with NEE or NBP in g C m − 2 (half-hourly, daily, or yearly sums), VPD (hPa), Tair ( • C), PPFD (μmol m − 2 s − 1 ), PREC (mm; yearly model only), and mgmt as a factor, which was slightly different for the different temporal scales.β 0 is the intercept, β 1− 4 are the coefficients to fit for each predictor, and ε is a stochastic disturbance term.Predictors in the GLS models (half-hourly, daily) were evaluated depending on how much the AIC changed when removing the respective predictor from the model.The relative importance of each predictor in the linear models without a correlation structure (yearly) was calculated according to Lindeman et al. (1980) using the R package "relaimpo" (Grömping, 2006), considering how much each predictor contributed to the R 2 of the model.All statistical analyses were conducted in Rstudio (R version 4.1.2,R Core Team, 2021).

Environmental conditions and NEE over 16 years
The site Chamau experiences a temperate climate, with most of the annual precipitation falling in the summer months (May to August) and the highest temperatures occurring in July (Fig. B1).Environmental conditions varied over the 16 years of the study (Fig. 1a-c).During 2005 and 2020, monthly mean temperatures were between − 3.4 • C (February 2012) and 21.8 • C (July 2015).While sometimes monthly air temperature dropped below 0 • C, the soil almost never froze (soil temperature in 0.04 m >0 • C) (Fig. 1a).Only in January 2006, monthly soil temperature was below 0 • C. Similarly, half-hourly soil temperature values rarely dropped below 0 • C, only in 2006, 2008 and 2012 (data not  shown).The coldest year was 2010, with air temperature below average throughout the year (especially in January and December), and a mean annual air temperature of 8.6 • C (Fig. B6; Table B4).In contrast, the hottest year was 2018, followed by 2020, with mean annual temperatures of 11.0 • C and 10.6 • C, respectively (Table B4), and air temperature above average from April to September in 2018 (Fig. B6b).Over the 16 years, annual mean temperature increased by 1.26 • C (+0.079 • C yr − 1 , linear trend, p< 0.01, R 2 =0.4).Substantial snow cover (more than 10 days with snow) occurred in seven out of 16 winters, which is common for the site's low elevation of 393 m a. s. l. (Fig. 1a).
Precipitation showed a high variability and ranged from 0.1 mm in December 2016 to 241 mm in July 2014 (Fig. 1b).Periods with belowaverage summer precipitation were observed during summer 2018 and summer 2019 (Fig. 1b; Fig. B6b).The year 2018 was the driest year, with an annual precipitation of 906 mm, while 2016 was the wettest year, with an annual precipitation of 1351 mm (Table B4).Low soil moisture conditions usually occurred in summer (Fig. 1c) and coincided with high VPD.Such conditions, indicating potential drought stress to plants, occurred more often in recent years, e.g., in 2015, 2018, and 2019, which were also years with below-average summer precipitation (Fig. 1b, Fig. B6b) and above-average annual air temperature (Table B4).
NEE of the permanent grassland at Chamau showed clear annual cycles, with highest CO 2 uptake rates during spring and summer (i.e., lowest NEE), and comparably low CO 2 emission rates (i.e., high NEE) during the winter months (Fig. 1d).The high emissions and reduced CO uptake in 2012 were due to a renewal of the grassland sward (Merbold et al., 2014;Merbold et al., 2021).In the following years (2013-2015), monthly net CO 2 uptake rates were highest among all 16 years, reaching NEE of − 214 g C m − 2 month − 1 in April 2014 (Fig. 1d).The years 2016, 2018 and 2019 showed rather low net CO 2 uptake rates or even net CO losses during the summer months, with 2019 being the year with the second largest summer CO 2 loss (i.e., highest NEE) after 2012.

Annual net ecosystem exchange and carbon budgets
Temporal variations in NEE were highly influenced by management, triggered by environmental factors (Fig. 1), and consequently by the phenology of the grassland sward.The timing of the frequent cuts during the growing season usually coincided with fair-weather periods.As a result, high CO 2 uptake rates like in May 2013 abruptly decreased (Fig. 2a), approaching zero or even showing CO 2 losses over a couple of days (beginning of June 2013), until vegetation regrew, and daytime NEE became negative again (end of June, beginning of July 2013).Depending on the meteorological conditions, CO 2 uptake was drastically reduced, e.g., during the hot and dry summer season in 2018 (Fig. 2b).
Moreover, during mild, snow-less winter months (January to March), notable CO 2 uptake occurred, as seen in 2007, 2011, or 2014 (Fig. 2c), much in contrast to the CO 2 losses during the winter months in 2012.The year 2012 showed the highest CO 2 emissions in the first half of the year of all 16 years, due to the renewal of the sward in February, i.e., when the soil was ploughed, and the sward reseeded (Fig. 2c).End-ofthe year NEE budgets were highly variable, ranging from − 670 g C m − 2 yr − 1 in 2014 to +189 g C m − 2 yr − 1 in 2012 (Table A1).Generally, the annual NEE budget was negative, i.e., the grassland fixed more CO 2 than it respired, in 15 out of the 16 years, except for 2012 (Fig. 2).
Management practices changed over the 16 years of this study (Fig. 3a).Management intensity was very high in the first four years of the measurements (2005)(2006)(2007)(2008), with up to seven cuts per year and the same number of fertilizer applications.In the following years (2009-2020), cutting frequency dropped to four to five times per year (Table B1).Mowing and fertilization with slurry were by far the most frequent management events over all years, while fertilization with manure and mineral fertilizer was very rare.Further reoccurring management events were grazing with sheep during the winter seasons.During 2012, the grassland was completely renewed (Fig. 3a).Starting in 2015, the management changed due to the clover experiment, in particular the fertilization regime was affected (no fertilization anymore on parcel B2; B3).
Therefore, the components of the annual C budgets showed a high inter-annual variability over the 16 years of this study (Fig. 3b).
Fertilizer inputs (C imports) were rather high in the earlier years and the renewal year 2012, while they decreased after the renewal (2013-2020), which was also due to the clover experiment starting in 2015 (Fuchs et al., 2018).NEE showed the highest CO 2 uptake in the years after the renewal (2013-2017), well above the long-term average, except for the very wet year 2016 (1351 mm, 217 mm (19%) above the 2005-2020 average).On the other hand, NEE was substantially reduced during years with extremely warm and dry summer seasons, i.e., and 2019.C exports were below the long-term average before the renewal (2005-2011; sward was originally sown in 1998, see Section 2.1), but well above average for all the years after the renewal (2013-2020), except for 2018 and 2019.C exports were lowest in 2010, which was also the coldest year of the 16-year record and parcel B1 was grazed more intensively than on average (Table B1).As found in earlier studies (Soussana et al., 2007;Ammann et al., 2020), there was only a weak relationship between annual C exports as harvests and NEE as net CO 2 uptake (data not shown; R 2 = 0.4).Since NEE is the difference between GPP and Reco, C exports can exceed NEE in magnitude, as the C exports do not fully reflect all respiration terms (e.g., soil respiration) included in NEE.
As a result of all three component fluxes, NBP showed the highest sink behavior (− 369 g C m − 2 yr − 1 ) just before the renewal (2011; Thus, from the beginning of the measurements in 2005 to end of 2020, the sum of all annual NBP budgets of the Chamau grassland showed a C sink (Fig. 3c).The special event of the sward renewal in 2012 was clearly reflected as a temporary, pronounced increase in cumulative NBP, while the following three years (2013)(2014)(2015) were small C sinks, keeping the overall trend of a C sink.However, starting with the very wet year 2016, we observed a clear change in the cumulative NBP trend, with small or even positive NBP values (Fig. 3c).Thus, the grassland acted as a C source during those last years.This change in trend of cumulative NBP coincided with the period of the clover experiment when parcel B in the north of the station was oversown with a clover seed mixture and did not receive any fertilizer anymore (C imports; see Fig. 3b).The two hot and dry summer seasons in 2018 and 2019 were also part of this increasing NBP trend.At the end of 2020, the grassland at Chamau had sequestered 1119 g C m − 2 (70 g C m − 2 yr − 1 ) over a period of 16 years (Fig. 3c, Table A1).

Comparison of long-term NBP to changes in soil carbon stocks
To validate this result, soil C stocks were sampled twice within the main footprint of the flux station (Fig. B4), in 2005 and 2018.For the top 70 cm of the soil profile, soil C stocks were 12.603 kg C m − 2 in 2005 and 12.245 kg C m − 2 in 2018 (Table 2).Flux data combined with management data suggested that the Chamau grassland sequestered 1422 g C m − 2 (NBP; on average 102±110 g C m − 2 yr − 1 ) between 2005 and (Table A1).However, there was no significant change in the measured C concentration of the top 0.1 m of the soil (paired t-test, n = 29, p = 0.17), which showed a C concentration of 3.1% in 2005 and 3.0% in (Fig. 4), and respective BD values of 1.061 g cm − 3 and 1.058 g cm − 3 .Relative uncertainties of the C stocks in the 0-0.1 m soil layer were estimated to be 16.7% (±549 g C m − 2 ) in 2005, and 20.4% (±636 g C m − 2 ) in 2018 (see also error bars in Fig. 4).
Soil C data coverage for depths >0.1 m was scarce in 2005 (Fig. 4), but on average, C stocks from 0 to 0.3 m soil depth were similar in and 2018, with 7.784 kg C m − 2 in 2005 vs. 7.414 kg C m − 2 in (Table 2).Taking into account the entire soil profile (0-0.7 m soil

Influence of meteorology and management on NEE and NBP
Using multiple linear regression models, we identified the most important drivers of NEE and NBP at the Chamau grassland site at different time scales (half-hourly, daily, yearly).In the half-hourly model, the most important driver of NEE was PPFD (Table 3), indicated by the large difference in AIC when removing this variable from the model.The explanatory power of VPD and management were similar, as seen in the similar ΔAIC values, while Tair explained very little of the NEE variations, shown by the low ΔAIC value (Table 3).In summary, these four variables explained 56% of the variance in NEE (R = 0.563; Table 3).
For the daily model, PPFD was still the most important predictor (highest ΔAIC; Table 3), followed by management and then by Tair.All management events had a positive effect on NEE (Table 3), with the largest effects by mowing and ploughing.E.g., mowing increased NEE by 0.2 g C m − 2 and 2.7 g C m − 2 on the half-hourly and daily time scales, respectively, compared to times with no management (Table 3).However, the overall explanation for the daily NEE model was only 45% (R 2 ) compared to 56% for the half-hourly model.
On the annual scale, both NEE and NBP were analyzed for environmental and management effects (Table 4).For NEE, neither effects from meteorological nor the management variable (which included information about the clover experiment) were significant, and the model fit was very weak (R 2 =0.082,Table 4).On the other hand, a significant positive effect (p = 0.006) of the clover experiment on NBP was found, indicating that in the years of the clover experiment (i.e., increased clover fraction and reduced fertilization), NBP was positive, indicating C losses (also seen in Fig. 3).Management explained approximately 53% of the variation in NBP, followed by Tair which was the second most important predictor (12%).R 2 was substantially higher for the NBP compared to the NEE model, about 68% of the variation in NBP could be explained by mainly the management and annual air temperature.

Comparison to other grassland C budgets
Overall, NEE and NBP showed a large inter-annual variation over the 16 years investigated.NEE ranged from − 670 to 189 g C m − 2 yr − 1 , while NBP ranged from − 369 to 253 g C m − 2 yr − 1 (Fig. 3).The 16-year average NEE and NBP were − 284±115 g C m − 2 yr − 1 and − 70±106 g C m − 2 yr − 1 , respectively.Peichl et al. (2011) reported a similar value for average NEE over six years (− 277 g C m − 2 yr − 1 ), but with much smaller variation among years (annual NEE was between − 245 to − 352 g C m − yr − 1 ).A lower NBP value (i.e., higher C sinks) was reported by Ammann et al. (2007) (− 147±130 g C m − 2 yr − 1 , 3 years) for a similarly managed (cut) grassland in the Swiss lowlands, while Soussana et al. (2007) reported a comparable average NBP of − 104±73 g C m − 2 yr − 1 for multiple European grasslands (9 sites, 2 years).Rogger et al. (2022) calculated an average NEE of − 357±76 g C m − 2 yr − 1 and an average NBP of − ±80 g C m − 2 yr − 1 for an extensively managed Swiss grassland at higher elevation (Früebüel, 1000 m a.s.l., 15 years), despite similar C imports but lower C exports.Thus, while our study supports the magnitudes of NEE and NBP reported before, it covered a wider range of environmental conditions and different management practices, suggesting that measurements over a long period are needed to represent the C sink strength of any given managed site.4)).The data sampled in 2005 (Roth, 2006) are displaced vertically for better visibility.

Table 3
Model output predicting half-hourly and daily NEE based on meteorological and management variables.Model predictors for two generalized least square (GLS) models are listed in the column "Variable".mgmt is a factor with five levels ("No Management", "Mowing", "Grazing", "Fertilization", "Ploughing").Estimated effects ±95% confidence intervals on NEE (in g C m − 2 ) based on half-hourly (30 min) and daily averages for each predictor are shown.Management effects describe the effects of the specific management on NEE compared to a reference level ("No Management").All effects were significantly different from zero (p-value < 0.001), except for the one indicated by "ns" (p > 0.05).The ΔAIC columns show the difference between AIC of the full model and the model with the respective predictor removed.The higher the ΔAIC, the stronger the explanatory power of the respective predictor.In the last two rows, the AIC and R 2 for the half-hourly and daily models are shown.*ΔAIC is the same for all factor levels since the whole factor was removed from the model.
Uncertainties in the annual C budgets (Table A1) were in a similar range as in earlier studies (Ammann et al., 2007;Aubinet et al., 2012;Rutledge et al., 2015).NEE contributed the largest uncertainty of the three components of the NBP budget (C exports, C imports, NEE), which was mostly due to the uncertainty introduced by accounting for a wide range of u * thresholds to filter the data rather than due to the random uncertainty of NEE, which was smaller (see Appendix A).However, using the median of the u * threshold distribution as done here is thought to give the most reasonable estimate of annual NEE and thus NBP.
Self-heating of the open path IRGA could furthermore contribute to NEE uncertainty but was not addressed for several reasons: (1) No universal correction is available for inclined mounted IRGAs as the selfheating correction by Burba et al. (2008) was developed for vertically mounted instruments.(2) For inclined IRGAs, only a fraction of the correction is required, whereby the fraction (scaling term) can be determined from parallel measurements using open and (en)closed path IRGAs (Kittler et al., 2017;Deventer et al., 2021).The fraction must be determined empirically and can range from 0 (no self-heating effect) to 1 (full self-heating correction).Since no concurrent measurements of two IRGA systems were available at the Chamau site, the fraction and its temporal dynamics (e.g., daytime-nighttime differences, see Kittler et al., 2017) remain unknown.It was shown previously that the self-heating effect can be negligible at grasslands (e.g., Haslwanter et al., 2009, see their Fig. 2).However, using the correction with fractions from other (grassland) sites is not recommended (Deventer et al., 2021).
(3) Applying the self-heating correction without these parallel measurements can even increase the flux uncertainty (Deventer et al., 2021) and might produce fluxes that are further away from the "true" flux than the uncorrected fluxes (Wohlfahrt et al., 2008b, see their Fig. 3).Since our annual NEE as well as our 16-year average NEE values are consistent with other studies, we are confident about their calculation which followed current state-of the art methodology (Aubinet et al., 2012;Wutzler et al., 2018;Pastorello et al., 2020).

Most important drivers of NEE and NBP
Modeling the combined effect of environmental drivers and management on NEE using multiple linear regression, we found PPFD being the most important predictor at the shorter time scales (half-hourly and daily; Table 3), supporting earlier studies (e.g., Wohlfahrt et al., 2008a).NEE is difficult to model because it is the difference between GPP and Reco, including different physiological processes (photosynthesis, heterotrophic and autotrophic respiration).GPP and Reco in grasslands have been shown to mainly depend on photosynthetically active radiation and temperature, respectively (Gilmanov et al., 2007), but also to be coupled with each other (e.g., with increasing GPP also Reco increases).
In addition, management temporarily but drastically alters NEE in cut or grazed grasslands, as seen in our driver analyses at the half-hourly and daily time scales (Table 3).After PPFD, management turned out to be the second important driver for NEE, particularly at the daily times scale, albeit with a much lower importance.This was probably due to the short-term influence of the defoliation events on NEE, which was most pronounced on a daily time scale when daytime-nighttime differences in NEE (captured at half-hourly scale), did not play a role anymore.Thus, it is recommended to consider the combined effects of environmental drivers and management when modeling NEE on shorter timescales (half hourly, daily).
On the annual scale, our model performed poorly for NEE, but very well for NBP.Management, i.e., the clover experiment with reduced fertilization C imports, had a significant effect on NBP (p = 0.006), while environmental variables such as Tair, VPD and annual PREC were less important.Although environmental and management effects were closely linked at shorter times scales (e.g., daily, weekly), as mowing usually occurs during periods of fair weather, such relationships were less relevant on annual time scales.Thus, our results on the annual time scale clearly showed that the positive trend observed in NBP in the recent years (i.e., Chamau grassland as a C source) was mostly due to reduced C imports rather than due to climatic conditions.

Effects of management and climate on NBP
The C budget (NBP) of this permanent grassland showed large variations during the 16 years investigated (Fig. 3b), comparable to earlier long-term studies (Ammann et al., 2020;Rogger et al., 2022).We observed a general relationship of annual NBP to C imports, with C sink Table 4 Model output predicting annual NEE and NBP based on meteorological and management variables.Estimated effects ±95% confidence intervals on annual NEE and NBP (both in g C m − 2 ) for each predictor are shown.The management factor had two levels, i.e., the clover experiment (years 2015-2020) and years with normal management (2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014).None of the effects was significantly different from zero (p-value > 0.05, "ns"), except for management effects on NBP.The ΔAIC columns show the difference between AIC of the full model and the model with the respective predictor removed.Negative values of ΔAIC suggest a low explanatory power of the respective predictor, i.e., the model's AIC might improve (decrease) when removing these predictors from the model.The relative importance (Lindeman et al., 1980), i.e., how much each predictor contributes to the R 2 (values sum up to R 2 ) of the model, is also reported.In the last row, the AIC and R 2 for the two models are shown.A1).The triangle indicates the year of the sward renewal (2012).The black curve was fitted to all data points (except for the renewal year 2012), using a second-degree polynomial function.A negative sign convention for C and N imports was used (see Eq. ( 1)).
behavior (negative NBP) when C and N imports were high, and C loss behavior (positive NBP) when imports were low (Fig. 5).In contrast, even though NBP was calculated from C imports and C exports, we did not find a relationship of NBP and C exports (data not shown).Between 2005 and 2011, high C and N imports with organic fertilization made the site a stable C sink.However, during the years 2016 to 2020, characterized by low C imports during the clover experiment the site was a C source, despite a broad variety of environmental conditions (Table B4): a very wet year (2016), a very productive year (2017), two extremely hot and dry summers (2018 and 2019), and a warm year (2020).These results suggest that maintaining the C sink at this site will require high C imports from organic fertilization.However, high imports of particularly slurry, but also manure (i.e., C imports with different water contents) might be accompanied by high N 2 O emissions (Flechard et al., 2007), although some studies reported increased net CO 2 uptake (NEE) being able to compensate or even exceed these high N 2 O emissions (reviewed in Hörtnagl et al., 2018).But maintaining such a high net CO 2 uptake seems increasingly challenging, since frequency and intensity of hot and dry summers are projected to increase in the future (CH2018, 2018).Our measurement period already included four of the "Top 5″ summer drought and heatwave years (2018,2020,2015,2011) of the last 40 years on the Swiss Plateau (Scherrer et al., 2022).While the summer drought in 2018 actually led to an increase in CO 2 uptake compared to pre-drought years at two less intensively managed Swiss grasslands at higher elevations, a reduced productivity was observed at the low elevation Chamau site (Gharun et al., 2020;our Fig. 3).On the other hand, higher winter temperatures and less snow cover than currently might favor a longer growing season with increased net CO 2 uptake (Zeeman et al., 2017).Nevertheless, the long-term C budget as seen in the NBP trend over time (Fig. 3, Fig. 5) was more strongly affected by C imports than by NEE and C exports, suggesting that the effect of climate was less pronounced than the effect of management (as also shown in Table 4).
Our C budget (NBP) was estimated at the field scale and did not consider further on-and off-site C fluxes nor other GHGs at the farm scale (Soussana et al., 2010).However, the C imports as organic fertilization (slurry) to the grassland originated from farm animals, which were fed with the feed from the grassland under study, thus enabling a rather closed carbon and nutrient cycle at field level.In the absence of these C imports to the grassland, the long-term field scale NBP might even shift from a C sink to a C source (Fig. 5).Carbon sequestration in agriculture has been identified by the IPCC (2023) as a promising mitigation option against further climate change.Thus, implementing sustainable management regimes such as closed carbon and nutrient cycles at field and farm levels will be decisive to secure and potentially increase the C sinks in grassland also in the future.

Effect of the sward renewal
In the renewal year 2012, the grassland was a large C source and lost 122 g C m − 2 (NBP in 2012), which was more than compensated in the two following years (NBP in 2013: − 249 g C m − 2 yr − 1 , 2014: − 254 g C m − 2 yr − 1 ; Table A1).In contrast to Ammann et al. (2020), who found an intensively managed cut grassland to be a C source until the second year after sward renewal, the grassland in our study was a C sink already in the first year after renewal.Differences in the results might be related to different environmental conditions in the years after the renewal (2008( -2010( in Ammann et al., 2020( vs. 2012( -2014 in this study) in this study).These years were favorable for the Chamau site (annual precipitation close to average, warm winter in 2014; Table B4, Fig. B6), promoting a good establishment and growth of the sward.Highest CO 2 uptake was observed in the two years after the renewal (Fig. 2), also the yields (i.e., C exports) were above average.These results are consistent with Creighton et al. (2016) who found that sward renewal increased DM yield compared to an old permanent pasture.However, these beneficial effects on yield can be counteracted by the loss of nutrients (C and N) and might only persist on a short timescale after the renewal (Kayser et al., 2018).In our study, the sward renewal clearly showed a short-term beneficial effect on NEE and yields, and subsequently on NBP in the three years after the renewal (Fig. 3).

Soil c stock changes and NBP
Soil C stocks in the entire soil profile (0-0.7 m depth) showed a small, insignificant decrease of 0.358 kg C m − 2 over 13 years (from 12.603 kg C m − 2 to 12.245 kg C m − 2 ; 27.5 g C m − 2 yr − 1 on average), while NBP showed an increase of 1.422 kg C m − 2 (102 g C m − 2 yr − 1 ) sequestered by the grassland during the same time (Table A1, Fig. 3c).Often, the C sink/ source behavior measured by NBP shows a higher C sequestration than the one measured in soil C stock changes (Jones et al., 2017;Leifeld et al., 2011;Soussana et al., 2010;this study).The possible underestimation of C losses (e.g., due to leaching or grazing) not captured by the NBP estimate might be one of the reasons for these differences, although these are typically considered small (Kindler et al., 2011;Zeeman et al., 2010).Other discrepancies between both estimates (besides uncertainties in NEE and thus NBP mentioned in Section 4.1) could be due to (1) methodological differences of the soil sampling approach, (2) C stock modeling, (3) spatial heterogeneity, as well as (4) management.
(1) The 2005 soil sampling was very well replicated for the top 0.1 m, but only included two profiles to deeper depths (Roth, 2006).Roth (2006) took mixed samples from four soil cores per sampling location (0-0.1 m), while we took samples from one soil core per location but extended the depth coverage at the 29 locations.On the other hand, differences in BD due to different methods were negligible (1.061 vs. 1.058 g cm − 3 , 2005 vs. 2018, respectively, average from 29 locations).
(2) The equation used to estimate soil C stocks (Hilinski, 2001) is widely used in the CENTURY model and relies on common assumptions, namely an exponential decrease of C concentrations with depth, and on measured data of soil C concentrations at the top and the bottom of the soil profile modeled.While more soil data would clearly have been beneficial, our estimate of soil C stock changes is based on a well-tested and reliable model and can rather be considered a conservative estimate.
(3) Soils are inherently heterogenous, even in agricultural ecosystems (Maillard et al., 2017), similar to the variation in aboveground biomass (see Appendix A).Soil management as well as grazing contribute to this spatial heterogeneity.We accounted for this by taking samples across the footprint area (both in 2005 and 2018), with an emphasis in 2018 to better represent the deeper soil.Still, the area covered by the soil sampling was smaller than the footprint area NBP was integrating.(4) Management varied considerably between 2005 and 2018.For example, the sward was renewed in 2012, i.e., the soil was ploughed, and although changes in BD were negligible, soil C stocks in the upper 0.3 m, the typical ploughing depth, did not change significantly, suggesting the preservation of soil C stocks (Table 2).In addition, annual C imports decreased after the renewal (Fig. 5).Thus, these well-known uncertainties of soil C stock measurements limit our validation assessment (Maillard et al., 2017).
Nevertheless, the soil assessment showed a near C neutral grassland (small decrease of 27.5 g C m − 2 yr − 1 ), while the NBP estimate suggested a small C sink to near-neutral conditions (− 102±110 g C m − 2 yr − 1 from 2005 to 2018, − 70±106 g C m − 2 yr − 1 from 2005 to 2020).Considering that the grassland was established in 1998, the observed stronger C sink in the earlier years (NBP; Fig. 3b) could also have been a response to the previous land use change.According to the IPCC Tier 1 approach, the effect of land use change (e.g., conversion from cropland to grassland) on C stocks is expected to last about 20 years (default value, shorter periods can be used; IPCC, 2019; Section 6.3 Land converted to grassland).Poeplau et al. (2011) suggested a longer period (>100 years) during which grasslands in the temperate zone might be in a transition phase of being C sinks until a new equilibrium is reached (based on 24 studies), while Kämpf et al. (2016) detected an increase in soil C stocks only during the first 16 years after conversion from cropland to grassland in a global meta-study (for this particular analysis: > 50 studies).However, such a 20+ year transition phase should have been observed in the soil C stocks as well, not only in the NBP estimates, and thus cannot explain the apparent discrepancy between both methodologies.Moreover, Kämpf et al. (2016) reported that no significant increase in soil C stocks occurred after land use abandonment when the initial soil C stock was larger than 5 kg C m − 2 .Our soil C stocks in 2005 were 12.6 kg C m − 2 , thus well above this threshold from their meta-study (with 69 studies).The impact of such a transition period cannot be modeled reliably either since the starting point, i.e., the soil C stocks in 1998, is unknown.Assuming that the measured soil C stocks 12.6 kg C m − 2 in 2005 were a result of C sequestration due to land use change between the conversion in 1998 and our first soil C stock measurements in 2005, and applying maximum C sequestration rates of 0.1 kg C m − 2 (Kämpf et al., 2016), our Chamau site could have sequestered at the most 0.8 kg C m − 2 in these eight years, resulting in an initial soil C pool of 11.8 kg C m − 2 for the arable land use, again well above the 5 kg C m − 2 value above which no C sequestration was observed.Based on these considerations, spatial heterogeneity of the soilboth horizontally and verticallyseems to be the most reasonable explanation for the difference between both C sequestration estimates.
Relating our results to the 4‰ initiative (www.4p1000.org;Minasny et al., 2017), which aims for an annual C increase of 4‰ in the topsoil (0-0.3 or 0-0.4 m) to counteract anthropogenic CO 2 emissions, raises doubts on the feasibility of this initiative, even in temperate grassland.To meet the initiative's goal, an annual increase of 0.030 kg C m − 2 (4‰ of 7.414 kg C m − 2 ) in the topsoil (0-0.3 m) would have been necessary, in contrast to the observed (albeit insignificant) annual decrease (0.028 kg C m − 2 yr − 1 in the top 0.3 m) at the Chamau grassland site.Nevertheless, in order to reverse the observed C loss into a C gain in the topsoil as well as to ensure productivity, very large C inputs would be needed.This is clearly reflected in our NBP estimates, which strongly depended on C and N imports (Fig. 5, Table 4).Where these C imports originate from will determine whether field-and farm-scale grassland management acts as C source, can be C neutral, or even acts as a C sink.

Conclusions
Using two independent methodologies, the intensively managed, permanent grassland was a small C sink to C neutral (based on estimates of NBP) or C neutral (based on soil C stock changes in 0-0.7 m).Our study clearly demonstrated the importance of covering multiple years with different management events, when assessing the C sink/source strength of a site.Accurate recording of management data (dates, exact yield amount, fertilizer amount) as well as consistent assessment of soil C stock changes can reduce uncertainties in the long-term C budget.Even though our study period included a sward renewal with high C losses and a clover experiment, the soil C stocks did not significantly change over 13 years.We observed a trend towards a C source (NBP) in recent years, which was mainly attributed to management (decreasing C imports) rather than to extreme weather conditions affecting NEE (summer droughts and one extremely wet year).High C imports contributed and maintained the negative NBP (i.e., the C sink), relevant for climate change mitigation and the 4‰ initiative.However, also other GHGs need to be considered, such as CH 4 and particularly N 2 O, in order to develop climate change mitigation strategies for permanent grassland.Avoiding sward renewal, while keeping a productive sward, as well as additional C imports via organic fertilization seem to be critical components for such mitigation strategies.Maintaining an even small grassland C sink will remain a challenge, particularly under future climate conditions.a u * threshold from the 5th percentile of the u * threshold distribution (0.027 -0.058 m s − 1 over all seasons and years; NEE_U05) as well as using a u * threshold from the 95th percentile of the u * threshold distribution (0.047 -0.115 m s − 1 over all seasons and years; NEE_U95).Other studies (e.g., Pastorello et al., 2020) used the 84th and 16th percentile, which would reduce our NEE uncertainty.So, we believe that our uncertainty estimate is quite conservative and yields a rather high value compared to other studies.Uncertainty from using different u * thresholds was estimated by taking the difference between annual sums of gapfilled NEE_U95 and NEE_U05 divided by two (Rutledge et al., 2015;Pastorello et al., 2020).The difference of the NEE used in this study (NEE_U50: using the median value for u * from the u * threshold distribution) to NEE_U05 and NEE_U95 is usually asymmetric, but for simplification reasons, we used a symmetric range to report the annual uncertainties.
Assuming that the random uncertainty and the u * threshold uncertainty do not correlate, we estimated the annual NEE uncertainty to be where σ NEE is the combined uncertainty for NEE from the random uncertainty σ rand and the u * threshold uncertainty σ u * (also see Pastorello et al., 2020, their Eq. 5).σ NEE was in the range of 53-140 g C m − 2 yr − 1 .σ rand did not vary much among years (35-68 g C m − 2 yr − 1 ) and was in a similar range compared to other studies (Aubinet et al., 2012).For σ u * , the range was larger (22-122 g C m − 2 yr − 1 ).The cumulative NEE for the full 16-year period using different u * thresholds (5th, 50th and 95th percentile of u * threshold distribution) resulted in different values: − 5589, − 4542, − 3040 g C m − 2 for NEE_U05, NEE_U50, and NEE_U95, respectively.However, the sign of NEE never changed over time.Thus, CO 2 uptake exceeded CO 2 emission at the site over the 16 years, independent of the u * threshold used.
(3) Uncertainties for the management C fluxes (i.e., C exports and C imports) were estimated based on information from the farmer (2005-2014) as well as own measurements (2015)(2016)(2017)(2018)(2019)(2020).For 2005 to 2015, we assumed an uncertainty of 10% (after Ammann et al., 2009;Zeeman et al., 2010) for the C exports (σ Cexports ) for each harvest event, which we summed up for annual estimates and for the full 16-year period (by taking the square root of the sums of variances, similar to Eq. (A1), with one variance value per harvest event).For the years 2015 to 2020, manual yield measurements were available, for which we estimated the standard deviation of the C exports from 10 samples per parcel for each harvest event (representing the spatial variability).For each parcel, we calculated an annual uncertainty by adding the variances and taking the square root (similar to Eq. (A1)).We averaged these annual standard deviations from the two parcels by using the same weights from the footprint contributions as for the NBP calculation (Table B3).The C concentrations in the biomass samples varied between 41% and 45% from 2015 to 2020.Since the C concentration measurements were very accurate (±0.5% maximum deviation from replicated measurements; this study), this uncertainty seemed negligible.However, the spatial variability of the harvest amount (standard deviation of the 10 replicated samples) can be high (±29% of the C export per harvest on average; 2015-2020), and is thus considered the most important source of uncertainty of C exports.
For the C imports via fertilizer, we distinguished between liquid (organic) fertilizer inputs (via slurry), mineral fertilizer inputs, and solid organic inputs via manure.Following Zeeman et al. (2010), we assumed an uncertainty of 15% for the C imports via slurry per application.C concentration of the slurry was 40±3.5% (±SD) (in the dry matter) on average (2005-2020, all samples from both parcels).Since the application of mineral fertilizer and solid manure was very rare, their uncertainties were considered negligible.Annual uncertainty of C imports was calculated similar to σ NEE or σ Cexports , i.e., by summing up the squared standard deviations (15% of the slurry amount per slurry application, squared) over the respective period, taking the square root and averaging the uncertainties from the two parcels according to their footprint contributions.
The annual uncertainty for NBP was then calculated using Gaussian error propagation by combining the annual uncertainties of the C budget components according to: where σ Cexports is the annual uncertainty of the C exports and σ Cimports the annual uncertainty of the C imports.The 16-year uncertainties for the cumulative sums were calculated by summing up the 16 squared annual uncertainties of the respective C fluxes, and then taking the square root.Dividing this value by the square root of 16 (=4), an average uncertainty per year was determined (similar to Buysse et al., 2017).

Table A1
Carbon fluxes for the years 2005 to 2020 of the permanent grassland at Chamau.C exports, C imports as well as NEE and NBP are presented.NBP cum gives the cumulative NBP sum over the years; management data from both parcels were combined using their footprint contributions (Table B3).The years 2005 and 2018 are set in bold as soil samples were taken during these years.Average values (AVG) are also given.For detailed calculations of the uncertainties (±), see Appendix A.

Appendix B. Additional figures and tables
. Average annual temperature and average annual precipitation sum are listed in black at the top.Maximum measured temperature (from half-hourly values), mean daily maximum temperature of the warmest month (July), mean annual temperature amplitude, mean daily minimum temperature of the coldest month (January) and minimum measured temperature are listed as black numbers on the left.Petrol shaded areas on the x axis indicate the months where the mean daily minimum temperature is lower than 0 • C, and petrol hatched areas the months with absolute minimum temperature of 0 • C (frost possible).The number in the middle of the x axis states the mean duration of consecutive frost-free days (when temperature does not drop below 0).

Table B1
Management events per parcel for the years 2005 to 2020.The number of events per management activity is indicated in round brackets.For grazing, the grazing duration in days is given (e.g., 30 d per parcel and year).N and C amounts imported via fertilization and exported via harvests (only exports via mowing are considered, grazing is excluded) as well as total yields (as biomass per area) from mowing are given.All imports and exports are given in their standard unit (kg ha − 1 ) for easy comparison with other studies. .

Table B2
Detailed information about management events during the 16 years of measurements per parcel.The date of the management (year-month-day), the parcels managed, the management activity, the amount of the yield exported, fertilizer applied, number of animals and grazing duration (in days) as well as the amount of C and N in the biomass exported or the fertilizer imported are reported.All imports and exports are given in their standard unit (kg ha

Fig. 1 .
Fig. 1.Environmental conditions and net ecosystem CO 2 exchange at the permanent grassland site Chamau for the years 2005 to 2020.Monthly averages of (a) air temperature (solid line), soil temperature (0.04 m depth, dashed line), and number of days with snow per month calculated from albedo (snow days, gray bars, right axis), (b) monthly precipitation sums (PREC), (c) monthly averaged water vapor pressure deficit (VPD, solid line), and soil water content (SWC, 0.05 m depth, gray dashed line, right axis), as well as monthly sums of (d) gapfilled net ecosystem CO 2 exchange (NEE) over the study period 2005 to 2020 are given.Gray shaded lines in the background of (b) are the long term (16 years) average monthly precipitation sums; the black line in (d) shows the running mean over five monthly sums of NEE.
Fig. 2. Temporal variations in NEE and cumulative NEE at the permanent grassland site Chamau for the years 2005 to 2020.Half-hourly NEE for the year after the sward renewal but before the clover experiment (2013; a) and for a year with a very hot and dry summer season (2018; b) are presented.Dotted lines indicate dates when the grassland was mown (a, b).Cyan colors indicate a net uptake of CO 2 , brown colors a net emission of CO 2 .The x axis gives the annual cycle, while the y axis gives the time of the day (a, b).The cumulative sums of half-hourly NEE per year are also given (c).

Fig. 3 .
Fig. 3. Management events (a), annual C budgets (b), and cumulative NBP (c) at the permanent grassland site Chamau for the years 2005 to 2020.Management dates for both parcels (A and B) are indicated in the top panel (a).Sowing includes resowing and oversowing, soil cultivation includes harrowing and rolling, pesticide includes mostly herbicide applications and one molluscicide application in 2012.Starting in 2015, a field experiment (clover experiment) was carried out, in which the northern parcel B did not receive any fertilizer anymore and was oversown with clover.C exports (by harvests), C imports (by fertilization), net ecosystem CO 2 exchange (NEE), and net biome production (NBP) as annual sums over the 16 years investigated are given (b).Colored lines indicate the 16-year average.Positive values indicate C emission, while negative values indicate C uptake.Numbers for all C fluxes and uncertainties are listed in the Appendix (TableA1).The black line in (c) is based on cumulative sums of NEE (from half-hourly fluxes).Management C imports (with fertilization; purple) and C exports (by harvests; light green) from the two parcels combined (see text) were added to the NEE cumulative sums at the dates they took place.

Fig. 4 .
Fig. 4. Soil C stocks for different depths (0-0.7 m) and two years (2005 vs. 2018).Average C stocks (based on all datapoints; blue and black vertical lines for 2005 and 2018, respectively) and related uncertainties due to spatial variation (horizontal error bars; see Section 2.5) for 0.1 m depth intervals are shown.Values for measured C stock estimates from 29 profiles are shown with triangles, modeled estimates as dots (based on Eq. (4)).The data sampled in 2005(Roth, 2006) are displaced vertically for better visibility.

Fig. 5 .
Fig. 5. NBP in relation to C imports for the years 2005 to 2020.C imports varied over the years due to different management practices (for details, see Fig. 3 and TableA1).The triangle indicates the year of the sward renewal (2012).The black curve was fitted to all data points (except for the renewal year 2012), using a second-degree polynomial function.A negative sign convention for C and N imports was used (see Eq. (1)).

Fig. B1 .
Fig. B1.Climate chart afterWalter and Lieth (1967) for Chamau for the years 2005 to 2020.Mean monthly temperature (red line) and mean monthly precipitation sum (blue line) are shown.Blue vertically hatched area shows the humid period of the year, blue filled area indicates months with precipitation greater than 100 mm (axis scale changes > 100 mm).Average annual temperature and average annual precipitation sum are listed in black at the top.Maximum measured temperature (from half-hourly values), mean daily maximum temperature of the warmest month (July), mean annual temperature amplitude, mean daily minimum temperature of the coldest month (January) and minimum measured temperature are listed as black numbers on the left.Petrol shaded areas on the x axis indicate the months where the mean daily minimum temperature is lower than 0 • C, and petrol hatched areas the months with absolute minimum temperature of 0 • C (frost possible).The number in the middle of the x axis states the mean duration of consecutive frost-free days (when temperature does not drop below 0).

Fig. B2 .
Fig. B2.Location of the parcels (A and B) for three periods during 2005 to 2020.The red and blue shaded area indicate the area of the farmer's parcels, names are given in the middle of the parcels.During 2005 to 2013, both parcels were managed similarly (management events only some days apart); during 2014 to 2018, the parcel sizes and their management changed.The southern parcel (red) was managed as before, while the northern parcel (blue) had a higher legume fraction and received no fertilizer any longer (experiment started in 2015, parcel border already changed in 2014); in 2019, the southern parcel was reduced in size, while the management was kept as in 2015 to 2018.The white asterisk marks the location of the eddy covariance station.North arrow and scalebar (units in m) are given in the lower part of the plots.Orthophoto is from the Federal Office of Topography swisstopo.

Fig. B3 .
Fig. B3.Footprint climatology for three periods during 2005 to 2020.Contour lines represent the area within the footprint where 90 (outer), 80, 70, 50 and 10% of the signal measured at the station (white asterisk) originate, calculated with the Kljun et al. (2015) footprint model.Contributions to the signal measured at the station are shown as well (parcel A in red, parcel B in blue).The wind rose in the upper right corner of each panel gives the wind speed (in m s − 1 , shown in the right panel) and wind direction distribution for the respective time periods (low wind speed and low turbulence situations are underrepresented since the footprint calculation is only valid for u * > 0.1 m s − 1 ).Percentages given below the wind roses (in white) represent the occurrence of calms (wind speed < 0.3 m s − 1 ).Orthophoto is from the Federal Office of Topography swisstopo.

Fig
Fig. B4.Soil sampling locations (a), soil C stocks (0-0.7 m) in 2018 (b), and difference in soil C stocks (0-0.7 m) between 2005 and 2018 (c) at the permanent grassland site Chamau.The 29 sampling locations for the top 0.1 m of the soil are shown (all dots) (a).In 2005, only the two soil profile pits (C1; C2) (Roth, 2006) were sampled for depths > 0.1 m.C1 was sampled until 0.65 m, while C2 was sampled until a depth of 0.3 m in 2005.Black dots represent locations where samples were taken down to a depth of 0.3 m in 2018, while the four gray dots represent deeper (> 0.3 m) sampling in 2018.Soil carbon stocks (0-0.7 m) are presented (b).Positive differences in C stocks (c) indicate C loss between 2005 and 2018, while negative differences indicate C sequestration.The white asterisk marks the location of the eddy covariance station.Orthophoto is from the Federal Office of Topography swisstopo.

Fig. B5 .
Fig. B5.Gapfilled half-hourly NEE for the years 2005 to 2020.The x axis shows the annual cycle with the months indicated (tick at first day of each month), the y axis shows the diurnal cycle (in hours).Mowing dates are indicated by black dashed lines.Net CO 2 uptake is indicated by cyan colors (negative sign), net CO 2 emission is indicated by brown colors (positive sign).MDS gapfilling was used and gaps >2 months were filled using random forest models (see Section 2.3 for details).

Fig. B6 .
Fig. B6.Monthly air temperature and precipitation (a) as well as air temperature anomalies and precipitation anomalies (b) from 2005 to 2020.Monthly averages (sums) of air temperature (precipitation) for the years 2005 to 2020 (a).The monthly anomalies (b) were calculated by subtracting the long-term (16-year) monthly temperature (precipitation) average (sum) from the monthly averages (sums).White colors in (b) indicate months where the conditions were close to the 16year mean conditions.Negative values for temperature (precipitation) indicate colder (drier) than normal conditions, positive values indicate hotter (wetter) than normal conditions.Darker colors indicate higher deviations from the mean conditions, legends are shown to the right.
− 1 ) for easy comparison with other studies.Organic fertilizations other than slurry are indicated by (M) for manure.Mineral fertilization was with calcium ammonium nitrate.Grazing other than by sheep is indicated by (C) for cattle.

Table B3 Parcel characteristics over time
Kljun et al. (2015)contributions to the signal measured at the flux station are given.Parcel areas are derived from the polygons shown in Fig.B2.Footprints were calculated with theKljun et al. (2015)footprint model.

Table B4 Annual sums of precipitation (PREC; in mm) and annual averages of air temperature (Tair; in • C) for the years 2005 to 2020.
For the first half of 2005, no measurements were available, so the Chamau temperature and precipitation data were filled with data from the nearby MeteoSwiss station Cham. *