Introduction

The rate at which zooplankton graze phytoplankton regulates marine carbon cycling by directly and indirectly modifying the size of both populations and their associated rates of net primary production (NPP)1,2, gross secondary production (GSP), and export production (EP)3,4,5. Grazing thus mediates the transfer of carbon to both higher trophic levels, where it sustains fisheries production6, and to the deep ocean, where it can be sequestered for 100s–1000s of years7. Accurately simulating the role of grazing is essential to predicting the ocean’s ability to feed a growing human population and buffer a changing climate. Yet the way in which we simulate zooplankton grazing in marine biogeochemical (BGC) models remains a large gap in our representation of marine carbon cycling.

Zooplankton species exhibit considerable functional diversity, and their relative distribution varies widely8,9,10,11,12,13. However, BGC models must represent the integrated behavior of all zooplankton using a strictly limited number of zooplankton groups14, especially in computationally expensive IPCC coupled-climate models that typically include only 1–2 groups15,16. This leads to substantial uncertainty in how modelers implicitly represent complex communities and their impact on marine carbon cycling17,18,19,20,21,22,23,24,25,26.

Substantial observational work over the last two decades has helped describe the global distribution of zooplankton and their grazing dynamics, but remains limited as a means to constrain grazing in global models. In-situ measurements of zooplankton biomass27,28,29,30 are patchy and must reconcile uncertainties across disparate acoustic, optical, and net-based methodologies31,32. Progress has been made towards estimating zooplankton biomass with satellites, but current products are either limited to swarms of specific species33 or heavily reliant on statistical inference from chlorophyll fields34,35,36.

Grazing pressure, defined as the phytoplankton’s specific loss rate to grazing24, varies with both zooplankton biomass and the specific grazing rate of zooplankton. However, empirical estimates of the parameters that describe how specific grazing rates change with prey abundance span three orders of magnitude, varying with zooplankton size, species, and age37,38,39. This means that detailed information about community composition must accompany bulk estimates of biomass to estimate grazing pressure. Field-based dilution experiments can help average over this variability, providing estimates of community-averaged grazing pressure40,41,42,43, but these are highly localized in space and time and can be biased by nonlinearities in the functional response44, trophic cascades45,46, and the presence of mixotrophs47. Review studies have helped constrain the range of reasonable values, but are limited to broad basin-scale averages and often exclusively target microzooplankton48,49,50. Measuring the contribution of larger mesozooplankton in the field typically requires gut content analysis and various assumptions about metabolic rates40,51. Finally, trait-based bio-geographies9,11,12 give a good snapshot of which functional groups belong where, but do not provide time-evolving global data sets. Collectively, the observational records provide a strong baseline understanding of zooplankton grazing dynamics but cannot be used in the same way as high spatial and temporal resolution, satellite-derived, chlorophyll and NPP products to tune global models.

This leaves a large degree of uncertainty in how grazing dynamics should be represented, with models differing largely in their associated qualitative and quantitative assumptions15,18,21,39. This could contribute substantially to the persistent uncertainty surrounding global and regional estimates of NPP19,52, EP20,53, and zooplankton biomass21 across future climate projections. Marine BGC model are tuned to observations of NPP and EP but still remain notoriously under-constrained and over-parameterized54,55,56. This means they could get the same answer for different reasons, but that once perturbed, different mechanisms will lead to different outcomes. One of the most important mechanistic differences in IPCC-class marine BGC models is the representation of zooplankton grazing19,20,24,25,26,57,58,59,60,61,62.

Here, we investigate zooplankton grazing dynamics in 11 IPCC Earth system models from the most recent (CMIP6) inter-comparison project63. We first quantify variability in the magnitude of the prescribed grazing formulation using a new metric, the Prescribed Grazing Index (PGI). We then quantify variability in various facets of emergent marine carbon cycling, including grazing dynamics, from historical simulations of each model. Here we show that the largest source of inter-model uncertainty is grazing pressure and that differences in emergent grazing pressure can be largely explained by differences in the PGI. Finally, we show how other emergent properties of the marine carbon cycle, such as GSP and EP, vary with the magnitude of the PGI, both in CMIP6 models and in a controlled sensitivity study. We believe that better constraining the established uncertainty in grazing dynamics is likely to substantially improve model estimates of past, present, and future marine carbon cycling.

Results and discussion

Variability in the grazing formulation

The representation of grazing in CMIP6-class BGC models varies widely, differing in the number of plankton functional types (PFTs) included, the food web that transfers carbon between them, and the functional response curves that specify how grazing rates vary with available prey (Figs. 1, 2). At one end of the spectrum, iHAMOCC64, CMOC65, and WOMBAT66 include only one zooplankton functional type (zPFT) grazing exclusively on one phytoplankton functional type (pPFT). OECOv267 and MARBL68 introduce additional pPFTs, each grazed independently by the same zPFT. CanOE69 adds a second zPFT, but each graze independently, pair-wise, on a single pPFT. Finally, MEDUSA2.170, PISCESv271, BFM5.272, and COBALTv273 introduce non-phytoplankton prey and allow zooplankton to graze on multiple prey options. In MEDUSA2.1 and PISCESv2, zPFTs distribute grazing effort based on fixed prey presences, while BFM5.272 and COBALTv273 include density-dependent prey preferences, known as active switching. Of the 10 BGC models surveyed, only four (MARBL, PISCESv2, BFM5.2, and COBALTv2) include temperature-limited grazing. See “Methods” for further details.

Fig. 1: Variability in CMIP6 food webs and the Prescribed Grazing Index.
figure 1

a The marine food webs represented in 10 CMIP6-class BGC models are presented in clockwise order of increasing complexity. Next to the name of each BGC model is the number of parameters required to describe grazing. Grazing relationships (arrows) are solid for single-prey responses, dashed for multi-prey responses with fixed preferences, and dotted for multi-prey responses with active switching. Red arrows indicate temperature sensitivity. Green, red/orange, and purple color schemes refer to models with 1, 2, or actively switching zooplankton. Hue scales qualitatively with complexity. PFTs have been generalized into small (nano-, small, non-diatom, or nanoflagelate), large (large, diatoms), or diazotrophs for phytoplankton (P) and small (micro-, small), medium (meso-, medium), or large (macro-, large) for zooplankton (Z). Bacteria (B) and Detritus (D) are included when available as prey. Model specific nomenclature for PFTs is included in Fig. 2. b The Prescribed Grazing Index (PGI), an innate property of the grazing formulation, is plotted for all models below the PGI associated with empirical estimates of the functional response for 7 zPFTs reported in the review of ref. 37. We further include three theoretical model configurations: two with a single zPFT described based on the median empirical parameters for meso- (dark red) and microzooplankton (blue), and one with 2 zPFTs including both (see “Methods”). Overlaid is the PDF of PGI for all 107 survey experiments.

Fig. 2: Grazing formulation in CMIP6 BGC models.
figure 2

PFT notation has been generalized by size for convenience, but full names as described in each model (or advised per personal communication) are reported in column 4. Food webs as described in Fig. 1. A Parameters g\({}_{\max }\) and K1/2 are abbreviated as g and K. Parameters values for g, K, Pth, λ are reported in units of d−1, mmolC m−3, mmolC m−3, and m3 mmolC−1, respectively. Units are converted to carbon using the model’s stoichiometry. B Models that use disk parameters converted to mathematically equivalent Michaelis–Menten parameters39. C Prey threshold values (Pth) in MARBL vary with PFT, depth and temperature, ranging from 0 in sufficiently cold or deep waters to 0.02 for diatoms and 0.01 for other PFTs. The PGI was computed using surface (>80 m) prey threshold values in MARBL, a temperature of 15.3 C in all models with temperature sensitivities, the observed global median PFT biomass for all models (see “Methods”). Emergent grazing rates were computed using the explicit depth and temperature fields. D In PISCESv2, in addition to the Pth term, the FLim is set equal to 0.5 when total food is below 0.6 mmolC m−3 and declines as total food increases (see eq. 26a in ref. 71).

Variability in the structure of the food web and grazing formulation substantially impact ecosystem stability22,74, plankton diversity23,24,25,26, NPP19,24, EP20,58,62,75, regional23 and global18,24,60 phytoplankton distributions, and phytoplankton bloom phenology59,76,77. However, while this work has highlighted the influence of qualitative differences in the grazing formulation, less attention has been paid to large differences in the magnitude of grazing pressure assumed across models. This may be because differences in the structure of the food web, value of parameters, shape of the functional response, and determination of prey preferences collectively exert a multivariate, often competing, influence on emergent grazing dynamics, making it difficult to discern how fast zooplankton, in a general sense, are assumed to graze by the full set of equations. We thus introduce a new diagnostic metric, the Prescribed Grazing Index (PGI), to approximate the magnitude of grazing rates prescribed by the innate properties of a given grazing formulation.

Variability in the prescribed grazing index

The prescribed grazing index (PGI) is defined as the specific rate at which zooplankton would graze in a standardized plankton community (see “Methods”). It is computed by evaluating each model’s grazing formulation (Fig. 2) at the long-term mean sea surface temperature and median observed global concentration of phytoplankton and zooplankton PFTs (Table S1). The PGI in surveyed CMIP6 models ranges from 0.009 d−1 in MEDUSA2.1 to 0.637 d−1 in MARBL (Figs. 1b, 2). This means that zooplankton in MARBL would graze ~70 times faster on the observed median plankton community than those in MEDUSA2.1. Even when ignoring the two most extreme models on either side, the PGI range is still over an order of magnitude.

To put this in context, we considered the functional response curves measured empirically from 107 different laboratory dilution experiments on >60 species of zooplankton37 and computed the PGI for a theoretical model including one zPFT grazing on one pPFT as described by each set of empirical parameters (see “Methods”). We then sorted the empirical experiments into functional types and computed the PGI for 10 theoretical models, each including 1–2 zPFTs with grazing described by their median empirically measured parameter values (Table S2). Most models (7 out of 10) have a PGI larger than that of a theoretical model with two zPFTs parameterized using the median empirical meso- and microzooplankton measurements (0.06 d−1; Fig. 1b). This is consistent with the fact that the apparent half-saturation concentration (K1/2) for the mean state of a patchy grid cell should be lower than any individual zooplankton grazing in it39. However, it is inconsistent with the fact that laboratory experiments are highly idealized and consider grazing on exclusively optimal prey. It is thus unclear which bias should dominate. More concerning is the spread of the PGI in models relative to empirical estimates.

Ostensibly, inter-model differences should reflect uncertainty in the mean state of their selected functional groups, but not uncertainty in the much larger, full range of species-level diversity. Yet, the PGI range across all surveyed CMIP6 models, is roughly as large as the middle 80% of all surveyed zooplankton species (0.005–0.55 d−1). Only five models have a PGI within the middle 50% of over 60 species (see Table 3 of ref. 37 for a list of included species). On one hand, three models (MEDUSA, OECOv2, and WOMBAT) have a PGI equivalent to or lower than a model with only one zPFT parameterized using the median empirical mesozooplankton parameters. On the other, three models (BFM5.2, CMOC, and MARBL) have a PGI equivalent to or larger than a model with only one zPFT parameterized using the median empirical microzooplankton parameters (Fig. 1b and Table S2). While well-mixed, in vitro experiments do not exactly represent the dynamics required to recreate the mean state of a patchy ocean, it is concerning that some models imply something statistically similar to an ocean filled entirely with very slow-grazing meroplankton larvae (MEDUSA2.1, OECOv2) and others an ocean filled entirely with very rapidly-grazing ciliates (MARBL and CMOC).

Although decreasing the complexity of the food web should generally increase the strength of predator-prey interactions22 and thus increase the PGI (as appears true in MARBL and CMOC), OECO-v2, and WOMBAT have the second and third lowest PGI despite having the simplest food webs. This confirms that the influence of qualitative differences in food-web structure can, in some cases, be dominated by quantitative differences in parameter values18, which vary dramatically39. Nevertheless, the PGI reduces many modes of variability into a single variable. It is plausible, then, that the emergent dynamics in models do not reflect the large differences implied by their PGI. To test the hypothesis that variability in the PGI can either capture or dominate the influence of other qualitative aspects of the grazing formulation, such as trophic complexity, prey-switching preferences, non-maximal feeding, and/or prey refuge18,24,25,60, we assess the variability in emergent grazing dynamics and determine if it can be explained by variability in the PGI.

Global variability in emergent grazing dynamics

We computed the emergent zooplankton grazing and mortality rates (which are not typically saved) from 11 CMIP6 Earth system models64,66,67,69,70,72,78,79,80,81 using the zooplankton dynamics explicitly prescribed in each model (Fig. 2) evaluated with monthly mean biomass and temperature fields simulated in their historical (1850-1900) reconstructions82,83,84,85,86,87,88,89,90,91,92 (see “Methods”; Table S3).

The global, annually-averaged, biomass-weighted, zooplankton-specific grazing rate (\(\overline{g}\)) increases by almost an order of magnitude, from 0.09 d−1 in OECO-v2 to 0.82 d−1 in CMOC (Fig. 3a, b). Compared to observations estimated by combining field estimates of microzooplankton48,49 and mesozooplankton51 bulk grazing rates with MAREDAT28 estimates of microzooplankton93 and mesozooplankton94 biomass (see “Methods”), nearly half of the models (5 out of 11) fall outside the observed uncertainty window.

Fig. 3: Inter-model variability in grazing pressure is best explained by variability in specific grazing rates.
figure 3

Inter-model variability in global, biomass-weighted, annually-averaged a, b zooplankton-specific grazing rates and ce grazing pressure plotted as a function of a, c zooplankton-specific mortality rates, b, d biomass, and e specific grazing rates. Colors correspond to models in Table S3 and Figs. 1, 2. Regression statistics are provided above each plot, with the linear fit included in red if the relationship is significant above the 95% (solid) or 80% (dashed) confidence threshold. Vertical and horizontal error bars refer to the amplitude of the seasonal cycle, computed separately in the northern and southern hemisphere, then averaged. The black horizontal line corresponds to observed values, with uncertainty bounds overlaid in gray (see “Methods”).

In a globally-averaged sense, the large variability in \(\overline{g}\) is not compensated for by offsetting differences in zooplankton mortality, leading to similarly large variability in grazing pressure imposed on the phytoplankton population (\(\overline{{g}_{p}}=\overline{gZ/P}\)). While there is a positive, relationship between \(\overline{g}\) and mean zooplankton-specific mortality rates (\(\overline{m}\), Fig. 3a), suggesting that models are tuned to have more efficient grazers die faster, there is not a statistically significant relationship between \(\overline{g}\) and the mean globally-integrated zooplankton biomass (\(\overline{Z}\)) (Fig. 3b), further suggesting that the magnitude of this tuning is insufficient to constrain grazing pressure (\(\overline{{g}_{p}}\)). Instead, variability in \(\overline{{g}_{p}}\) appears to be driven by variability in \(\overline{g}\). That is, \(\overline{{g}_{p}}\) has no statistically significant relationship with \(\overline{m}\) (Fig. 3c) or \(\overline{Z}\) (Fig. 3d), but is correlated with \(\overline{g}\) (Fig. 3e). In turn, there remains a very large degree of uncertainty in the representation of \(\overline{{g}_{p}}\), which increases by over an order of magnitude (116%) from OECO-v2 to MARBL, largely mirroring the uncertainty in \(\overline{g}\) and drifting even further from observed estimates (Fig. 3c–e).

In fact, grazing pressure appears to be the largest source of uncertainty in simulated marine carbon cycling. To quantify sources of uncertainty across CMIP6 models, we compared the inter-model coefficient of variation (\(\sigma \ {\overline{x}}^{-1}\)) for 14 major carbon fluxes, pools and rates, computed using global, annual median values from each simulation (Fig. 4). By this metric, uncertainty in grazing pressure is 15% larger than the next largest source, specific grazing rates. Notably, uncertainty in specific grazing rates is 60% larger than zooplankton-specific mortality rates. Thus, it appears inter-model differences in the rate of zooplankton mortality are insufficient to balance much larger differences in the specific rate of grazing. In turn, uncertainty in grazing pressure is at least three times larger than that of EP, NPP, and export efficiency, and two times larger than that of phytoplankton and total plankton biomass. By contrast, uncertainty in phytoplankton-specific division rates is substantial, on par with zooplankton mortality, and appears necessary to offset large differences in grazing pressure in order to constrain NPP.

Fig. 4: Inter-model uncertainty in marine carbon cycling.
figure 4

a The coefficient of variation calculated across 11 CMIP6 models (standard deviation divided by the mean) is shown for the annual (orange), winter (blue), and summer (yellow) median of major biological marine carbon fluxes, stocks, and rates. Winter values are computed using December and June for the northern and southern hemispheres, respectively. Vice versa for summer. Medians are computed for the top 200 m of the water column and weighted by either (1) volume, (2) NPP, (3) phytoplankton biomass, or (4) zooplankton biomass to account for the distribution of where/when the relevant carbon cycling occurs. Note, qualitatively similar values computed using weighted means, instead of medians, and are reported in Fig. 6. bo Hovmöller diagrams (month vs. latitude) of the seasonal and zonal distribution of inter-model uncertainty are provided for each variable. Areas in which the standard deviation is greater than the mean are contoured in black.

Zonal and seasonal variability in emergent grazing dynamics

Uncertainty in grazing pressure (Figs. 4o, 5a–c) and specific grazing rates (Figs. 4n, 5d–f) is even larger at smaller spatial and temporal scales. During the winter, when relieved grazing pressure regulates the phytoplankton seed population required to initiate a spring bloom1,95, the median global grazing pressure is 43-times larger in MARBL than OECOv2 (Fig. 5a; left panel) and the inter-model standard deviation is 20% larger than the inter-model mean (Fig. 4a, o). During summer, when zooplankton biomass is higher (Fig. 5g–i) and slow-grazing meso- or macrozooplankton regulate standing stocks of phytoplankton, especially in the Southern Ocean60, the largest source of uncertainty in models is specific grazing rates (Fig. 4a, n).

Fig. 5: Zonal and seasonal variability in emergent zooplankton biomass and grazing.
figure 5

The distribution of median ac grazing pressure, df zooplankton-specific grazing rates, and gi depth-integrated zooplankton biomass. a, d, g The global median of each variable is plotted during the climatologic winter (December/June above/below the equator) and summer (vice versa). Medians are computed over the top 200 m of the water column and weighted by a phytoplankton, d zooplankton biomass or g volume. Model colors correspond to the traces below. b, e, h Zonal distributions are plotted for each model using a sliding 3 latitudinal band. Diamonds correspond to the median observed estimate with error bars for the 25th and 75th percentiles (see “Methods”). c, f, i On the border, the inter-model correlation with the PGI is plotted for each zonal band and season, with statistically insignificant relationships x'd out. Lighter x’s indicate a casual, but statistically insignificant, relationship with . 2 > p > 0.05%.

Zonally, the inter-model uncertainty in grazing pressure is largest in the tropics, regardless of season, with a roughly 20-times difference between MARBL and the three models with the lowest median grazing pressure (Fig. 5b). This is partly driven by substantial enhancement in the specific grazing rates of models with temperature-limited grazing (MARBL, PISCESv2, BFM5.2, and COBALTv2) in the tropics (Fig. 5e). Variance in zooplankton biomass, however, is largest at the poles, where models vary from almost 0 (CMOC and CanOE) to over 400 (COBALTv2 and MEDUSA2.1) mmolC m−2 in the Southern Ocean summer for example (Fig. 5h).

Variability in grazing pressure across models is not well constrained by observations (see “Methods"). Qualitatively, observations of grazing pressure agree with the zonal distribution simulated in models, peaking at the equator, then tapering off towards the poles48,49,51, but the models simulate much lower values and exhibit a much larger spread. In no region do more than 65% of models fall with the middle 50% of zonally-binned, basin-scale observational estimates (Fig. 5b; black diamonds). While this underscores how poorly constrained grazing pressure is in models, it is important to remember that field measurements are sparse and subject to substantial sampling biases48,51. Thus, a small uncertainty window is probably a consequence of small sample size, rather than low natural variability (see “Methods”), suggesting that the solution is not simply nudging models toward current observations but also further constraining observations.

Perhaps a more useful comparison is a more complex BGC model. While no CMIP6 model included more than three zooplankton or phytoplankton types, we show in Fig. 5 (black traces) the DARWIN model of ref. 96, which includes 31 pPFTs and 16 zPFTs. Only BFM5.2, COBALTv2, and MARBL come close (r2 > 0.6) to matching the zonal variability in grazing pressure simulated in DARWIN during both winter and summer. MARBL, however, exhibits by far the largest magnitude bias, while BFM5.2 has the second smallest. This suggests that active prey switching, only included in COBALTv2 and BFM5.2, is an essential ingredient in implicitly simulating the more diverse ecosystem explicitly resolved in DARWIN22,23,24,97. Interestingly though, grazing pressure in DARWIN is achieved via much lower specific grazing rates (Fig. 5e) and higher zooplankton biomass (Fig. 5h) than in BFM5.2 or COBALTv2. This is likely due to a decline in the direct interaction strength between predators and prey in a more complex food web18 combined with a reduction in non-linear mortality to explicit and implicit predation, which will be smaller in aggregate when biomass is distributed across more discrete biomass pools.

Sensitivity of marine carbon cycling to the PGI in CMIP6 Models

Large differences in emergent grazing dynamics at the global, regional, and seasonal scales appear to be driven by large differences in the magnitude of the grazing formulation, as quantified by the PGI. Here we investigate the relationship between the PGI and emergent properties of broader marine carbon cycling. Qualitative differences in the functional response are explored in “Supplemental Discussion 1”.

Despite reducing many modes of variability to a single value, the PGI is a strong predictor of not only global annually-averaged specific grazing rates (Fig. 6b; r2 = 0.82), but also grazing pressure (Fig. 6a; r2 = 0.78). In fact, emergent grazing pressure is better correlated with the PGI at the global scale than emergent zooplankton mortality (\(\overline{m}\)), biomass (\(\overline{Z}\)), or even specific grazing rates (\(\overline{g}\)) (Fig. 3c–e). Moreover, despite substantial spatial-temporal variability in zooplankton biomass (Fig. 5h), the PGI remains significantly, positively correlated with specific grazing rates (Fig. 5f) and grazing pressure (Fig. 5c) at most latitudes, during both winter and summer. In turn, even though most NPP and EP disproportionately occur during blooms, when the biomass is much higher than the global median, it appears that understanding how fast models assume zooplankton will graze at lower concentrations (as quantified per the PGI) is an excellent predictor of emergent grazing dynamics across all prey concentrations, likely due to feedbacks on the size of the prey population.

Fig. 6: Sensitivity of marine carbon cycling to the PGI in CMIP6 models.
figure 6

an The relationship between global, annually-averaged marine carbon cycling and the PGI. Vertical error bars reflect seasonal variability, as per Fig. 3. Horizontal error bars reflect the precision of the PGI and are computed by evaluating the PGI with ±25% total phytoplankton biomass. The black horizontal line corresponds to our best guess of the observed values, with uncertainty bounds overlaid in gray (see “Methods”). Linear regressions are included in red if the relationship is significant above the 95% (solid) or 80% (dashed) confidence threshold. o Statistical properties of the observations, model spread, and relationship between the inter-model variability and the PGI are reported.

Given the close link between the PGI and emergent grazing dynamics, we examine if large differences in the PGI can explain other modes of uncertainty in the marine carbon cycle (Fig. 6). Notably, there is no significant relationship between the PGI and NPP, EP or export efficiency at global (Fig. 6i, j, m), regional or seasonal scales (Fig. S3). Despite the importance of top-down controls19,20,24,58,62,75, this is unsurprising because CMIP6 models are highly tuned towards observed estimates of NPP and EP. However, because BGC models are otherwise largely undetermined98, large uncertainty in grazing pressure must be compensated for elsewhere. While the PGI is correlated with faster zooplankton-specific mortality rates (Fig. 6d and S4c), these differences are already accounted for within grazing pressure and are insufficient to constrain its variance. In turn, models with a larger PGI transfer more carbon more efficiently from primary producers to higher trophic levels (Figs. 6k, l, n and S5) and must balance larger grazing losses and lower phytoplankton biomass (Figs. 6e and S4i) with faster phytoplankton-specific division rates (Figs. 6c and S4f). Thus, NPP appears to be tuned via faster turnover of a smaller phytoplankton population. However, because the large variance in specific grazing rates is not adequately balanced by zooplankton mortality, there is not necessarily a higher turnover in the zooplankton population. In turn, zooplankton biomass exhibits nearly twice as much inter-model uncertainty as phytoplankton biomass (Fig. 4a, e, k) and a weaker, statistically not significant relationship with PGI (Fig. 6f).

In short, large differences in the PGI, and thus grazing pressure, appear to be tuned out in terms of historical NPP via faster nutrient uptake, but models with similar NPP can have very different standing stocks of phytoplankton (Figs. 6e and S4g–i), zooplankton (Figs. 6f and 4g–i), total biomass (Figs. 6g and S6g–i) and zooplankton to phytoplankton ratios (Figs. 6h and S6d–f).

Influence of the PGI in a controlled sensitivity study

To better control for non-grazing differences in CMIP6 models16 and directly investigate the mechanisms by which the PGI can regulate marine carbon cycling, we ran our own suite of global simulations (Table S4), all forced with identical physics (see “Methods”). Simulations were run in WOMBAT66, which uses a simple, single-prey grazing formulation (Fig. S7) to isolate the influence of uncertainty in the PGI from the influence of other qualitative differences in the grazing formulation.

We first ramped up the PGI without tuning any other parameters (Figs. 7 and S8). Unsurprisingly, globally-integrated \(\overline{NPP}\) (Fig. 7a) and \(\overline{{{{{{{{\rm{EP}}}}}}}}}\) (Fig. 7c) collapse from complete nutrient utilization to nearly nothing as the PGI increases. However, surprisingly this collapse occurs across roughly the same range of PGI used in CMIP6 models, further highlighting the degree to which the PGI is under-constrained. For example, running WOMBAT66 with a PGI as large as MARBL68 causes \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) to drop by nearly 40 PgC yr−1, all else being equal. Worse, while the range of PGI used in CMIP6 models spans nearly two orders of magnitude (70-fold), just doubling the PGI at intermediate values (from 0.0125 to 0.25 d−1) can drive a roughly 8 PgC yr−1 drop in \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\), roughly 15% of observed estimates (Fig. 6o; see “Methods”).

Fig. 7: Sensitivity of marine productivity to the PGI in a controlled sensitivity experiment.
figure 7

Global, annually-averaged a NPP, b GSP, and c EP from each simulation of WOMBAT (see “Methods”) is plotted against the PGI of the corresponding model. Horizontal error bars refer to the precision of the PGI, per Fig. 6. The vertical uncertainty window is proportional to the amplitude of the seasonal cycle, computed as ±25% of the difference between the annual maximum and mean. Vertical dashed lines refer to the PGI of CMIP6 models, color-coated per Fig. 2. Regional, annually-averaged d NPP, e GSP, and f EP are plotted separately for eutrophic (left axis) and oligotrophic (right axis) regions. The boundary between “oligotrophic” and “eutrophic” regions is contoured in black on the maps in Fig. 8. Vertical dashed lines denote which runs are plotted in the corresponding panels of Fig. 8.

Moreover, export efficiency in WOMBAT monotonically declines with PGI (Fig. S8n). This is driven by a non-linear reduction in particle formation, which scales with the square of each biomass pool (Fig. S7), such that less biomass exports less efficiently. Notably though, export efficiency declines even as total biomass initially increases (Fig. S8f). This cannot be explained by a decrease in particle formation through sloppy feeding or a decrease in PFT-specific aggregation rates as \(\overline{{{{{{{{\rm{GSP}}}}}}}}}\) and the ratio of zooplankton (which aggregate more efficiently in WOMBAT; Fig. S7) to phytoplankton both increase. Instead, this decline is dominated by a reduction in the skewness of the biomass distribution as the Z to P ratio approaches 1. This reduces the sum of the squares of the two biomass pools (Fig. S8i) and, in turn, community-integrated particle formation. Thus, so long as particle formation occurs non-linearly99, trophic controls on the distribution of biomass can regulate export efficiency despite competing assumptions about the differential rate of particle formation in PFTs with implicitly different size and composition20. This is consistent with the reduction in export efficiency associated with spreading out biomass across longer food chains62 and means that a shift to more efficient exporters may not necessarily increase export efficiency if accompanied by a decrease in the skewness of the biomass distribution.

While grazing pressure (Fig. S8l) and grazing efficiency (Fig. S8m) monotonically increase with the PGI, zooplankton-specific grazing rates (Fig. S8k), zooplankton biomass (Fig. S8e) and \(\overline{{{{{{{{\rm{GSP}}}}}}}}}\) (Fig. 7b) all peak at intermediate PGI then decline. This means a larger PGI can actually lead to slower emergent grazing rates and less secondary production if overgrazing stifles phytoplankton biomass before it has a chance to bloom. It also demonstrates how \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) is not necessarily correlated with the bulk transfer of mass and energy up the food web. Thus, while future increases in temperature-limited grazing can reduce \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\), even if accompanied by improving phytoplankton growth conditions19, it is not clear how this will translate to higher trophic levels. Likely, different BGC models will yield qualitatively different projections of secondary production based on their initial PGI and on what side of the overgrazing inflection point they begin. This is particularly problematic for ecosystem100 and fisheries models101, which are typically driven with a one-way forcing from NPP and assume NPP and GSP have an exclusively positive relationship101,102, further stressing the need for two-way coupling with primary producers to forecast fisheries stocks accurately103.

Regionally, the redistribution of nutrients can lead to large deviations from the global mean, a shift in the global distribution of productivity, and, at times, opposite responses at different latitudes (Figs. 7d–f, 8). As phytoplankton are grazed down in more eutrophic waters, nutrients are left unutilized and recirculated into more oligotrophic regions, allowing regional productivity to increase despite greater grazing pressure. Thus, misrepresenting the PGI could lead to inaccurately simulating the distribution of nutrients and productivity and thus inaccurately identifying regions that could be impacted by future changes in ocean productivity and, in turn, fisheries production101,104.

Fig. 8: Regional sensitivity of marine carbon cycling to the PGI.
figure 8

Maps of the distribution of mean annual ac NPP, df GSP, and gi EP from three simulations from the “Slow Turnover” suite of WOMBAT experiments are plotted (see “Methods”). Specific simulations noted in Fig. 7. b, e, h show the raw values from the run with roughly peak \(\overline{{{{{{{{\rm{GSP}}}}}}}}}\) and an intermediate PGI. To the left and right are the percent deviation that occur when a, d, g a 3.4-times smaller and c, f, i 2.9-times larger PGI is used. The boundary between the “oligotrophic” and “eutrophic” regions plotted in Fig. 7 is contoured in black on each map and defined by the region where GSP is reduced by more than 100% when using the smaller, relative to intermediate, PGI (i.e., in panel d).

Finally, we re-ran each simulation but allowed for faster nutrient uptake (see “Methods”; Table S4). We then considered two ensembles, each tuned to the same \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\), but one with a roughly 3-times faster phytoplankton saturation division rate and a three times faster PGI (Fig. 9). Critically, this tuning only reflects less than 5% of the full range of PGI used across CMIP6 models. Still, the ensemble with a marginally larger mean PGI and faster phytoplankton turnover led to 58% more efficient carbon transfer up the food web (Fig. 9m) and 35% more efficient carbon sequestration (Fig. 9o). From a fisheries perspective, an extra 5 PgC yr−1 was transferred to zooplankton (Fig. 9b), leading to a 0.15 PgC increase in the zooplankton population (Fig. 9e), roughly equivalent to the inter-model variability across historical CMIP6 simulations and half the size of the globally-observed population (Fig. 6f, o). From a climate perspective, this translates to an extra 2 PgC yr−1 sequestered (Fig. 9c), or double the maximum theoretical potential of Southern Ocean Iron Fertilization105,106 and roughly 65% of the inter-model variability in historical CMIP553 and CMIP6 simulations (Fig. 6j, o).

Importantly, though, the primary driver of elevated \(\overline{{{{{{{{\rm{EP}}}}}}}}}\) was a larger zooplankton population (Fig. 9e) and zooplankton mortality was not tuned in this experiment. Despite nearly identical total biomass in each ensemble (Fig. 9f) accommodated by a 38% reduction in phytoplankton biomass (Fig. 9d), the ratio of zooplankton to phytoplankton grew by 151% (Fig. 9h). This led to a dramatic skewing of the biomass distribution, an increase in the sum of squares of the biomass pools (Fig. 9i), and an increase in the efficiency of particle formation. Conceivably, zooplankton mortality could be tuned up (alongside a larger PGI) to rein in the size of the zooplankton population and constrain \(\overline{{{{{{{{\rm{EP}}}}}}}}}\) without modifying grazing pressure or \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\). However, such tuning: (a) yields large differences in the size and distribution of biomass (Fig. S9) and (b) is difficult because zooplankton predation is typically resolved implicitly107. This latter point is evidenced by the inadequate tuning of zooplankton mortality across CMIP6 models (Section “Global variability in emergent grazing dynamics”) and the established benefit of explicitly adding higher trophic levels60.

Fig. 9: Sensitivity of marine carbon cycling to model tuning.
figure 9

ao Various aspects of global, annually-averaged marine carbon cycling are plotted against \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) for two suites of experiments. The “Slow Turnover” WOMBAT experiments (solid lines) are the same simulations shown in Figs. 7 and S8. The “High Turnover” WOMBAT experiments (dashed line) were parameterized with faster phytoplankton growth rates but are otherwise identical (see “Methods”; Table S4). Dashed vertical line bracket the ensemble of runs used to quantify the difference between “Slow Turnover” and “High Turnover” (adjoining horizontal lines) simulations. The x-axis is flipped to plot increasing PGI from left to right, consistent with Figs. 7 and S8.

Moving forward

Within the grazing formulation, qualitative differences in the structure of the food web18,22,24, degree of trophic complexity59,60, sensitivity to temperature19, and capacity for prey refuge23,24,25,108 all exert a strong top-down control on NPP and subsequent carbon cycling. Ideally, models that leverage these distinctions to increase biodiversity23,24,25,26, prevent competitive exclusion97, improve ecosystem stability22,74, or replicate species succession24 and bloom phenology59,76, ought also tune their parameters to converge on a relatively constrained degree of bulk grazing pressure. However, this does not appear to be happening. Large variability in emergent grazing dynamics is well explained by the PGI at global, regional, and seasonal scales, despite the PGI reducing many modes of qualitative variability into a single value. In turn, models with substantial qualitative differences in their grazing formulation and food-web structure but similar PGIs can yield similar zonal distributions of emergent grazing rates (i.e., MEDUSA2.1 vs OECO-v2; Fig. 5e) and/or grazing pressure (i.e., CanOE vs PISCESv2; Fig. 5b). Together, this highlights how far CMIP6 models remain from converging on a first-order magnitude of zooplankton-specific grazing rates, leading to very large uncertainty in the representation of grazing pressure throughout the global ocean.

In turn, even if models are tuned to the same NPP and EP, they must do so with very different-sized biomass pools and turnover rates to accommodate such large differences in grazing pressure (Fig. S9). However, to constrain EP with different-sized biomass pools, models must further differ in their particle aggregation, ballasting, and/or remineralization rates20. Yet each of these terms have different temperature sensitivities, meaning they will respond differently to a warming ocean. This helps explain the large uncertainty in projections of EP53 despite fairly well-constrained historical reconstructions (Figs. 4, 6j). Therefore, constraining uncertainty in the PGI, and in turn, emergent grazing pressure and the distribution and abundance of biomass, appears to be a necessary and urgent next step towards constraining predictions of future marine carbon sequestration and transfer to higher trophic levels.

Adequately constraining the PGI may require a new generation of observational products that leverage higher resolution, time-evolving, remote sensing and profiling float data sets. While these platforms cannot directly measure grazing, both satellites and float products have been used to infer net community production from biomass109,110 and oxygen budgets111,112. Combined with observationally-based estimates of phytoplankton biomass113, NPP113,114, and the assumption that phytoplankton losses are dominated by grazing115,116, grazing pressure could then be estimated at much finer space and time scales than shipboard dilution experiments48,49. Further, pairing these with rough estimates of zooplankton biomass, either from statistical models21,36 or emerging remote sensing products33,34,35, could yield specific grazing rate terms, enabling modelers to better tune zooplankton turnover (i.e., the balance of growth vs. mortality).

Together, such products will not only help constrain the mean state of grazing at global and regional scales, but could also complement existing bio-geographical resources to constrain how regional grazing dynamics vary with community composition. That is, by plotting the evolution of observed grazing rates against phytoplankton biomass, one can empirically define the community-averaged apparent functional response at regional and even seasonal scales. Spatio-temporal variability in this curve will improve our understanding of the distribution9,11,12, seasonal succession117,118, and migration119,120 of zooplankton species, which can then be compared to the associated variability in the emergent, community-averaged apparent functional response simulated in models (see “Supplemental Discussion 1”). Such comparisons will enable modelers to determine if explicit competition in their assumed food webs is reasonably approximating natural variability in community-averaged grazing dynamics, or if increasing trophic complexity by including, for example, macrozooplankton60, salps121, larvaceans, euphausiids, chaetognaths, jellyfish36,122, pteropods123, heterotrophic protists124, or Rhizarians5,125 might be required.

Moreover, as stratification under climate change likely leads to greater light availability but fewer nutrients, the phytoplankton community may shift toward smaller phytoplankton with faster nutrient uptake but higher light requirements126,127. This may increasingly favor rapidly-grazing microzooplankton or gelatinous filter feeders, which can ingest smaller phytoplankton size classes122,128, over crustacean mesozooplankton, which generally graze slower but can consume much larger prey12,129. Considering these groups serve critically different roles in trophic transfer through prey preference125 and carbon export through diel-vertical migration and130,131,132,133 the formation of rapidly-sinking fecal pellets and carcasses121,134, it is essential that models include enough trophic diversity to capture present-day and emergent changes to community composition and associated community-averaged traits122.

As it stands, grazing dynamics appear to be the largest source of uncertainty in contemporary simulations of the marine carbon cycle. This is understandable given the considerable challenges associated with observing and subsequently modeling zooplankton32,135,136; however, moving forward, improving zooplankton grazing will be critical. This is especially true given that BGC models will be at the forefront of informing or deterring the possible implementation of biogeochemically driven ocean-based negative emissions technologies105.

Conclusions

We surveyed the marine BGC component of 11 state-of-the-art Earth system models and explicitly computed their emergent zooplankton dynamics. We found that the largest source of uncertainty in their representation of marine carbon cycling was grazing pressure. Large variability in grazing pressure is tied to large differences in zooplankton-specific grazing rates, which are not sufficiently compensated by the tuning of zooplankton mortality. Moreover, the variability in both emergent grazing pressure and specific grazing rates is well explained at global, regional, and seasonal scales by large differences in the PGI - even though it collapses many modes of variability into a single dimension. This suggests that very large differences in the magnitude of grazing assumed in models may dominate important qualitative differences in the grazing formulation and food web structure. To constrain NPP against such large top-down variability, models appear to tune the turnover of the phytoplankton population. In a suite of control simulations, we found that small changes in the PGI, compensated for with faster phytoplankton division rates to tune NPP, can yield large differences in the amount of carbon subsequently transferred to higher trophic levels and exported to depth. These differences were largely driven by a skewing of the biomass distribution. Moving forward, we believe that further constraining our observed understanding of global grazing dynamics and improving the representation of zooplankton grazing in marine biogeochemical models will lead to more robust estimates of the marine carbon cycle.

Methods

Inter-model variability of the grazing formulation

The representation of grazing and food webs in IPCC CMIP6 BGC models varies widely (Figs. 1, 2; see also ref. 137). At one end of the spectrum, iHAMOCC64, CMOC65, and WOMBAT66 include only one zooplankton and one phytoplankton type. In each, the zooplankton-specific grazing rate, g (d−1), is assumed to increase monotonically with the phytoplankton concentration, P (mmolC m−3), before eventually saturating74. This relationship, g(P), is known as the single-prey functional response and is typically represented with a type II or III response curve39,138. Both curves can be described with the same two parameters. The saturation grazing rate, g\({}_{\max }\) (d−1), describes the food-replete specific grazing rate, whereas the half-saturation concentration, K1/2 (mmolC m−3), specifies roughly where the curve begins to saturate (Fig. S1). All else equal, both increasing g\({}_{\max }\) or decreasing K1/2 increase zooplankton-specific grazing rates.

iHAMOCC uses a type II response, which linearly increases with prey density below K1/2, when prey is scarce (rectangular hyperbola; Fig. S1). A type II response is typically what has been measured in empirical laboratory experiments37,38 and is thus likely to be representative of individual zooplankton grazing in a well-mixed environment. However, the type II response can contribute to dynamically unstable solutions74, requiring iHAMOCC to include a small prey threshold (below which grazing stops) to prevent phytoplankton from going extinct. WOMBAT and CMOC use a type III response, which is concave upward below K1/2 (sigmoidal; Fig. S1), thereby offering prey refuge and population stability at low prey concentrations74,139 and increasing co-existence in the phytoplankton community108. Ecologically, the type III response can be justified as the implicit representation of more complex behavioral changes such as active prey switching24,138, which has been observed in multiple zooplankton species13,140,141,142, or as the mean state of a spatially patchy ocean39,143, which relatively coarse global models must implicitly average across. Despite qualitatively identical grazing formulations, differences in the parameter selection in CMOC and WOMBAT (particularly K1/2; Fig. 2) mean that zooplankton-specific grazing rates on phytoplankton concentrations <1 mmolC m−3 are nearly 30-times faster in CMOC than WOMBAT.

OECO-v267 and MARBL68 use a slightly more-complex ecosystem structure, resolving multiple types of phytoplankton but just one generic type of zooplankton that grazes independently on all phytoplankton types. Both models still employ a single-prey response, in that the presence of one prey option does not influence the zooplankton-specific rate at which the other is grazed, g (d−1). However, because the biomass of the generic zooplankton group, Z (mmolC m−3), increases with grazing on any single phytoplankton type, so does the grazing pressure, gp = gZ/P (d−1) on all phytoplankton type, largely eliminating the possibility of refuge for relatively scarce prey18. In turn, this structure can obtain an effective food web that is functionally similar to including only one phytoplankton type18, in which strong top-down controls work to prevent species co-existence and stifle large blooms59. This is particularly relevant in MARBL, which uses a type II response, a very small K1/2 value, and requires relatively large prey thresholds to prevent extinction (Fig. 2), whereas OECO-v2 is able to offer prey refuge in the form of a type III response with a relatively large K1/2 value.

CanOE69 is slightly more complex, with two phytoplankton and two zooplankton types. Here, each zooplankton type grazes independently, pair-wise, on one unique phytoplankton type, with the large zooplankton type additionally grazing on the small zooplankton type. Unlike MARBL and CanOE, in which the indirect interaction strength144 between phytoplankton types is always negative, trophic cascades can now lead to positive interaction strength between phytoplankton types18,60. That is, large phytoplankton growth can relieve grazing pressure on small phytoplankton by increasing the biomass of large zooplankton, which in turn graze down the biomass of small zooplankton, the sole predator of small phytoplankton.

Increasing in complexity, MEDUSA2.170, PISCESv271, BFM5.272, and COBALTv273 all include at least two zooplankton types, each with multiple prey. In MEDUSA2.1 and PISCESv2, two types of zooplankton split their grazing effort between two types of phytoplankton in addition to detritus and cannibalism from large to small zooplankton, but both do so with fixed prey preferences. This means that the abundance of one prey option does not influence the desirability of another, such that when prey preferences are equal, the grazing effort is proportional to the relative distribution of prey25. Using fixed prey preferences, known as passive-prey switching, ensures maximal feeding (i.e., adding prey biomass, all else equal, always leads to an increase in total ingestion) but does not provide prey refuge for relatively less abundant prey24. In turn, to help prevent extinction and provide some form of prey refuge, MEDUSA2.1 uses a type III (sigmoidal) response and PISCESv2 includes a relatively strong prey threshold which decreases the apparent prey concentration by half when the total prey concentration <0.6 mmolC m−3.

The most complex models considered, BFM5.2 and COBALTv2, both include density-dependent prey preferences. This is known as active prey switching and allows zooplankton to redirect grazing effort to relatively more abundant prey, even if it is otherwise less desirable145. Without invoking behavioral changes, active switching can be understood as a community response, in which one zooplankton type explicitly grazing on n phytoplankton types implicitly functions as n fully-specialized zooplankton types growing in abundance in proportion to their preferred prey25. In BFM5.2, only microzooplankton (but not mesozooplankton) have density-dependent prey preferences, while in COBALTv2, the most complex BGC model considered, three zooplankton groups all graze with density-dependent preferences. Density-dependent, active prey switching can improve ecosystem stability139, provide prey refuge145, overcome competitive exclusion97, increase species diversity22,23,24,25, and better replicate species succession during phytoplankton blooms24. However, active switching also results in the ecologically unlikely phenomena of non-maximal feeding, in which total ingestion can decrease as total prey abundance increases if prey simultaneously becomes more evenly distributed25,74.25 have outlined a formulation for active switching in which the K1/2 value for total ingestion decreases as prey become more evenly distributed, such that maximal feeding with active prey switching can be achieved; however, this formulation has not been included in any of the CMIP6 models surveyed here.

Many state-of-the-art BGC and ecosystem models include richer functional diversity60,62,123, trophic complexity36,122,129, and allometrically scaled-size spectra23,62,96,146,147,148, with some simulations including hundreds of PFTs26. Increasing the size and complexity of the food web improves ecosystem stability by increasing resistance to perturbations22, permits realistic self-regulation of food-chain length62,129, replicates the observed bio-geography of PFTs and size classes36,147, and can even play a larger role than iron limitation in shaping high nitrate low chlorophyll conditions in the Southern Ocean60. However, increasing complexity can also increase predictive uncertainty if the additional parameter values cannot be constrained by observations. Even relatively simple BGC models, with just four plankton types, can be heavily under-constrained55. Moreover, more complex models are much more computationally expensive, and generally not yet tractable in IPCC simulations, which require many years, ensemble members, and scenarios to be run at increasingly high, mesoscale, spatial resolution. For this reason, we focus on the current generation of state-of-the-art IPCC models, but have included in our analysis, for comparison, the model of96, which is part of the DARWIN Project, run on the MITgcm at 1x1 degree resolution, and includes 47 size-structured PFTs.

The prescribed grazing index

The prescribed grazing index (PGI) is computed as the zooplankton-specific grazing rate, integrated across all phytoplankton prey options, prescribed by a given grazing formulation (Fig. 2) when evaluated at the observed global median phytoplankton and zooplankton plankton distribution and long-term mean sea surface temperature. In models with multiple zPFTs, zooplankton are assumed to be partitioned based on the observed global median distribution (Table S1) and the PGI is their biomass-weighted mean specific grazing rate on all phytoplankton prey options. When non-phytoplankton prey is included in the food web (i.e. bacteria, detritus, or other zooplankton), it is accounted for in so much as it offers refuge for phytoplankton by diverting grazing effort. Mathematically, this is equivalent to evaluating Equation (3), which is used to compute the offline emergent dynamics (see Section “Model output and offline diagnostics”), except here, a standardized prey field is used instead of the emergent monthly mean biomass fields. By using a standardized prey field, we can obviate myriad other inter-model differences that drive the emergent distribution and abundance of prey and thus quantify a reduced approximation of the magnitude of grazing rates prescribed by the innate grazing formulation in each model. Accordingly, the PGI does not necessarily reflect the emergent grazing dynamics of a given simulation, which further depend on the formulation of bottom-up controls, physics, zooplankton mortality rates, and other aspects of the grazing formulation that cannot be captured with a single value.

The standardized prey field used to evaluate the PGI is based on the global median picophytoplankton, diatom, Phaeocystis, diazotroph, microzooplankton, mesozooplankton, macrozooplankton concentrations28, and the global mean long-term average sea surface temperature from the NOAA ERSSTv5 product149. The observed global median particulate organic carbon concentration150 and bacteria concentration from the upper 200 m151 are included when available as prey. See “Observational components of the PGI” below. To account for differences in the structure of the food webs, some assumptions must be made about how biomass is partitioned in models with different numbers of PFTs. However, total phytoplankton and zooplankton biomass is kept constant. In models with just one phytoplankton or zooplankton group, total phytoplankton and zooplankton biomass is used. In models without diazotrophs or macrozooplankton, those pools are added to the ‘large phytoplankton’ and ‘medium zooplankton’ pools, respectively. Phaeocystis, which no surveyed model includes but compete with diatoms for nutrients, particularly in the Southern Ocean57, are included in the ‘large phytoplankton’ pool.

Note that while these values are deliberately agnostic to the emergent dynamics of any specific model run, thereby providing a standardized field to obviate other inter-model differences and isolate innate properties of the grazing formulation, they compare well to the inter-model mean of the global median simulated biomass fields (Table S1). In both models and observations, the concentration of biomass increases from total zooplankton to bacteria to total phytoplankton to detritus, and the relative distributions are generally consistent. Finally, the zooplankton biomass-weighted temperature from the models (i.e., the temperature the median simulated zooplankton experiences) is close to the standardized observed surface temperature used (15.30 and 15.61 C, respectively). Together, this confirms that the standardized observed values are representative of the median values simulated across models.

Limitations and sensitivity of the PGI

One limitation of the PGI is that it is not well suited for comparing models with much more trophic complexity than those studied here. That is because evaluating the PGI for a model with many more PFTs or an allometrically scaled-size spectrum within PFTs would require many further assumptions about how the standardized prey field is partitioned across plankton. While some of these assumptions could potentially be made using global estimates of size-spectra129, the PGI becomes a less interesting metric for models that explicitly resolve greater trophic complexity, as it is designed to reflect the difference in how simpler models assume that complexity is averaged over and parameterized out.

A second, and larger limitation of the PGI is that it collapses many modes of variability into a single dimension. While this is intentional and useful to quantitatively compare the integrated influence of multiple models’ disparate grazing formulations along a single axis, it means that two grazing formulations could yield a different hierarchy of PGIs if evaluated using a different standardized prey field. For example, even in the simplest case—a single zPFT grazing on a single pPFT—using a type II versus type III response function with the same parameter values would yield a higher PGI if evaluated below K1/2 and but a lower PGI if evaluated above K1/2 (Fig. S1).

This underscores the importance of selecting an appropriate standardized prey field. Using global median values rather than high phytoplankton concentrations that characterize large blooms (during which global NPP and EP disproportionately occur) is important for two reasons. First, almost all modern grazing formulations, and all of those surveyed here, saturate at high phytoplankton concentrations to simulate the satiation of predators once they become limited by prey consumption rates, rather than capture rates74. In turn, if the goal is to capture as much variability as possible across different formulations, then they should be evaluated before they have begun to significantly saturate. Second, and more importantly, the dynamical system is more sensitive to the values of grazing at low phytoplankton concentrations than high ones39. Intuitively, this makes sense as all high phytoplankton concentrations must first pass through lower ones and thus cannot emerge if grazing rates are fast enough at low phytoplankton concentrations to stifle a bloom before it happens.

However, to test the sensitivity of the PGI to the selection of the standardized prey field, we evaluated it using a total phytoplankton concentration ranging from 0.1–5.0 mmolC m−3, compared to the MAREDAT global median value of 0.79 mmolC m−3 (Fig. S2). In all evaluations, the relative distribution of biomass across groups was kept the same. Generally, models with a larger PGI also have more uncertainty when evaluated across a range of prey abundance. This is because they typically have lower K1/2 or higher g\({}_{\max }\) values, leading to larger direct interaction strengths at low to moderate prey concentrations, and thus, larger changes to grazing with respect to changes in prey biomass18. In our result, we express this uncertainty by including error bars representing the PGI evaluated at ±25% of the global median phytoplankton concentration, illustrating how PGI generally becomes less precise as it increases (i.e., Figs. 3, 6, 7).

Still, when looking across the full distribution of PGI values (Fig. S2a), the hierarchy of median values (horizontal black lines) is very similar to that obtained when solely using the standardized MAREDAT field (red stars). Moreover, while the difference in PGI between models decreases when evaluated at larger prey concentrations, there is still a 680% difference between the model with the fastest and slowest PGI when evaluated at a much larger total phytoplankton concentration of 5.0 mmolC m−3 (Fig. S2b, blue trace). Even ignoring the two fastest and two slowest grazing models, the difference in PGI is over threefold at 5.0 mmolC m−3, and over tenfold when evaluated below 0.5 mmolC m−3 (Fig. S2b, black trace). By comparison, this ratio ranges from only nine- to tenfold when calculated using the empirically measured functional response curves for micro- and mesozooplankton (see “Empirical measurements of the functional response” below).

Combined with the fact that the PGI is well correlated with simulated specific grazing rates and grazing pressure globally (Fig. 6) and at most latitudes, during summer and winter (Fig. 5), it appears that it is a useful tool to compare large-scale differences in the rate zooplankton are assumed to graze between models. That is, while small differences in PGI are not particularly interesting, and sensitive to the definition of the standardized prey field, large differences are. A 70-fold difference in the PGI means zooplankton in CESM2 would graze 7000% faster than in MEDUSAv2 if subject to the same temperature and prey. This implies something fundamentally different about the model dynamics and likely requires a substantially different parameterization of bottom-up controls to prevent complete nutrient limitation and or ecosystem collapse.

Empirical measurements of the functional response

The single-prey functional response curve describing how zooplankton-specific grazing rates increase with prey abundance can be estimated empirically via laboratory dilution experiments. In these studies, specific grazing rates are measured at different prey concentrations and then typically fit to a type II response function. The empirical estimations of the PGI in Fig. 1b are computed by evaluating empirically-estimated functional response curves as if they described the grazing formulation of a theoretical model with 1 pPFT and 1 zPFT. We consider 107 empirically-estimated functional response curves from over 60 species of zooplankton compiled in a review of laboratory work by ref. 37. They include seven zPFTs: nanoflagellates, dinoflagellates, ciliates, rotifers, meroplankton larvae, cladoecerans, and copepods (Table S2). To control for prey quality, 37 applied a selection criterion to exclude all studies in which dilution experiments were conducted with prey of sub-optimal size, such that the empirical parameters they compare are all assumed to constitute grazing on optimal prey, as is the implicit assumption for models with a single pPFT and zPFT. Temperature, predator volume, and empirically derived g\({}_{\max }\) and K1/2 values from each experiment are reported in Table 3 of ref. 37. Here, we convert g\({}_{\max }\) values to units of d−1 and control for temperature using a Q10 value of 2.8, the mean value reported by ref. 37. To be consistent with model estimates of the PGI (Table S1), all g\({}_{\max }\) are converted to a temperature of 15.3 C using the relationship

$${{{{{{{{\rm{g}}}}}}}}}_{\max ,15.3}={{{{{{{{\rm{g}}}}}}}}}_{\max }{{{{{{{{\rm{Q}}}}}}}}}_{10}^{(15.3-T)/10},$$
(1)

where g\({}_{\max ,15.3}\) is the temperature-dependent saturation grazing rate at 15.3 C and g\({}_{\max }\) is the value measured at an experimental temperature of T. K1/2 values are converted from biovolume to mmolC m−3 using the reported carbon density of 0.12 gC cm−3, which is consistent with the range of carbon densities in phytoplankton152. All empirical parameters were fit to a type II response. Empirical estimates of the PGI are then derived by evaluating the measured single-prey functional response curve for each species at the global median phytoplankton concentration (0.79 mmolC m−3; Table S2), as described above. Note, as per the model evaluated PGI, this is an estimate of how fast each zooplankton species would graze on its optimal prey at the global median total phytoplankton concentration. It is not an estimate of how fast each species actually grazes in the open ocean, but rather a quantification of the innate properties of their empirically defined functional response curves.

The probability density function for the PGI of all 107 experiments is overlaid in Fig. 1b. The PGI reported for each zPFT (Fig. 1b and Table S2) is computed using the median g\({}_{\max }\) and K1/2 values for all included species. zPFTs are further grouped into microzooplankton (<1e4 μm3) and mesozooplankton (>1e4 μm3) with their PGI computed identically. These coarser groupings do not account for differences between the sample size of each species and their actual global distribution, but do give a general sense of the difference in the PGI for broadly categorized micro- and mesozooplankton. We also consider a theoretical model with 2 zPFTs and 2 pPFT, with zooplankton grazing prescribed separately based on the empirically described micro- and mesozooplankton parameters (Table S2). The PGI here does take into account the actual distribution of micro- and mesozooplankton (Table S1).

Finally, note that this is not inclusive of all important zooplankton species or functional groups. For example, large photo-symbiotic Rhizaria may constitute 50% or all mesozooplankton biomass125 and help explain why the oligotrophic food web may be shorter than predicted by classical ecology129. Moreover, neither macrozooplankton nor gelatinous filter feeders are included. Macrozooplankton typically graze slowly on smaller zooplankton but can regulate grazing pressure on diatoms and other pPFTs through carnivorous trophic cascades60. On the other hand, larvaceans and salps are extremely abundant mesozooplankton that can filter bacteria and picophytoplankton, unavailable to crustaceans. They generally grow an order of magnitude faster than crustacean zooplankton153 and have been shown to play an important role in carbon export via the production of rapidly-sinking carcasses and fecal pellets121,134. Never the less, the empirical values included here are meant to provide context for the two zPFTs most typically included CMIP6-class BGC models, not provide an exhaustive list of empirical parameters for all known zooplankton species. Note, salps121,134 and macrozooplankton26,60 have been routinely included in targeted global simulations, but their inclusion in high-resolution climate-ensembles is not yet standard.

Observational components of the PGI

The global median standardized prey field used to evaluate the PGI is primarily based on field observations collected by the MARine Ecosystem biomass DATa (MAREDAT) initiative. MAREDAT is a compilation of global, gridded biomass data sets for a range of PFTs and is described in detail in28. Here, we use the reported global median epipelagic (0–200 m) concentration of microzooplankton93, mesozooplankton94, macrozooplankton154, picophytoplankton155, diatoms156, diazatroph157 and Phaeocystis158 to describe the phytoplankton and zooplankton distribution and in models in which bacteria is available as prey, we use the global median picoheterotroph concentration (which also includes Archaea)151. Briefly, all MAREDAT data is gridded onto a 1 by 1 grid with 33 vertical layers and a climatologic monthly time step. Throughout, data has been quality controlled by contributing researchers and Chauvenet’s criteria was used to omit anomalously high values. Note, calcifying PFTs (coccolithophores, foraminifers, and pteropods) are reported but were not included as no model explicitly represented them and their median global values are small relative to other PFTs.

Uncertainty in these global values is fairly large due to patchiness and sampling bias, as discussed in detail in ref. 28. However, estimates of total phytoplankton (0.79 mmolC m−3) are generally consistent with our independent estimates of the median phytoplankton biomass (0.97 mmolC m−3) calculated from 18 years (July 2002–April 2021) of remote sensing data from the Carbon-based Productivity Model113. Note, in high-latitude regions with summer remote sensing coverage but without winter coverage, winter-time concentrations were set to 0, assuming that any overwintering biomass persisting through the darkness was trivial to the global median.

Temperature and detritus are also needed to compute the PGI for some models, but are not included in the MAREDAT database. In models in which grazing is temperature-limited, rates are evaluated at the global (-60S–60N) mean long-term average (1854–2022) sea surface temperature from the NOAA ERSSTv5 product149. In models in which detritus is available as prey, detritus is estimated as the global median area-weighted particulate organic carbon (1–200 m) field observations reported in Fig. 2 of ref. 150 less the global median living biomass concentrations from Table S1.

Observations of global BGC rates and carbon stocks

The observed mean annual global carbon stocks and fluxes reported in Fig. 6 are compiled and computed from a variety of independent data sets, and thus are not necessarily internally consistent. Where possible, uncertainty was estimated using the 25th and 75th percentiles of observed distributions; however, this was not always possible and is discussed below in detail for each estimate.

Mean annual, globally-integrated biomass estimates (Fig. 6e–h) were computed entirely from the MAREDAT dataset28. First, for each phytoplankton and zooplankton PFT considered in the standardized prey field described above93,94,154,155,156,157,158 we computed the mean annual carbon biomass inventory by constructing global, monthly biomass profiles (0–200 m), then integrating in depth, averaging in time, and multiplying by the global ocean area between 60S–60N. Note, very high latitudes were excluded here and for model output to avoid regions where models tend to perform poorly and observations are sparse. Monthly biomass profiles were created by computing the median biomass concentration at each depth horizon, with uncertainty bounds approximated using the 25th and 75th percentiles. We use the median ± 25% rather than the mean ± 1 standard deviation to estimate the global carbon inventory because field experiments, particularly those specifically targeting blooms, are typically designed to sample where plankton are more abundant, likely creating a larger positive bias in the mean. Uncertainty bounds reflect the middle 50% of the MAREDAT data to be consistent with the uncertainty estimation of grazing pressure by ref. 48 (see below). Mean annual phytoplankton biomass (0.59 PgC (0.37, 1.25); Fig. 6e), zooplankton biomass (0.33 PgC (0.20, 0.78); Fig. 6f), total phytoplankton and zooplankton biomass (0.92 PgC (0.66, 1.7); Fig. 6g), and the ratio of zooplankton to phytoplankton biomass (0.57 (0.25, 1.6); Fig. 6h) are then computed by operating on the mean annual global PFT inventories and propagating uncertainty bounds accordingly.

The observed mean global grazing pressure (0.34 d−1 (0.24, 0.62); Fig. 6a) was computed from field measurements of microzooplankton48,49,50 and mesozooplankton51 phytoplankton-specific grazing mortality rates (i.e., grazing pressure) estimated primarily using shipboard dilution experiments and gut content analysis, respectively. Field measurements for microzooplankton from 110 studies were compiled by ref. 48, extending the work of ref. 49, and grouped into bio-geographic provinces as defined by ref. 159. Ref. 48 report the median, 25th and 75th percentile for microzooplankton grazing pressure (referred to as the grazing mortality rate) in each province. We estimate the global mean grazing pressure from microzooplankton using an area-weighted average across all open ocean provinces, with uncertainty propagated using the 25th and 75th percentiles for each province. Field measurements for mesozooplankton from 27 studies were compiled and used to fit a log-linear relationship between NPP and total autotrophic ingestion by mesozooplankton51. We use this relationship along with the median NPP in each province reported in Table 2 of ref. 48 to further estimate the associated total mesozooplankton autotrophic ingestion rate in each province, with uncertainty estimated using the standard error of the regression parameters reported by ref. 51. We then compute the area-weighted global mean mesozooplankton bulk grazing rate and divide by the global phytoplankton inventory (see above) to estimate the global mesozooplankton grazing pressure, add it to microzooplankton grazing pressure to reach total grazing pressure, and propagate uncertainty accordingly. Note, grazing pressure from mesozooplankton accounts for roughly 10% of total grazing pressure.

Schmoker et al. 48 also report the total autotrophic losses to microzooplankton grazing (PgC yr−1), allowing us to estimate the globally-integrated bulk phytoplankton grazing mortality rate (28.0 PgC yr−1 (23.2, 54.4); Fig. 6k) in a similar fashion, again using the relationship from49 for the mesozooplankton contribution. Dividing by globally-integrated zooplankton biomass from the MAREDAT dataset gives us zooplankton-specific grazing rates (0.23 d−1 (0.12, 0.61); Fig. 6b). Dividing by globally-integrated NPP gives us an estimate of global grazing efficiency (56% (46, 108); Fig. 6n). Note, a grazing efficiency over 100% indicates that uncertainties in the observations exceed the theoretical bounds. Additionally, we can multiply the globally-integrated bulk phytoplankton grazing mortality rate by an estimate of assimilation efficiency to reach GSP (7.9 PgC yr−1 (19.6, 38.4); Fig. 6l). We assume assimilation efficiency to be 0.7, as typically assumed by the ecological modeling community13,160,161,162,163, with an uncertainty range of 0.3–0.913,161,162,163.

Globally-integrated NPP (53 PgC yr−1 (46, 60); Fig. 6i) is reported per ref. 164, derived from ARGO-float measured diel-cycles in oxygen drawdown. This range is consistent with observationally-constrained model estimates (~51165 and 58 ± 7 PgC yr−1166) and the climatologic (2012–2018) mean Carbon-based Productivity Model (CbPM) remote sensing product (~51.1 PgC yr−1113) when integrated between the 60S–60N (as for the models). However, the vertically generalized productivity model (VGPM)167 estimates a lower climatologic mean (2012–2018; 60S–60N) of 42.7 PgC yr−1, and some observationally-constrained model estimates reach as low as ~35 PgC yr−1165. Note, integrating either remote sensing product from 90S-90N only increases the NPP by 1–2 PgC yr−1 and is thus reasonable to compare to full global estimates from refs. 164,165,166. Globally-integrated EP (6 PgC yr−1 (5, 12); Fig. 6j) is reported per ref. 168 but estimates are generally poorly constrained, ranging from 5–12 PgC yr−1168,169,170. Annually-averaged global export efficiency (11.3% (8.9, 22.7); Fig. 6m) is estimated from the globally-integrated NPP and EP estimates described above, with uncertainty propagated accordingly.

Globally-averaged phytoplankton-specific division rates (0.25 d−1 (0.10, 0.54); Fig. 6c) are derived by dividing NPP estimates from the literature by phytoplankton biomass from MAREDAT (see above) and propagating uncertainty. Note, using the mean annual climatologic CbPM-derived surface carbon inventory instead (0.46 PgC assuming a mean MLD of 100 m), yields a similar estimate (0.30 d−1); however, using field data reported in Table 2 of ref. 48 and computing the global average as for grazing pressure yields much higher estimates (0.52 d−1 (0.41, 0.81)), likely due to sampling bias. Here, we use the lower estimates to better account for large swaths and seasons of un-sampled, likely low-productivity water but note that grazing observations are still subject to sampling bias and likely overestimated.

Globally-averaged zooplankton-specific mortality rates (0.13 d−1 (0.10, 0.15); Fig. 6d) are approximated from the species mean mortality rates of sac-spawning copepods at the long-term global mean SST (15.3 C), with uncertainty calculated using the standard error of the regression fit between temperature and mortality reported in Table 2 of ref. 171. Note, these values were inferred from field estimates of fecundity and development using a steady-state assumption, not direct observations of mortality, and do not take into account differences in microzooplankton mortality rates171. We take these estimates to implicitly include respiration as a cost in the growth-side steady-state estimates, but note this is likely an underestimation as global epipelagic (0–200 m) respiration rates for zooplankton have been estimated at 0.11 ± 0.04 d−1172 and predation is estimated to account for 66–75% of the rates estimated by ref. 171.

Finally, it is important to note that given the sparsity of global observations and large sampling biases, smaller uncertainty windows often reflect less observations from which to calculate uncertainty than a true reflection of natural variability. That is, it is difficult to quantify the uncertainty in the uncertainty. Thus, while observational bounds provide useful context, they should be interpreted with caution.

Zonal rates and carbon stocks

Zonally resolved observational estimates of grazing pressure (Fig. 5b), zooplankton-specific grazing rates (Fig. 5e), zooplankton biomass (Fig. 5h), phytoplankton-specific division rates (Fig. S4e), phytoplankton biomass (Fig. S4H), bulk phytoplankton grazing mortality (Fig. S4b), total plankton biomass (Fig. S6b), and the ratio of zooplankton to phytoplankton (Fig. S6e) were computed in a similar manner to the global estimates described above. Following48, zonal bins were partitioned into the northern high latitudes (>60N), northern westerlies (30–60N), tropics (30S–30N), southern westerlies (30–60S), and southern high latitudes (<60S). No explicit seasonal delineations were made; however, the vast majority of field studies included in48 and28 were conducted outside of the winter months, so observational estimates are thus considered more representative of summer, and plotted accordingly.

First, zonal distributions of biomass for each PFT (i.e., Fig. 4 in ref. 28) were downloaded from the MAREDAT Pangea database93,94,154,155,156,157,158 and binned as described above. Similar to the global estimates, we found the median biomass for each PFT in each depth layer and zonal band, then integrated these profiles across depth and functional groups to yield the median phytoplankton, zooplankton, and total plankton inventory. Uncertainty bounds were computed using the same method but for the 25th and 75th percentiles. The ratio of zooplankton to phytoplankton biomass is then computed for each zonal band with uncertainty propagated accordingly.

Next, zonal estimates and uncertainty for bulk grazing mortality, grazing pressure, and specific grazing rates are computed identically to the global values described above, except now sorting the provinces of ref. 48 into the zonal bands described above and pairing them with the corresponding MAREDAT biomass zonal bins. Note, 48 do not distinguish the northern from southern westerlies, so grazing data is identical for each. However, we use zonally resolved biomass data to compute specific rates where required. Finally, the zonal estimate of phytoplankton-specific division rates are now computed using the zonal estimates of ref. 48.

Even more so than the global estimates, when interpreting uncertainty bounds, it is important to remember that these field measurements are sparse (~100–500 experiments per zonal band), patchy (see Fig. 1 in ref. 48), and likely subject to substantial sampling biases, as most studies are carried out during the spring or summer, deliberately in proximity to the spring bloom.

Model output and offline diagnostics

We analysed the output from historical simulations of 11 CMIP6 models (Table S3). Model output82,83,84,85,86,87,88,89,90,91,92 has been made publicly available by the Earth System Grid Federation (ESGF) and was downloaded from the Lawrence Livermore National Laboratory data node (https://aims2.llnl.gov/search). Note, three variables from ACCESS-ESM1.5 (phytoplankton biomass, zooplankton biomass, NPP) were not available on the ESGF-LNLL node and were downloaded directly from the Australian National Computing Infrastructure. For each model, we considered monthly climatologies created from the historic period from 1850 to 1900. Considering years up to 1900 provides a long enough time series to average out ENSO variability but stops before there is a substantial anthropogenic forcing, as we are interested in the non-transient dynamics stemming from structural differences in the grazing formulation between models. We used the pre-1990 historical simulation instead of the pre-industrial control because more variables of interest were available across all models. For each model, we downloaded monthly averaged column integrated NPP, exported through 100 m, and all depth-resolved phytoplankton and zooplankton plankton fields. Depth-resolved detritus, bacteria, temperature, and oxygen fields were additionally downloaded when required for offline rate computations. Any exceptions are discussed below.

We then calculated diagnostic rate terms offline, pixel-by-pixel, using a given model’s specified equations and parameters (Figs. 2 and S13) evaluated with the depth-resolved predator and prey biomass fields in addition to depth-resolved temperature and oxygen fields where required. The bulk phytoplankton grazing mortality rate (G, mmolC m−3 d−1), defined as the bulk rate at which all phytoplankton biomass is lost to zooplankton grazing, was calculated by integrating grazing losses from all phytoplankton pools to all zooplankton pools.

$$G=\mathop{\sum }\limits_{i=1}^{{n}_{z}}\mathop{\sum }\limits_{j=1}^{{n}_{p}}{g}_{{Z}_{i}}^{{P}_{j}}({{{{{{{\rm{Prey}}}}}}}}\,{{{{{{{\rm{Field,\; Temperature}}}}}}}}){Z}_{i},$$
(2)

where nz and np are the number of zooplankton and phytoplankton types, Zi is the biomass of zooplankton group i, and \({g}_{{Z}_{i}}^{{P}_{j}}\) is the zooplankton-specific grazing rate of Zi on Pj as described in Fig. 2, evaluated at the given prey field and temperature of each grid cell. Note, the presence of non-phytoplankton prey (i.e., zooplankton, bacteria, detritus) in the prey field does influence \({g}_{{Z}_{i}}^{{P}_{j}}\) in so much as it redirects grazing effort and provides refuge for phytoplankton; however, losses from non-phytoplankton pools are not included in the integration of G. Zooplankton-specific grazing rates (g, d−1) and grazing pressure (gp, d−1) are calculated by dividing by G by the total zooplankton and phytoplankton biomass pools.

$$g=\frac{G}{\mathop{\sum }\nolimits_{i = 1}^{{n}_{z}}{Z}_{i}}$$
(3)
$${g}_{p}=\frac{G}{\mathop{\sum }\nolimits_{j = 1}^{{n}_{p}}{P}_{j}}$$
(4)

Gross secondary production (GSP, mmolC m−3 d−1) is defined as the gross assimilation rate of phytoplankton biomass into zooplankton biomass, such that

$$GSP=\mathop{\sum }\limits_{i=1}^{{n}_{z}}\mathop{\sum }\limits_{j=1}^{{n}_{p}}{\gamma }_{{Z}_{i}}^{{P}_{j}}{g}_{{Z}_{i}}^{{P}_{j}}({{{{{{{\rm{Prey}}}}}}}}\,{{{{{{{\rm{Field,\; Temperature}}}}}}}}){Z}_{i},$$
(5)

where γ is the assimilation efficiency of Zi grazing on Pi. Here, we define γ to include all terms that scale G when computing the zooplankton biomass tendency but note that depending on the model, this term can implicitly or explicitly include sloppy feeding prior to ingestion, egestion as fecal production, and/or respiration to account for the energetic cost of assimilation. Thus, GSP does not mean exactly the same thing across models but consistently represents the gross assimilation of zooplankton biomass before accounting for losses to basal metabolic respiration, linear mortality (i.e., disease and senescence), and quadratic mortality (i.e., implicit predation). For a more direct comparison to ref. 48, grazing efficiency is defined here as the percent of NPP that is grazed (i.e., \(\frac{G}{{{{{{{{\rm{NPP}}}}}}}}}\)), regardless of how much is assimilated into zooplankton biomass (i.e., \(\ne \frac{{{{{{{{\rm{GSP}}}}}}}}}{{{{{{{{\rm{NPP}}}}}}}}}\)). Note, in some cases (i.e., CMOC), grazing efficiency exceeds 1. This is an artifact of the offline averaging over nonlinearities in the functional response, which will overestimate grazing in all models. Nonetheless, the high grazing efficiency in CMOC reflects very high grazing rates, a very low K1/2 value, and the fact that essentially all NPP is grazed.

Bulk zooplankton mortality rates (M, mmolC m−3 d−1) were calculated as the sum of all zooplankton biomass loss terms, including basal metabolic losses to respiration, less the biomass that is assimilated back into other zooplankton pools via carnivorous grazing, such that

$$M=\mathop{\sum }\limits_{i=1}^{{n}_{z}}{M}_{i}+I,$$
(6)

where Mi is the mortality rate for a given zooplankton class excluding losses to other carnivorous zPFTs and I accounts for all biomass lost via assimilation inefficiencies during carnivorous grazing. Both terms are detailed in Fig. S13 for all models. Although metabolic losses to respiration are not mortality, per se, they are included because they are generally small compared to other losses, and it allows for consistent comparison across models which do not consistently attribute their linear loss terms to the same processes. Zooplankton-specific mortality rates were reached by dividing by the total zooplankton biomass pool,

$$m=\frac{M}{\mathop{\sum }\nolimits_{i = 1}^{{n}_{z}}{Z}_{i}}.$$
(7)

Finally, phytoplankton-specific division rates, μΣ, (d−1), were computed in a depth-averaged sense as depth-integrated NPP divided by total depth-integrated phytoplankton biomass.

In a few cases, assumptions had to be made to fill in data that was not available. OECO-v2 includes a diazotroph class, but only a single phytoplankton variable was uploaded. We assume the partitioning between generic phytoplankton and diazotrophs was 2%, equal to the global median averaged across the other two models that resolved (and publically hosted) diazotrophs (CESM2 and COBALTv2). COBALTv2 only provided surface fields for bacteria (a prey option for small zooplankton) and included medium and large zooplankton together in a single output field. Here, depth-resolved bacteria fields were assumed proportional to the profile of total phytoplankton. Based on personal communication (see Table S3), medium and large zooplankton were assumed to be split 40/60 in highly productive waters, 60/40 in mesotrophic waters, and 90/10 in oligotrophic waters. Productivity biomes were delineated based on the 33rd and 66th percentiles of annually-averaged, depth-integrated NPP. Note, COBALTv2 provided explicit grazing rates, which would obviate both assumptions but introduce uncertainty in comparisons with the other models, which had to be computed offline. Here we report the offline diagnostics for consistency, but note that the median bias between offline and explicit zooplankton grazing rates in the top 200 m of COBALTv2 is −14%. Moreover, using the explicit rate terms would not change our primary results regarding inter-model differences in grazing pressure, which differed by over 4000% between the most extreme models during the winter. In CanOE, the temperature-limited respiration rate for zooplankton (β1TLimZ; Fig. S13) does not result in the catabolism of existing biomass when excess carbon is available in ingested prey (which have variable stoichiometry) relative to the fixed zooplankton (Redfield) stoichiometry. Nutrient pools for individual phytoplankton types were not available, so we determined excess carbon based on the stoichiometry of the total phytoplankton population. In PISCESv2, the assimilation efficiencies depend on variable prey stoichiometry and are approximated as their maximum efficiency and nutrient-limited excretion terms were omitted, meaning GSP may be slightly overestimated and mortality rates may be slightly underestimated.

Finally, in addition to the 11 CMIP6 models, in Figs. 5 and S4S6 we included the model of ref. 96, which is part of the DARWIN Project, run on the MITgcm at 1 × 1 resolution, and includes 31 and 16 size-structured phytoplankton and zooplankton groups, respectively. Here grazing rates were saved explicitly at monthly resolution and integrated across phytoplankton and zooplankton groups offline. To account for the inclusion of mixotrophs, phytoplankton biomass is considered anything that is a primary producer (i.e., phytoplankton and mixotrophs) while zooplankton biomass is considered anything that consumes primary producers (i.e., herbivorous zooplankton, omnivorous zooplankton, and mixotrophs). The bulk phytoplankton grazing mortality rate, G, thus includes phytoplankton and mixotroph losses to grazing by both zooplankton and mixotrophs. Note, Darwin also includes zooplankton carnivory and 7 types of bacteria. These are not included explicitly in our diagnostics, but since they dynamically influence zooplankton biomass and grazing effort, their inclusion, and the broader structure of the food web, is reflected in our diagnostics.

Lastly, note that when computing mean or median values, all variables are weighted to appropriately reflect differences in the area of grid cells and distribution of biomass. For instance, globally-averaged zooplankton-specific grazing rates are weighted by the zooplankton biomass inventory of each grid cell, such that the grazing rate of a very small concentration of zooplankton in a small grid cell counts proportionally less than that of a large, densely populated one. Similarly, grazing pressure is weighted by phytoplankton biomass to reflect the grazing pressure felt by the mean (or median) phytoplankton in the model.

WOMBAT configuration

We tested a range of different grazing formulations (see “Sensitivity experiments” below) in a 3D global BGC model. Given the established sensitivity of BGC models to the underlying physics173,174, each experiment was embedded in an identical general circulation model and subject to the same atmospheric forcing. All non-grazing parameters were kept identical across runs, unless otherwise noted.

The BGC model used, the Whole Ocean Model of Biogeochemistry and Trophic-dynamics (WOMBAT)66, is the BGC component of the current Australian Earth Systems Model (ACCESS-ESM1.5)175 and has been used extensively in previous studies176,177,178,179 including the CMIP6 project referenced here66. WOMBAT is a standard NPZD model that routes nutrients (and implicitly carbon) between one phytoplankton pool, one zooplankton pool, one dissolved inorganic pool, and one detritus pool (i.e., particulate organic carbon)179. Briefly, a single phytoplankton pool grows based on a nutrient, light, and temperature-dependent photosynthetic growth rate and is grazed by a single zooplankton pool. Autotrophic cellular respiration is considered implicitly in the growth term, meaning phytoplankton accumulation via photosynthesis represents net primary production, while heterotrophic respiration and excretion is considered explicitly, meaning zooplankton accumulation via grazing represents gross secondary production. Microbial decomposition implicitly returns a phytoplankton mortality flux and detritus remineralization flux to the inorganic nutrient pool. The fraction of NPP that is eventually transferred to export depends on the size of the detritus pool, which is fed by a quadratic phytoplankton aggregation term, quadratic zooplankton mortality term, and a small fraction of phytoplankton lost to grazing that directly enters detritus instead of zooplankton biomass due to sloppy feeding. The zooplankton mortality/aggregation rate is 20% larger than the phytoplankton aggregation rate. A schematic of the standard WOMBAT configuration and parameterization is provided in Fig. S7.

The ocean model used for each experiment is based on the ocean component of the ACCESS global climate model that is described in detail in ref. 180. In brief, the model here is a global configuration of Modular Ocean Model version 5181 that is nominally 1-degree resolution, with extra latitudinal resolution near the equator and in the Southern Ocean. There are 50 vertical levels, with 10 m resolution in the top 200 and 300 m resolution in the abyss.

Each run was initialized from the same initial state and forced with the same surface flux and freshwater runoff from the Japanese 55-year atmospheric reanalysis surface dataset, JRA55-do182. After initialization, each run was spun up for 5 years to a quasi-steady state, at which point the model had equilibrated with the changes made to its grazing formulation. Model output is reported from the 5th year of the simulation and can be considered climatological, as it is embedded in a repeat-climatological physical ocean.

Sensitivity experiments

In the standard “Slow Turnover” suite of experiments (Figs. 7, 9 and S8; solid lines), we ran 19 simulations, each with different grazing formulations but an otherwise identical model configuration. We test 18 parameter combinations of g\({}_{\max }\) (0.5, 1.0, 2.0 d−1) and K1/2 (0.5, 1.0, 2.0, 4.0, 8.0, 16.0 mmolC m−3), which cover a representative range of parameter values used in models and estimated empirically39. Additionally, one control run was simulated using the most common WOMBAT configuration (Type III; K1/2 = 6.68 mmolC m−3; g\({}_{\max }\) = 1.58 d−1)66. All simulations in the “Slow Turnover” experiment suite use a type III response, as in the control configuration of WOMBAT, and all other parameter values are as described in Fig. S7. We then repeated each simulation in two additional experiment suites (for a total of 57 unique runs), one using a faster phytoplankton growth rate parameter and one using a type II response curve (Table S4). Otherwise, the grazing formulation and all other parameter values were identical across experiment suites.

We ran the “High Turnover” experiment suite to determine how carbon cycling is modified by models that are tuned to the same global, mean-annual NPP (\(\overline{{{{{{{{\rm{NPP}}}}}}}}}\)) by pairing either faster or slower zooplankton and phytoplankton growth rates. Here, we allowed for more efficient nutrient utilization by parameterizing a higher nutrient and light-replete phytoplankton saturation growth rate (μsat). The value of μsat in the “High Turnover” experiment suite was chosen to match the value used in MARBL68, the BGC model which has the largest PGI (Figs. 1, 2) and thus needs the strongest compensation from other parameters to be tuned to observed \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\). Accordingly, the μsat value used in MARBL and our “High Turnover” experiment suite (2.38 d−1) is 3.2-times faster than the WOMBAT value used in the “Slow Turnover” experiment suite (0.75 d−1) at 16 C. Note, μsat is temperature-dependent and temperature limitation is treated with slightly different equations in MARBL (μsat = 5 × 1.7(T−30)/10) and WOMBAT (μsat = μsat. × 1.066T). We chose \({\mu }_{\max }\) such that \({\mu }_{\max }.* 1.06{6}^{T}=5* 1.{7}^{(T-30)/10}\) at T = 16 C, roughly the ACCESS global mean SST. Although \({\mu }_{\max }\) in our “High Turnover” experiment suite at T ≠ 16 C will slightly differ from that of MARBL it remains roughly proportionately elevated relative to our other experiment suites and thus is not relevant to our results.

To quantify the divergence of the ‘High Turnover’ experiment suite (Fig. 9; dashed lines) from the “Slow Turnover” experiment suite (Fig. 9; solid lines) we consider two ensembles with, on average, identical \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) (see Fig. 9). To do so, we fit a curve to the relevant model output from each simulation plotted against their associated \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) (Fig. 9). We then averaged those curves across the same range of \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\), such that each ensemble must have identical \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\), but may vary across other dimensions of model output (Fig. 8; horizontal). We constrained the range of \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) to that where both experiment suites overlap, such that the curves we averaged across were never extrapolated beyond actual runs (see Fig. 9; dashed vertical lines). This corresponds to a range of roughly 11 to 44 PgC yr−1, or the minimum and maximum \(\overline{{{{{{{{\rm{NPP}}}}}}}}}\) computed from the “High Turnover” and “Slow Turnover” experiment suites, respectively. This range is bracketed in Fig. 9, along with the associated mean model output from each ensemble and their percent deviation.

Additionally, we ran a “Type II” experiment suite to determine how a qualitatively different response curve modifies the sensitivity to its parameter selection (Table S4). The sensitivity of marine carbon cycling to the PGI when using a type II response (dashed lines) instead of a type III response (solid lines) is shown in Fig. S8 and discussed in “Supplemental Discussion 1”.

Finally, note that WOMBAT quantifies biomass in nitrogen units and uses disk parameters to describe the functional response. Therefore, before prescribing parameters in WOMBAT, all parameter values reported here were first converted from carbon to nitrogen units using a C:N ratio 106:16 and then from Michaelis–Menten parameters into disk parameters (ϵ and \({{{{{{{{\rm{g}}}}}}}}}_{\max }\)) using \(\epsilon ={{{{{{{{\rm{g}}}}}}}}}_{\max }\) K\({}_{1/2}^{-2}\) for a type III response and \(\epsilon ={{{{{{{{\rm{g}}}}}}}}}_{\max }\) K\({}_{1/2}^{-1}\) for a type II response. For detailed information on the difference between the two mathematically equivalent parameter schemes, see ref. 39. This conversion has no effect on the prognostic integration of the model and parameter values were reported in their Michaelis–Menten form with carbon units for convenience, clarity, and comparison.