Distribution of mesozooplankton biomass in the global ocean

Mesozooplankton are cosmopolitan within the sunlit layers of the global ocean. They are important in the pelagic food web, having a significant feedback to primary production through their consumption of phytoplankton and microzooplankton. In many regions of the global ocean, they are also the primary contributors to vertical particle flux in the oceans. Through both they a ffect the biogeochemical cycling of carbon and other nutrients in the oceans. Little, however, is known about their global distribution and biomass. While global maps of mesozooplankton biomass do exist in the literature, they are usually in the form of hand-drawn maps for which the original data associated with these maps are not readily available. The dataset presented in this synthesis has been in development since the late 1990s, is an integral part of the Coastal and Oceanic Plankton Ecology, Production, and Observation Database (COPEPOD), and is now also part of a wider community e ffort to provide a global picture of carbon biomass data for key plankton functional types, in particular to support the development of marine ecosystem models. A total of 153 163 biomass values were collected, from a variety of sources, for mesozooplankton. Of those 2 % were originally recorded as dry mass, 26 % as wet mass, 5 % as settled volume, and 68 % as displacement volume. Using a variety of non-linear biomass conversions from the literature, the data have been converted from their original units to carbon biomass. Depth-integrated values were then used to calculate an estimate of mesozooplankton global biomass. Global epipelagic mesozooplankton biomass, to a depth of 200 m, had a mean of 5.9 μg C L−1, median of 2.7μg C L−1 and a standard deviation of 10.6μg C L−1. The global annual average estimate of mesozooplankton in the top 200 m, based on the median value, was 0.19 Pg C. Biomass was highest in the Northern Hemisphere, and there were slight decreases from polar oceans (40–90 ◦) to more temperate regions (15–40 ◦) in both hemispheres. Values in the tropics (15◦N–15◦ S) were intermediate between those at the northern and southern temperate latitudes. Datasets available at doi:10.1594/PANGAEA.785501.


Introduction
Mesozooplankton are found throughout the world's oceans.They are defined as zooplankton ranging from 200 µm to 2 cm (Sieburth et al., 1978), consisting primarily of crustacean plankton (copepods), meroplanktonic larva and smaller individual gelatinous zooplankton.Mesozooplankton are traditionally sampled by towed nets with mesh sizes ranging from 200 to 333 µm (Harris et al., 2000).They feed directly on phytoplankton, microzooplankton, other mesozooplankton and detritus, and have a significant feedback to primary production (Buitenhuis et al., 2006).In the global ocean they are one of the primary contributors to vertical particle flux in the oceans.Thus they are important in both the pelagic food web and export production, affecting the biogeochemical cycling of carbon and other nutrients in the oceans.
Published by Copernicus Publications.
While global maps of mesozooplankton biomass exist in the literature (Bogorov et al., 1968;Reid Jr., 1962), they exist only in the form of hand-drawn maps, and the original data compiled for creating these maps are not widely available, if at all.Volume 5 of the World Ocean Atlas (WOA) 2001 (O'Brien et al., 2002) was one of the first freely available, global data compilations of zooplankton biomass created.Since then, this dataset has been expanded upon in method and data content at fairly regular intervals (O'Brien, 2005(O'Brien, , 2007(O'Brien, , 2010)).For this synthesis, data from O'Brien (2010), along with additional new data, have been processed through the new and hybrid techniques outlined in this document.
Mesozooplankton are an important group within the plankton community.While mesozooplankton and microzooplankton collection methods and biogeochemical contribution differ greatly, a distinction is not always made between the two groups in biogeochemical models that represent all zooplankton as one box, e.g., nutrient-phytoplanktondetritus-zooplankton (NPDZ) models.NPDZ models have been shown to underestimate the interannual variability of chlorophyll a, which suggests these models also underestimate decadal-and century-scale sensitivity of climate variability (Buitenhuis et al., 2006).Models that more closely represent our current understanding of the marine ecosystem are being built in an effort to address this issue.Mesozooplankton communities have shown to exhibit decadal-scale variability with climate (Beaugrand et al., 2003), and as they have an effect on both primary production and carbon export they need to be explicitly represented in biogeochemical models (Le Quéré et al., 2005).Including mesozooplankton sensitivity to climate variability on a decadal scale, in models that capture important marine ecosystem processes, should bring us closer to modeling the response and feedbacks between marine ecosystems and climate variability, which are largely unknown at present.There is a pressing need for observations that allow the development and validation of these models, and mesozooplankton constitute a group of significant importance in this regard.
The data presented in this paper are part of a wider community effort known as MARine Ecosystem DATa (MARE-DAT).MAREDAT is a collection of global biomass datasets.It contains data on the global distribution of a variety of the major plankton functional types (PFTs) currently represented in marine ecosystem models.These include picophytoplankton, diazotrophs, coccolithophores, Phaeocystis, diatoms, picoheterotrophs, microzooplankton, mesozooplankton, pteropods and macrozooplankton.MAREDAT is part of the MARine Ecosystem Model Inter-comparison Project (MAREMIP) that led to this compilation of observationbased global biomass datasets.The biomass data that populate MAREDAT are freely available for use in model evaluation and development, and to the scientific community as a whole.
The original mesozooplankton biomass data extracted from COPEPOD were run through standard COPEPOD translation and standardization routines (Sect.2.1), converted to common biomass units, sampling mesh sizes, and depth intervals (Sect.2.2), and run through standard COPEPOD quality control routines and secondary quality control measures (Sect.2.3).The results of the quality control routines and the gridded mesozooplankton carbon biomass data are examined and discussed in Sect.3.

Origin of data
Mesozooplankton biomass data were extracted from the Coastal and Oceanic Plankton Ecology, Production, and Observation Database (COPEPOD, http://www.st.nmfs.noaa.gov/copepod), a global plankton database project of the US National Marine Fisheries Service (NMFS).COPEPOD's data content comes from ongoing and historical NMFS ecosystem surveys and monitoring projects, from data rescued by COPEPOD's Historical Plankton Data Search and Rescue project (COPEPOD-SAR), from international institutional and project-based sampling programs, and from individual investigators (e.g., thesis data, individual cruises).
COPEPOD's data, including mesozooplankton data, come from a wide variety of sources and in a wide variety of formats.There is a two-phase process that allows data to be translated faithfully from original file format and variables to the COPEPOD variable definition set and data structure.In the first phase, there are procedures in place that allow the original methods and metadata documentation to be reviewed ensuring accurate representation during translation.
Once the original values are available in standard COPE-POD electronic format, there are two issues: (1) original units are not always comparable, and (2) taxonomic resolution is not always uniform.During the second phase, common base unit values are transformed into standard units and all taxonomic data are standardized and classified into groupings.For the purposes of this synthesis all mesozooplankton biomass values have also been converted to µg C L −1 .For more information in relation to the treatment and standardization of data in COPEPOD, see O'Brien (2010) (http: //www.st.nmfs.noaa.gov/copepod/2010).
A total of 110 datasets were used in the global mesozooplankton biomass compilation.Table 1 lists the first 30 of these datasets, ranked in order of their spatial contribution, which represent 80 % of the spatial data coverage and 80 % of the total observations.The remaining 80 datasets individually contribute less than 1 % each to the spatial coverage.The datasets in Table 1 were ranked and sorted by the number of monthly 1 × 1 degree grid cells (Mcells) of spatial coverage, which they contributed to the global gridded fields, as opposed to ranking by number of observations.This method gives a higher ranking to the most spatially visible members in the global grid, such as the International Indian Ocean Expedition (IIOE), which is the most visible and dominant

Biomass conversion
There are four different types of biomass within the COPE-POD mesozooplankton dataset: wet mass, dry mass, displacement volume, and settled volume (see Fig. 1).The determination of total sample biomass or biovolume, as compared to microscope-based full sample identification and enumeration, is relatively fast and simple and is therefore the most prevalent zooplankton measurement type and method found in both historical and ongoing mesozooplankton monitoring and survey programs (O'Brien, 2010;O'Brien et al., 2011).Of the largest data contributors to the database, the ongoing NMFS survey projects EcoFOCI, CalCOFI, SEAMAP, and EcoMon/MARMAP exclusively use displacement volume; Japanese survey programs almost exclusively use wet mass, and historical sampling by Russian/former Soviet Union (FSU) surveys uses a mixture of wet mass, displacement volume, and settled volume.Dry mass data are rare, coming primarily from the most recent sampling programs (e.g., JGOFS, GLOBEC, Norwegian Sea Survey).Published equations allow these four biomass types to be converted to carbon biomass.Total carbon mass was selected as the common zooplankton biomass proxy because of its fundamental use in food chain and energy flow applications (Harris et al., 2000;Wiebe et al., 1975) and the abundance of published conversion equations to this biomass type (e.g., Cushing et al., 1958;Balvay, 1987;Wiebe, 1988;Bode et al., 1998;Harris, 2000).The non-linear biomass conversion equations of Balvay (1987), Wiebe (1988) and Bode et al. (1998) are used (Table 2).

Sampling mesh sizes
The mesozooplankton size fraction was extracted from COPEPOD by selecting only data from mesh sizes 150 to 650 µm.Three general mesh groups occur centered on 200 µm, 333 µm, and 505 µm (Fig. 2a).Historically, the most common mesh size was 333 µm (Fig. 2b), used by large, and often continuous, monitoring programs carried out by the US and Japan and by historical multi-national projects such as IIOE and NORWESTLANT.Recently sampled data, as well as the historical Russian/FSU data, focus more on data in 200 µm mesh data (Fig. 2c).Finally, large areas of the eastern Pacific used 505 µm mesh nets for their ichthyoplanktonfocused surveys (Fig. 2d).A mesh category, mCAT, was assigned to each of these groupings, labeled m200, m333, m505.The original values and the assigned mCAT values are both documented in the original mesozooplankton dataset.
Mesh size affects what components of the zooplankton population are actually caught (Landry et al., 2001;Hernroth, 1987;DeVries and Stein, 1991;Colton et al., 1980), with smaller mesh nets generally collecting more biomass than larger mesh nets due to their better capture of the smaller taxa species and smaller life stages.As each mesh size does not offer a complete geographic coverage (333 µm is absent in the mid-Atlantic and Southern Ocean, and 200 µm is Earth Syst.Sci.Data, 5, 45-55, 2013 www.earth-syst-sci-data.net/5/45/2013/  absent in the equatorial and eastern Pacific), the mesh conversion equations used in O'Brien (2005) (http://www.st.nmfs.noaa.gov/copepod/2005) were calculated using the updated mesozooplankton biomass data presented here in Table 3.As 333 µm was the most numerically abundant data type, and co-sampled 333 and 505 µm data were more prevalent than 200 and 500 µm co-sampled data, all sizes were calculated to their equivalent 333 µm values.In general, smaller mesh nets capture a larger portion of the smaller species and smaller life stages, while larger mesh nets capture less of the smaller species and life stages (Harris et al., 2000).The equations in Table 3 reduce the biomass values from 200 µm mesh nets, and increase the biomass values from 505 µm nets, to make them reasonably equivalent to data sampled with a 333 µm mesh net.

Depth intervals
Zooplankton and mesozooplankton alike are unevenly distributed with depth.Unlike the discrete depths of bottle-sampled plankton (e.g., 10 m, 25 m), over 95 % of the available mesozooplankton data in COPEPOD were sampled with a single net towed over a single depth interval that generally runs from a target depth to the surface, e.g., 0-50 m, 0-100 m, with 0-150 m and 0-200 m being the most common (Fig. 3).Zooplankton data sampled from these depth intervals can be used to describe the average population throughout that interval, but they cannot be used to discuss data at an individual depth level, e.g., 20 m.A small handful of data were sampled at multiple depth intervals using a multiple net sampler; e.   grid used in the MAREMIP database, the mesozooplankton were stored at the WOA depth level representing the midpoint of the tow interval.For example, the 0-40 m interval (zCat i040) was stored as 20 m (WOA level 2) while the 0-200 m interval (zCAT i200) was stored as 100 m (WOA level 7) (see Table 4).The data were gridded using the original entries for latitude, longitude and month from all datasets.Mesozooplankton concentrations in µg C L −1 were binned on the 4dimensional WOA grid.This is a monthly grid with horizontal resolution of 1 × 1 degree and 33 vertical depth levels, with the first ten levels representing depths 0, 10, 20, 30, 50, 75, 100, 125, 150, and 200 m.Depth intervals were assigned to represent WOA levels, as described above.Only data that were gridded in the top 200 m of the ocean were used for calculation of global epipelagic mesozooplankton annual average biomass.

Quality control
Numerical range-based quality control of zooplankton data is complicated because of differences in sampling method, mesh size, seasonality and diurnal vertical migration (O'Brien, 2007).The mesozooplankton data acquired from COPEPOD already have quality control flags assigned to each value by COPEPOD.The COPEPOD quality control method (O'Brien, 2007, http://www.st.nmfs.noaa.gov/copepod/2007) for zooplankton biomass data divides the world into 15 major geographic basins, six mesh size categories, 12 months, four seasons and four biomass types.
The COPEPOD 2007 quality control system has three different types of outlier warning flags that are assigned based on three n-dependent ranging tiers.If a data value falls outside of 99 %, 99.9 % or 99.99 % of all other available same-category data present within the COPEPOD database, they are flagged.Using the COPEPOD quality control system, an individual mesozooplankton wet mass collected with a 333 µm mesh size in the North Pacific is compared to (1) the full numeric range of all wet mass data present within  ) Min zdiff allowed is important if a tow is shorter than (found within) the min and max depths; this makes sure it has at least an 80 % coverage of the interval.(This is to prevent a "0-500 m" tow from being comprised of a 400-500 m-only depth fragment.)Supplementary note: in any given tow interval within the COPEPOD dataset, a bottom depth correction flag (BDCF) will be set if the bottom depth at the sampling location is less than the lower target range for a given zCAT.This means that a 0-100 m tow in a 110 m bottom depth area would qualify as a i100, i150, i200, i250, i300, i400, and i500 value.Except for the i100, the other depths would include a "BDCF" marked in the data file.This allows a user to use all data from a single depth category, i.e., i200, or to combine multiple depth categories, i100, i150, i200 -by excluding any BDCF flags to remove duplicated data between the multiple depth files.
COPEPOD, e.g., all other wet mass data sampled in any oceanic region in any month (F1); (2) the basin-specific annual range, e.g., wet mass data sampled only in the North Pacific in any month (F2); and (3) the seasonal range, e.g., wet mass data sampled only in the North Pacific in June, July or August (F3).For the purposes of compiling the mesozooplankton biomass data, COPEPOD mesozooplankton biomass values were excluded if their flagging indicated that they fell outside of 99.9 % of same category data from any region and any month (F1), fell outside of 99 % of same category data from the same region regardless of season (F2) and during the same season (F3).The stricter criterion used for the F1 flag's range checking is intended to detect extreme outliers at the global (and any season) level without excluding reasonable differences due to geographic sub-regions and/or season (which are tested by the F2 and F3 range flagging).
The suggested minimum quality control for the MARE-DAT datasets was to apply Chauvenet's criterion for data rejection (Glover et al., 2011;Buitenhuis et al., 2012).Chauvenet's criterion was applied only to the log-transformed mesozooplankton biomass data, which are normally distributed.The mean x and the standard deviation σ of the logtransformed data were calculated and used to calculate the critical value x c .One half of 1/(2n) was used as Chauvenet's criterion in a two-tailed test; however, only data on one tail, the high one, x + x c , were rejected.3 Results and discussions
using Chauvenet's criterion; all values being lower than the critical value of the mean +4.6534 × standard deviation.Sampling protocols, handling, preservation and measurement techniques were not considered when removing outliers.These variables are assumed reasonably consistent within COPEPOD, but are most likely not uniform across datasets and projects.Issues related to sampling such as the inherent variability of field populations (Landry et al., 2001), mesh size, type of net, gear avoidance, seasonal/diel vertical migrations, sample handling, e.g., sample splitting, size fractionation and sample analysis, all sources of random sampling error, were considered to have a greater effect than the sampling bias issues found across projects/datasets.

Biomass description
The mesozooplankton biomass database contains 153 163 data points.Data from a number of stations that have been sampled repeatedly over many years, or programs where measurements have been made on a fine-resolution grid have been included.Therefore, after gridding, we obtained 42 245 data points on the WOA grid (1 • ×1 • ×12 months × 33 depths), representing coverage of annually averaged biomass for 20 % of the ocean surface.To limit the overrepresentation of well-sampled locations, we present results of the gridded data.
The gridded data were split between regions as follows: 46 % of the data were found in the Pacific Ocean, 16 % in the Atlantic Ocean, 16 % in the Indian Ocean and 14 % in the polar oceans.The tropics, including the equatorial Atlantic, equatorial Pacific, Indian Ocean, which represent 43 % of the ocean surface, accounted for 39 % of the data.In contrast 14 % of the data came from the polar oceans, which represent 5 % of the ocean surface.Only 22 % of the data were found in the Southern Hemisphere (Fig. 4).There is some sampling bias towards the local summer season (Fig. 5e and f), with peak cells found in summer months in both hemispheres.

Global estimates
Global estimates of mesozooplankton biomass were calculated from the gridded data in the top 200 m of the global ocean (see Table 5).Global mesozooplankton biomass had a mean of 5.9 µg C L −1 , a median of 2.7 µg C L −1 and a standard deviation of 10.6 µg C L −1 .Biomass was highest in the Northern Hemisphere, and there were slight decreases from polar oceans (40-90 • ) to more temperate regions (15-40 • ) in both hemispheres.Values in the tropics (15 • N-15 • S) were intermediate between those at the northern and southern temperate latitudes.The standard deviation within the latitude bands was high so the differences in the mean were not significant.The global total of mesozooplankton carbon biomass in the top 200 m of the ocean was estimated at 0.19 Pg C.This total was Earth Syst.Sci.Data, 5, 45-55, 2013 www.earth-syst-sci-data.net/5/45/2013/An overview of all 11 PFT groups currently included in the MAREDAT project is given in the Introduction to the MAREDAT ESSD Special Issue (see Buitenhuis et al., 2012).A comparison of all PFT biomasses, e.g., picophytoplankton, diazotrophs, coccolithophores, Phaeocystis, diatoms, picoheterotrophs, microzooplankton, mesozooplankton, pteropods and macrozooplankton, is also presented.It is important to be aware that the majority of the other plankton groups only have a small fraction of the data coverage seen in the mesozooplankton data of this paper.For these groups, the spatial and temporal coverage were limited such that only a basic comparison of "latitudinal ranges" and "annual averages" was possible.

Conclusions and recommendations
A coherent map of mesozooplankton global distribution and biomass is presented.Global mesozooplankton biomass was estimated from the median biomass value of 2.7 µg C L −1 (= 0.19 Pg C annual average mesozooplankton biomass in the top 200 m) and a standard deviation of 10.6 µg C L −1 .The global, latitudinal and depth estimates of biomass concentrations will be useful for understanding ocean biogeochemistry, and for evaluating global models that include mesozooplankton.Although less developed versions of the mesozooplankton data have been published before as part of the regular COPEPOD database report series (O'Brien, 2005(O'Brien, , 2007(O'Brien, , 2010)), this is the first time individual mesh categories (mCATs) and depth intervals (zCAT) have been distributed.This is also the first time these data have been collected together as a whole for publication in a journal together with the publication of the associated dataset.The dataset description and methods should act as a guide to those interested in using this dataset.It is important when using a dataset such as this that the associated caveats are understood and should be considered when drawing conclusions based on these data.The data compiled for this effort represent over 50 years of sampling effort made by institutions and scientists from around the world.While combined depth, mesh, and method maps such as Figs. 4 and 6a show a nearly global distribution of data, the extent of this coverage disappears quickly when one looks only at data from a specific month.Detailed maps of the monthly data distribution by depth, mesh, and original biomass type are available online at the COPEPOD project website (http://www.st.nmfs.noaa.gov/copepod).Mesozooplankton investigators and policy makers are encouraged to view these maps, as they show clearly that not all regions of the ocean are adequately sampled and the maps may provide guidance for future mesozooplankton monitoring or process studies to fill in these gaps.
Communication between biogeochemical modelers, data managers and experimentalists is at an all time high.There is an increasing interest to combine expertise from the modeling and experimental communities to produce and share the data products necessary to parameterize and validate marine ecosystem models.COPEPOD regularly interacts with scientific projects such as MAREMIP and international working groups such as the ICES Working Group on Zooplankton Ecology (WGZE).Through collaboration with the scientists and user community, COPEPOD strives to constantly improve its data content and to ensure data products, such as the biomass fields in this paper, are available and useful to the scientific community.

Figure 1 .
Figure 1.Distribution of the different types of biovolume and biomass samples: (a) settled volume, (b) displacement volume, (c) wet mass and (d) dry mass.
g., the Russian Juday multi-net frequently samples at depths 0-10 m, 10-25 m, 25-50 m, 50-100 m and 100-200 m.By adding these pieces together, it was possible to build standard depths, e.g., 0-25 m from 0-10 m and 10-25 m or 0-200 m from 0-10 m, 10-25 m, 25-50 m, 50-100 m and 100-200 m.The mesozooplankton biomass data presented here have been organized into 11 depth categories, which allow the data to be selected at a variety of different depths (see
COPEPOD zCAT is the COPEPOD four-character token used to represent each depth interval, e.g., i010 = 0-10 m, i100 = 0-100 m, i200 = 0-200 m.Upper z and Lower z-target are the ideal depth intervals desired by this (zCAT) category.Max upper z is the maximum non-surface interval allowed by this zCAT.(This really applies more to deeper depth intervals and multi-net tows, i.e., 0-25 m, 25-50 m).Min and Max lower z-allowed are the allowed range above and below the lower z-target.(They keep the individual COPEPOD zCATs from overlapping with each other.

Figure 4 .
Figure 4. Global distribution of all mesozooplankton biomass data (converted to carbon and a common 333 µm equivalent mesh size).Each point represents a station where mesozooplankton were recorded.

Table 1 .
Sources for COPEPOD mesozooplankton biomass and biovolume data.
data source in the Indian Ocean.While this dataset is spatially ranked 3rd, it would be ranked 15th based only on observations.In contrast, the long-running EcoMon/MARMAP dataset ranks 2nd in observations but ranks 13th spatially, because those 35 yr of repeat sampling in the same 1 × 1 degree grid cell actually only contribute 12 monthly means (12 Mcells) each to the global grids created by this synthesis.

Table 4 .
Description of COPEPOD depth interval criteria and World Ocean Atlas equivalents.

Table 5 .
Global and latitudinal band values for the gridded mesozooplankton biomass data.