Decadal reanalysis of biogeochemical indicators and fluxes in the North West European shelf-sea ecosystem

This paper presents the ﬁrst decadal reanalysis simulation of the biogeochemistry of the North West European shelf, along with a full evaluation of its skill, conﬁdence, and value. An error-characterized satellite product for chlorophyll was assimilated into a physical-biogeochemical model of the North East Atlantic, applying a localized Ensemble Kalman ﬁlter. The results showed that the reanalysis improved the model simulation of assimilated chlorophyll in 60% of the study region. Model validation metrics showed that the reanalysis had skill in matching a large data set of in situ observations for 10 ecosystem variables. Spearman rank correlations were signiﬁcant and higher than 0.7 for physical-chemical variables (temperature, salinity, and oxygen), (cid:2) 0.6 for chlorophyll and nutrients (phosphate, nitrate, and silicate), and signiﬁcant, though lower in value, for partial pressure of dissolved carbon dioxide ( (cid:2) 0.4). The reanalysis captured the magnitude of pH and ammonia observations, but not their variability. The value of the reanalysis for assessing environmental status and variability has been exempliﬁed in two case studies. The ﬁrst shows that between 325,000 and 365,000 km 2 of shelf bottom waters were vulnerable to oxygen deﬁciency potentially threatening bottom ﬁshes and benthos. The second application conﬁrmed that the shelf is a net sink of atmospheric carbon dioxide, but the total amount of uptake varies between 36 and 46 Tg C yr 2 1 at a 90% conﬁdence level. These results indicate that the reanalysis output data set can inform the management of the North West European shelf ecosystem, in relation to eutrophication, ﬁshery, and variability of the carbon cycle.


Introduction
Trends and patterns of biogeochemical variables that are relevant for marine policy and ecosystem understanding can be evaluated by merging numerical models and ocean color in extended ''biogeochemical reanalysis,'' using a consistent data assimilation algorithm [Lahoz and Schneider, 2014;Gehlen et al., 2015]. Such algorithm corrects the model estimates toward the observations, taking account of the errors in the model and in the observations [Kalman, 1960]. The resulting estimates of biogeochemical variables are expected to be more realistic than the estimates obtained separately from modeling and monitoring efforts, as is well established in environmental disciplines such as atmospheric science [Bengtsson and Shukla, 1988;Trenberth and Olson, 1988] and ocean physics modeling [Stockdale et al., 1998].
In ocean biogeochemical modeling, the first (quasi)decadal biogeochemical reanalysis estimated the interannual variability of global primary production in years 1998-2004 by assimilating chlorophyll from SeaWiFS (Sea-viewing Wide Field-of-view Sensor) into the NASA Ocean Biogeochemical Model (OBM) [Nerger and Gregg, 2007]. A comparable variability of primary production was obtained in the reanalysis by Gregg [2008], who in addition described the spatial patterns of chlorophyll in the global oceans. The reanalysis by Fontana et al. [2013] evaluated spatial-temporal patterns of chlorophyll and nitrate in the North Atlantic Ocean in years 1998-2006, by assimilating SeaWiFS chlorophyll into a coupled physical-biogeochemical model. Reanalyses for years 1998-2012, using chlorophyll observations from SeaWiFS and MODIS and the NASA OBM, evaluated significant declining trends of chlorophyll in the Northern Hemisphere and Indian Oceans [Gregg and Rousseaux, 2014], and estimated declining trends of phytoplankton functional groups in part of the global oceans [Rousseaux and Gregg, 2015].
catches [Pauly et al., 2002]. These processes and services are impacted by interannual climate variability and changes in anthropic pressures, implying trends in coastal eutrophication [Cloern, 2001], fluctuations of shelf uptake of CO 2 [Borges, 2011], and expansion of poorly oxygenated shelf floor areas threatening fishes and benthic communities [Diaz and Rosenberg, 2008;Gilbert et al., 2010;Rabalais et al., 2014]. Marine policy and research are cooperating in monitoring and modeling biogeochemical variables that are indicators of the status of shelf ecosystems and that can characterize its long-term variability, such as chlorophyll concentration, dissolved oxygen, partial pressure of CO 2 , and nutrient concentrations [OSPAR, 2013]. Such indicators have been estimated successfully in previous works by assimilating ocean color into shelf-sea models; however, such simulations were short-termed (i.e., 1 year or shorter) focusing on the skill of daily to weekly operational predictions [e.g., Teruzzi et al., 2014;Shulman et al., 2013] or on the seasonal cycle of the ecosystems [e.g., Triantafyllou et al., 2007;Fontana et al., 2010;Ciavatta et al., 2011Ciavatta et al., , 2014Mattern et al., 2013;Hu et al., 2012;Xiao and Friedrichs, 2014a], leaving the reanalysis of the interannual variability of shelf-sea biogeochemistry unexplored.
The overall aim of this work was to provide the first decadal reanalysis of the biogeochemistry of the North West European shelf-sea. The specific objectives of this paper are: (i) to evaluate the skill and confidence of the reanalysis and (ii) to exemplify the value of the reanalysis data set to assess the status and interannual changes of the shelf ecosystem. With this last broad objective in mind, we present two case studies assessing: (a) the vulnerability of the bottom waters of the shelf to oxygen deficiency and b) the interannual variability of the uptake of atmospheric CO 2 by the shelf-sea ecosystem.
To achieve these aim and objectives, we assimilated an error-characterized ocean color product for chlorophyll [Brewin et al., 2015] into an ecosystem model of the North East Atlantic , upgraded to the state-of-the-art version of the European Regional Seas Ecosystem Model (ERSEM) [Butensch€ on et al., 2015] and integrated into the Ensemble Kalman filter [Evensen, 1994;Ciavatta et al., 2011]. This assimilation system was applied in the reanalysis of the biogeochemistry of the North West European (NWE) shelf in the years 1998-2009. The reanalysis output data set was first skill-evaluated using ocean color data and in situ observations of 10 physical and biogeochemical variables. The data set was then postprocessed to extract information relevant to the case studies, including the confidence level of the reanalysis estimates. Estimate of confidence is a major gap in most of the current modeling applications for ecosystem assessment [Hyder et al., 2015;Piroddi et al., 2015], thus we suggest that it represents an added value of our reanalysis data set for its possible application in marine policy.
The paper is structured as follows. Section 2 describes the ecosystem model, the setup of the assimilation algorithm, the data, and the metrics applied for skill evaluation. In section 3, the results are presented and discussed. The skill of the reanalysis data set is first evaluated with respect to the assimilated ocean color data (section 3.1), and then using an in situ data set which was not part of the assimilation (section 3.2). The two case studies are presented in sections 3.3 and 3.4 and concluding remarks and future applications are pointed out in section 4. are treated as prognostic variables. The model includes: an advection scheme with stability and conservation properties [James, 1996]; a vertical turbulence model (GOTM) [Burchard et al., 1999]; and calculation of horizontal pressure gradients. 2.1.2. The Pelagic Biogeochemical Submodel: ERSEM The biogeochemical dynamics are described by the European Regional Seas Ecosystem Model (ERSEM) [Baretta et al., 1995], using its state-of-the-art version presented in Butensch€ on et al. [2015], and applying a configuration with 51 pelagic variables. ERSEM uses a functional type approach to model the dynamics of the low trophic levels of the ecosystem. Primary producers are split into four phytoplankton functional types (PFTs), including three size-based types (picophytoplankton, nanophytoplankon, and microphytoplankton), plus diatoms as silicate users. Each of these PFTs is defined in terms of its content of chlorophyll, carbon, nitrogen, phosphate, and (for diatoms only) silicate. Three functional types of zooplankton (mesozooplankton, microzooplankton, and heterotrophic nanoflagellates) prey on the PFTs, bacteria, and particulate organic matter as a function of their size. One bacterial functional type drives the microbial loop, the production and recycling of dissolved organic matter in labile, semilabile, and recalcitrant forms, and it drives the regeneration of inorganic nutrients in the water column [Polimene et al., 2006;Hansell, 2013]. In the functional types, the stoichiometric ratios of nutrients-to-carbon and chlorophyll-to-carbon (in the PFTs) vary dynamically [Geider et al., 1997;Baretta-Bekker et al., 1997]. The model includes the dynamics of five inorganic dissolved nutrients (carbon, nitrate, ammonia, phosphate, and silicate) and dissolved oxygen. The model configuration applied here includes a carbonate system module, which regulates air-sea flux of carbon dioxide, as well as the description of calcite, including its deposition at the seafloor Butensch€ on et al., 2015].
Numerous works demonstrate the skill of ERSEM in representing marine ecosystem processes and reproducing ocean observations. Model validations have used univariate and multivariate analysis [e.g., Allen and Somerfield, 2009;Saux-Picart et al., 2012;de Mora et al., 2013de Mora et al., , 2016 in model applications ranging from zerodimensional process studies [e.g., Pinna et al., 2015] to global simulations [Kwiatkowski et al., 2014;de Mora et al., 2016]. In particular, the state-of-art version applied in this work was flexible and skilled in simulating multiannual time series of nutrients, chlorophyll, oxygen, particulate organic and dissolved inorganic carbon, as well as reproducing emerging properties (phytoplankton stoichiometry and average community structure) observed in three contrasting sites in coastal, shelf, and open ocean [Butensch€ on et al., 2015;de Mora et al., 2016]. However, the model can have low skill in representing observed phytoplankton successions (in particular blooms of dinoflagellates) in large-scale shelf-sea applications, due to limitations in the parameterization of the PFTs [Ciavatta et al., 2011].

The ERSEM Benthic Submodel
The benthic submodel is the ERSEM benthic model [Blackford, 1997], as described in Butensch€ on et al. [2015]. In the configuration applied here, the submodel includes 35 biogeochemical variables, subdivided into 7 living functional groups (including zoobenthos, aerobic, and anaerobic bacteria), along with particulate matter and dissolved organic and inorganic nutrients. The fluxes at the sediment-water interface are determined by sedimentation and diffusion of inorganic material across the seabed.

Boundary Conditions and Atmospheric Forcing
The oceanic conditions at the open boundaries of the ecosystem model (temperature, salinity, currents, and sea surface elevation) were extracted for the years 1998-2009 from the GLORYS reanalysis product provided within the EC FP7 project MyOcean [Ferry et al., 2012]. The corresponding conditions for dissolved nutrients and oxygen were extracted from the 2005 World Ocean Atlas climatology [Garcia et al., 2006a[Garcia et al., , 2006b, and for dissolved inorganic carbon (DIC) from the database GLODAP [Key et al., 2004].
The model was forced by daily climatological discharges of freshwater and dissolved nutrients from 250 rivers. Data of water discharge were taken from the Global River Discharge Data Base [V€ or€ osmarty et al., 1996], and from data prepared by the UK Centre for Ecology and Hydrology. River nutrient loadings matched those used by Lenhart et al. [2010], with raw data for the UK, Northern Ireland, Ireland, France, Norway, Denmark, and the Baltic processed by the UK Centre for Environment Fisheries and Aquaculture Science, and raw data for Germany and the Netherlands derived from P€ atsch and Lenhart [2004]. In addition, Baltic inflow was represented as river-inflow . Atmospheric input of nutrients was derived from the European Monitoring and Evaluation Programme [Tørseth et al., 2012].
The atmospheric forcing (three-hourly solar irradiance, air temperature, wind velocity, precipitation, humidity, pressure, and cloud cover) was obtained from a regional climate hindcast (years 1989-2009, spatial resolution of 12 km) performed by the Danish Climate Centre, using the regional Climate model HIRHAM5 [Christensen et al., 2006], driven by ERA-interim global reanalysis [Dee et al., 2011].

The Assimilation System
The assimilation framework uses the system described in full in Ciavatta et al. [2011Ciavatta et al. [ , 2014, where it was developed to assimilate ocean color in a similar POLCOMS-ERSEM model configured for the English Channel. The system uses the Ensemble Kalman filter (EnKF) [Evensen, 1994]. This is a sequential assimilation method, which starts by randomly sampling an ensemble of N state vectors x aðlÞ 0 (l 5 1, 2, . . ., N) from an initial probability density function for the model variables. Each ensemble member, i.e., state vector, is propagated in time using the nonlinear model equations during the ''forecast step,'' that provides the EnKF ''forecasts'' x  condition used to simulate a new ensemble forecast for time i 1 1, in a sequential procedure that estimates the evolution of the model variables over the time window spanned by the assimilated observations.
Our assimilation system uses the Evensen [2003] version of the EnKF, which includes localization of the analysis and perturbation of the assimilated observations [see also Natvik and Evensen, 2003;Hu et al., 2012;Storto et al., 2013]. Observations and model states are log-transformed prior to the analysis, to guarantee positivity of the solutions [Janjić et al., 2014], as in the applications by Torres et al., [2006], Nerger and Gregg [2008], and Ciavatta et al. [2011Ciavatta et al. [ , 2014.
Importantly, the ensemble method can provide estimates of the uncertainty of the reanalysis product, derived from the dispersion of the ensemble members around their central value (i.e., the median). In particular, we used ranked values of the 100 ensemble members (minimum, 5th, 95th, and maximum ensemble value) to define the confidence levels of the reanalysis estimates (1%, 5%, 95%, and 100% confidence, respectively).

Setup of the Assimilation System
Following Ciavatta et al.
[2011], we used the EnKF with an ensemble size of N 5 100 members. To keep the analysis affordable computationally, the analyzed state vector had to include a maximum of 44 out of the 51 biogeochemical state variables. The remaining seven variables were updated through the model equation during the simulation runtime (''forecast'' step) and were selected among those more likely to create instabilities in the long-term reanalysis on the base of previous findings (silicate in dissolved and medium and large particulate forms [Ciavatta et al., 2011]), and of assimilation tests within this study (semilabile dissolved organic carbon, dissolved oxygen, alkalinity, and calcite). The radius of the localized analysis was set spatially variable as a function of the bathymetry [Ciavatta et al., 2011]. In particular, we increased the ''resolution'' of the analysis from oceanic toward coastal waters, by setting a radius of 100 km for grid points where the bathymetry is deeper than 2000 m (i.e., in 35% of the cells of the model grid), 50 km for bathymetry between 50 and 2000 m (51% of the grid), and 25 km for bathymetry shallower than 50 m (14% of the grid).
Model error is accounted for by random perturbations of the model forcing, namely the surface solar irradiance, thus inducing fluctuations in the underwater light field that drives photosynthesis [Torres et al., 2006] (see Natvik and Evensen [2003] and Simon and Bertino [2009] for comparable approaches). A Gaussian perturbation with standard deviation equal to 20% of the irradiance value is added during the model forecast step. Furthermore, at the first assimilation step of each year, model error is added to all the variables undergoing the analysis, as white noise drawn from a distribution of pseudorandom fields with error equal to 10% of the value of the variables. The error is lowered to 1% for those variables that have relatively high average values (DIC, ammonia, and small particulate matter), to avoid divergences in the concentrations of the largest pool in the model [Ciavatta et al., 2011].
The ensemble was initialized by perturbing the output of a hindcast model simulation that started in January 1991 after a 5 year spin-up. The hindcast states for September 1997 were perturbed by using Gaussian pseudorandom fields with error equal to 30% of the value of the variables. These perturbed states were used to start the assimilation from the first data available in the ocean color time series. Results of the reanalysis and of the simulation without assimilation, namely the ''reference'' run, are presented for January 1998 to December 2009.
The reanalysis simulation was run on the UK national supercomputing facility ''ARCHER'', using 7200 computing processors and $9 Mega Allocation Units (MAUs).

Data
The data of remotely sensed concentration of surface chlorophyll used in the assimilation ( Figure 3) were provided by the Ocean Colour-Climate Change Initiative project of the European Space Agency (ESA's OC_CCI product, Version 2.0) [Brewin et al., 2015;Grant et al., 2015;Sathyendranath et al., Creating an ocean-colour time series for use in climate studies: the experience of the ocean-colour climate change initiative, submitted to Remote Sensing of Environment, 2016]. This product was created by merging satellite data from sensors MERIS, MODIS, and SeaWiFS, after shifting the wavelength bands and correcting the bias between the sensors. It consists of a global daily level 3 binned data set provided on a sinusoidal grid at 4 km resolution. It was downloaded via FTP from http://www.oceancolor.org. As described in Appendix A, here the data set was projected onto the $12 km model grid, and daily values were merged into 5 day composites centered on the last day of each month in the years 1998-2009. Importantly, in the assimilation system we made use of per-pixel error statistics Journal of Geophysical Research: Oceans 10.1002/2015JC011496 estimated by OC_CCI through the analysis of match-ups between in situ data and ocean color . In particular, we computed and assimilated unbiased values of chlorophyll observations, and we defined the variance of their pseudorandom Gaussian perturbations (section 2.2) by processing perpixel root-mean-square deviations provided with the OC_CCI data set (see details in Appendix A, equations (A8) and (A9)).
The in situ data used to evaluate the reanalysis skill were measured in the North East Atlantic in the years 1998-2009 and were extracted from the Ecosystem Data Online Warehouse of the International Council for the Exploration of the Sea (http://geo.ices.dk/ [ICES, 2009]) for the following variables: temperature, salinity, dissolved oxygen, chlorophyll, nitrate, ammonia, phosphate, silicate, and pH. Data of partial pressure of carbon dioxide (pCO 2 ) were derived from the Surface Ocean CO 2 Atlas (synthesis product, version 2; http:// www.socat.info) [Bakker et al., 2014].

Skill Metrics
The skill of the reference and reanalysis output (y) in matching the assimilated composites of chlorophyll concentrations from ocean color (y 0 c , equation (A10) in Appendix A) was evaluated by computing and comparing maps of the root-mean-square deviation (RMSD) and of the Pearson correlation (q) between the time series of y and y 0 c at each surface grid point of the model domain. Time series of RMSD between the spatial distributions of y and y 0 c at each month of the reanalysis was also computed. The skill of the reanalysis was evaluated for the output of both the forecast and analysis steps of the assimilation algorithm (see section 2.2 for the definition of these steps).  Taylor and Target diagrams that show [Taylor, 2001;Jolliff et al., 2009]: Pearson correlation coefficient (q), standard deviation of the output (r) normalized by the standard deviation of the observations (r o ), bias of the output, bias 5 mean(y-o), normalized by r o , unbiased root-mean-square deviation (RMSD') normalized by r o , and taken with the algebraic sign of the differences between the standard deviation of the output and the observations, sign(r2r o ). In addition, we computed ''robust'' skill metrics that are sounder than parametric metrics when the distribution of the variables is non-Gaussian, because robust metrics are based on the percentiles and ranks of the distributions and thus they are less affected by outliers [e.g., Daszykowski et al., 2007]. Robust metrics were presented in a Target diagram showing [Butensch€ on et al., 2015]: the bias computed as the median value of the reanalysis-to-observation mismatch, bias* 5 median (y-o), normalized by the interquartile range of the observations (IQR o ); the unbiased median absolute error, MAE' 5 median{abs[y-o-bias*]}, normalized by IQR o, and taken with the algebraic sign of the differences between the interquartile range of the output and the observations, sign(IQR-IQR o ); the Spearman rank correlation coefficient, q s . The latter was used also in the case studies, to compute cross correlation among time series of variables from the reanalysis; we computed also the significance level p that such correlation is different from zero, at a confidence level of 99%, that is, p < 0.01.

Skill in Matching the Ocean Color Data
In the years 1998-2009, the distribution of chlorophyll observed in the North West European shelf was characterized by sharp gradients from the coastal areas toward the oceanic waters (Figure 3a), and the reanalysis matched this pattern quite closely (Figure 4).
Simulated concentrations were lower than satellite observations in coastal areas, however, the satellite observations were more uncertain in these regions compared to the open shelf (Figure 3c), due to resuspended sediments and colored dissolved organic matter discharged by rivers [Sathyendranath, 2000]. In oceanic waters, the reanalysis overestimated the observed chlorophyll concentrations because the model was overestimating nutrients in the boundary ocean regions (see section 3.2). In addition, in the northern oceanic waters, the skill of the reanalysis was constrained by the relatively low number of data items available for assimilation (Figure 3d). The reanalysis had skill in matching the assimilated satellite observations ( Figure 5). The seasonal cycles of the observations were captured by the simulation, as demonstrated by the large areas where the correlation coefficient is higher than 0.6. Some low, or even negative, correlations were computed in the northern basin where observations were sparse, and in coastal areas where observations were more uncertain, limiting the ability of assimilation to correct the model (Figures 3c and 3d). The RMSD between reanalysis and data is comparable to the chlorophyll concentrations in large parts of the domain, when averaged over the whole period 1998-2009 (Figure 4a and 5a). Temporal mismatches between simulation and observations (e.g., misrepresented phytoplankton blooms) contributed to the high RMSD in the coastal areas.
The reanalysis product has a higher skill in matching the ocean color data than the output of the model without data assimilation, i.e., the model reference run. Importantly, this holds for the 1 month ''forecasts'' of the assimilation run (i.e., the output before the assimilation step), as well as for the ''analysis'' (i.e., the output after the assimilation step) ( Figure 6). Both the analyses and forecasts decreased the RMSD of the reference run by at least 1% in $60% of the basin (Figures 6a and 6c). The already high reference correlations were not changed markedly by the forecasts and analyses, since in both cases changes were smaller than 0.01 in $60% of the basin (Figures 6b and 6d). Improvements of the reference simulation were higher in magnitude for the analysis than for the forecasts, and the analysis decreased the reference RMSD up to 20% (Figure 6a). In general, improvements were less evident in the coastal areas and in the northern basin, but here the assimilated data had higher errors and were less numerous, respectively, as mentioned above.
Considering the skill over time, both the assimilative forecast and analysis had lower errors than the reference simulation throughout most of the reanalysis period (Figure 7). Reduction of the RMSD by assimilation was in general more frequent in the spring and summer seasons (see for example, summer 2005), while in winter, changes were often negligible (for example, winter 2005/2006). Noticeable reductions of the RMSD were obtained in December 2000December , 2001December , and 2008, but these results were less robust, because in December, typically only a small amount of data was available for assimilation and skill evaluation, due to cloud cover and to the low solar zenith angle at the latitudes of the study region in winter. In general, reinitialization at the analysis step improved the subsequent forecast. However, in some instances, the forecasts were worse than the reference, and the analysis could only mitigate the deterioration of the simulation (e.g., in springsummer 1998). Similar temporal patterns of skill improvement and deterioration were found in the time series of reference, analysis, and forecast correlations with the ocean color data (not shown).
Improved estimates of the assimilated data were expected from the analysis, and essentially this achievement indicates that the data assimilation algorithm was implemented correctly [Gregg et al., 2009]. However, the improved skill from the 1 month forecasts shown in Figures 6 and 7 was not obvious, since forecasting the not-yet-assimilated data is a challenging task even for state-of-the-art operational systems [Ford et al., 2012;Teruzzi et al., 2014]. In principle, the reinitialization of the assimilated variable closer to the data should improve also the forecast of the next available data, with respect to the reference run. However, reinitialized biogeochemical fields often tend to be ''forgotten'' and to converge back to the reference simulation because of the effect of hydrodynamics, forcing, boundaries values, and biogeochemical processes [Allen et al., 2003;Friedrichs et al., 2006;Teruzzi et al., 2014]. In addition, multivariate analysis can produce values that are not consistent with the simulated model dynamics, e.g., outlier nutrient values, thus developing simulation instabilities that can lead the forecast to deteriorate both the assimilated and unassimilated variables [Gregg et al., 2009;Ciavatta et al., 2011]. These potential shortcomings of assimilation may explain the limited areas of skill deterioration pointed out in Figure 6, such as in the complex coastal zones.  (Figure 8). The reanalysis-to-observation match-ups are well aligned along the bisector line of the plots for temperature, salinity, and dissolved oxygen, indicating a skilled representation of both the magnitude and variability of the observations. The match-ups for phosphate, silicate, and nitrate indicate a general overestimation of nutrients, while the reanalysis underestimated (overestimated) low (high) pCO 2 data. Interestingly, in the plots of nutrients, two areas of elevated data density are distinguishable at low and high concentrations, representing the summer and winter conditions, respectively, which in turn are related to the seasonal cycle of primary production and stratification. This pattern of nutrients is captured by the reanalysis, though the winter concentrations are overestimated. Match-ups for in situ chlorophyll and ammonia are scattered in the plot, indicating that the high variability of these data was not described by the reanalysis. Finally, the reanalysis was able to represent the magnitude of the pH data, but not their fluctuations.
The quantitative metrics confirm the high skill of the reanalysis in estimating temperature, salinity, and dissolved oxygen (e.g., both the Pearson and Spearman correlations were higher than 0.7) (Figure 9). The skill for the majority of other variables was relatively high when robust metrics were used (Figure 9c), rather than the metrics based on the Gaussian assumption (Figures 9a and 9b). A clear example is in situ chlorophyll, which is much closer to the center in the robust Target diagram (Figure 9c), rather than in the standard Target ( Figure  9b). Chlorophyll, phosphate, nitrate, and silicate all reached correlations 0.6 or higher when the Spearman rank correlation was computed (Figure 9c). Robust metrics make the comparison of the variability of observations and estimates sounder by using percentiles of the distributions (interquartile and median), which reduce the impact of outliers. For example, outlier data of nutrients imply that the standard deviations are higher for the observations than for the reanalysis, leading these variables to stay below the unit radius in the Taylor diagram in Figure 9a, and on the left side of the Target diagram in Figure 9b; however, nutrients shifted to the right side of the robust Target diagram in Figure 9c, since the estimates fluctuated more than the data when less weight is given to the outlier observations. Considering the robust skill metrics, the pCO 2 correlation with data is not negligible ($0.4) and the overestimated variability and the bias are within the range of the errors for the other variables. The same holds for pH, though the low Spearman correlation confirms that the model captures the magnitude of this variable, but not its variability.
The skill of the reanalysis for the unassimilated variables in Figure 9 was not significantly different from the skill of the reference run (not shown). This means, on the one hand, that the model itself performs satisfactorily in estimating in situ data available in the North East Atlantic. On the other hand, it means that improved ocean color estimates did not come at the cost of worsening the estimates of the other model variables. This is not always the case, since ocean color assimilation can cause unrealistic changes in biogeochemical variables which are not assimilated, reducing the model skill and creating feed-back effects that eventually can blow the simulations [Fontana et al., 2013;Gregg et al., 2009;Ciavatta et al., 2011;Ford et al., 2012;Terruzzi et al., 2014]. In our application, two factors can have contributed to the low impact of assimilation on the skill metrics for Journal of Geophysical Research: Oceans 10.1002/2015JC011496 unassimilated variables, namely the location of the in situ sampling points and the frequency of the assimilation of satellite data. Most of the match-ups with in situ data occur in coastal case II waters, where ocean color error is higher (Figure 3c) and therefore assimilative corrections are smaller than in the open ocean. This is suggested also by the sensitivity analysis of an analogous assimilation system applied in a subdomain of the study region [Ciavatta et al., 2011]. The relatively low assimilation frequency imposed by the high computational cost of the multivariate ensemble methodi.e., monthly assimilation, compared, for example, with daily assimilation allowed by the univariate relaxation method in Rousseaux and Gregg [2015]-also constrained our reanalysis to impact more strongly on the skill for unassimilated variables.

Case Study I: Assessment of Oxygen Deficiency in Shelf Bottom Waters
Dissolved oxygen concentration is an essential climate variable [Bojinski et al., 2014], a threat to aquatic life at low concentrations [Vaquer-Sunyer and Duarte, 2008], and an indicator of eutrophication regulated by international legislation [OSPAR, 2013]. The first case study demonstrates that the reanalysis can provide an error-characterized assessment of this indicator. It shows that the bottom of the North West European shelf has large areas vulnerable to oxygen deficiency (Figure 10), namely the south North Sea, Celtic Sea, Armorican shelf, coastal zones in Scotland, West Ireland, and English Channel, but we did not identify anoxic situation in any of these cases. In all the above regions, the reanalysis decadal data set includes at least one daily value of dissolved oxygen below the concentration of 6 mg/L, but still above 2 mg/L; these are the thresholds of oxygen deficiency and anoxia, respectively, defined by the OSPAR Commission for safeguarding the ecosystem of North East Atlantic [OSPAR, 2013].
The extension of the vulnerable area is noticeably larger if we apply a conservative criteria of at least 1% confidence on oxygen deficiency (Figure 10b, red plus yellow area, $365,000 km 2 ), rather than a less strict 100% confidence (red area, $325,000 km 2 ). A 1% confidence means that just 1 of the 100 ensemble members estimates daily oxygen below the threshold of 6 mg/L in the bottom layer, while 100% confidence means that all 100 estimate oxygen below 6 mg/L. The more conservative 1% criterion extends the borders of the vulnerable regions (see e.g., in the Celtic Sea), but it includes also areas otherwise neglected by the assessment, i.e., in the Northern North Sea. Overall, the 1% ensemble criterion extends the area of vulnerability by $ 40,000 km 2 , i.e., an area comparable to the surface of Switzerland.
The simulated absolute minimum values of oxygen at each day, at any point within the risk area, have a clear seasonal pattern and no evident trend in the years 1998-2009 ( Figure 11). The lowest values (3.5-4 mg/L) are typically reached in August/September, they increase sharply in autumn and return to not-deficient values in spring. The bottom minima occurred with higher frequency in the Armorican shelf near the Gironde and Loire estuaries, and in the German Bight. The concentrations never descended below the hypoxia threshold; however, they reached persistent low values that were found lethal for some benthic species, e.g., 4.6 mg/L were found lethal for some fishes and mollusks in the review by Vaquer-Sunyer and Duarte [2008]. This low value was reached at both the 1% and 100% confidence levels ( Figure 11).
The location of vulnerable areas identified in Figure 10 matches the global map of hypoxia and eutrophication areas in Rabalais et al. [2014] (http://www.wri.org/our-work/project/eutrophication-and-hypoxia/interactive-mapeutrophication-hypoxia) and compares reasonably well with regional studies for the Celtic Sea [O'Boyle and  [Greenwood et al., 2010], and Armorican shelf [Charria et al., 2014]. The latter work presents a continuous, 2 year long time series of oxygen data that matches well the seasonal pattern shown in Figure 11. The authors showed that oxygen solubility, seasonal stratification of the water column, and bacterial remineralization of organic matter are the potential triggers of low bottom oxygen concentrations in the region, particularly in summer [e.g., O'Boyle and Nolan, 2010;Greenwood et al., 2010]. Our simulation extend these findings to the scale of the whole shelf, since we found highly significant anticorrelations between the oxygen series in Figure 11 and the daily series of water temperature, bacteria biomass, and particulate organic carbon simulated at the same bottom locations (Spearman rank correlations q s 5 20.75, 20.59, and 20.63, respectively, p < 0.01). Furthermore, Figure 11 suggests that oxygen deficiency may occur also in winter months at some shelf locations (e.g., near estuaries). This could not be directly confirmed by measurements collected in the ICES oxygen data base, where the coverage for bottom water in winter is far too low to permit the identification of this phenomenon (http://geo.ices.dk/). Therefore, these findings stimulates increasing the extension, frequency, and seasonal coverage of European bottom water monitoring for better understanding and predicting oxygen dynamics.  Figure 10b, yellow color represents deficient areas at 1% confidence level (i.e., at least 1 member of the ensemble signals oxygen deficiency), red represents 100% confidence (all the 100 members signal deficiency), and blue the areas of the shelf with concentration higher than 6 mg L 21 at 100% confidence. Figure 11. Time series of absolute minimum concentration of dissolved oxygen simulated within the vulnerable area shown in Figure 10; yellow and red lines represent the minimum values at 1% and 100% confidence level, respectively.

10.1002/2015JC011496
The soundness of the confidence levels shown in Figures 10 and 11 depends on the proved reliability of the model description of oxygen observations (section 3.2), but also on our arbitrary choices in the setup of the ensemble simulation, for example, in the initial ensemble conditions for oxygen (section 2.2.1). However, dissolved oxygen was neither analyzed nor perturbed systematically in the reanalysis nor were temperature and salinity, which are the physical drivers of the oxygen solubility in the water column. Thus, the oxygen spread in the ensemble was propagated by biological processes only, which were perturbed through the analysis and perturbation of the other model state variables, as well as through the perturbation of the surface irradiance (section 2.2.1). On the one hand, such propagation of the spread implies that the assimilation system for surface chlorophyll was capable of conveying the assimilated information and the model uncertainty across the simulated trophic structure, and down to the bottom of the water column to affect the simulation of oxygen at depth. On the other hand, it implies also that the range of the confidence level (i.e., the spread of oxygen) would be underestimated if the errors we assumed for the other model variables and irradiance forcing (i.e., the standard deviations of their perturbations) were underestimated in the first instance.
Besides dissolved oxygen, the reanalysis output also contains data characterizing the spatial-temporal variability and confidence levels of the other 10 variables linked to biogeochemical indicators listed in European legislation [OSPAR, 2013], including chlorophyll, nutrients, and pH, which are skill assessed in sections 3.1 and 3.2 (see Appendix B for a complete list of the reanalysis output). 3.4. Case Study II: Assessment of Atmospheric CO 2 Uptake by the Shelf Ecosystem The reanalysis data set can be applied to estimate the interannual variability of the shelf uptake of atmospheric CO 2 , and to evaluate the confidence levels for such estimates, as shown in this case study. The North East Atlantic is a net sink of atmospheric CO 2 at a high confidence level ( Figure 12). The ocean uptake increases from south to the northern colder waters (from $5 to 15 mol C m 22 yr 21 ) and from the coast toward the open ocean. On the shelf, the uptake is typically lower than 5 mol C m 22 yr 21 (region delimited by the 200 m isobath in Figure 12). Weak sources of CO 2 to the atmosphere (<1 mol C m 22 yr 21 ) were found in the English Channel, Irish Sea, and near estuaries. The interannual variability of the fluxes was more homogeneous and smaller than the interannual means, in general ( Figure 12b). However, variability and means were comparable in the Irish Sea and English Channel, indicating that these areas can switch from being weak sinks [Kitidis et al., 2012] to weak sources of CO 2 in some years, as a consequence of interannual fluctuations of the ecosystem dynamics (e.g., primary production) and forcing (e.g., water temperature) [Marrec et al., 2015;Borges and Frankignoulle, 2003]. The uncertainty in the fluxes was in general lower than their average values, indicating that the reanalysis is suitable for assessing flux directions, i.e., defining sink or source zones, at a 90% confidence level. However, in some zones, uncertainty and fluxes were comparable low, for example, in the English Channel, south North Sea, and Norwegian coast (Figure 12d). These areas should be considered flux-neutral at 90% confidence, like the English Channel that was already classified a ''not-significant-sink'' by Borges and Frankignoulle [2003].
The overall annual uptake of carbon dioxide in the shelf region was 41 Tg C yr 21 on average in the period 1998-2009, but this value has an uncertainty of 65 Tg C yr 21 (i.e., $25% of the average), at 90% confidence level, and with an interannual variability of $20% ( Figure 13 and Table 1). These estimates are coherent with previous literature findings (Table 1). An estimate of the average obtained with a comparable model, but referred to the years 1989-2004, lies within the range found in this study . Our average value was higher than the ones provided for the North Sea only [Thomas et al., 2005] and for the European shelf-seas altogether [Borges et al., 2006], but they overlap with the uncertainty range in the Gulf of Biscay [Chen and Borges, 2009].
The interannual variability of the yearly uptake of CO 2 ranged between 35 and 44 Tg C yr 21 (Table 1 and Figure 13), and we found it was related to the interannual variability of the gross primary production (Spearman rank correlation q s 5 0.72, p < 0.01), rather than to the interannual fluctuations of sea surface temperature (not-significant rank correlation). These results agree with Wakelin et al. [2012], who suggested that biological processes exert a stronger effect than temperature on the air-sea flux of CO 2 in the study region.
Our estimate of the uncertainty of the total flux (65 Tg C yr 21 , i.e., 25% of the average) appears sound, considering that it is coherent with the error assessed for pCO 2 observations (percentage RMSD 5 $20%, section 3.2), and it is comparable in percentage to the range of uncertainty estimated by Thomas et al., 2005 ($22%, Table 1). The estimated uncertainty of the flux is arguably linked to the uncertainty of primary production (see above rank correlation), which is however constrained rather directly by the corrections of ocean color assimilation. In addition, the estimated uncertainty of the flux was larger than the arbitrary perturbations imposed on dissolved organic carbon (DIC) in the assimilative initialization (1%, section 2.2.1), suggesting that the subjective initial perturbation of DIC did not strongly constrain the estimated uncertainty of its flux at the atmosphere interface. Our estimate of the uncertainty of the CO 2 flux is limited by not accounting for the error in the unperturbed temperature and salinity, which however are simulated skillfully by the model system (section 3.2). The provision of sound estimates of the uncertainty of carbon fluxes based on assimilative ensemble simulations is an added value of the reanalysis with respect to the reference simulation. The relatively large range of uncertainty estimated here calls for the development and assimilation of ocean color products with higher accuracy for type-2 shelf-sea waters, so that the reanalyzed air-sea fluxes of carbon dioxide can be constrained more strongly. The assimilation of optical data from ocean color could help, because such data have a lower error than chlorophyll in shelf-seas, and they can constrain directly a larger number of variables that are optically active and contribute to carbon fluxes, such as particulate and colored dissolved organic matter [Ciavatta et al., 2014]. Further promising options include assimilating size-class chlorophyll to better constrain variables and fluxes linked to different PFTs (see the 1dimensional simulations by Xiao and Friedrichs [2014b]), as well as assimilating pCO 2 data in shelf-sea ecosystem models. The latter approach arguably improved the estimation of the air-sea flux of CO 2 in an annual simulation of the global ocean biogeochemistry in the work by While et al. [2012].
The reanalysis data set contains values and confidence ranges of a large number of biogeochemical fluxes which are useful to investigate nutrient cycles and ecosystem processes in the North West European shelfsea (see the list in Appendix B).

Conclusions
The reanalysis of the North East Atlantic biogeochemistry provided a unique decadal data set that has considerable skill in approximating ocean observations, and that can enhance the understanding and management of the North West European shelf ecosystem, in relation to eutrophication and fluctuations of the carbon cycle.
Importantly, the reanalysis comes with confidence levels that quantify the uncertainty of the biogeochemical estimates. The crucial implications of this supplementary information were evident in two case studies, where we assessed that: 1. An area as large as 325,000 km 2 was vulnerable to oxygen deficiency at the bottom of the North West European shelf, but additional 40,000 km 2 are included when using a strict 1% confidence criteria. 2. The North West European shelf is a net sink of atmospheric CO 2 , but our simulated uptake can range between 36 and 46 Tg C yr 21 , when applying a 90% confidence level for the estimates.
The confidence levels provided here are an added value of the reanalysis with respect to the model output alone, because estimates of reliability are much needed for model applications in marine policy [Hyder et al., 2015]. For example, provision of percentile confidence level is required for eutrophication indicators inferred from monitoring programs [OSPAR, 2013], but quantification of uncertainty is a crucial gap when such indicators are estimated through model simulations [Piroddi et al., 2015]. The ensemblebased reanalysis presented here can help with tackling this gap in our knowledge of the North West European shelf, and the same methodological approach can be applied with other shelf-sea models running on adequate high-performance computing facilities. Further insights into the confidence in simulated ecosystem indicators and biogeochemical fluxes-including the contribution of uncertainty in hydrodynamics not accounted for here-can be achieved using an ensemble of different biogeochemical models  North Sea only [Thomas et al., 2005] a The total fluxes were computed for the shelf region with bathymetry shallower than 200 m represented in Figure 12  Range between the maximum and minimum annual values. c Average value of the annual ranges between the 95th and 5th percentiles.
Journal of Geophysical Research: Oceans 10.1002/2015JC011496 ensemble, as well as the assimilation of ocean color size-class chlorophyll for operational reanalyses of shelf ecosystems, is subject of our ongoing work within the Marine Environment Monitoring Service of the European Copernicus Program.
Finally, to our knowledge, this is the first reanalysis that is taking advantage of the by-pixel estimates of the errors of the assimilated ocean color product, decreasing the level of subjectivity often applied in biogeochemical data assimilation [Ciavatta et al., 2014]. However, the product we used was derived primarily for case-I waters. We expect that further advantages for biogeochemical reanalysis in shelf-seas will derive from the availability of long-term, integrated products for case-I and II waters, e.g., from the current efforts of the Ocean Colour Climate Change Initiative of the European Space Agency.
The reanalysis product presented in this paper is available for download and applications at the data portal http://portal.ecosystem-modelling.pml.ac.uk. r 10;c 5 ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi jD 2 10;c 2d 2 10;c j q (A7) We changed the base of the mean and variance of the distributions to natural logarithm by using mathematical properties [Campbell, 1995]: l 0 e;c 5l 0 10;c Á log e 10 ð Þ (A8) r 2 e;c 5r 2 10;c Á log e 10 ð Þ ½ 2 (A9) These parameters were used in the analysis step of assimilation to compute pseudorandom Gaussian distributions of the observations (section 2.2). In addition, the mean value (y 0 c ) and standard deviation (s c ) of the unbiased lognormal distribution of chlorophyll, in concentration units, were obtained from the logarithmic mean and variance in equations (A8) and (A9), by using mathematical properties of lognormal distributions [Mood et al., 1974, equation 39]: These are the parameters used in the presentation of the assimilated data in Figure 3.

Appendix B: The Reanalysis Data Set
The reanalysis data set is available in digital files produced in Network Common Data Form (NetCDF) version 4 (http://www.unidata.ucar.edu), following the standard convention ''Climate and Forecast'' meta-  data CF-1.6 (http://cfconventions. org/). Separated files contain different statistics of the reanalysis ensemble (median, mean, 5th percentile, 95th percentile, minimum, and maximum), for daily and monthly means of pelagic and benthic variables and fluxes listed in Tables B1-B4. An extensive description of such model variables and fluxes was provided by Butensch€ on et al. [2015]. The full reanalysis data set has a size of $12 Tb. A subset of the regridded data set can be visualized, processed, and downloaded at the data portal http://portal. ecosystem-modelling.pml.ac.uk, while the full data set is available on request to the corresponding author.