1 Introduction

Over the past years, the emerging impacts of climate change on the Northwest European Shelf (NWES) circulation system and marine ecosystem have caused serious concern (e.g. OSPAR 2000; BACC II Author Team 2015; Quante and Colijn 2016). The NWES is a biologically rich and productive region, inhabited naturally by diverse species of all trophic levels. Its marine ecosystem is characterized by a complex network of interactions between biota and the physical and chemical environment. However, the NWES is also one of the most frequently traversed sea areas in the world. The coasts are densely populated by highly industrialized adjoining countries which conduct major fishing activities including mariculture, exploit oil and gas reservoirs, and install offshore wind farms and tidal power plants for renewable energy generation. Understanding the effects that climate change might have on the NWES dynamical system is key to account for both the economic interests and the integrity of the ecosystem functioning.

Several studies indicate that climatic changes sensitively affect the biological production on the NWES, including the recruitment of fish larvae and harvests from associated fisheries. Increasing water temperatures invoke changes in the distribution and abundance of planktonic and benthic species (e.g. Beaugrand et al. 2009; Weinert et al. 2016). Increased stratification in the upper ocean and a weakening of winter convection lower the nutrient concentrations of inflowing Atlantic water masses, reducing the NWES primary production and CO2 uptake (e.g. Steinacher et al. 2010; Gröger et al. 2013).

The conventional approach to address regional climate change impacts in the ocean is the dynamical downscaling of a global climate projection by a regional ocean model. In various studies, different downscaling strategies have been devised, motivated by the availability of forcing data and the technical features implemented in the regional model at hand. Atmospheric forcing fields are often taken directly from the output of the parent global model or from a dynamical downscaling with a high-resolution regional atmosphere model. Using output from global models, though, proved problematic for the prescription of open lateral boundary conditions. Global climate models typically run without the incorporation of ocean tides (none of the CMIP5 models runs with tides). The conditions near the shelf edge, where the open lateral boundaries are usually situated, are inconsistent with a state including tides, since e.g. mixing by internal tides is missing. Besides, global models generally show large biases in the Subpolar Gyre (SPG) region and deficiencies in representing proper water mass exchange between the NWES and the Atlantic ocean. As a consequence, elaborate bias corrections have to be applied to both the oceanic and atmospheric forcings to introduce realistic gradients, such as in frontal zones or vertically stratified regions (e.g. Katzfey et al. 2009; Holt et al. 2012; Mathis et al. 2013). Moreover, most regional ocean models are not interactively coupled with an atmosphere model. The importance to account for coupled air–sea interaction has been proposed in many studies on regional ocean modeling and dynamical downscaling (e.g. Schrum et al. 2003; Rummukainen 2010; Tian et al. 2013; Van Pham et al. 2014; Gröger et al. 2015; Wang et al. 2015). A high sensitivity in the response of physical and biogeochemical conditions in the ocean to changes in the atmospheric wind and thermal forcings is characteristic for shelf and marginal seas but cannot be accounted for by uncoupled models because they are driven with bulk formulae which do not allow the atmosphere to adjust to the downscaled ocean conditions (e.g. Ådlandsvik 2008; Holt et al. 2010; Olbert et al. 2012; Gröger et al. 2013; Mathis and Pohlmann 2014; Tian et al. 2016; Tinker et al. 2016). Whereas the consensus that coupled regional model systems are the ultimate tool for downscaling climate change projections seems to evolve slowly with respective model development, the additional costs in man power and computing time make applications still rare (e.g. Kjellström et al. 2005; Bülow et al. 2014). Nevertheless, comparably simple downscaling techniques are advantageous in dealing with other major sources of uncertainty, e.g. running ensemble simulations with forcings from different global climate models and different realizations.

What is missing so far is a consistent test of the various downscaling strategies, ranging from a downscaling with a fully coupled regional climate system model over forcing a regional ocean model with data from a dynamical atmosphere downscaling to directly using forcing fields from the global model output. Other uncertainties to be addressed are the effect of prescribing lateral boundary conditions in the ocean and simplifications in dealing with river runoff.

The aim of this paper is to provide insight into what impacts common downscaling strategies potentially have on the simulated climate change signal and hence to explore uncertainties in regional climate projection studies rooted in the design of the numerical experiment. We conduct a set of dynamical downscalings for the NWES of one global climate projection with our regional climate system model or subsystems thereof. The considered strategies range from an interactively coupled ocean–atmosphere-land downscaling to an ocean-only downscaling driven by unprocessed atmospheric forcing from the global climate model. The resulting climate change signals are analyzed and compared to the coupled reference downscaling, implicitly assuming that climate signals are most consistently simulated by a fully coupled climate system model. The main advantage of coupled simulations lies in the physically more realistic representation of climatic processes by including coupled air–sea–ice energy, mass and momentum exchange. An evaluation of simulated present-day conditions, nevertheless, is prerequisite for any dynamical downscaling to assure reasonable model performance. By opposing several uncoupled downscalings, our experiments aim at unveiling both conceptual potentials and limitations in gaining added value for regionalized change signals. As the ocean model component is global with a zoom on the NWES, the effect of domain size can also be tested. This approach allows us to estimate the impact of various simplifications in the experimental setup on the simulated climate change signal as well as to assess the relevance of ocean–atmosphere feedback mechanisms.

2 Methods

Our model system consists of the global ocean model MPIOM, the ocean biogeochmistry model HAMOCC, the hydrological discharge model HD, and the regional atmosphere model REMO (Mikolajewicz et al. 2005; Elizalde Arellano 2011; Sein et al. 2015). In all experiments, MPIOM was run on a stretched grid configuration with non-diametrical poles located over Central Europe and North America (Fig. 1). Firstly, the use of a global ocean model circumvents the need for prescribed open boundary conditions in the ocean and secondly, the stretched grid configuration provides a higher horizontal resolution in the North Atlantic and NWES than conventional global setups. In the North Sea, the mesh size ranges from about 12 km along the continental coast to about 30 km around the Shetland Islands. The vertical is resolved by 30 levels with 8 levels in the upper 100 m. The same configuration has also been used at various resolutions by Gröger et al. (2013), Su et al. (2014), Sein et al. (2015) and Mathis et al. (2015) and its potential compared to regional North Sea models has been evaluated in a recent study by Pätsch et al. (2017). To account for ocean–atmosphere feedback mechanisms, REMO is interactively coupled with MPIOM over the European and northeast Atlantic areas. The REMO model domain corresponds to the EURO-CORDEX region (e.g. Kotlarski et al. 2014), developed for coordinated ensembles of high-resolution climate simulations of the European mainland and seas (PRUDENCE project, ENSEMBLES project). The REMO grid has a uniform horizontal resolution of about 50 km and 27 levels in the vertical. Lateral soil freshwater drainage and river runoff are simulated by the HD model.

Fig. 1
figure 1

Stretched grid configuration of the global ocean model MPIOM (black lines), providing a higher horizontal grid resolution in the North Atlantic and NWES. Green lines represent the boundaries of the regional atmosphere model REMO, which corresponds to the standard EURO-CORDEX area

2.1 Model descriptions

MPIOM is the ocean-sea ice component of the global earth system model MPI-ESM (Maier-Reimer 1997; Marsland et al. 2003) of the Max-Planck-Institute for Meteorology in Hamburg. The primitive equations of oceanic motion are discretized on an Arakawa C-grid with z-coordinates and free surface, applying hydrostatic and Boussinesq approximations. Tracer and momentum advection follows a second order total variation diminishing scheme after Sweby (1984). Vertical mixing is performed after Pacanowski and Philander (1981) with an additional parameterization for the effect of ocean currents on wind stress (Jungclaus et al. 2006). In the model version described here, the full ephemeridic luni-solar tidal potential is included according to Thomas et al. (2001). A detailed model description and evaluation is given in Marsland et al. (2003) and Jungclaus et al. (2013).

Biogeochemical processes in the ocean are simulated by the Hamburg Ocean Carbon Cycle model HAMOCC (Maier-Reimer et al. 2005; Ilyina et al. 2013), which is online coupled to MPIOM. Marine biology dynamics is represented by nutrients, phytoplankton, zooplankton, detritus, and dissolved organic matter (Six and Maier-Reimer 1996). Biogeochemical cycles and trophic levels are connected by nutrient uptake and remineralization of organic matter, including colimitation by major macro and micro nutrients, such as phosphate, nitrate, and dissolved iron (Six and Maier-Reimer 1996; Kloster et al. 2006). Atmospheric dust deposition as a source for iron follows a present-day eolian climatology (Mahowald et al. 2006). A sediment module accounts for deposition processes of solid constituents as well as for the biogeochemistry of pore water tracers in the upper bioturbated sediments (Heinze et al. 1999). Resuspension of marine sediments in the model configuration described here is implemented after Wilcock (2004), based on the concept of incipient motion in dependency on the critical bed shear stress for mean upper sediment density and grain size. Tracer transport is simulated by MPIOM. In this way, HAMOCC describes cycles of biogeochemical elements in the ocean water column and sediment and calculates fluxes of tracers between these two compartments and at the air–sea interface.

The regional atmosphere model REMO (Jacob and Podzun 1997; Jacob et al. 2001) evolved from the Europa-Modell of the German Weather Service (Majewski 1991). The physical parameterization scheme has been adopted from the global atmosphere circulation model ECHAM versions 4 and 5 (Roeckner et al. 1996, 2003). Discretized primitive equations of atmospheric motion are solved making use of the hydrostatic assumption. The prognostic variables are surface pressure, horizontal wind velocity components, air temperature, water vapour, cloud liquid water, and cloud ice. The origin of the rotated spherical Arakawa C-grid is located in the geographical center of the model domain (see Fig. 1) to minimize cell distortion at higher latitudes. A hybrid coordinate system (Simmons and Burridge 1981) is used to resolve the vertical. The temporal integration follows a leapfrog scheme with semi-implicit correction and time filtering after Asselin (1972). In coupled mode (Mikolajewicz et al. 2005; Elizalde Arellano 2011; Sein et al. 2015), sea surface fluxes of heat, momentum, and freshwater are transferred to the ocean model MPIOM by the OASIS3 coupler (Valcke 2013), and in turn sea surface temperature and sea ice cover are passed from MPIOM to REMO. At the land–atmosphere interface a simple bucket scheme is applied for the treatment of soil hydrology. Freshwater drainage and runoff on land are calculated by the hydrological discharge model HD (Hagemann and Dümenil-Gates 2001).

2.2 Model experiments

In this section, the experimental setups of the different downscaling simulations are introduced. The parent global climate projection used for this downscaling exercise is the first realization of the low resolution version of the global coupled earth system model MPI-ESM, driven by the Intergovernmental Panel on Climate Change (IPCC) Representative Concentration Pathway RCP8.5 (van Vuuren et al. 2011; Giorgetta et al. 2013). All model experiments start with the simulation of the historical period 1920–2005 and continue with the RCP8.5 emission scenario until the year 2100. In the coupled reference setup Cref, MPIOM is coupled with REMO to account for air–sea interaction processes in the northeastern North Atlantic, including the NWES (Fig. 1). Outside the coupling domain, MPIOM and REMO are forced with output from the global atmosphere model ECHAM6, as being the atmosphere component of MPI-ESM (CMIP5 version). Remaining model inconsistencies between REMO and ECHAM6 are basically restricted to the representations of shortwave irradiance and clear sky albedo (Stevens et al. 2013). To avoid possible boundary effects anyhow, REMO variables are gradually relaxed toward the ECHAM6 forcing data in the outer 20 boundary grid cells. Moreover, problematic air–sea–ice fluxes due to differences in sea ice cover can be ruled out, since sea ice conditions in all downscaling experiments are in good agreement with MPI-ESM and observations, including realistic sea ice in the Baltic. The freshwater cycle is closed by using the HD model. To prevent imbalance in the global freshwater budget, a sea surface salinity (SSS) restoring is applied in the global ocean domain, excluding the North Sea and Baltic Sea (Fig. 2), which accounts for MPI-ESM biases in climatological SSS (period 1920–1990) relative to the observational Polar Science Center Hydrographic Climatology (Steele et al. 2001). The reference simulation Cref thus represents a fully coupled regionalization of a transient global climate projection performed by MPI-ESM. The downscaling factors in the NWES region amount to about 3–8 for the ocean and 4 for the atmosphere. In the same way, the corresponding MPI-ESM control simulation under constant preindustrial atmospheric CO2 concentration has been regionalized to distinguish anthropogenic climate change signals in the model results from potential model drift. The coupling time step for Cref and the control simulation is 3 h. Table 1 provides an overview of key model simulations.

Fig. 2
figure 2

Blue North Sea domain used for the calculation of spatial averages. Red global ocean domain excluding the North Sea and Baltic Sea, used for SSS restoring and downscaling experiments with three-dimensional restoring of water temperature, salinity and nutrient concentrations. Black dashed box region used for the calculation of stratification changes. Black lateral sections 1 West Norwegian Trench inflow, 2 East Shetland inflow, 3 Orkney Shetland inflow, 4 English Channel inflow, 5 Baltic Sea inflow, 6 Norwegian outflow. Annual mean flow directions are shown in Fig. 13

Table 1 Overview of key model simulations

In the following paragraphs, several uncoupled downscaling experiments are introduced, where each experiment mimics a common downscaling strategy of different complexity. The comparison of the model results allows to assess the qualification of the individual downscaling strategies to develop climate change signals independent from the parent global climate projection. The interactive ocean–atmosphere coupling, however, enables the reference simulation Cref to evolve with a higher number of degrees of freedom, compared to the uncoupled experiments. To account for internal variability in the coupled downscaling, Cref is simulated for three different initial conditions. To minimize spinup time, the first realization has been initialized for the year 1920 by equilibrium conditions from test simulations of about 200 years in total. The second and third realizations have been initialized by 1923 and 1925 conditions of the first realization, respectively. For the uncoupled experiments, by contrast, initial conditions are identical (the same as for the first Cref realization). Nevertheless, the simulated conditions after initialization differ among the uncoupled experiments at every model time step because of the different forcings and hence pose a much stronger perturbation than a change in the initial conditions only. Internal variability of the uncoupled experiments can therefore be neglected for our purposes. Deviations of the various downscalings from the parent global simulation finally provide information about their added value. It will be shown that the differences or similarities in the downscaled change signals cannot be explained by internal model variability, i.e. they are characteristic for the downscaling strategies rather than mere noise. For simplicity, by Cref we therefore always refer to the first realization of the coupled downscaling, if not mentioned otherwise.

In the uncoupled experiment CF (coupled forcing), we check beforehand whether the projected change signals of the coupled reference simulation Cref can be reproduced when MPIOM is forced with REMO output from Cref, in order to rule out systematical errors between the coupled and uncoupled forcing techniques involved in this study.

The most straightforward downscaling approach is to drive the regional ocean model directly by atmospheric output from the original global simulation (e.g. Ådlandsvik 2008; Olbert et al. 2012; Gröger et al. 2013). In the second experiment EF (ECHAM forcing), we exclude REMO from the model system and force MPIOM entirely with the coarser ECHAM6 output (horizontal resolution 1.875°). ECHAM6 grid cells containing land fractions have been excluded from the interpolation procedure of the forcing fields. This experiment demonstrates the influence of lacking a regionalized atmospheric forcing with higher resolution compared to the parent global model. River discharge is still used from Cref, since it is not available from MPI-ESM in sufficient spatial and temporal resolutions. In addition, model output of the original global simulation is usually prescribed at the open lateral boundaries of the regional ocean domain (e.g. Ådlandsvik 2008; Holt et al. 2010; Mathis and Pohlmann 2014; Tian et al. 2016; Tinker et al. 2016). Thus, in the third experiment EFroc (ECHAM forcing regional ocean), EF is repeated with a strong three-dimensional time-dependent restoring of sea water temperature, salinity, and nutrient concentrations (total PO4, NO3 and Si) outside the North Sea (Fig. 2). The restoring targets represent anomalies of the MPI-ESM output relative to the historical mean of the coupled reference simulation Cref. With this experiment, we mimic a stand-alone regional ocean model that uses MPI-ESM output not only for the atmospheric forcing but also to prescribe oceanic conditions at the open lateral boundaries of its North Sea-Baltic Sea domain.

In uncoupled downscalings two ways of horizontal interpolation are generally used for the preparation of atmospheric forcing fields. To reach spatial completeness, extrapolation of atmospheric data toward the high-resolution coast is applied if the land-sea mask of the original data provides for enough information over the ocean. When the resolution of the original data is rather coarse, however, land grid points are often utilized to avoid intense extrapolation of sea conditions into areas of missing wet data. In the next experiment EFland (ECHAM forcing land), MPIOM is driven by the same atmospheric forcing used for experiments EF and EFroc, except that during the preparation of the forcing fields, ECHAM6 grid cells containing land fractions have not been excluded from the interpolation procedure. Differences between EF and EFland thus highlight implications of poorly resolved coastlines in the atmospheric forcing.

A common method to increase the resolution of the atmospheric forcing is to launch an atmospheric downscaling with a regional atmosphere model (e.g. Holt et al. 2010; Mathis and Pohlmann 2014; Tian et al. 2016; Tinker et al. 2016). In the fifth experiment RF (REMO forcing), MPIOM is forced with REMO output from an atmosphere-only downscaling of ECHAM6, which has been driven by atmospheric lateral boundary data, sea surface temperature (SST), sea ice conditions and sea surface velocity from the global MPI-ESM simulation. The internal model setup of REMO has been identical to the atmospheric component of the fully coupled regional model system. This experiment thus already includes higher horizontal resolution in the NWES for both the ocean and atmosphere, but unlike Cref, it does not account for air–sea feedback mechanisms.

In analogy to EFroc, RF is repeated in another experiment RFroc (REMO forcing regional ocean), applying the three-dimensional ocean restoring outside the North Sea and Baltic Sea. Hence, RFroc mimics a state-of-the-art uncoupled dynamical downscaling with a regional ocean model driven by downscaled high resolution atmospheric forcing and prescribed lateral boundary conditions in the ocean extracted from the parent global climate simulation.

A prominent added value of dynamical downscalings is the simulation of ocean tides and related processes such as energy dissipation due to bottom friction and vertical mixing. In particular, a realistic representation of shallow marginal areas like the NWES crucially relies on the incorporation of tidal dynamics. In the next experiment CFtid (coupled forcing tides), atmospheric forcing fields are taken from Cref while in MPIOM, the calculation of the ephemeridic tidal potential is switched off. With this experiment, we analyze effects on the downscaled change signals when a comparatively simple ocean model is used, lacking the incorporation of essential physical processes.

In all downscaling experiments so far, river discharge is used from the coupled reference run Cref, which includes a closed freshwater cycle simulated by the HD model. However, fresh water runoff for individual rivers is often prescribed as climatological means when not provided by the original global model output (e.g. Mathis and Pohlmann 2014; Tian et al. 2016). To study the influence of neglected transient lateral freshwater drainage and runoff, in the last experiment CFriv (coupled forcing rivers), constant monthly climatological river runoff (calculated from Cref 1971–2000) is used in a simulation otherwise driven by atmospheric forcing fields from Cref.

Historical riverine nutrient concentrations are provided for the North Sea (Pätsch and Lenhart 2004), Baltic Sea (Savchuk et al. 2012), and Mediterranean Sea (Ludwig et al. 2009), while future concentrations follow a white noise distribution with mean and variance of the period 2001–2010 from the cited observations. The atmospheric forcing fields for the uncoupled experiments consist of 3-hourly means to be consistent with the coupling frequency of Cref.

2.3 Control simulations

Model simulations that include regions of deep ocean are prone to model drift and therefore need appropriate spinup time. As mentioned in the previous section, our experiments continue from test simulations of about 200 years. To rule out remaining effects of model drift in the projected climate change signals, control simulations under constant preindustrial atmospheric CO2 concentration (285 ppm) have been performed for the global and regional coupled setups. Both control simulations do not show significant trends in SST, SSS, and current speeds in the NWES and northeastern Noth Atlantic. Time series of SST and SSS in the North Sea are exemplified in Fig. 3. The considered change signals thus do not need to be corrected in our analysis and can be regarded as anthropogenic impact.

Fig. 3
figure 3

Time series of 10-year running mean North Sea SST (upper panel) and SSS (lower panel). Black coupled downscaling of the preindustrial control simulation. Red coupled downscaling Cref of the historical period and RCP8.5 scenario

2.4 Analysis

A global future projection by the coupled earth system model MPI-ESM has been regionalized for the NWES by the coupled downscaling simulation Cref. Climate change signals of this downscaling simulation are considered here as benchmark for the various uncoupled downscaling approaches.

In general, differences between the two time slices 1971–2000 (representing mean present-day conditions) and 2071–2100 (representing mean conditions at the end of the twenty-first century) are analyzed for the discussion of projected change signals. Statistical significance is derived from the variance of the control simulations. We calculated the standard deviation of all possible 30-year means and multiplied it by \(\sqrt 2\). The latter factor doubles the variance of the set of 30-year means, taking into account that the two particular 30-year means of the change signals (1971–2000 and 2071–2100) are themselves taken from a set with similar variance. Any change signal greater than 1.96 times this enlarged standard deviation is ascribed significance greater than 95%.

In particular temperature and salinity change signals are significant compared to the variability of the control simulations. Time series of annual volume transports or current speeds, however, exhibit pronounced variability in the NWES region, compared to centennial trends. Analyzing the differences between 1971–2000 and 2071–2100 therefore can lead to misguided change signals. Instead, linear trends of depth-averaged current velocities as well as volume transports through the main North Sea inflow and outflow sections (see Fig. 2) have been calculated for the analysis of projected changes in the general circulation.

3 Results

3.1 Sea surface temperature

Climate change signals of annual mean SST are shown in Fig. 4 for the global MPI-ESM simulation and the coupled downscaling experiment Cref. The projected signal is stronger in Cref than in MPI-ESM in the entire North Sea and Baltic Sea as well as in the northeastern North Atlantic. This tendency is also reflected by the change in vertically integrated heat content (not shown).

Fig. 4
figure 4

Climate change signal of annual mean SST (RCP8.5 2071–2100 minus Historical 1971–2000) for MPI-ESM, the coupled downscaling Cref and uncoupled downscalings given in Table 1. Dotted areas mark changes less than 95% significance

The present-day warming trend of the North Sea results from an imbalance of its heat budget, given as net heat gain in Table 2. Through the annual cycle, more heat stored in North Atlantic water masses is transported into the North Sea (via advection and diffusion) than is released to the atmosphere (via surface heat and freshwater fluxes, including river runoff) or exported to the Norwegian Sea (via the Norwegian Coastal Current). The projected warming trends in the scenario simulations MPI-ESM and Cref are approximately linear, which means that this imbalance remains approximately constant. However, the contributions from the ocean and atmosphere change.

Table 2 Mean North Sea heat budget and SST for 1971–2000 and projected changes for 2071–2100 from MPI-ESM and Cref

During months of net surface heat input (April to August, referred to as summer in this section), the general circulation in the North Sea and hence heat import from the North Atlantic weakens in the scenario simulations due to lower surface wind stress and a higher frequency of northerly wind components, while the heat input through the sea surface does not change significantly. Opposite trends are indicated during months of net surface heat loss (September to March, referred to as winter in this section). The general circulation does not change significantly, while the heat loss to the atmosphere weakens due to weaker sensible heat flux (lower air–sea temperature difference) and long-wave radiation (higher cloud cover and air temperature). In other words, the future decrease in total summer heat input is approximately equal to the decrease in total winter heat loss, maintaining the imbalance of the North Sea heat budget. Nevertheless, the decreases in both summer heat input and winter heat loss give rise to a stronger warming trend of the North Sea water temperature in winter than in summer. Furthermore, the stronger seasonal trends in MPI-ESM imply a larger difference between winter and summer warming (0.3 °C) than in Cref (0.1 °C).

While in Cref, the weakening of the North Sea circulation in summer is rather general, in MPI-ESM it affects particularly the southern North Sea and the warm English Channel flow, causing a stronger reduction of annual ocean heat inflow in MPI-ESM than in Cref.

The different trends of surface heat loss between MPI-ESM and Cref result from different trends in the northeastern North Atlantic. For the two simulations, barotropic stream functions of North Atlantic volume transport are shown for present-day conditions (1971–2000) in Fig. 5. In MPI-ESM, the flow direction of the North Atlantic Current (NAC) is rather zonal, a common deficiency of general circulation models with limited resolution, causing a cold temperature bias in the eastern North Atlantic (e.g. Molinari et al. 2008; Keeley et al. 2012; Flato et al. 2013), whereas in Cref, the NAC follows a realistic northeastward direction. Corresponding stream functions at the end of the scenario period (2071–2100), also shown in Fig. 5, indicate a weakening of the NAC and an intensification of the SPG for MPI-ESM. In coupled ocean–atmosphere models, a weakening of the Atlantic Meridional Overturning Circulation (AMOC) is typically associated with a strengthening and eastward extension of the North Atlantic storm track (e.g. Woollings et al. 2012; Nissen et al. 2014). Hence, in MPI-ESM the northward ocean heat transport weakens and the upper 100–150 m water column in the northeastern North Atlantic, including the North Sea, freshens substantially (decrease in salinity 0.8 g kg−1). The induced vertical stratification limits winter heat loss of the upper water column to the atmosphere, reducing the air–sea temperature difference and therefore the turbulent heat flux by a further cooling of the upper ocean. The weaker ocean heat supply in the northeastern North Atlantic is indicated by a negative change in the convergence of vertically integrated ocean heat transport, shown in Fig. 6. In Cref, by contrast, the intensification of the SPG involves an eastward shift of the NAC, bringing anomalous heat into the northeastern North Atlantic. Accordingly, the change in the convergence of ocean heat transport, also shown in Fig. 6, is positive and the freshening of the upper water column is weaker than in MPI-ESM (decrease in salinity 0.4 g kg−1). The additional heat in the northeastern North Atlantic locally enhances winter heat loss to the atmosphere, increasing near-surface air temperature and humidity. The warmer and wetter air masses are advected into the NWES region by westerly winds and reduce the sensible heat flux in the North Sea, but leave the latent heat flux unchanged.

Fig. 5
figure 5

Barotropic volume stream function at the end of the historical period (1971–2000) and at the end of the RCP8.5 scenario (2071–2100) for MPI-ESM and Cref. Arrows sketch the simulated pathway of the NAC

Fig. 6
figure 6

Climate change signal of annual mean convergence of ocean heat transport (RCP8.5 2071–2100 minus Historical 1971–2000) for MPI-ESM and Cref

The resulting reductions in annual net ocean heat inflow into the North Sea and surface heat loss to the atmosphere (Table 2) are thus weaker in Cref than in MPI-ESM. Nevertheless, they still balance largely in both scenario simulations, leaving the imbalances of the North Sea heat budgets and hence the annual warming trends constant.

Before presenting the results of various uncoupled model systems, we assure that simulation CF, for which REMO output from Cref is used to force MPIOM uncoupled, well reproduces the coupled climate change signals (compare Fig. 7 with Figs. 4, 9). In the North Sea, root mean squared differences (RMSD) between CF and Cref change signals amount to 0.02 °C for SST, 0.03 g kg−1 for SSS, and 0.06 cm s−1 for depth-averaged current speed. The corresponding climate change signals are at least two orders of magnitude larger, demonstrating that there is no systematic bias using exchange of surface fluxes (coupled experiments) or using bulk formulae to calculate the fluxes (uncoupled experiments). Such uncoupled repetitions are therefore qualified to be exploited for further sensitivity experiments e.g. in simulating biogeochemical effects for different scenarios of riverine nutrient loads.

Fig. 7
figure 7

Climate change signals (RCP8.5 2071–2100 minus Historical 1971–2000) of annual mean SST (left panel) and SSS (right panel) for experiment CF, driven by REMO output from Cref

Annual SST change signals downscaled by the uncoupled experiments EF, EFroc, RF and RFroc are shown in Fig. 4. Owing to the higher grid resolution, the spatial structure of the downscaled change signals is more detailed in all these simulations but the larger pattern and in particular the magnitude fairly reproduce the change signal of the parent global simulation. Since SST and near-surface air temperature are closely related to each other, the SST change signal of MPI-ESM is imposed to the uncoupled experiments by the atmospheric forcing. In EF and EFroc, the atmospheric forcing is directly taken from MPI-ESM output, while in RF and RFroc, the intermediate atmospheric downscaling by REMO was driven by MPI-ESM output, including SST. The strong influence of the atmospheric forcing on SST is reflected both in the shallow NWES and the deeper northeastern North Atlantic, leading to overall lower SST change signals than in the coupled downscaling Cref. Outside the North Sea the similarity to MPI-ESM is largest in EFroc and RFroc due to the involved three-dimensional restoring.

The downscaled change signals of the three Cref realizations branching from different initial conditions range just between 2.2 and 2.3 °C in the North Sea. Likewise, the spatial patterns and gradients are very similar. RMSDs between the change signal of MPI-ESM and the Cref realizations range from 0.26 to 0.34 °C, while the results from the uncoupled downscalings are significantly closer to MPI-ESM with RMSDs ranging from 0.10 to 0.13 °C. The distinctness of the coupled downscaling is thus not a random artifact due to internal model variability.

Effects in the downscaled SST change signal in experiment EFland are shown in Fig. 8. The influence of land fractions on the interpolated forcing fields introduces an additional error at the downscaled coastal areas of poor topographic land-sea representation by the global model. In particular in the German Bight, along the Danish coasts, and in the Skagerrak and Kattegat regions the projected SST change signal increases in EFland by about 0.3 °C compared to EF and EFroc, locally exceeding even the coupled reference run Cref. A similar SST sensitivity to coarse atmospheric forcing along coastlines was found for the Danish Straits by Tian et al. (2013). Besides, in our simulations, crucial changes in the atmospheric forcing along the east coast of North America sensitively affect the water mass characteristics of the Golf Stream and NAC, yielding a weakening of the SST change signal in the northeastern North Atlantic by 0.1–0.2 °C. Hence, if a fully coupled ocean–atmosphere downscaling cannot be applied, it is important to restrict the global source data to ocean-only grid cells. Otherwise, the difference in the heat storage capacities of sea water and terrestrial ground creeps in a spurious effect to the interpolated forcing fields.

Fig. 8
figure 8

Difference in the climate change signals of annual mean SST (RCP8.5 2071–2100 minus Historical 1971–2000) between experiment EF and experiment EFland, including land grid cells in the forcing interpolation

In experiment CFtid, the incorporation of the ephemeridic tidal potential has been switched off. Nevertheless, even in such a drastic misrepresentation of shelf sea dynamics, the SST change signals for both seasonal and annual means (not shown) are well reproduced because of their tight dependency on the atmospheric forcing. The RMSD between CFtid and Cref change signals amounts to just 0.05 °C.

Experiment CFriv is not analyzed in this section because the influence of river runoff on the considered SST projection is negligibly small.

3.2 Sea surface salinity

Climate change signals of annual mean SSS are shown in Fig. 9. As explained in Sect. 3.1, the projected changes in NAC and SPG conditions induce a stronger salinity drop in the northeastern North Atlantic in MPI-ESM than in Cref. The salinity of the North Sea is modulated by inflow of salty North Atlantic water masses and riverine freshwater discharge along the continental coast and into the Baltic Sea. The change signal of the northeastern North Atlantic water masses is advected into the North Sea, causing a stronger salinity drop in MPI-ESM than in Cref in the northern and central North Sea as well as in the English Channel. Mean North Sea SSS change signals amount to −0.85 g kg−1 for MPI-ESM and −0.79 g kg−1 for Cref. Among the three Cref realizations the signals range from −0.76 to −0.79 g kg−1, while the discrepancy between MPI-ESM and Cref is more evident in the spatial distribution shown in Fig. 9.

Fig. 9
figure 9

Climate change signal of annual mean SSS (RCP8.5 2071–2100 minus Historical 1971–2000) for MPI-ESM, the coupled downscaling Cref and uncoupled downscalings given in Table 1. Dotted areas mark changes less than 95% significance. Experiment CFriv is shown in Fig. 10

Since the differences in the projected changes in the northern North Atlantic circulation and salinity are largely determined by the grid resolution rather than by air–sea interaction, the SSS change signals in EF and RF are indeed quite similar to Cref. A slightly better approximation can be recognized for RF because the same high-resolution atmosphere model (REMO) has been used in both RF and Cref simulations. River runoff is taken from Cref in all downscaling experiments shown in Fig. 9, dominating the salinity change along the continental coast and in the Baltic Sea. Large differences to the coupled downscaling Cref, however, arise in experiments EFroc and RFroc. The three-dimensional salinity restoring outside the North Sea and Baltic Sea imposes the stronger freshening trend of MPI-ESM to the inflowing North Atlantic water masses, leading to an amplification of the negative SSS change signal in the North Sea by about 0.2 g kg−1. Moreover, the freshening of the Norwegian Coastal Current in MPI-ESM penetrates westward into EFroc and RFroc inflow conditions, acting as a further artificial freshening of Atlantic inflow.

The reasonability of the EF and RF strategies is also reflected by the RMSDs of their North Sea salinity signals relative to the three Cref realizations, which range from 0.13 to 0.14 g kg−1 and from 0.05 to 0.10 g kg−1, respectively, while for EFroc and RFroc they range from 0.27 to 0.29 g kg−1 and from 0.26 to 0.30 g kg−1.

The influence of neglected transient river runoff is examined in experiment CFriv. In general, a warming global climate is associated with an increase in atmospheric moisture transport from the tropics towards higher latitudes, which causes an intensification of the hydrological cycle. The resulting increase in European river runoff is reflected in Cref by maximum salinity decreases along the continental coast and in the Baltic Sea. Consequently, the downscaled SSS change in CFriv, shown in Fig. 10, is substantially underestimated due to the use of monthly climatological present-day river runoff.

Fig. 10
figure 10

Climate change signal of annual mean SSS (RCP8.5 2071–2100 minus Historical 1971–2000) for experiment CFriv, driven by constant monthly climatological river runoff

Similar to SST, the incorporation of land grid cells in the interpolation of the meteorological forcing fields introduces additional errors to downscaled SSS changes. Compared to EF, the results of experiment EFland shown in Fig. 11 indicate a weakening of the projected SSS trends along the North Sea coasts by up to 0.5 g kg−1 in the German Bight and Skagerrak. In particular the change signals of precipitation and evaporation is sensitive to the land-sea mask, altering in EFland the interpolated surface fresh water supply of coastal areas. In northern Europe, there is typically less annual precipitation over land than over the ocean and air masses have higher annual mean temperatures but lower relative humidity, with projected future changes showing the same qualitative differences.

Fig. 11
figure 11

Difference in the climate change signals of annual mean SSS (RCP8.5 2071–2100 minus Historical 1971–2000) between experiment EF and experiment EFland, including land grid cells in the forcing interpolation

3.3 Vertical stratification

Changes in the maximum vertical density gradients are shown in Fig. 12. In the seasonally stratified thermocline region of the North Sea, the maximum vertical density gradients increase in spring and summer by up to 2 g m−4 for MPI-ESM and 6 g m−4 for Cref. In both simulations, the change signals in summer are dominated by the projected temperature rise. Even though the vertical temperature gradient becomes weaker due to the stronger warming trend in winter than in summer, as mentioned in Sect. 3.1, the nonlinear dependency of water density on temperature overcompensates, yielding an intensifying contribution to the pycnocline intensity. In Cref, the temperature effect amounts to about 60% of the given change signal, while 40% result from the freshening of the surface layers. In MPI-ESM, the temperature effect is lower because of the weaker projected temperature increase but amounts to about 150% of the stratification strengthening shown in Fig. 12, whereas the projected vertical salinity distribution is more homogeneous than today, weakening the density gradient by 50%. The uncoupled downscalings follow Cref in spring, governed by similar vertical salinity distributions of the high-resolution simulations, but the stratification changes become moderated in summer by about 20–30% because of the weaker temperature signal inherited from the global model atmospheric forcing.

Fig. 12
figure 12

Climate change signal of monthly mean maximum vertical (potential) density gradients (RCP8.5 2071–2100 minus Historical 1971–2000) for MPI-ESM, the coupled downscaling Cref and uncoupled downscalings given in Table 1. Values are averaged over a box region covering the central North Sea (see Fig. 2)

Contributions due to a future increase in surface wind stress by 30% in spring and summer are similar for all experiments, including MPI-ESM, imposing a weakening on the pycnocline intensification. Both mean present-day (3.6e−05 N m−2) and future (4.7e−05 N m−2) wind stresses vary among the experiments by less than 2%. In Cref, the counteracting effects of a pycnocline strengthening but stronger winds lead to a minor pycnocline shallowing by 0.2 m. In the uncoupled downscalings, the weaker strengthening either balances the wind effect for EF and EFroc or leads to a slight deepening by 0.3 m for RF and RFroc.

Along the continental coast, the stratification intensifies in MPI-ESM by up to 20 g m−4 due to the absence of tidal mixing. The downscaled change signals do not show this artifact owing to the incorporation of tides in the regional MPIOM setup. In experiment CFtid, the stratification increase in the southern North Sea reaches even 30 g m−4. Nevertheless, as mentioned in Sect. 3.1, the SST change signals are still well reproduced because of their close dependency on the atmospheric forcing.

3.4 General circulation

As mentioned in Sect. 2.4, linear trends of depth-averaged current velocities as well as volume transports through the main North Sea inflow and outflow sections have been calculated for the analysis of projected changes in the general circulation. In Fig. 13, the annual mean circulation for the year 1985, as being the middle of the period 1971–2000, is constructed from the regression analysis for MPI-ESM and Cref. Volume transports are given in Table 3.

Fig. 13
figure 13

Annual mean depth-averaged current velocities for MPI-ESM and the coupled downscaling Cref, representative for the period 1971–2000

Table 3 Annual mean volume transports (representative for 1971–2000) through various lateral sections shown in Fig. 2

The added value of dynamical downscalings is well demonstrated by the representation of the North Sea circulation system. The main Atlantic inflow branches via the Fair Isle Current and along the western side of the Norwegian Trench as well as the cyclonic recirculation in the Skagerrak follow realistic pathways in Cref but cannot be resolved adequately by the coarse resolution of MPI-ESM. The comparison with the uncoupled experiments (not shown) indicates that a higher grid resolution is essential also for downscaling projected circulation changes. Corresponding trends of volume transports are given in Table 4. Nevertheless, the uncoupled downscalings EF and RF tend to underestimate the changes in Atlantic inflow downscaled by Cref, except for the Norwegian Trench. Restoring temperature and salinity distributions towards MPI-ESM anomalies in experiments EFroc and RFroc generally intensifies the change signals due to the increase of the halo steric effect on the sea level, induced by the stronger salinity drop of the northeastern Atlantic water masses. Using climatological river runoff in experiment CFriv as well as including land grid cells in the forcing interpolation in EFland do not show significant impact on the downscaled circulation changes, compared to their benchmark simulations Cref and EF, respectively.

Table 4 100-year change signals (from linear regression analysis) of annual mean volume transports through various lateral sections shown in Fig. 2

3.5 Primary production

North Atlantic water masses have been identified as the main nutrient supplier for the NWES ecosystem, complemented by river loads and atmospheric input (e.g. Pätsch and Kühn 2008; Kühn et al. 2010; Thomas et al. 2010). Winter nutrient concentrations in the northeastern North Atlantic upper ocean are largely determined by (1) the ratio between nutrient-rich subarctic water and nutrient-depleted subtropical water, modulated by the size and strength of the SPG (Wade et al. 1997; Hátún et al. 2005; Johnson et al. 2013; Häkkinen et al. 2013), and (2) the maximum depth of vertical convection which upwells nutrient-rich subsurface water (Holliday 2003; Williams et al. 2011). In the global simulation MPI-ESM, inorganic nutrient concentrations in the SPG region and northeastern North Atlantic show a distinct decline under the RCP8.5 scenario. The strengthening of the vertical stratification explained in Sect. 3.1 further intensifies this general trend, yielding a reduction of biological primary production in the entire northern North Atlantic and NWES (Steinacher et al. 2010; Bopp et al. 2013). In the North Sea, the extent of the change in primary production depends on the total effect of increasing water temperature, increasing stratification, decreasing solar radiation due to higher cloud cover, decreasing nutrient import from the North Atlantic, increasing riverine nutrient loads, changes in the sediment composition and sedimentation/resuspension processes, and changing atmospheric nutrient input and gas exchange.

Time series of North Sea primary production as well as the spatial distribution of primary production trends are shown in Figs. 14 and 15, respectively. The downscaled changes in primary production mirror the negative trend projected by the global simulation MPI-ESM. The nutrient decline in the SPG and northeastern North Atlantic, however, is stronger in the regionalized simulations Cref, EF and RF, leading to stronger decreases in both nutrient import from the North Atlantic into the North Sea and North Sea primary production. Moreover, changes in river nutrient loads are not taken into account in MPI-ESM but follow historical records in the downscaling experiments. The exceptionally high nutrient input during the 1970s, 1980s and early 1990s is reflected in the change signal by a stronger primary production decline in the southern North Sea. Its relative impact is shown in Fig. 16, indicating that the stronger decline in the southern North Sea is dominated by the influence of high historical river nutrient loads. In MPI-ESM, the effects of reduced nutrient import via the English Channel and higher water temperature on the primary production are even largely balanced in the southern North Sea. Along the Shelf Edge Current, by contrast, the strengthening of the stratification and hence the nutrient decline in the upper ocean are maximum, leading to a primary production decrease in the northwestern North Sea. In the downscaling experiments, the temperature effect in the English Channel locally overcompensates the nutrient effect and induces positive primary production trends. These temperature-driven regions are also recognized in Fig. 16, showing reduced impact of river nutrient loads. As mentioned in Sect. 2.2, prescribed future nutrient concentrations in rivers do not undergo any long-term trend. The projected increase in river runoff, however, induces a proportional increase in North Sea riverine nutrient loads by about 15%.

Fig. 14
figure 14

Time series of annual accumulated depth-integrated primary production in the North Sea (10-year running mean) for MPI-ESM, the coupled downscaling Cref and uncoupled downscalings given in Table 1. Grey lines refer to the preindustrial control simulations by MPI-ESM (dashed) and its coupled downscaling (solid)

Fig. 15
figure 15

Climate change signal of annual accumulated depth-integrated primary production (RCP8.5 2071–2100 minus Historical 1971–2000) for MPI-ESM, the coupled downscaling Cref and uncoupled downscalings given in Table 1. Dotted areas mark changes less than 95% significance. Experiment CFriv is shown in Fig. 17

Fig. 16
figure 16

Relative impact of historical river nutrient loads on the climate change signal of depth-integrated primary production in experiment Cref, represented by RCP8.5 2071–2100 minus Historical 1971–2000. Values are inferred from a simulation with Cref atmospheric forcing and river runoff but constant river nutrient concentrations over the whole period 1920–2100

In EF, the decrease in North Sea primary production is generally stronger because of a weaker warming trend and a stronger nutrient decline in the northeastern North Atlantic, while the downscaled trend in RF is quite similar to the coupled reference simulation Cref. When water temperature, salinity and nutrient concentrations are restored in EFroc and RFroc towards MPI-ESM anomalies, the less intense SPG nutrient decline in MPI-ESM weakens the downscaled negative trends in North Sea primary production and enhances the positive trends, compared to EF and RF, respectively.

Simulated changes in primary production are reflected by corresponding changes in the effectivity of the shelf carbon pump (Tsunogai et al. 1999; Thomas et al. 2005). Trends in annual atmospheric CO2 uptake averaged over the North Sea region (Fig. 2) are given in Table 4. The projected weakening of the shelf carbon pump amounts to 13% in MPI-ESM, whereas for the different downscaling strategies it ranges from 22 to 27%. The central and northern North Sea are characterized by net CO2 uptake, while in the southern North Sea strong CO2 outgassing in summer and autumn leads to an annual net CO2 release (Kühn et al. 2010). In MPI-ESM, the insignificant changes in southern North Sea primary production indicate that the binding of inorganic carbon during plankton growth does not change either. The outgassing, however, increases because of lower CO2 solubility at higher temperatures, which further weakens the net atmospheric CO2 uptake of the North Sea compared to the downscaling experiments. In EFroc and RFroc, it is the weaker negative trends and stronger positive trends in southern North Sea primary production and outgassing compared to EF and RF that weaken the shelf carbon pump.

In all downscaling experiments, sediment concentrations of organic matter mirror the changes in primary production, showing a decrease in the deposition areas of the central and northern North Sea and Norwegian Trench. In experiment CFtid, the unrealistic absence of tidal currents enables sedimentation in the southern North Sea, leading to locally lower nutrient concentrations in the water column. Both mean present-day primary production in the southern North Sea and the future change signal are weaker in CFtid than in Cref by about 30%.

Effects of climatological river runoff and nutrient loads in CFriv are comparatively small, which can be attributed to the fact that the projected changes in North Sea nutrient loads (defined as the product of volume transport and nutrient concentration) are two orders of magnitude larger for inflowing North Atlantic water masses than for river discharge. In Fig. 17, experiment CFriv is compared with CF to highlight these subordinate differences between simulations with transient and climatological river runoff. The negative values along the continental coast and in particular in the Baltic Sea indicate a stronger reduction of the primary production in CFriv than in CF by about 0.3 mol C m− 2. The slight increase in riverine nutrient loads in CF due to the projected increase in river runoff locally weakens the dominating nutrient decline of the inflowing Atlantic water masses. In turn, the higher riverine fresh water supply to the northern North Atlantic strengthens the vertical stratification and hence reduces nutrient concentrations in the upper ocean. Consequently, the primary production drop in CF is stronger in the central and northern North Sea than in CFriv. Changes shown in Fig. 17 are significant in the North Sea and Atlantic insofar as they are larger than the maximum spread among the three Cref realizations (Table 5).

Fig. 17
figure 17

Difference of the projected change in annual accumulated depth-integrated primary production (RCP8.5 2071–2100 minus Historical 1971–2000) between experiments with climatological and transient river runoff and nutrient loads

Table 5 Projected changes in annual net atmospheric CO2 uptake (RCP8.5 2071–2100 minus Historical 1971–2000), averaged over the North Sea region shown in Fig. 2

4 Discussion

4.1 Sea surface temperature

The coupled downscaling Cref gains added value compared to MPI-ESM from its higher grid resolution both in the ocean and atmosphere. The resulting better representation of the ocean circulation, such as the pathway of the NAC, is a known feature of increased resolution and reduced dissipation (e.g. Bryan et al. 2007; Zhang et al. 2011; Talandier et al. 2014). In MPI-ESM, the zonal bias of the NAC induces anomalous heat flux to the atmosphere in the northeastern North Atlantic, causing a positive SST bias in the North Sea of up to 1.4 °C (compared to mean 1971–2000 observational data used for ERA40 reanalysis), which is reduced to about 0.3 °C in Cref. However, although Cref indeed generates a more realistic present-day climate, we focus here on differences in the projected change signals rather than on a comprehensive model evaluation. As explained in Sect. 3.1, complex future changes in the NAC and SPG are projected differently by MPI-ESM and Cref, leading to an intensified North Sea SST change signal in Cref. The comparison of three Cref realizations branching from different initial conditions rules out these mechanisms emerge from internal model variability.

It has been demonstrated in Fig. 4 that uncoupled downscalings using forcing fields from the parent global simulation are not able to develop independent, model-specific climate change signals, even though in our study two atmosphere models are applied. The thermal forcing largely preserves the imprint of the global model change signal and constrains the downscaled ocean response. In a subtle test simulation, we have run an uncoupled downscaling by using atmospheric thermal forcing from Cref but surface wind stress from MPI-ESM, driving an inconsistent circulation comparable to experiment EF. However, the SST change signal of Cref is still well reproduced, with a RMSD in the North Sea of just 0.03 °C. These results affirm the conclusion by Schrum et al. (2016) and Tinker et al. (2016) that the projected changes from ensemble studies performed with one uncoupled regional ocean model but different global forcings (from different global models or perturbed physics, respectively) crucially depend on the global forcing. Moreover, uncoupled downscalings based on the same global forcing tend to agree better, while coupled downscalings allow for the development of more independent results and hence show large inter-model deviations. The uncertainty due to the downscaling strategy derived from our experiments turns out similar to the uncertainties due to the parent global model (e.g. Wakelin et al. 2012) as well as the downscaling regional model (e.g. Bülow et al. 2014).

For consistency reasons, our uncoupled experiments have been driven by surface heat fluxes derived from downward radiative forcing fields. In stand-alone ocean models, however, the radiative forcing is often estimated from other parameters such as shortwave insolation and cloudiness. For such methods, the SST change signal is still tightly constrained by the atmospheric forcing but can nevertheless deviate stronger from the global model result.

The mechanisms mediating the differences in the North Sea SST change signal between MPI-ESM and Cref crucially require the coupled downscaling both to provide a sufficiently high grid resolution in the northern North Atlantic and to include the northeastern North Atlantic in the coupling domain. Effects of smaller coupling domains were also identified by Bülow et al. (2014), albeit in their study different ocean models were compared.

The sensitivity of SST to the downscaling strategy was also found for hindcast and present-day simulations by Schrum et al. (2003), Gröger et al. (2015) and Wang et al. (2015) and for a future time slice experiment in the Baltic Sea by Kjellström et al. (2005). Experiment RF has shown that for the open ocean a downscaling of the atmospheric forcing does not yield independent results unless air–sea interactions are coupled interactively. In near-coastal areas, however, experiment EFland revealed that in the interpolation procedure of the meteorological forcing fields, grid cells containing land fractions need to be excluded from the source data to avoid spurious influences of land–atmosphere interaction. Atmospheric downscaling naturally mitigates this problem due to the higher spatial resolution of the land–ocean transition zone in regional atmosphere models. The experiments with three-dimensional restoring in the ocean EFroc and RFroc have further confirmed that the oceanic forcing at the open lateral boundaries of stand-alone regional ocean models is of secondary importance for downscaling SST change signals in the North Sea.

Simulated mean present-day SST indeed differs in each downscaling experiment, varying in the North Sea by about 0.5 °C. Yet, the downscaled change signals are characterized by systematic similarities: All uncoupled downscalings reproduce the signal of the forcing global model. This dependency allows us to conclude that the downscaling strategy in fact has a stronger impact on the projected change signal than the model skill in simulating accurate present-day conditions, even for the considered high emission scenario RCP8.5.

4.2 Sea surface salinity

The higher grid resolution in Cref also influences the projected salinity change in the northeastern North Atlantic, which leads to different freshening trends of the water masses entering the NWES in MPI-ESM and Cref. The change signals downscaled by the uncoupled experiments EF and RF already represent a reasonable approximation of Cref, owing to the advantages of the stretched grid configuration in the ocean. These results are in agreement with observational studies (e.g. New and Smythe-Wright 2001; Bersch 2002; Holliday 2003; Pollard et al. 2004; Ullgren and White 2010), providing evidence that variations in the northeastern North Atlantic upper ocean salinity are governed by changes in the regional circulation rather than local air–sea interaction. Thus, the usage of atmospheric forcing fields from both the parent global simulation or an atmosphere-only downscaling proves rather unproblematic for uncoupled downscalings. However, experiment EFland revealed once more that land–atmosphere interaction interpolated in the meteorological forcing fields lead to substantial biases along the North Sea coasts.

Experiments EFroc and RFroc have shown that downscalings by regional ocean models forced with salinity anomalies of a global simulation suffer from inconsistencies between the change signals of the global simulation and a fully coupled downscaling. For instance, at the northern open boundary off Norway, salinity distributions of the global model do not fit with inflow and outflow conditions of the high-resolution circulation. Such inconsistencies introduce large errors at the domain margins and hence influence the downscaled change signal. A larger domain size including the northeastern North Atlantic might reduce the influence of the global forcing data, as also indicated by Schrum et al. (2016) where different forcing global models yielded a similar spread in downscaled North Sea salinity changes as obtained in our study from different downscaling strategies. Efforts in enlarging the model domain westwards into the North Atlantic have been made e.g. by Wakelin et al. (2009) for present-day simulations and Holt et al. (2010) and Tinker et al. (2016) for future projections.

Furthermore, the incorporation of transient river runoff proves essential to account for the intensified hydrological cycle that comes along with a warming global climate. Projected salinity changes along the continental coast and in the Baltic Sea are distinctly underestimated when climatological river runoff is applied.

As concluded for SST, simulated mean present-day SSS in the North Sea varies by about 0.2 g kg−1 between Cref and its fair approximations EF and RF, ascribing more importance to the downscaling strategy than to the model skill in simulating accurate present-day conditions.

4.3 Vertical stratification

Dependencies on the downscaling strategy are also identified for changes in the intensity of the seasonal North Sea stratification. Although the uncoupled experiments benefit from the higher grid resolution in the early thermocline period during spring, the change of the pycnocline strength in summer is limited by the influence of the atmospheric thermal forcing inherited from the parent global model. The warming trend is weaker in MPI-ESM than in Cref, therefore weakening the downscaled changes in the vertical density gradient due to the nonlinear relation between water density and temperature. The large uncertainty in the downscaled stratification changes by Holt et al. (2010), Gröger et al. (2013) and Mathis and Pohlmann (2014) potentially result from the diversity of applied downscaling strategies.

In our study, simulated present-day conditions of the North Sea stratification vary by the same order of magnitude as the downscaled change signals. However, experiments yielding minimum and maximum present-day stratifications, EFroc and RF respectively, do not yield extreme change signals but are systematically biased toward the global model result. Hence, also for ocean stratification, the change signal is more sensitive to the downscaling strategy than to the mean state of the North Sea system.

4.4 General circulation

Projected trends in the North Sea circulation represent the combined response of changes in the surface wind stress and the barotropic and baroclinic pressure gradients due to changes in the wind conditions, local steric sea level, as well as temperature and salinity distributions. In general, such features are resolved in greater detail by simulations using the regional ocean grid configuration compared to the global simulation MPI-ESM. A similar spatial pattern of North Sea circulation changes is identified by Mathis and Pohlmann (2014) and Tinker et al. (2016), who presented uncoupled downscalings of the IPCC AR4 scenario A1B. Nevertheless, none of the uncoupled experiments considered in our study succeed in reproducing Cref changes quantitatively. Largest errors occur in the simulations with prescribed open boundary conditions, while deviations are lowest when an unbounded global ocean is used. In consistency to these results, weak influences of open boundary conditions were found by Wakelin et al. (2009) for a model domain having the Atlantic margins situated in the deep ocean off the shelf break.

4.5 Primary production

The discussed changes in water temperature, volume transports, and nutrient concentrations sensitively affect the North Sea primary production. Influences of the global forcing are recognized in the uncoupled experiments, in particular when oceanic boundary conditions are involved. In selected areas, such as the Southern Bight or Norwegian Trench, the downscaling strategy can even be decisive for the sign of the downscaled trends. Moreover, simulated present-day primary production in the North Sea varies over all downscaling experiments by about 2.2 mol C m−2, which is even greater than the magnitudes of the projected change signals.

In Wakelin et al. (2012), magnitudes and signs of primary production trends from an uncoupled regional ocean model with domain size limited to the North Sea-Baltic Sea showed a strong dependency on the global forcing data, similar to our findings. Furthermore, in Holt et al. (2016) and Schrum et al. (2016) cross-shelf exchange of nutrients has been proposed the most likely candidate to explain discrepancies in projected North Sea primary production, downscaled by uncoupled regional ocean models of different domain size. Our experiments confirm this suggestion in showing maximum discrepancies between simulations utilizing the global ocean domain and their North Sea-limited counterparts with prescribed oceanic boundary conditions. In the latter simulations, exchange processes between the North Atlantic and NWES are determined by the performance of the parent global model and handed down to the regional model at its domain margins. Hence, as concluded for salinity, the spatial coverage of the regional model domain is of key importance for downscaling changes in North Sea primary production.

5 Conclusions

From the presented model experiments we conclude that for the NWES, the incorporation of coupled air–sea interaction in dynamical downscalings of global climate projections is essential for the regional model system to develop independent results. Comprehensive understanding of both the processes mediating between climatic variations in the atmosphere and regional responses in the ocean as well as related feedback mechanisms can only be attained by investigating coupled ocean–atmosphere simulations. Nevertheless, future projections of selected parameters (e.g. salinity) are able to be regionalized adequately by uncoupled ocean models too. When regional ocean models are used, however, the prescribed oceanic boundary conditions can introduce large errors to the downscaled change signals, even in coupled simulations.

Projected changes in North Sea water temperature are found to be governed mainly by atmospheric conditions formed in the northeastern North Atlantic. We therefore recommend the coupling domain of the downscaling model system to include the northeastern North Atlantic region in order to allow for local ocean–atmosphere responses related to changes in the NAC pathway and transport rates. Moreover, the grid resolution in the North Atlantic ocean should be sufficiently high, otherwise it is important that a realistic representation of the NAC is captured already by the parent global simulation.

If a regional coupled ocean–atmosphere model system cannot be applied as downscaling strategy, grid cells containing land fractions should be excluded in the interpolation of the meteorological forcing fields to avoid distorting influences of land–atmosphere interaction.

The sensitivity of projected temperature changes to the downscaling strategy also affects the seasonal stratification in the North Sea. In the uncoupled downscalings, the change signals of the pycnocline depth and intensity are influenced by the atmospheric thermal forcing taken from the global model.

All downscalings of the North Sea general circulation considered here clearly benefit from the higher grid resolution of the regional model systems and the resulting more accurate representation of the bottom topography. The patterns of projected circulation changes in the uncoupled downscalings are consistent with the coupled reference downscaling and show the same level of detail. Nevertheless, quantitative changes in transport rates still depend on the downscaling strategy.

The approximation of riverine freshwater supply using climatological runoff has a strong impact on the downscaled salinity trends but yields reasonable results for changes in water temperature and ocean circulation.

Projected changes in North Sea primary production are largely governed by changes in the physical conditions of the North Atlantic climate system. The dependency of changes in water temperature, salinity, and the ocean circulation on the downscaling strategy thus is transferred to changes in primary production and the effectivity of the shelf carbon pump. Locally, the downscaling strategy can even be decisive for the sign of the change signal.

The downscaling strategies compared here cover a wide range of complexity, decreasing gradually from a fully coupled ocean–atmosphere climate system model to a stand-alone ocean model forced with boundary conditions extracted from other general and regional circulation models of different grid resolutions. Remarkably, the resulting change signals in the North Sea reveal to depend stronger on the individual downscaling strategy than on the fidelity of simulated present-day conditions. Being most valid for SST, even an uncoupled experiment with a drastic misrepresentation of shelf sea dynamics due to the exclusion of tidal waves does not yield significant deviations in the SST change signal, underlining its close dependency on the atmospheric forcing. SST change signals of uncoupled downscalings heavily rely on the atmospheric conditions prescribed at the air–sea interface. Speaking in terms of processes, in the coupled experiment, the projected temperature rise in the NWES can be traced back bottom up to changes in the North Atlantic heat budget, whereas in the uncoupled experiments, it is imposed top down by changes in the atmosphere.

Our results generally affirm common practice in experiment design that the choice of the most appropriate downscaling strategy has to be rested on the variable of greatest interest, in due consideration of available model systems and computational costs. Uncoupled regionalizations of SST change signals are primarily dependent on the meteorological forcing, whereas changes in salinity, transport rates and primary production show strong dependencies on the open boundary conditions and hence on the domain size. Best results are obtained when the regional high-resolution domain not only includes the NWES but also the northeastern North Atlantic. In our experiments, the stretched grid configuration of the formally global ocean component elegantly provides a higher grid resolution in the entire northern North Atlantic. For conventional NWES model systems with open lateral boundaries, it is therefore recommended to enlarge the limited ocean domain into the North Atlantic.

Inter-model comparisons of climate projections regionalized for the NWES show large uncertainties in the downscaled change signals. Our experiments with a consistent regionally coupled climate system model indicate that the uncertainties due to different downscaling strategies are of the same order of magnitude as the uncertainties due to different forcing global models and downscaling regional models.