Interactive comment on “ Evaluation of the carbon cycle components in the Norwegian Earth System Model ( NorESM ) ” by J . F . Tjiputra

In this manuscript, Tjiputra et al. focus on evaluating the land and ocean carbon components of the newly developed Norwgian Earth System Model (NorESM). This model has been used to run several simulations (including future scenarios) contributing to the Coupled Model Intercomparison Project phase 5 (CMIP5), and is (or will be) used in many inter-model comparison studies. It is thus timely to propose such an evaluation.


Introduction
In addition to the atmospheric radiative properties, global climate dynamics also depend on the complex simultaneous interactions between the atmosphere, ocean, and land. These interactions are not only non-linear, but also introduce feedbacks of different magnitude and signs to the climate system. In order to understand the sophisticated interplay between the different components, Earth system models have been developed by the geoscientific community in recent years. The J. F. Tjiputra et al.: NorESM carbon cycle last Intergovernmental Panel for Climate Change Assessment Report (IPCC-AR4) stated that in order to produce a reliable future climate projection, such models are required (Denman et al., 2007).
An Earth system model typically consists of a global physical climate model coupled with land and ocean biogeochemical models (Bretherton, 1985), but can be extended to include further processes and reservoirs (e.g. anthropogenic interactions). As an integrated global model system, such model does not only simulate the change in climate physical variability due to anthropogenic drivers, but also includes climate feedbacks associated with the global carbon cycle. These feedback processes include changes in terrestrial and oceanic carbon uptake due to anthropogenic CO 2 emissions, perturbed surface temperature, precipitation, ocean circulation, sea ice extent, biological productivity, etc. A new Norwegian Earth System Model (NorESM) was recently developed . The NorESM is among the many models used worldwide to project future climate change and is used for the upcoming IPCC-AR5. The ocean carbon cycle model in NorESM is unique compared to most other models due to its coupling with an isopycnic ocean general circulation model. One of the advantages of such a coupling is more accurate representation of the transport and mixing of biogeochemical tracers along the isopycnals in the interior ocean. The isopycnic model also avoids physically inappropriate splitting of transport and diffusion processes in horizontal and vertical components as done in a more common z-coordinate model (Bleck, 1998;Haidvogel and Beckmann, 1999). Through the vertically adaptive grid, areas of high horizontal and vertical density gradients can be simulated well by the model. An earlier study by Assmann et al. (2010) also shows that higher spatial gradients in tracer distributions can be achieved. On the other hand, depending on the number of density surfaces, an isopycnic coordinate model may or may not represent well the buoyancy driven circulation. In areas of low density gradients, the model cannot simulate velocity shear and surface processes are also more difficult to simulate in outcropping layers, which is avoided through introduction of a non-isopycnic surface mixed layer.
When compared to the previous generation Bergen Earth system model (BCM-C, Tjiputra et al., 2010a), the ocean model resolution and several physical parameterizations have been considerably improved as discussed in this study. In addition, the NorESM adopts new atmospheric, land, and sea ice models, which are based on the Community Climate System Model (CCSM4, Gent et al., 2011) or Community Earth System Model (CESM1, Lindsay et al., 2013). We note that that the terrestrial component (Community Land Model version 4, CLM4) is developed and maintained by the National Center for Atmospheric Research in collaboration with university community and Department of Energy Labs in the United States. Here, only limited analysis of the terrestrial carbon cycle model will be presented since no significant differences in the CLM4 carbon cycle are found when sim-ulated as part of the NorESM or CESM1 frameworks (e.g., Arora et al., 2013;Jones et al., 2013). For more in-depth reviews and evaluation of the CLM4, the readers are referred to recent publications by CLM4 developers (e.g., Lawrence et al., 2011;Gent et al., 2011).
In this manuscript, we focus on reviewing the basic performance of the ocean and land carbon cycle components of the NorESM. In order to assess the quality of model projections, it is necessary to evaluate the respective model simulations against the available present-day climate and biogeochemical states. The biogeochemical states simulated by an Earth system model strongly depend on the quality of the physical fields in the model. Therefore, we will first analyze statistically how well the model simulates the climatological states of sea surface temperature and salinity. Next, the model simulated mean state of ocean biogeochemical tracers, such as oxygen, phosphate, and air-sea CO 2 gas exchange are compared with the observations from the World Ocean Atlas (WOA) and other observationally based estimates. Finally, we compare the land carbon pools, vegetation productivity, and respiration simulated by NorESM with the observationally based values (e.g. from the FLUXNET network of eddy covariance towers, among others). Note that the physical parameters affecting the terrestrial biogeochemistry, such as mean surface sensible and latent heat flux from land, surface air temperature over land, spatial distribution of cloud fraction, and mean precipitation are separately discussed in an accompanying manuscript by Bentsen et al. (2012).
The model description is presented in Sect. 2. Section 3 describes the model experiment setup. The results of the experiment are discussed in Sect. 4. Finally, conclusions are summarized in Sect. 5.

Model description
The Norwegian Earth System Model (NorESM) is partly based on the recently released Community Climate System Model (CCSM4, Gent et al., 2011), which is maintained by the National Center for Atmospheric Research and is developed in partnership with collaborators funded primarily by the US National Science Foundation and the Department of Energy. The NorESM adopts the original coupler (CPL7), as well as terrestrial (CLM4, Lawrence et al., 2011), and sea ice (CICE4, Holland et al., 2012) components from CCSM4. The chemistry package in the atmospheric model (CAM4, Neale et al., 2013) is improved following Seland et al. (2008). In this section, we briefly describe the atmospheric, ocean, and land components of the NorESM. Since the physical components are documented in more detail by Bentsen et al. (2012) and Iversen et al. (2012), here, major emphasis is placed on the carbon cycle components. The atmospheric component in NorESM (CAM4-Oslo) is a modified version of the NCAR Community Atmospheric Model. The reader is referred to a manuscript by Neale et al. (2013) for the original CAM4 model description. Here, the main difference from the CAM4 model is the improvement in the aerosol and aerosol-cloud interactions as discussed in Seland et al. (2008) and Kirkevåg et al. (2008). For example, CAM4-Oslo includes tropospheric oxidants (e.g. OH, O 3 , and H 2 O 2 ) and a replenishment time which increases with the cloud volume fraction. The ratio of organic matter to organic carbon aerosols related to the biomass burning primary organic matter emissions has been updated following Formenti et al. (2003). The prescribed AeroCom (Aerosol Comparisons project) sea salt emissions are replaced by prognostic (wind and temperature dependent) emissions (Struthers et al., 2011). The relative humidity threshold for formation of low clouds is reduced to 90 %, and the critical droplet volume radius for onset of auto-conversion is increased to 14 µm. The new aerosol module in CAM4-Oslo reduces the model bias with respect to near-surface mass concentration and aerosol optical depth. More detailed description and improvements of CAM4-Oslo model are available in Kirkevåg et al. (2013).

Ocean general circulation model
The ocean physical component of NorESM originates from the Miami Isopycnic Coordinate Ocean Model (MICOM; Bleck and Smith, 1990;Bleck et al., 1992) but with modified numerics and physics as described in Bentsen et al. (2012). The main benefits of an isopycnal model is accurate mixing and transport along isopycnic surfaces and good control on the diapycnal mixing that facilitates preservation of water masses during long model integrations. The vertical coordinate is potential density with reference pressure at 2000 dbar and provides reasonable neutrality of model layers in large regions of the ocean (McDougall and Jackett, 2005). The incremental remapping algorithm (Dukowicz and Baumgardner, 2000) is used for transport of layer thickness and tracers. The robust, accurate and efficient handling of numerous biogeochemical tracers was an important reason for selecting this transport algorithm. The analysis of biogeochemical tracers of the Hamburg Oceanic Carbon Cycle (HAMOCC) model in an earlier version of this ocean model (Assmann et al., 2010) contributed to revealing issues in the representation of the Southern Ocean. Several of the later developments of the dynamical core and physical parameterizations was targeted to resolve some of these deficiencies.
Originally MICOM provided two options of turbulent kinetic energy (TKE) balance equations for the parameterization of mixed layer depth, based on Kraus and Turner (1967) and Gaspar (1988). Both formulations overestimated the mixed layer depth at high latitudes. We achieved reduced mixed layer depth biases by using a TKE model based on Oberhuber (1993) extended with a parameterization of mixed layer restratification by eddies (Fox-Kemper et al., 2008). To improve the representation of water masses in weakly stratified high latitude halocline, the static stability of the uppermost layers are measured by in-situ density jumps across layer interfaces, thus allowing for layers that are unstable with respect to potential density referenced at 2000 dbar to exist. To maintain the warm layer beneath the Southern Ocean halocline, we found it important to increase the thickness and isopycnal eddy diffusivity associated with the Antarctic Circumpolar Current. This was achieved by parameterizing the diffusivity according to the diagnostic version of the eddy closure of Eden and Greatbatch (2008) as implemented by Eden et al. (2009). Further, to reduce sea surface salinity and stratification biases at high latitudes, salt released during freezing of sea ice can be distributed below the mixed layer.
The model is configured on a grid with 1.125 • horizontal resolution along the equator with grid singularities over Antarctica and Greenland. The model uses potential density as vertical coordinate. It adopts a total of 53 isopycnic layers with potential densities ranging from 28.202 to 37.800 kg m −3 . The topmost two layers are located in the surface mixed layer.

Ocean carbon cycle model
The NorESM employs the Hamburg Oceanic Carbon Cycle (HAMOCC5) model, which is based on the original work of Maier-Reimer (1993) and subsequent refinements (Maier-Reimer et al., 2005). It was recently coupled with the isopycnic MICOM model by Assmann et al. (2010). The HAMOCC5 model is embedded into the MICOM as a module, and hence has the same spatial resolution. Different from the earlier version (Tjiputra et al., 2010a), the mixed layer is divided into two layers, the uppermost of approximately 10 m depth, followed by a second layer representing the remainder of the mixed layer. The single 10 m layer improves the simulation of the surface ocean response to the atmospheric forcing (e.g. with respect to air-sea heat flux) and consequently, the process representations such as those of sea ice formation.
The current version of the HAMOCC5 model includes a revised inorganic seawater carbon chemistry following the Ocean Carbon-cycle Model Intercomparison Project (OCMIP) protocols. The oceanic partial pressure of CO 2 (pCO 2 ) in the model is prognostically computed as a function of temperature, salinity, dissolved inorganic carbon (DIC), total alkalinity (TALK), and pressure. The model also includes a 12-layers sediment model (Heinze et al., 1999), which is primarily relevant for long-term simulations (> 1000 yr). Nevertheless, the sediment model was activated in all submitted CMIP5 simulations. The model also does not include any weathering fluxes.
HAMOCC5 employs an NPZD-type ecosystem model, extended to include dissolved organic carbon (DOC). The ecosystem model was initially implemented by Six and Maier-Reimer (1996). The nutrient compartment is represented by three macronutrients (phosphate, nitrate, and silicate), and one micronutrient (dissolved iron). The phytoplankton growth rate is formulated as a function of temperature and light availability according to Smith (1936) and Eppley (1972). The available light is formulated based on the prognostic incoming solar radiation from the atmospheric model reaching the ocean surface. Light penetration decreases with depth according to an exponential function with a gradual extinction factor formulated as a function of water depth and chlorophyll (phytoplankton-to-chlorophyll constant ratio is used) concentration (Maier-Reimer et al., 2005). In addition, phytoplankton growth is also co-limited by availability of phosphate, nitrate, and dissolved iron. Climatology monthly aerial iron deposition based on Mahowald et al. (2005) is applied in all model simulations. A fraction of the iron deposition (3.5 %) is assumed to be immediately dissolved, where part of it is immediately available for biological production. In nitrate-limited oligotrophic regions, the model assumes nitrogen fixation by cyanobacteria, which is parameterized as the relaxation of the nitrate concentration at surface layer to the available phosphate concentration, through Redfield ratio. Nitrogen fixation only occurs in the uppermost surface layer, with fixation rate of 0.5 % day −1 of the difference between the phosphate and nitrate (Maier-Reimer et al., 2005). The constant Redfield ratio adopted in the model is P : C : N : O 2 = 1 : 122 : 16 : −172. Phytoplankton loss is modelled by specific mortality and exudation rates as well as zooplankton consumption. The dissolved organic carbon (DOC) produced by phytoplankton and zooplankton (through constant exudation and excretion rates) is freely advected by the ocean circulation and is remineralized at a constant rate (whenever the required oxygen is available). The parameterizations of the growth, grazing, and remineralization rates in the ecosystem module adopt a constant Redfield ratio to regulate the flow of carbon, oxygen, and nutrients between the different compartments.
The particles produced within the euphotic zone (i.e. the upper 100 m) depth are freely advected by the ocean circulation and exported with a prescribed vertical sinking speed. Particulate organic carbon (POC) associated with dead phytoplankton and zooplankton is transported vertically at 5 m day −1 . As POC sinks vertically, it is remineralized at a constant rate of 0.02 day −1 and according to oxygen availability. Particulate inorganic matter calcium carbonate (PIC) and opal shells (biogenic silica) sink by 30 and 60 m day −1 , respectively. In addition, the particulate tracers in HAMOCC are also advected by the ocean circulation. The distribution of calcium carbonate and biogenic silica export is formulated as a function of rain ratio and silicic acid concentration (Heinze, 2004). In general, when the silicic acid concentration is high, the export of biogenic silica increases and export of calcium carbonate decreases. Once exported out of the euphotic layer, biogenic silica is decomposed at depth with a constant redissolution rate. The calcium carbonate shells only dissolves when the simulated carbonate ion concentration is less than the saturation state (i.e. CO −2 3SAT ) with a dissolvable maximum of 5 % of calcium carbonate per time step. The nonremineralized particulate materials, reaching the sea floor sediment, undergo chemical reactions with the sediment pore waters, bioturbation and vertical advection within the sediment. Note that the current version of the model does not take into account influx of carbon and nutrients from the continental rivers, though lateral inflows from rivers can be activated.
The exchange of oxygen and CO 2 between the atmosphere and the surface ocean is simulated according to Wanninkhof (1992) formulation. In principle, the air-sea gas exchange is determined by three components: the gas solubility in seawater, the gas transfer rate, and the gradient of the gas partial pressure between the atmosphere and the ocean surface. The solubility of O 2 and CO 2 gases in seawater are derived as functions of surface ocean temperature and salinity following Weiss (1970Weiss ( , 1974. The gas transfer rate is computed as a function of the Schmidt number and proportional to the square of surface wind speed. The model assumes that gas exchange occurs in ice-free regions only. The main biogeochemical processes simulated in HAMOCC5 model are summarized in Table 1.

Land model
The NorESM adopts the Community Land Model version 4 (CLM4), which is the latest offspring of the CLM family (Lawrence et al., 2012a). An extensive description of the model, including summary of all simulated land carbon and nitrogen compartments can be found at the website http://www.cesm.ucar.edu/models/ccsm4.0/clm/, as well as in the literature Lawrence et al., 2011). Only a brief overview of the model functionalities will be given in this manuscript.
The CLM4 integrates ecosystem cycling on the continental surface of water, energy, chemical elements, and trace gases. Spatial land surface heterogeneity is represented in a sub-grid cell hierarchy of multiple land units, columns, and plant functional types (PFTs). The land unit captures large-scale patterns of the landscape in the form of glaciers, lakes, wetlands, cities, and vegetated areas. The column level is used to represent the potential variability in the soil and snow state variables within a single land unit. The exchanges between the land surface and the atmosphere are defined at the PFT level. The vegetation state variables as well as the treatment for bare ground are computed at the PFT level. Sub-grid entities (land unit, column, and PFT) are independent from each other and maintain their own prognostic variables. All sub-grid units within a grid cell experience the  Heinze et al. (1999) Prognostic variables DIC, TALK, phosphate, nitrate, silicate, dissolved iron, oxygen, dinitrogen, phytoplankton (diatom and calcifiers), PIC (Calcium carbonate and opal shells), POC, DOC, zooplankton, net primary production, export productions (organic and inorganic), dimethyl sulfide (DMS), air-sea CO 2 , N 2 , O 2 , and N 2 O fluxes same atmospheric forcing. In each grid-cell, sub-grid outputs are averaged and weighted by their fractional areas before they are transferred to the atmospheric model. A uniform soil type is maintained throughout one grid cell. Thermal and hydrologic properties of the soil depend on its texture (Clapp and Hornberger, 1978) and on its organic matter content (Lawrence et al., 2008). The soil profile is represented down to 50 m. The 10 upper layers are hydrologically active (0-3.8 m) while the five bedrock layers (below 3.8 m) act as a thermal reservoir. Biogeophysical processes simulated by CLM4 include solar and longwave radiation interactions with vegetation canopy and soil, momentum and turbulent fluxes from canopy and soil, heat transfer in soil and snow, hydrology of canopy, soil, and snow, and stomatal physiology as well as photosynthesis. The hydrology scheme in CLM4 includes the representation of water fluxes and reservoirs in snow layers, canopies, soils (including soil ice) and in an unconfined aquifer, as well as in glaciers, lakes, and rivers. The hydrology scheme uses the Richards equation following Zeng and Decker (2009). The soil water equations are solved for the top 10 layers of the profile. For each soil layer, the scheme simulates water transport taking into account infiltration, surface and sub-surface runoff, gradient diffusion, gravity, canopy transpiration through root extraction, and interactions with groundwater. An unconfined aquifer is added to the bottom of the soil column. Surface runoff in the model consists of overland flow due to saturation excess and infiltration excess. The saturated fraction of the soil column is a function of the water content, the fraction of surface layers being frozen (Niu and Yang, 2006), and the topography. The snow is represented by up to five snow layers. The snow parameterizations are primarily based on Dai and Zeng (1997). Snow evolution includes three types of processes: metamorphism, load compaction, and melting. The snow model in CLM4 includes new parameterizations for aerosol black carbon and dust deposition, grain-size dependent snow aging, vertically resolved snowpack heating (Flanner et al., 2007), snow cover fraction (Niu and Yang, 2006), and burial of short vegetation fraction (Wang and Zeng, 2009).
The carbon-nitrogen (CN) cycle model represents the biogeochemistry of carbon and nitrogen in vegetation, litter and soil-organic matter (Thornton et al., 2007). The assimilated carbon is estimated from photosynthesis. The amount of nitrogen available for plants is the sum of the nitrogen uptake in the soil and the re-translocation of nitrogen from senescing tissues. The nitrogen limitation acts on the gross primary production (GPP). A potential GPP is calculated from leaf photosynthetic rate without nitrogen constraint. The model diagnoses the needs of nitrogen to achieve this potential GPP, and accordingly, the actual GPP is decreased for nitrogen limitation. Inputs and losses of mineral nitrogen are taken into account in the form of nitrogen-atmospheric deposition, biological nitrogen fixation, denitrification, leaching, and losses in fire. A prognostic phenology scheme controls transfers of stored carbon and nitrogen out of storage pools for new tissues growth and losses of plant tissues to litter pools. Leaf and stem area indices for each plant functional type are derived from satellite data following the Lawrence et al. (2011) methodology. The spatial distribution of PFTs is updated on an annual time step. Transient land cover and land use change data sets used in CLM4 (Lawrence et al., 2012b) are derived from a global historical transient land use and land cover change data set (LUHa.v1) covering the period 1850-2005 (Hurtt et al., 2006).
The land model simulates both autotrophic and heterotrophic respirations. The autotrophic respiration is simulated as the sum of maintenance and growth respiration processes. In living biomass, maintenance respiration is formulated as a function of temperature and tissue N concentration (Thornton and Rosenbloom, 2005). Growth respiration is calculated as a constant factor of the carbon allocated to growth of new tissues. For computation of heterotrophic respiration, CLM-CN uses a converging cascade representation of soil organic matter dynamics (Thornton et al., 2002;Thornton and Rosenbloom, 2005). The model simulates three litter pools (labile, cellulose, and lignin) and a coarse woody debris pool together with four soil organic matter pools (fast, medium, slow, and very slow decomposition rates). The three litter pools differ in base decomposition rate, with turnover time ranging from 20 h to 71 days. The four soil organic matter pools differ in base decomposition rate (turnover time is 14 days to 27 yr) and C:N ratio (10-12). There is no distinction between surface and below-ground pools. The soil organic matter dynamics is conditioned by the soil-nitrogen cycle. In the case of nitrogen mineralization, the soil organic matter base decomposition rates are computed as functions of soil temperature (Lloyd and Taylor, 1994) and soil water potential (Orchard and Cook, 1983;Andrén and Paustian, 1987). In the case of nitrogen immobilization, the decomposition is limited by the nitrogen availability and by the plant demand for mineral nitrogen.

Experiment design
Prior to any experiments, the NorESM model as a coupled system is spun up for 900 yr. During this spin-up we fixed the atmospheric CO 2 concentration at 284.7 ppm. For the spinup, the oceanic tracer fields were initialized as follows: the initial fields of oxygen and nutrients are derived from the World Ocean Atlas (WOA) (Garcia et al., 2010a,b). The dissolved inorganic carbon (DIC) and alkalinity fields are taken from the Global Data Analysis Project (GLODAP) data set . We use the 1 • × 1 • gridded annual data of both data sets. Since the initialization is followed by a 900 yr spin-up, no special care was taken to conserve mass of the WOA and GLODAP fields. Rather, for each model grid cell, the closest data point is sought and a 10 • × 10 • average around this point is assigned to the respective model grid cell. If no data is available at the location of a model grid cell (e.g. GLODAP provides no data in the Arctic ocean), a mean regional or a mean global profile is used there. The other biogeochemical variables in the water column (e.g. phytoplankton, zooplankton, dissolved organic carbon, etc.) and sediment compartments are initialized to zero or small but nonzero values. The spin-up is important, particularly for the oceanic carbon cycle tracers to reach distributions which are reasonably close to equilibrium states. Note that due to the limited time and computational power, it is not feasible to spin up the sediment compartment to reach a steady state. In the future, we plan to spin up the model sediment with an acceleration technique. Nevertheless, for the current CMIP5 experiments setup (i.e. integration times of a few hundred years), the sediment water column interaction contributes little to the ocean tracer inventories (a reason for which most modeling groups do not consider the sediment at all). For the terrestrial spin-up, the CLM4 component uses the land cover change data set (LUHa.v1, Hurtt et al., 2006) of the first sim-ulation year, 1850, as initial condition. Other details for the CLM4 spin-up configuration are described in Thornton and Rosenbloom (2005). After approximately 500 model years, the simulated mean global surface air temperature reached an equilibrium mean state of approximately 13.6 • C.
Following the spin-up, we performed two branch simulations, a control (CTRL) and a historical (HIST). For the CTRL simulation, we essentially extended the spin-up for another 250 yr (1850-2100). Here the non-evolving, preindustrial atmospheric aerosols and CO 2 concentration following the CMIP5 protocols are prescribed in the simulation. In addition, there is no anthropogenic land-use change applied in the CTRL. For the HIST simulation, the model is simulated for 156 yr, representing the historical period from year 1850 to 2005. In the HIST simulation, observed changes in climate parameters are prescribed. These parameters include evolving atmospheric CO 2 concentration, anthropogenic aerosols and natural aerosols related to historical volcanic eruptions, as well as time-varying solar forcing. In addition, changes in land-use due to human activity are included in the HIST simulation. Note that both the CTRL and HIST simulations are performed with prescribed atmospheric CO 2 concentrations, and not with prescribed CO 2 emissions. The above conditions are applied according to the CMIP5 experimental design, documented by Taylor et al. (2012).

Transient global temperature
The transient response of the global surface temperature simulated in the HIST period agrees reasonably well with observations. At the end of the HIST simulation (i.e. year 2006), the global mean 2-m temperature has increased by approximately 0.9 • C, whilst the SST has increased by 0.6 • C relative to the year 1850. Figure 1 shows the evolution of global mean surface temperature anomaly (relative to  period) simulated by the NorESM together with observational based estimates from the Hadley Climate Research Unit (HadCRUT3, Brohan et al., 2006). The amplitude of the simulated multi-decadal variability throughout the historical period is in line with the observations. Following the 1991 mount Pinatubo eruption, the model simulates stronger cooling followed by stronger warming toward the end of the simulation.
Comparison between the observed and simulated global mean surface temperature trend over the historical periods and contributions of different elements in the simulated temporal variability are discussed further in accompanying manuscripts of Bentsen et al. (2012) and Iversen et al. (2012). Bentsen et al. (2012) show that the mean temporal trends of three historical NorESM ensemble members follow the observed trend closely. For example, both the observation and Plotted together is the observational estimate from the HadCRUT3 product (black line) with the respective uncertainty range in grey shades (Brohan et al., 2006). model simulations yield 0.14 • C decade −1 warming trend for the 1961-2010 periods. In their study, Iversen et al. (2012) show that the simulated increase in warming trend since the 1970s is predominantly attributed to the combination of opposing radiative forcing of greenhouse gas and the aerosols.

Ocean physical and biogeochemical properties
Realistic simulation of the ocean biogeochemistry depends strongly on the background physical processes (Doney et al., 2004). Thus, in addition to different biogeochemical fields, we also assess the model ability in simulating relevant physical fields, such as the temperature, salinity, and mixed layer depth. For observational-based climatology estimates, such as temperature, salinity, oxygen, or phosphate, we compare the HIST simulation averaged over 1980-1999 period. For other observations such as DIC, ALK, and air-sea CO 2 fluxes, which are available in larger amounts only in more recent times, we compare them with the averaged model output over the 1996-2005 from the HIST simulation. Figure 2 shows the statistical summary of the simulated temperature and salinity as well as key biogeochemical tracers distribution when compared to the observation in form of a Taylor diagram (Taylor, 2001).

Physical fields
Compared to the WOA estimates , the model simulates realistically the mean annual sea surface temperature, in terms of amplitude and spatial distribution, as shown in Fig. 3. The Taylor  Taylor diagram of non-area weighted statistical summary between the simulated and observed annually averaged (climatology) of ( ) ocean temperature, ( ) salinity, ( ) phosphate, ( ) dissolved oxygen, ( ) silicate, (♦) dissolved inorganic carbon, and ( ) alkalinity. Shown here are comparison at surface (magenta), 1000 m (blue), and 3000 m (green) depths. Observations are based on the World Ocean Atlas (WOA) and GLODAP (see also text). The black circle represents the observations. All standard deviations are normalized to the respective observed standard deviation. For temperature, salinity, phosphate, silicate and oxygen, we compare the HIST simulation from 1980-1999 period, whereas for DIC and ALK, we use the 1996-2005 simulation period. good model-data fit for surface temperature with correlation coefficient close to one. At depth, the vertical temperature structure in the Pacific is comparable with the observations. However, in the Atlantic section, the deep water temperature is noticeably warmer than the observations. The bias in the horizontal temperature distribution also increases from surface to deeper layers, as shown in Fig. 2. The relatively high Atlantic deep water temperature is partly attributed to the anomalously strong Atlantic meridional overturning circulation (AMOC) strength in our present simulation, as also discussed further in Bentsen et al. (2012). Here, the NorESM yields a relatively strong mean AMOC strength of 32 Sv compared to the observed estimates of 15.75 ± 1.6 Sv (Ganachaud and Wunsch, 2000;Lumpkin and Speer, 2003). The NorESM simulates a steady AMOC strength over the historical and control simulation periods, as shown in the Supplement.
The spatial distribution of the salinity field in NorESM broadly agrees with observations ) with noticeable differences, as shown in Fig. 4. At the surface, the model generally simulates lower salinity throughout most of the Southern Hemisphere subtropical gyres. In the Arctic, the model overestimates the surface salinity considerably by as much as 3 psu. The model-data difference in the Atlantic meridional section indicates that the model's deep and bottom water masses are generally too saline. In the North Atlantic, this is consistent with the strong AMOC in the model, as salinity change dominates sea water density increase at low temperatures (occurring in high latitude regions with   vertical convection due to hydrostatic instability). Around 30 • N latitude, the model simulates anomalously high deepwater salinities, which is attributed to a combination of too much outflow of saline water from the Mediterranean Sea and relatively strong near-surface mixing. This caveat is difficult to resolve with the current model horizontal resolution of approximately 1 • since the width of Gibraltar Strait is roughly 30 km. The structure of Antarctic Intermediate Water (AAIW) and Sub-Antarctic Mode Water (SAMW) from the Southern Ocean is realistically simulated by the model, though the salinity in this feature is rendered as slightly too low. Note that additional analysis of the temperature and salinity fields compared to the observations are also available in Bentsen et al. (2012).
Accurate representation of spatial and temporal Mixed Layer Depth (MLD) is essential for many ocean biogeochemical processes. For example, winter mixing entrains DIC-and nutrient-rich deep water into the surface, which plays an important role in air-sea CO 2 fluxes and spring bloom biological production. Maps of mean mixed layer depth for the boreal winter (DJF) and summer (JJA) periods are shown in Fig. 5 together with observational-based estimates using a 0.2 • temperature criterion (de Boyer Montégut et al., 2004). Regions with strong mixing simulated by the model generally correspond well with those observed. While the model still overestimates the mean MLD for the winter season in both hemispheres, it is substantially improved compared to the previous generation model (Tjiputra et al., 2010a, see also Supplement). The improvement was achieved through the implementation of new turbulent kinetic energy balance equation following Oberhuber (1993) and updated parameterization of mixed layer restratification by eddies following Fox-Kemper et al. (2008) (see also Sect. 2.2.1). In the Southern Ocean, improvement in mixed layer depth translates into an improved simulated seasonal variability of sea-air CO 2 gas exchange and biological production (see below).

Biogeochemical tracers
In an ocean biogeochemical general circulation model, the dissolved inorganic nutrients are useful for assessing how well the model simulates the marine productivity, respiration, and remineralization of organic matter as well as the large scale ocean circulation. The large-scale spatial variation of mean surface phosphate concentration simulated by the NorESM is strongly correlated to the WOA estimate (Garcia et al., 2010b), as shown in Figs. 2 and 6. Regions of strong mixing and upwelling (e.g. North Atlantic, North Pacific, and Southern Ocean) yield higher phosphate concentrations than the mid-latitude regions. At high latitudes, relatively high nutrient concentrations are associated with the strong upwelling during wintertime mixing, where due to the low light conditions, nutrients cannot be depleted until later spring or summer. In the equatorial regions (Pacific and Atlantic), the upwelled nutrients are steadily consumed by biological production due to its suitable location, which is not limited by light or temperature throughout the year. Also at mid-latitudes optimum growth conditions (i.e. year-long sufficient light and temperature) contribute to steadily low surface nutrient concentrations. In the North Atlantic, the model simulated surface phosphate is slightly higher than the observed, but is much improved compared to the nearly depleted surface phosphate simulated in the previous model generation ( Assmann et al., 2010;Tjiputra et al., 2010a). In the Southern Ocean, similar improvement in surface phosphate can also be seen. The updated model was modified considerably from the previous model version and this improvement in the surface concentration is likely due to the doubling of the phytoplankton nutrient uptake half-saturation constant parameter from 0.1 to 0.2 µmol P L −1 . Higher half-saturation constant reduces the nutrient uptake when the surface nutrient concentration is low, and hence increases the mean nutrient concentration near surface. Note that similar sets of figures illustrating the previous model (Bergen Earth system model) performance as compared to observations are provided and briefly discussed in the Supplement accompanying this manuscript. Figure 6 shows that the phosphate concentration in the Atlantic and Pacific bottom water-masses are underestimated by the model. We believe this is largely attributed to the simulated strong overturning circulation (by the model), which results in a relatively young deep water mass with weak accumulation of remineralized nutrients in the deep Atlantic and Pacific Oceans. The parameterization of the biogeochemical processes, such as particle sinking speed and remineralization rates of dissolved and particulate organic matters can also influence the nutrients and oxygen distribution at depth. For example, high vertical sinking speed would translate to higher nutrient at depth and high remineralization rate would increase nutrient and decrease oxygen concentration of younger water masses. However, these controlling parameters were not modified considerably relative to the previous version. In addition, in the low-resolution version of the model (i.e. NorESM-L), where the simulated overturning cir-culation strength is much more reasonable at ∼ 18 Sv (Zhang et al., 2012), the phosphate concentration at deep ocean is much more realistic. The NorESM-L also simulates older ideal age tracer in the deep ocean than the medium resolution version.
The NorESM simulated distributions of other macronutrients (nitrate, silicic acid) reveal comparable features with respect to corresponding field observations as phosphate and, therefore, are not discussed here in further detail. Since there is no nutrient input from river runoff, the model simulates a small drift in the global budget of nutrients in the water column, mainly due to loss to the sediments. A river runoff parameterization has already been implemented, and will be switched on in a later version of NorESM.
In HAMOCC5, dissolved iron acts as a limiting micronutrient for marine biological production. The main source of iron concentration in the surface is through aerial dust deposition, which is transported out of deserts over land (e.g. the Sahara). Since the model used the same climatology iron (dust) deposition as the previous model version, the distribution of surface iron concentration is very similar to the one shown in Fig. 13 of Assmann et al. (2010) (see also Fig. 9). Maximum surface iron concentration is simulated in the Mediterranean Sea with values slightly higher than 2 nmol Fe L −1 . Several regions such as the North Atlantic, northern part of Indian Ocean, and parts of the Southern Ocean also have relatively high surface iron concentration, ranging between 0.4 and 0.6 nmol Fe L −1 . The Pacific Ocean is mostly depleted with regard to iron. This feature is consistent with the limited observational-based estimates, as shown in Parekh et al. (2005). Figure 7 shows the simulated surface and vertical structure of dissolved oxygen as compared to the observations from the WOA (Garcia et al., 2010a). Along the surface, the dissolved oxygen of the model agrees well with the observations, as indicated by the strong correlation and small model-data misfit in Fig. 2. The dissolved oxygen close to the surface is mostly determined through air-sea gas exchange processes and through oxygen release during phytoplankton growth. As the oxygen gas has higher solubility in colder water, maximum surface dissolved oxygen concentrations are simulated in the cold sub-polar and polar regions, whereas warm low latitude regions maintain lower oxygen concentrations. Below the surface layer and at depth, the oxygen is utilized predominantly for remineralization of particulate and dissolved organic matters. Therefore, the oxygen structure of the model at depth is approximately the opposite of those for nutrient (e.g. phosphate). Regions of oldest water masses such as deep equatorial Pacific and Atlantic as well as deep North Pacific contain minimum oxygen concentrations. Regions of younger water masses along the North Atlantic Deep Water (NADW) and Antarctic Mode Water (AAMW) have relatively high oxygen concentrations. As mentioned above, since the model has very strong overturning circula-tion strength, it is expected that the deep water oxygen concentration in the model is somewhat overestimated with respect to measurements, which is the case in most of the bottom water masses (Figs. 7e and 7f). Over the 1850-2005 period of HIST simulation, the oxygen inventory has a weak increasing trend of approximately 1 %, which can be seen in the Supplement Fig. S7. Figure 8 shows that the surface concentration of DIC and ALK simulated by the model broadly agree with the observation in terms of the spatial distribution. However, the absolute value is slightly higher (5-10 %) than the observation counterpart. Simulating the correct alkalinity distribution is known to be a problematic task in global carbon cycle models (e.g. Séférian et al., 2012). In this model, alkalinity bias could potentially be attributed to one or a combination of the following factors. Bias in the GLODAP data used to initialize the model spin-up, which can be divided into discretization problem (i.e. interpolation of limited in situ locations to a gridded global data product) and the fact that the GLO-DAP data includes anthropogenic carbon and when coupled with prescribed preindustrial atmospheric CO 2 in the model could lead to higher calcium carbonate dissolution, more carbonate ion, and hence higher alkalinity (note that it is yet to be explored whether the magnitude of the change is comparable with that seen in the current model). Bias in the salinity as well as parameterization of calcium carbonate production in the model, among others, can also contribute to the alkalinity bias. Despite the fact that both DIC and ALK are higher than the observed values, the model still simulates a reasonable surface sea-air carbon exchange compared to the observation as discussed below. This is because the carbon flux between the air-sea interface depends on, among others, the chemical buffering capacity of gaseous CO 2 in seawater. The inverse of this buffer capacity is known as the Revelle Factor (Revelle and Suess, 1957). The seawater buffer capacity is linearly correlated to the carbonate ion concentration. Thus regions with high carbonate ion concentrations such as the warm low latitude have high buffer capacity (low Revelle Factor), while the low carbonate and cold high latitude regions have low buffer capacity. Overestimation of alkalinity alone (i.e. without overestimation of DIC) would give higher carbonate ion concentration, and consequently increase the buffer capacity. Nevertheless, as DIC also contains carbonate ion, an approximation for [CO  Sarmiento and Gruber (2006). Since both alkalinity and DIC in the NorESM are overestimated by a similar factor, the simulated buffer capacity is not altered considerably. Figure 8 also shows the ALK minus DIC values from both the model and GLODAP data. Here, the model value compares fairly well with the observations in spatial variation as well as magnitude.

Biological production
As described in Sect. 2.2.2, the model net primary productivity is limited by both the prognostic physical (temperature and light) and nutrients fields. The nutrient usage for biological production is formulated as nutrient = min(PO 4 , NO 3 · R P:N , Fe·R P:Fe ), where the stoichiometry constant ratio of P : N : Fe = 1 : 16 : 3.66 × 10 −6 is applied. Figure 9 shows the distribution of annual mean surface phosphate, nitrate, and dissolved iron concentration in a uniform unit (i.e., equivalent to µmol P L −1 ). Based on Fig. 9, the model simulates no region where phosphate is the limiting nutrient. In most of the Atlantic and Indian Oceans, the nitrate concentration is depleted, whereas the dissolved iron concentration is relatively abundant due to close proximity to the main source of iron emissions (i.e., the Sahara). In most of the Pacific, except for the eastern equatorial Pacific, iron is the limiting nutrient. Finally, in the Southern Ocean, surface nitrate concentration is slightly lower than iron. Note that the model also simulates nitrogen fixation (see also Sect. 2.2.2) process, thus in regions where both the nitrate and iron are comparable, for example the Southern Ocean and equatorial Pacific, iron may ultimately be the limiting factor. In their study, Assmann et al. (2010) also showed that both the Northern and Southern Hemispheres productivity are also strongly limited by light and temperature fields. To evaluate how well the ecosystem dynamics in the surface layer is simulated by the model, we compare the model simulated net primary production to remote sensing-based estimates from Behrenfeld and Falkowski (1997). Regions with large primary production are found in the coastal upwelling regions, equatorial Pacific, and the high latitude oceans, as shown in Fig. 10. This is, in general, consistent with the surface nutrient distribution shown in Figs. 6 and 9. In the high latitude Southern Ocean, the biological production in the model remains relatively low despite high macronutrients supply (e.g. see Fig. 6). This region is well known as High-Nutrient-Low-Chlorophyll (HNLC) region associated to the limited dissolved iron concentration required for phytoplankton growth as discussed above. The model-data deviation is largest in the eastern equatorial Pacific and parts of the Southern Ocean. In these regions, the model generally simulates higher NPP than observed. Carr et al. (2006) show that this caveat is common among many biogeochemical models, which maybe associated to the peculiar characteristic of the HNLC regions, where globallytuned ecosystem parameterizations in models are likely to fail due to lack of a full understanding of the steering processes.
To further analyze the relationship between net primary production and nutrients, we compute the mean phosphate concentration at different latitudinal bands and ocean basins, and plot them against the respective mean net primary production as shown in Fig. 11. The analysis identifies three dominant productivity domains. The first is the low nutrient, low productivity region, which is confined to low latitudes. However, the equatorial Pacific is an exception, where the surface nutrient concentration is relatively low but the biological production is high. The second domain is the Northern Hemisphere at high latitudes (i.e. North Atlantic and North Pacific), characterized by high biological production with moderate nutrient concentration. These are also regions of strong export production, hence strong biological pump. The third domain is the Southern Ocean with high surface nutrient concentration but relatively low biological production. As mentioned above, this is due to the limited aerial iron deposition, which is an essential micronutrient for primary production. Using the current model setup, the early period of HIST simulation yields a global mean net primary production of 42.81 ± 0.86 Pg C yr −1 . This value is well within the large range of estimates from both remote sensing and global biogeochemical models of 30 to 70 Pg C yr −1 (Carr et al., 2006). Even though there is a small negative drift in the nutrient budget associated to the sediment burial, the simulated global net primary production remains stable for the 250 yr of CTRL simulation. The NorESM simulates global particulate inorganic and organic carbon (PIC and POC) exports of 0.51 ± 0.01 and 8.41 ± 0.18 Pg C yr −1 , respectively, for the 1850-1859 period. Thus the simulated PIC-to-POC ratio is approximately 0.06, well within the range of 0.06 ± 0.03 given by Sarmiento et al. (2002), but just outside of the range given by Jin et al. (2006) of 0.07-0.10. In the earlier model configuration (Tjiputra et al., 2010a), the model simulates roughly 20 % higher PIC export of 0.6 Pg C yr −1 . The main reason for this discrepancy lies in the simulated surface silicate concentration, as shown in Fig. 12. The PIC export in the model is formulated as a function of silicate concentration such that high surface silicate yield low PIC export but high biogenic opal export. On contrast, low surface silicate translates into high PIC export but low opal export. In the earlier model, the simulated surface silicate concentration is considerably underestimated in the Southern Ocean, and is now ameliorated in the NorESM. Figure 12 shows that the NorESM simulates higher silicate concentration in most high biological productivity regions, such as the North Atlantic, North Pacific, equatorial Pacific, and vast area of the Southern Ocean. Compared to the climatological estimates, the NorESM surface silicate concentration is better (than the BCM-C) in the Southern Ocean, but noticeably overestimated in the northern high latitudes. The HIST simulation reveals that there are detectable changes in the globally integrated mean annual net primary production and organic carbon export (below 100 m) Fig. 12. Comparison between mean surface silicate concentration (1980)(1981)(1982)(1983)(1984)(1985)(1986)(1987)(1988)(1989)(1990)(1991)(1992)(1993)(1994)(1995)(1996)(1997)(1998)(1999) simulated by the NorESM (top) and the Bergen Earth system model (middle), and climatological estimates (bottom) from the WOA (Garcia et al., 2010b). Units are in (µmol L −1 ). between the preindustrial (1850-1859) and contemporary (1996)(1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005) periods, as shown in Table 2. The NorESM simulates drops of 5 % for both NPP and and organic carbon export. On the other hand, there is no detectable changes in the simulated annual calcium carbonate and silicate particulate exports. A full annual time series of the simulated export production for the 1850-2005 from both the CTRL and HIST simulations are available in the accompanying Supplement Fig. S7. Table 2. Globally integrated annual net primary production, carbon export production, calcium carbonate export, silicate export production, and sea-air CO 2 flux (negative represents ocean uptake), computed for preindustrial and contemporary (centered at year 2000) periods.  Figure 13 shows the simulated (HIST) mean annual seaair CO 2 fluxes for the 1996-2005 period together with observational-based estimates by Takahashi et al. (2009) for the similar period. The model broadly agrees with the observations in term of spatial variation with strongest carbon source to the atmosphere in the equatorial Pacific and most intense carbon sink in the North Atlantic and Nordic Seas. In the equatorial Indian Ocean, the model outgassing is noticeably weaker than the data estimate. The model-data discrepancies are also pronounced in the polar Southern Ocean (South of 60 • S), a region of increasing interest but still remaining poorly observed. Here, the model suggests a dominant carbon sink, whereas the data show a combination of weak sources and sinks regions. While observational-based studies (e.g. Le Quéré et al., 2007) indicate a weakening CO 2 sink in the Southern Ocean, a model study by Tjiputra et al. (2010b) shows that, due to its efficient northward subduction of intermediate deep water, the Southern Ocean could continue as the dominant anthropogenic carbon sink in the future. The possibility of considerable Southern Ocean carbon uptake from the atmosphere has also been documented by anthropogenic carbon determinations (Vázquez-Rodríguez et al., 2009).

Sea-air CO 2 fluxes
Compared to the preindustrial control (CTRL) simulation (not shown), the biggest difference occurs in the North Atlantic where some mean outgassing regions are completely replaced by carbon uptake. Under the preindustrial atmospheric CO 2 boundary condition (i.e. 284.7 ppm), the NorESM also simulates more intense carbon outgassing in the equatorial Pacific upwelling as well as the Southern Ocean circumpolar upwelling zone. Over a long-term period, a study with the Bergen Earth system model (Tjiputra et al., 2010b) reveals that, due to their water mass transport characteristics, the equatorial Pacific and the polar Southern Ocean could take up more CO 2 under a business-as-usual future scenario. On the other hand, the CO 2 uptake rate in the North Atlantic would stabilize toward the end of the 21st century, predominantly associated with the slowdown in the overturning circulation. In the midlatitude regions, there are relatively small changes in the carbon fluxes.
In addition to the spatial distribution of air-sea CO 2 flux shown in Fig. 13, it is also useful to analyze the model simulated distribution of anthropogenic carbon column inventory. However, an accurate representation of anthropogenic carbon concentration from the model requires another set of simulations (e.g. similar simulation as HIST but with constant preindustrial atmospheric CO 2 for the air-sea gas exchange). While we do not have such simulation, we can still approximate the column inventory of anthropogenic carbon by computing the difference in column inventory of dissolved inorganic carbon between HIST and CTRL simulations at the same period. Here, we choose year 1994 to compare with the observational based estimates over the same period Sabine et al., 2004). Figure 14 shows that the maximum anthropogenic carbon concentration in the ocean is concentrated in the North Atlantic region. This feature is due to the large-scale global overturning circulation pattern in the surface, which converges in the North Atlantic, before exported to depth. In addition, the mid-latitude Southern Ocean also stores large portion of global anthropogenic carbon, associated with the intermediate water formation that transfers recently taken up carbon into deeper depth for long-term storage. In general, the simulated spatial pattern is broadly consistent with the observational-based estimates. However, the model approximation is lower than observed in the equatorial regions, whilst in the North Atlantic the model estimate is higher. The strong AMOC strength could contribute to the higher anthropogenic carbon storage in the North Atlantic, as absorbed anthropogenic carbon in this region is transported faster to the deep ocean. It is not so obvious, however, why the model underestimates the anthropogenic carbon in the equatorial oceans. A study by Matsumoto and Gruber (2005) has indicated that the C* method adopted in Sabine et al. (2004) study has many limitations as well (e.g. they show that the method overestimates the anthropogenic carbon in the equatorial region). Over the 1850-1994 period, the model takes up a total of 106.7 Pg C, while the observations suggest a net uptake of 118 ± 2 Pg C over the 1800-1994 period .

Terrestrial biogeochemistry
Several studies have been dedicated to describe and evaluate the CLM4. For example, full technical description for physical and biogeochemical processes in CLM4 is available in . Lawrence et al. (2011) discuss the improved parameterization introduced in CLM4 relative to the previous version, CLM3.5. Bonan and Levis (2010) discuss the the influence of nitrogen biogeochemistry on the terrestrial carbon budget. Gent et al. (2011) overview CLM4 performance within the latest Community Climate System Model (CCSM4) framework. The coupling of CLM4 to the NorESM model, in general, does not introduce substantial changes in the overall characteristics of the land simulation. In this subsection, we discuss the basic features of the CLM4 when coupled to the NorESM framework.

Vegetation and soil carbon pools
The mean vegetation and soil carbon budget simulated by NorESM over the 1982-2005 historical period are 551.3 and 537.4 Pg C (see also Table 3), respectively. Figure 15 shows the distribution of total vegetation and soil carbon contents as simulated by the NorESM. The ecosystem carbon content follows the precipitation and temperature distribution (see also Figs. 16 and 17). Note that a more detailed modeldata evaluation of the NorESM simulated surface temperature and precipitation are available in Bentsen et al. (2012). Large vegetation carbon mass can be seen in regions with both warm surface temperature and high precipitation rate throughout the year, for example in the equatorial and eastern Asia regions. The simulated amounts of carbon stored in vegetation biomass is in the range of observed values of 466-654 Pg C (WBGU, 1988;DeFries et al., 1999). However, the amount of carbon stored as organic matter in the soil is well below Jobbágy and Jackson (2000) global estimates of 1502 Pg C for the first meter depth. Regionally, the NorESM simulates carbon stock, which is lower by a factor of 2 to 10 than the values proposed by Jobbágy and Jackson (2000). The mismatch is particularly substantial in the high latitudes where NorESM simulates less than 2 kg C m −2 in tundra covered regions as compared to the observed values of 18 kg C m −2 . The low soil carbon at high latitudes is likely attributed to the lack representation of anoxic soil carbon decomposition and mixing properties. In addition, the litter decomposition is too fast  and the soil organic carbon pools are not built-up fast enough during the model's spin-up and hence remains low over the simulation periods. Unrealistically low GPP across much of the Arctic is also contributing to the bias in Arctic soil carbon stocks.
With regards to litter carbon pool, Table 3 shows that the model also underestimates the observational estimates. Bonan et al. (2013) discuss and show that the litter decomposition rates are much too high compared to the observations. Consequently, too much carbon is returned to the atmosphere instead of being transferred to the soil. Further analysis of this issue is ongoing and is beyond the scope of this paper. Table 3 also shows that the coarse woody debris carbon pool is comparable in magnitude with the observations.  (Houghton, 2003) 1502 (Jobbágy and Jackson, 2000) Fine litter 12.47 ± 0.29 68 (Matthews, 1997)

Terrestrial primary production and respiration
Here, we also compare the gross primary productivity (GPP) and terrestrial ecosystem respiration (TER = autotrophic + heterotrophic respirations) simulated by NorESM with the observationally derived values. While we can assess the capability of NorESM to fix and emit carbon on land, it is important to note that the fluxes due to changes in land use and management as well as fire are not taken into account in this analysis. latitude south regions, the model overestimates the observed GPP by approximately 10 %, 10 % and 17 %, respectively, while at high latitude, the model underestimates the observations by approximately 45 %. The regional differences between the model simulated and observed TER resemble the similar patterns with GPP, with model overestimation in all regions except for the high latitude region, as shown in Table 4 and Fig. 19. Globally, the mean annual autotrophic and heterotrophic respiration simulated by NorESM are 83.2 and 23.4 Pg C yr −1 , respectively. In total, the simulated TER is 106.6 Pg C yr −1 , larger than estimates by Jung et al. (2011) of 96.4 ± 6 Pg C yr −1 . Nevertheless, the simulated net ecosystem exchange (NEE), which can be estimated by subtracting TER from GPP, is Table 4. Regional and global annual mean gross primary production (GPP) and terrestrial ecosystem respiration (TER) as simulated by the NorESM and estimated from FLUXNET-MTE data. The FLUXNET-MTE uncertainties were estimated based on global mean uncertainties published by Jung et al. (2011). Units are in (Pg C yr −1 ).

Regions
NorESM-GPP FLUXNET-GPP NorESM-TER FLUXNET-TER  23.2 Pg C yr −1 and remains within the range of values estimated by Jung et al. (2011). Table 4 shows that the NorESM overestimates TER fluxes by 14.3 % and 20.5 % in the northern and southern mid-latitudes, respectively, when compared to the measurements. In the Tropics, simulated TER fluxes are 17.6 % higher compared to the FLUXNET-MTE estimates, whereas at high latitudes, NorESM underestimates the observed TER by 31 %. Figure 18 shows the distribution of mean annual GPP fields simulated by NorESM and as estimated from FLUXNET-MTE. In general, the NorESM land carbon model overestimates the annual GPP compared to the FLUXNET-MTE in the tropics and throughout the extratropics. NorESM simulates more than 4 kg C m −2 GPP throughout regions coveredwith tropical rain forest. The NorESM overestimates the latitudinal distribution of GPP in the tropics and in the mid-latitudes by approximately 15 %. Such a pattern has been shown by Beer et al. (2010) to be produced by process-based models and more specifically by Bonan et al. (2011) for CLM4.0. The relatively large underestimation of GPP in the high latitudes might be due to the excessive nitrogen limitation, predominantly during summer (see also seasonal analysis below), and issues with cold region hydrology, which are currently being addressed for the next version of CLM. Although this GPP discrepancy is locally quite strong, it represents only a small part in the total amount of carbon absorbed by land. Figure 19 shows the spatial TER distribution from NorESM and observations. The latitudinal patterns of TER follow very closely those shown by GPP due to the coupling existing between the two variables. First, a direct coupling where GPP provides substrate for the autotrophic respiration and secondly, a more loose coupling where GPP indirectly regulates the amount of carbon returning to the soil, which also determines the heterotrophic respiration.
Time series of monthly GPP from the model and observations are shown in Fig. 20. Generally, the seasonal cycle is correctly simulated by the NorESM, with large productivity during respective hemispheres' summer season and low productivity in winter. In the Northern Hemisphere high latitude, the model simulated mean GPP is close to the observations, while the summer GPP is noticeable smaller than the observations. In this region (i.e., north of 60 • N), the Fig. 19. Same as Fig. 18 for terrestrial ecosystem respiration (i.e., sum of autotrophic and heterotrophic). Units are in (kg C m −2 yr −1 ). model simulated surface air temperature (at 2 m level) is lower by 1 to 5 K than the Climate Research Unit (CRU, New et al., 1999;Mitchell et al., 2005) as well as the National Centers for Environmental Prediction (NCEP, Saha et al., 2010) estimates, especially during the summer months (June-July-August). As temperature is a limiting factor for vegetation growth in these regions, lower temperatures may induce a shorter growing season, and hence an underestimation of productivity. However, a stand-alone CLM4 simulation forced with observed climate also simulates a similar high latitude GPP bias (Swenson et al., 2012). In their study, Swenson et al. (2012) suggest that other factors such as excessive nitrogen limitation and limitation associated with cold region soil hydrology may also play a role.
In both hemispheres' mid-latitude regions, the model simulates reasonably well the amplitude and seasonal variability of GPP. In the tropics, the model GPP seasonal variation is comparable with the observation, but the model mean is considerably larger than the observations, by approximately 0.6 Pg C month −1 . With regards to long-term regional change in GPP, both model and observations suggest a relatively small positive trend, except for the midlatitude southern region, where the trend is statistically not different from zero. Globally, the model suggests an increasing trend of 1.74 Tg C month −2 , more than three times larger than implied from the FLUXNET-MTE observation of 0.52 Tg C month −2 . We also note that there are uncertainties in the FLUXNET-MTE estimates associated with random and systematic errors from the upscaling methodology biases .

Transient sea-air and land-air CO 2 fluxes
The time series evolution of net oceanic carbon uptake simulated by the HIST and CTRL simulation is shown in Fig. 21. In the 250 yr of CTRL simulation, the ocean continues to take up CO 2 at 0.18 ± 0.08 Pg C yr −1 . In the HIST simulation, the model uptake rate is closely linked to the prescribed atmospheric CO 2 concentration. The sharp increase in atmospheric CO 2 after year 1950 leads to consistently more intense oceanic carbon uptake. Figure 21 shows that the model oceanic carbon uptake for the 1980s and 1990s agrees with the estimates from the IPCC-AR4 estimates (Denman et al., 2007). For the present-day estimate (centered at year 2000), the model simulates a net ocean carbon sink of 2.41 ± 0.12 Pg C yr −1 (see also Table 2), well within the observation based estimates of 2.0 ± 1.0 Pg C yr −1 (Takahashi et al., 2009).
The terrestrial carbon uptake simulated over the historical period is also shown in Fig. 21. Compared to the control simulation, the terrestrial carbon uptake steadily increases from year 1850 to 2006. However, the terrestrial carbon uptake, excluding the land use change, remains lower than the estimates from IPCC-AR4 (Denman et al., 2007) for the 1980s and 1990s mean uptakes. This anomalously low terrestrial carbon uptake can be attributed to the strong nitrogen limitation effect on the CO 2 uptake by the plants, and therefore reduces the CO 2 fertilization effect. In their study, Lindsay et al. (2013) also argue that this process is predominantly responsible for the bias in CLM4 land-atmosphere CO 2 fluxes.

Summary and conclusions
In this manuscript, we evaluate the carbon cycle components of the Norwegian Earth System Model (NorESM). The NorESM model was developed based on several components of the Community Climate System Model (CCSM4). It keeps the original coupler (CPL7), terrestrial (CLM4), and sea ice (CICE4) components while the chemistry processes in the atmospheric model (CAM4) are improved. The ocean general circulation and carbon cycle models are replaced with the Miami Isopycnic Coordinate Ocean Model (MICOM) and the Hamburg Oceanic Carbon Cycle (HAMOCC) model. In addition to control and historical simulations discussed here, the NorESM also performed many other simulations to support the coming Fifth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC-AR5). The NorESM model output (referred as "NorESM1-ME") is available for download at the CMIP5 (Coupled Model Intercomparison Project) website, http://cmip-pcmdi.llnl.gov/cmip5/.
The ocean carbon cycle model in NorESM is unique because of the coupling with an isopycnic ocean model. In general, the global distribution of temperature and salinity as well as biogeochemical tracers such as oxygen and nutrient agree broadly with climatological estimates from the World Ocean Atlas (WOA). The model performs especially well in simulating the observed large scale temperature amplitude and spatial variability. Surface distributions of oxygen and phosphate have been noticeably improved with respect to an earlier model version. This progress is attributed to the better tuned ecosystem parameters and improved mixing parameterization in the recent MICOM model version. Improvement in the surface nutrient distribution translates to better representation of biological production, while bias in the equatorial Pacific remains. A relatively strong AMOC strength of ∼ 32 Sv leads to model-data bias in tracer distributions particularly in the North Atlantic Deep Water masses. The spatial sea-air CO 2 fluxes simulated by NorESM agree well with climatology estimates, with globally integrated net annual CO 2 flux for the contemporary period lies within the range of the observational-based estimates.
The land carbon cycle in NorESM is represented by the latest off-spring of the CLM family, CLM4. With this land module, the NorESM reproduces the general pattern of the vegetation carbon content. However, CLM4 in NorESM considerably underestimates the soil carbon content, which appears to be due to poorly or incompletely represented biogeochemical and hydrologic processes in CLM4 rather than due to biases in the coupled climate simulation. Compared to the FLUXNET-MTE measurements, the NorESM simulates the land-vegetation gross primary productivity reasonably well. Our analysis shows that the model simulates consistent amplitude and seasonal cycle as observed in mid-latitudes but considerable biases remain in the tropics and at high latitudes. The model-data disagreement in the tropics is due to excessive productivity, which has also been documented by Bonan et al. (2011). At high latitudes, too strong nitrogen limitation, particularly in the summer months may be responsible for the model uncertainties. The future development effort will be oriented toward a better parameterization of the carbon absorption by vegetation as well as improved and more process based representation of the ecosystem respiration. Much effort and methodological consideration will also be needed to improve the soil carbon content predictions.
The model will also be continuously developed to include land-ocean coupling by parameterizing the fluxes of carbon, nutrients, and dissolved oxygen into the continental margins through river-runoff. The parameterization will be based on observational data and formulated as a function of weathering, temperature, and precipitation similar to Bernard et al. (2011). We also plan to improve the nitrogen cycle in the ocean biogeochemistry model, focusing on the changes in marine N 2 O sources and sinks to the atmosphere under present and future climate change.