Increased Climate Response and Earth System Sensitivity From CCSM4 to CESM2 in Mid‐Pliocene Simulations

Three new equilibrium mid‐Pliocene (MP) simulations are implemented with the Community Climate System Model version 4 (CCSM4) and Community Earth System Model versions 1.2 (CESM1.2) and 2 (CESM2). All simulations are carried out with the same boundary and forcing conditions following the protocol of Pliocene Model Intercomparison Project Phase 2 (PlioMIP2). These simulations reveal amplified MP climate change relative to the preindustrial going from CCSM4 to CESM2, seen in global and polar averages of surface warming, sea ice reduction in both the Arctic and the Antarctic, and weakened Hadley circulation. The enhanced global mean warming arises from enhanced Earth system sensitivity (ESS) to not only CO2 change but also changes in boundary conditions primarily from vegetation and ice sheets. ESS is amplified by up to 70% in CCSM4 and up to 100% in CESM1.2 and CESM2 relative to the equilibrium climate sensitivity of respective models. Simulations disagree on several climate metrics. Different from CCSM4, both CESM1.2 and CESM2 show reduction of cloud cover, and weakened Walker circulation accompanied by an El Niño‐like mean state of the tropical Pacific in MP simulations relative to the preindustrial. This El Niño‐like mean state is consistent with paleo‐observational sea surface temperatures, suggesting an improvement upon CCSM4. The performances of MP simulations are assessed with a new compilation of observational MP sea surface temperature. The model‐data comparison suggests that CCSM4 is not sensitivity enough to the MP forcings, but CESM2 is likely too sensitive, especially in the tropics.


Introduction
The mid-Piacenzian (or mid-Pliocene, MP for short) warm period (at 3.205 millions of years ago, Ma) has been identified as one of the key targets of the Paleoclimate Model Intercomparison Project Kageyama et al., 2018;Masson-Delmotte et al., 2013). Despite good agreement in simulating modern and preindustrial (PI) climate, Earth system models (ESMs) diverge in predicting many fundamental aspects of future climate change, including the transient melting behavior of the Arctic sea ice (Stroeve et al., 2012), Arctic feedbacks (Pithan & Mauritsen, 2014), changes in equatorial Pacific sea surface temperature (SST) pattern (Coats & Karnauskas, 2017;Seager et al., 2019;Vecchi & Soden, 2007), and changes in subtropical precipitation (Collins et al., 2013), among many others. Paleo-observations can provide independent out-of-sample data to help constrain these uncertainties . To this end, Pliocene Model Intercomparison Project (PlioMIP) (Haywood et al., 2010(Haywood et al., , 2015 and Pliocene Research, Interpretation and Synoptic Mapping Project (PRISM) (Dowsett et al., , 2016 have been carried out with separate focuses on paleoclimate modeling and paleo-observational data synthesis. The simulations featured in the current study contribute to the second phase of PlioMIP (PlioMIP2). The Community Earth System Model version 2 (CESM2) simulation is the Coupled Model Intercomparison Project Phase 6 (CMIP6) Tier 1 experiment: midPliocene-eoi400.
PlioMIP2 targets at the time slice of 3.205 Ma during the mid-Piacenzian . The selection of 3.205 Ma allows the alignment of model simulations with the interglacial of Marine Isotope Stage KM5C (Prescott et al., 2014), which occurred with present-day orbits and 400 ppm CO 2 . The Greenland ice sheet is prescribed with results from the Pliocene Ice Sheet Modeling Project . This ice sheet configuration features only 25% modern coverage in Greenland and a deglaciated western Antarctic (Dowsett et al., 2016). A global soil map was generated for MP (Pound et al., 2014). A dynamic topography model was applied to estimate topographic changes due to changes in mantle convection and removal of ice sheets (Dowsett et al., 2016). Sea level and other sedimentary information were integrated with the simulation of dynamic topography to determine changes to ocean gateways and coastal shelves, resulting in closed Bering and Canadian Arctic Archipelago straits; exposed Sunda, Sahul, and Baltic Shelves; and an exposed Hudson Bay (Dowsett et al., 2016).
In the modeling realm, following the previous PlioMIP (PlioMIP1) and Climate Model Intercomparison Project 5 (CMIP5), many new versions of ESMs have been released, including CESM1 and CESM2. PlioMIP1 was carried out with the Community Climate System Model version 4 (CCSM4) . Different from CCSM4, CESM1 features extensive updates in representing atmospheric physics including a new radiation calculation, a boundary layer scheme, a shallow convection scheme, and cloud microphysics (Hurrell et al., 2013). The model also incorporates a modular aerosol model (Liu et al., 2012). CESM1 shows improved simulations of cloud climatology and radiative forcing compared to CCSM4  and substantial changes in cloud feedback to CO 2 warming with an overall less cloud cooling effect in response to surface warming (Gettelman et al., 2012). CESM1 also has a higher equilibrium climate sensitivity (ECS) of 4 K per doubling of CO 2 . ECS is 3.2 K in CCSM4 (Bitz et al., 2012). In published CESM1.2 PlioMIP1 simulations, the capability of simulating aerosol-cloud interactions and the removal of anthropogenic pollutants are shown to substantially amplify the Arctic warmth . The combined changes of atmospheric physics in CESM1.2 also produce an El Niño-like equatorial Pacific response in a PlioMIP1 simulation, which is not featured in CCSM4 (Tierney et al., 2019).
From CESM1 to CESM2, substantial updates were made to all model components (Danabasoglu, Lamarque, & Bacmeister, et al., 2019). In particular, the new land model features changes in both CO 2 and nitrogen fertilization (Fisher et al., 2019). Parameterizations of the planetary boundary layer and shallow convection are now unified with the Cloud Layers Unified by Bi-Normals Parameterization (CLUBB) scheme (Bogenschutz et al., 2013). A sea ice model has now included the mushy layer physics (Turner & Hunke, 2015). ECS has increased again to 5.3 K per doubling of CO 2 (Gettelman et al., 2019).
In this study, we document the implementation of PlioMIP2 boundary conditions in three generations of models from the same lineage, namely, CCSM4, CESM1.2, and CESM2, and quantify simulated large-scale MP climate changes relative to PI. Model agreements and disagreements are highlighted. A new compilation of paleo-observational SST (Foley & Dowsett, 2019) is used to evaluate performances of these three models.
Following the CMIP5 and CMIP6 simulations, the atmosphere and land components in our PlioMIP2 simulations are set to 0.9°latitude by 1.25°longitude horizontal resolution. Atmosphere components feature 26, 30, and 32 levels from~992.6 hPa at the bottom to~3.5 hPa at the top of CCSM4, CESM1.2, and CESM2, respectively. The ocean component features 60 vertical levels from 5 m depth to 5,375 m depth in all models. Both ocean and sea ice components use a horizontal grid mesh of 384 by 320 grids with grid sizes identical to published CMIP simulations north of~60°S. Due to the terrestrial presence of the western Antarctic ice sheet in PI and present-day boundary conditions, the default 384 by 320 ocean and sea ice grid mesh does not cover the western Antarctic. In order to simulate ocean circulation and sea ice of the western Antarctic, we add 10 meridional rows of ocean grids to the present-day grid mesh. Each row is a replica of the southmost row of the default 384 by 320 mesh. One may choose to generate a new ocean grid mesh with the same number of grid cells but different grid sizes, which is commonly done in simulations of pre-Quaternary climates. However, different grid sizes may introduce artificial differences when comparing with the PI simulation and, hence, are not used here.

Boundary Conditions
We implement the enhanced boundary conditions following the protocol of PlioMIP2 (Dowsett et al., 2016;Haywood et al., 2015). We first calculate anomalies of MP topography and bathymetry relative to PI at 0.5°r esolution (50 km). Both are derived from the ETOPO data set (a global relief data set developed by National Geophysical Data Center) by PRISM4 (Dowsett et al., 2016). Anomalies are then interpolated to the resolution of model components and added to the PI topography and bathymetry used by CCSM4 and CESMs. Due to the lack of information of high-resolution topographic roughness, modern roughness is mapped to paleotopography by proximity, with the exception of where the ice sheets were absent during the MP. Topographic roughness from the nearest unglaciated land is prescribed to cover the deglaciated area of the Greenland and Antarctic.
MP changes to topography are typically less than 500 m (Figures 1a and 1b). Bathymetric changes mainly occur in the newly opened western Antarctic coastal shelf, uplifted Mariana trench, and East Pacific Rise and around maritime continents due to mantle convection. Coastlines are adjusted to feature closed Bering and Canadian Arctic Archipelago straits and exposed Hudson Bay, Baltic Sea shelf, and Sunda and Sahul Shelves (Dowsett et al., 2016). Greenland and Antarctic ice sheets are reduced   (Figures 1a and 1b).
The MP paleo-biome distribution was generated by blending the fossil records with BIOME4 simulations (Salzmann et al., 2008). Although no change occurs in reconstructed biome types from PlioMIP1 to

Journal of Advances in Modeling Earth Systems
PlioMIP2 (Salzmann et al., 2008), a more general megabiome map is provided for PlioMIP2 (Dowsett et al., 2016). We also updated the mapping method to convert biome types to plant functional types (PFTs) used by CCSM4 and CESMs. PFTs are prescribed as boundary conditions. Yet productivity and plant phenology are prognostic and solved as part of the terrestrial carbon cycle in equilibrium with climate and prescribed CO 2 .
Our updated mapping method includes the following steps: (1) We group the modern PFTs into groups of modern megabiome types (PFT BIOME-present ) defined by the BIOME4 data set (BIOME-present). PFTs located in the same area as a megabiome type are indexed with that type.
(2) We quantify the latitudinal offset of a megabiome type between its MP (BIOME-MP) and present-day locations (BIOME-present) using the mean meridional Euclidian distance.
(3) Interim maps of PFTs (PFT BIOME-i ) and megabiome types (BIOMEi) are generated by shifting PFT BIOME-present and BIOME-present to their MP latitudes. (4) We map the PFTs from the PFT BIOME-i to the final MP locations based on spatial correspondence function G: PFT BIOME-MP ¼ G (PFT BIOME-i ). G is determined with BIOME-MP ¼ G (BIOME-i) for the same biome type by proximity and inverse distance weighting within a radius of 500 km. The second and third steps are not included in the previous mapping method ( Rosenbloom et al., 2013). Without these steps, MP boreal forest PFTs were extrapolated from the northern edge of the present-day boreal forest by proximity, resulting in muted changes in PFTs ( Figure 2).
Using the same approach, we apply the MP soil map (Pound et al., 2014) to generate the soil color and organics in our MP simulations. MP lakes are prescribed according to reconstructions (Pound et al., 2014). Yet, in CCSM4 and CESMs, lake water balance is maintained with the assumption of fixed lake depth (Oleson et al., 2010). Our test runs suggest that the atmospheric net water input cannot maintain the lake levels of the Northern African lakes. Water from the nearest estuary outlet, i.e., the Mediterranean, is extracted to maintain the lake level. To avoid this unphysical process, the Northern African lakes are replaced with the nearby PFTs in our simulations ( Figure 2).

Implementation
All simulations are carried out on the Cheyenne supercomputer (Computational and Information Systems Laboratory, 2017a, 2017b), which is maintained at the Computational Information Systems Lab and funded by the National Science Foundation. For CCSM4 and CESM1.2, we branch out two PI simulations from the long PI simulations (1,000 years) performed on the now retired Yellowstone supercomputer using CCSM4 and CESM1.0 (Gent et al., 2011;Hurrell et al., 2013). These two PI simulations are continued for an additional 180 and 300 years on Cheyenne with the same CCSM4 and a slightly updated model CESM1.2.
Changing supercomputer or small model updates creates no change in the net top of the atmosphere (TOA) radiation imbalance (~0.1 W/m 2 ) ( Figure 3a). Simulated global mean surface temperatures of PI are also consistent with published values (Figure 3b). These two PI simulations (CCSM4-PI and CESM1.2-PI) serve as the baseline for comparisons with MP simulations using respective models (CCSM4-MP and CESM1.2-MP). The CESM2-PI simulation is part of the CMIP6 Tier 1 experiment piControl.001 and has recently been completed on Cheyenne (Danabasoglu et al., 2020). This simulation serves as the baseline for comparisons with the CESM2-MP simulation (CESM2-MP). Model initialization, forcing, run length, and net TOA radiation imbalance are summarized in Table 1.
Details of initializing land and ocean components are discussed here. To reduce the spin-up time of carbon-nitrogen cycle on land, we carried out separate land-only simulations with arbitrary initial conditions and forcings generated by the coupled MP test runs using CESM1.2 (featuring Community Land Model version 4) and CESM2 (featuring Community Land Model version 5). Land initial conditions for these two coupled test runs were interpolated from the PI runs. These two test runs were continued for~200 years before producing the 30-year forcing data. This spin-up procedure is meant to reduce the runtime by producing a land initial state close to the MP.
Due to the high computational expenses of CESM1.2 and CESM2, we test both PI and warm initial conditions to determine a cost-effective approach for ocean initialization. Three-dimensional ocean temperature and salinity of equilibrium PI runs are mapped to the MP bathymetry based on proximity to create PI initial conditions. Warm initial conditions are created following the approach used by the CCSM4 PlioMIP1 simulation . Sea water temperature anomalies between MP reconstructions and PI observations are added to simulated PI SSTs to create the initial ocean temperature. Salinity is initialized from the published PlioMIP1 simulation .
When initialized with PI conditions, CCSM4-MP shows rapid decline of TOA radiation imbalance within the first 50 model years, yet both CESM1.2-MP and CESM2-MP show slow decline with high TOA radiation  (Gent et al., 2011;Hurrell et al., 2013). The CESM2 simulation has recently been completed on the same supercomputer and published (Danabasoglu et al., 2020) (only the last 100-year data of CESM2 is shown here).

Analysis of Results
MP climate change is quantified as the difference in climatology between the MP and PI simulations using the same model. Climatology is calculated as the average of the last 100 model years. We also calculate a series of climate metrics to quantify large-scale climate change. Definitions and references of these metrics are provided in Table 2.

Results
Metrics of simulated global mean surface condition, moist state, and changes of the atmosphere and ocean circulation are shown in Tables 3-5, respectively. Model agreement and disagreement are described in the following sections.

Global Mean Surface Climate
Among the three versions of the simulations, CESM2-MP features the greatest changes in surface climate, whereas CCSM4-MP features the smallest (Table 3). Surface warming (in both air temperature and SST) is strongly amplified toward high latitudes with greater amplification in the northern high latitudes than in the southern high latitudes. This warming pattern is consistent with previous findings of PlioMIP1  and paleo-observations (Dowsett et al., 2012). Moreover, our MP simulations show substantial reduction of annual sea ice in both the Arctic and Antarctic regions. In particular, the Arctic Ocean is sea ice free during the boreal summer (August to October) in CESM1.2 and CESM2. The Southern Ocean is nearly sea ice free during the austral summer (February to April) in CESM2 ( Figure 6) ( Table 3).
Despite enhanced Arctic warming from CCSM4 to CESM2, the relative strength of Arctic feedbacks to global mean decreases. Using a linear feedback framework (e.g., Goosse et al., 2018), the total feedback factor (γ), which measures the relative strength of all feedbacks to the strength of Planck's feedback, can be estimated as γ ¼ 1 − ΔT 0 ΔT s , where ΔT 0 and ΔT s are the surface temperature change due to Planck's feedback and the  Table 2. 10.1029/2019MS002033

Journal of Advances in Modeling Earth Systems
surface temperature change due to all feedbacks, respectively. γ scales positively with ΔT s and negatively with ΔT 0 . We assume that intermodal difference is small for global and Arctic ΔT 0 . This is generally the case in multimodel ensembles (Gettelman et al., 2012(Gettelman et al., , 2019Pithan & Mauritsen, 2014;Zelinka et al., 2020) and supported by the similar reference state of the global and Arctic mean climate in PI simulations. As such, in our MP simulations, the difference in feedback factor scales with the difference in surface temperature change relative to PI: γ~ΔT s .
We use the ratio of Arctic (ΔT sp ) to global mean ΔT s (ΔT sg ) to compare Arctic amplification of feedback strength between our simulations: ΔT sp ΔT sg is~3.0 in CCSM4 and CESM1.2 but only 2.4 in CESM2, suggesting weakened Arctic amplification of feedbacks. As a result, the enhanced Arctic surface warming in CESM1.2 and CESM2 is likely a result of warmer global mean climate instead of amplified Arctic feedback strength (Table 3).

Global Mean Precipitation and Cloud
Global mean annual precipitation increases in all MP simulations (Table 4). The rate of precipitation increase is around 2% per degree of surface warming, consistent with previous modeling studies (Held & Soden, 2006). This result suggests weakened convective mass overturning in response to surface warming (Held & Soden, 2006), which is also shown in the relative small increase in the convective precipitation compared to the increase of total precipitation (Table 4).
Cloud coverage increases in CCSM4-MP but decreases in both CESM1.2-MP and CESM2-MP. (Table 4). This intermodel disagreement is consistent with changes of relative humidity in the troposphere. CCSM4-MP features 11.6% increase in specific humidity per degree of tropospheric warming; this ratio is 8.7% and 7.5% in CESM1.2-MP and CESM2-MP, respectively. The increase in specific humidity of CCSM4-MP far exceeds the increase in saturation vapor pressure (7% per degree warming), suggesting increasing relative humidity of the troposphere, consistent with increasing condensation and cloud cover.

Tropical Circulation Change and Model Dependency
Hadley circulation (HC) weakens in all MP simulations (Table 5 and Figure 7). This weakening primarily occurs in the Northern Hemisphere with 10% to 20% reduction in overturning mass flux. The Southern Hemisphere mainly displays changes in the HC morphology (Table 5). Weakening of the HC is broadly consistent with the weakening of tropical convective mass overturning manifested in the slower increase of precipitation relative to the increase of lower tropospheric water vapor content (Held & Soden, 2006). Yet simulations of future climate tend to show more robust weakening of the Walker circulation than of the HC (Vecchi & Soden, 2007). This discrepancy between simulations of MP and the future may be attributable to the non-CO 2 changes in MP boundary conditions. Cloud coverage between 700 and 400 hPa High-level cloud Cloud coverage above 400 to 10 hPa Total Hadley circulation (HC) strength (I Had-tot ) Difference between the maximum and minimum of the zonal mean mass stream function between 30°N and 30°S (Oort & Yienger, 1996). Northern Hemisphere HC strength (I Had-N ) Maximum of the zonal mean mass stream function between 0°and 30°N (Oort & Yienger, 1996)  The hemispherically asymmetric weakening of the HC seemingly coincides with the hemispherically asymmetric changes in meridional SST structure. Mean SST contrast between the tropics (15°S to 15°N) and extratropics (40°N/S to 70°N/S) decreases by~3-4°C in the Northern Hemisphere but only by~1°C in the Southern Hemisphere (Table 1). The connection between changing meridional temperature structure and the strength or symmetry of the HC has been proposed based on previous experiments with prescribed SSTs (Brierley et al., 2009;Carrapa et al., 2019;Feng et al., 2016;Rind, 1998), radiation flux adjustments (Burls & Fedorov, 2017;Frierson & Hwang, 2012;Frierson et al., 2013), and coupled Earth system simulations (Chiang & Friedman, 2012;Friedman et al., 2013). The connection can be established through meridional heat transport: Heating from the extratropics in both hemispheres or in one hemisphere can alter the radiation imbalance of the atmosphere, leading to changes in the strength (Burls & Fedorov, 2017;Carrapa et al., 2019;Rind, 1998), or hemispheric symmetry of the atmospheric heat transport and HC (Chiang & Friedman, 2012;Friedman et al., 2013;Frierson & Hwang, 2012;Frierson et al., 2013).
This proposed connection cannot explain the intermodal difference in simulated HC strength. From CCSM4-MP to CESM2-MP, despite a greater weakening of the Northern Hemisphere HC strength, changes in the meridional atmospheric heat transport go from a strong reduction in CCSM4-MP to nearly constant in CESM2-MP in the northern extratropics ( Figure 8).
Model dependency is also seen in the simulated MP Walker circulation changes. CCSM4-MP simulates strengthened Walker circulation, whereas both CESM1.2-MP and CESM2-MP simulate small (−0.7%) to substantial (−9.0%) weakening of the Walker circulation (Table 3). This intermodel spread is well coupled to the intermodel spread of changes in the east-west SST contrast across the equatorial Pacific, which increases by 0.3°C in CCSM4-MP but decreases by 0.3°C and 1°C in CESM1.2-MP and CESM2-MP, respectively, relative to PI (Figure 9). Both simulations also feature weakened upwelling in the upper ocean of the eastern equatorial Pacific (Figure 9). Consequently, coupled atmosphere-ocean state of the tropical Pacific becomes El Niño-like in both CESM1.2-MP and CESM2-MP but remains unchanged in CCSM4-MP.
The El Niño-like warming pattern has been identified in instrumental observations of tropical SST during the 1980s and 1990s and attributed to the greater response of cloud cover and albedo to CO 2 increase in the eastern equatorial Pacific (Meehl & Washington, 1996;Xie, 2020) even without ocean dynamical

Journal of Advances in Modeling Earth Systems
FENG ET AL.
contributions (Vecchi & Soden, 2007;Xie, 2020). Yet the debate on whether or not this feature originates from model bias in simulating the equatorial Pacific cold tongue remains open (Seager et al., 2019). An El Niño-like warming pattern implies an overall positive cloud feedback in the eastern equatorial Pacific (Zhou et al., 2017) and a higher climate sensitivity to CO 2 forcing compared to a uniform warming pattern or an La Niña-like pattern (Clement et al., 1996;Xie, 2020). Most CMIP6 models now feature an El Niño-like warming pattern in response to CO 2 increase (Xie, 2020). An El Niño-like warming pattern is also seen in observed MP SSTs (Fedorov et al., 2013;Tierney et al., 2019), supporting the simulations by CESM1.2 and CESM2.

Atlantic Meridional Overturning Circulation
Different from the PlioMIP1 simulations by CCSM4 and CESM1.2 Rosenbloom et al., 2013), our experiments show strengthened AMOC in the upper ocean (above 1,000 m) compared to PI corresponding to increased salinity in the North Atlantic ( Figure 10). PlioMIP1 features open Bering and Canadian Arctic Archipelago straits (Haywood et al., 2010). Subsequent sensitivity studies find that closure of those Straits can enhance AMOC in several models (Brierley & Fedorov, 2016;Hill, 2015;Otto-Bliesner et al., 2017) through restricting the fresher water export from the Arctic towards the north Atlantic, which weakens the overall vertical stratification of the North Atlantic . The consistent response between previous sensitivity studies and ours highlights the importance of Arctic Ocean gateways in modulating the strength of AMOC.

Enhanced Earth System Sensitivity From CCSM4 to CESM2
Earth System Sensitivity (ESS) describes the long-term equilibrium surface temperature response to a doubling of CO 2 , which measures long-term feedbacks from changes in ocean circulation, vegetation distribution, and ice sheets in addition to short-term feedbacks but excludes feedbacks from carbon cycle such as marine productivity or weathering (Lunt et al., 2009). Estimates of ESS from MP simulations are known  Note. Metrics used to quantify circulation changes are shown in Table 2. Numbers in parentheses are the % changes relative to PI.

Journal of Advances in Modeling Earth Systems
to be greater than the ECS Lunt et al., 2009), which only measures the submillennial time scale feedbacks in the climate system.
In our simulations, the global mean warming solely from CO 2 radiative forcing can be estimated with the known model ECSs. Using the MP and PI CO 2 (ln(400/284.7)) and ECSs of individual models, we estimate CO 2 -induced global warming of 1.6°C, 2.0°C, and 2.6°C in CCSM4-MP, CESM1.2-MP, and CESM2-MP, respectively. Increasing ECS in CESM2 (Gettelman et al., 2019) and in many CMIP6 models has been attributed to the incorporation of aerosol-cloud interactions into the model and a stronger positive extratropical cloud feedback (Zelinka et al., 2020). Nonetheless, a greater ECS is insufficient to explain the increase in surface warming from CCSM4-MP to CESM2-MP (Table 2). Changes in boundary conditions amplify the simulated global mean warming by 70% of the ECS-estimated warming in CCSM4 but by 100% in both CESM1.2 and CESM2 (Table 2).
Despite ice sheets and PFTs being prescribed in our MP simulations, the terrestrial carbon cycle and plant phenology are prognostic and dependent on both model versions and simulated climates. The radiative forcing efficacy associated with changing vegetation and ice sheets may also depend on simulated climate states. Similarly, although prescribed emissions of aerosols and precursors are the same for MP and PI experiments, climatically important aerosols from the surface such as dust and sea salt are semiprognostic. Aerosol transport and removal are also prognostic. Moreover, CESM1 and CESM2 predict binned size distribution and hygroscopicity of aerosol particles (Liu et al., 2012(Liu et al., , 2016 and aerosol-cloud interactions (Gettelman et al., 2015;Morrison & Gettelman, 2008), which are not included in CCSM4. Consequently, simulated aerosol conditions and forcings are likely different between experiments. Changing aerosol

Journal of Advances in Modeling Earth Systems
conditions during the MP may have a substantial radiative effect Sagoo & Storelvmo, 2017;Unger & Yue, 2014). Future work will explore this effect together with the radiative effect due to changes in vegetation and ice sheets in order to pinpoint the source(s) of the amplified ESS relative to ECS.
Instead, changes in topography, soil, and lakes reflect plate tectonics and weathering and are independent of ESS. Topographic changes are generally small, ranging from a minimum of −280 m to a maximum of 529 m in all simulations. Global temperature responses to MP changes in soil and lake distribution (Pound et al., 2014) and closure of Bering and Canadian Arctic Archipelago straits Otto-Bliesner et al., 2017) and Indonesian Seaway (Brierley & Fedorov, 2016) are also found to be small.
The high ESS-to-ECS ratio highlights the long-term warming potential of the Earth system at the millennial time scale. From CCSM4 to CESMs, there is a clear amplification of ESS beyond the increase of ECS. Geological reconstructions of warm periods of Neogene (since 23 millions of years ago) may support a high ESS-to-ECS ratio. For example, global mean SST is estimated to be 5-6°C warmer than PI during the early

Can Paleo-Observations of MP SSTs Help Evaluate Skills of CCSM4-MP, CESM1.2-MP, and CESM2-MP?
An important goal of paleoclimate modeling is to sample model structural uncertainties with a set of boundary and forcing conditions outside the model calibration range. Comparisons between model simulations and paleo-observations can provide insights into model skill (e.g., Haywood et al., 2019). In practice, this exercise is often complicated by the uncertainty and sparsity of paleo-observations. With coordinated effort, two new MP SST data sets are now available through compiling data within a narrow age constraint of the targeted MP interval (Foley & Dowsett, 2019, FD hereafter) and also applying multiple calibrations (McClymont et al., 2020). These data sets minimize the uncertainty due to unresolved glacial-interglacial cycles in the records, which is known to complicate the MP model-data comparison Prescott et al., 2014;Salzmann et al., 2013). Here we used the FD data set to help rank the skill of three models in simulating MP climate. A detailed discussion on site-by-site comparison between the PlioMIP2 model ensemble and paleo-observational SSTs and limitations of paleo-observations can be found in Haywood et al. (2020) and McClymont et al. (2020).

Assessment of Global Mean Surface Temperature
Due to the sparse sampling, it is unclear whether the observational SSTs contain enough information to estimate the global mean surface warming of the MP. We address this question by examining the relationship between global mean surface warming (Δ{Ts} global-model ) and simulated mean SST warming averaged across paleo-observation sites (Δ{SST} sites-model ) ( Figure 11). Δ{Ts} global-model and Δ{SST} sites-model are calculated for each 30-year period of the last 900 model years of MP simulations. Surface temperature (ΔTs) and SST anomalies (ΔSST) are calculated relative to the last 100-year averages of the PI simulations. Treating each pair of Δ{Ts} global-model and Δ{SST} sites-model as one realization of climate mean state, we notice an excellent linear fit between Δ{Ts} global-model and Δ{SST} sites-model across the three simulations, suggesting strong predictive power of Δ{SST} sites-model for Δ{Ts} global-model ( Figure 11).
Observed SST warming (Δ{SST} sites-obs ) is then calculated by averaging the differences between observed MP SSTs and mean SSTs of 1871 to 1900 (using the Hadley Centre Sea Ice and Sea Surface Temperature data set; Rayner et al., 2003) across all paleo-observational sites. The one standard deviations (1σ) of individual paleo-observations are propagated to the 1σ of Δ{SST} site-obs assuming Gaussian distributions of paleo-SSTs and site independency. Δ{SST} sites-obs is 3.6°C ± 0.6°C, much higher than the equilibrium Δ{SST} sites-model of CCSM4-MP but closer to that of CESM1.2-MP and CESM2-MP. CESM1.2-MP underestimates Δ{Ts}, whereas CESM2-MP overestimates Δ{Ts}, suggesting underestimating and overestimating ESS by these two models (Figure 11).

Assessment of Meridional SST Structure
A key spatial signature of the MP SST warming is the northern high latitude amplification. SST changes are muted in the tropics and are relatively small in the southern high latitudes (Dowsett et al., 2012;Haywood, Hill, et al., 2013;McClymont et al., 2020). This meridional structure is known to be challenging for CCSM4 to capture Rosenbloom et al., 2013) (Figure 12).
We treat observed paleo-SST anomalies relative to the 1871-1990 averages (ΔSST obs ) as random samples of zonal mean ΔSST obs . A polynomial fit using sin(φ) as the basis (φ: latitude in radian) is used to estimate distribution of zonal mean ΔSST obs (Figure 12). The choice of sin(φ) is to account for the hemispherically asymmetric amplification of the SST warming. Statistically significant fit is achieved with the second-order polynomial. The P-value is 0.03 through the likelihood ratio test against an intercept-only model. The polynomial fit well captures the observed SST warming structure. Figure 11. Regression of simulated global mean surface temperature anomaly (Δ{T s } global-model ) against simulated SST anomaly averaged over the paleo-observational sites (Δ{SST} sites-model ). Shadings show 95% prediction interval. The mean and one standard deviation of observed SST anomaly are also shown (Δ{SST} sites-obs ). Anomalies are differences between MP and PI. Filled markers: 30-year model averages.
To account for the zonal heterogeneity in paleo-observations, we estimate one standard deviation (1σ) of predicted zonal mean ΔSST obs by polynomial fit (Figure 12, dash lines). Of all latitudes where paleo-observations are available, 27% of the simulated zonal mean ΔSSTs by CCSM4 are within 1σ of the predicted zonal mean ΔSST obs . This percentage is 78% for CESM1.2 and 43% for CESM2. CESM1.2 shows, overall, the best skill at capturing the meridional warming structure of the MP SST. Within the tropics (20°S to 20°N), simulated zonal mean ΔSST by CESM2 is the warmest among these three models and mostly above the 1σ of zonal mean ΔSST obs , consistent with CESM2's highest ECS and potential overestimate of ESS. Figure 13 shows the spatial distribution of ΔSST model and ΔSST obs . From CCSM4-MP to CESM2-MP, ΔSST model displays a closer match to ΔSST obs in the North Atlantic with the mean difference relative to ΔSST obs decreasing from −2.8°C to −0.3°C. CESM1.2-MP shows the best match (0°C mean difference) with ΔSST obs at tropical sites (<20°N/S), whereas CCSM4-MP underestimates (−1.0°C mean difference) and CESM2-MP overestimates (+1°C mean difference) tropical SSTs. Persistent model-observation mismatch remains high around the Benguela current, Gulf Stream, North Atlantic current, and Mediterranean. These mismatches are also noticed in MP multimodal ensembles (McClymont et al., 2020). These regional mismatches may come from high spatial heterogeneity of ocean currents and insufficient model resolution. High resolution is shown to improve the skills of CCSM4 and CESM1 in simulating present-day upwelling and current systems (Gent et al., 2009;Small et al., 2014). Along the upwelling zones and its surroundings, amplified warming may be generated by the weakening of upwelling-favorable wind events  and optimal combinations of ocean stratification, wind stress, and alongshore pressure gradient (Miller & Tziperman, 2017). These conditions may not be featured in the fully coupled simulations.  Comparison of meridional structure of simulated and observed SST differences between MP and PI (ΔSST). Black diamond ¼ ΔSST from paleo-observations. Solid lines ¼ second-order polynomial fit of observed ΔSST as a function of sine latitude. Dash lines ¼ one standard deviation of predicted zonal mean ΔSST by polynomial fit. Light magenta ¼ simulated ΔSSTs at each model grid across latitudes for which paleo-observations are available. Red ¼ simulated zonal mean ΔSSTs.

10.1029/2019MS002033
Journal of Advances in Modeling Earth Systems

Conclusions
Three versions of ESMs maintained at the National Center for Atmospheric Research are applied to simulate the mid-Piacenzian warm period, a target interval of Paleoclimate Model Intercomparison Project. All three simulations agree in the signs of MP changes of global mean and meridional structure of surface temperature, precipitation, sea ice cover, HC, and AMOCrelative to PI. Specifically, global mean surface air temperature increases by 2.7°C in CCSM4 and 5.2°C in CESM2. Surface warming is amplified in the polar region. Arctic amplification is 3 times the global mean in CCSM4 and CESM1.2, and 2.4 times that in CESM2. CESM2 features near-complete sea ice-free conditions during both Arctic and Antarctic summer. Summer sea ice remains present in both the Arctic and the Antarctic in CCSM4. Precipitation increases by 5% in CCSM4 and 11% in CESM2 relative to PI, slower than the expected increase of surface saturation vapor pressure, suggesting slowing down of convective mass overturning. HC weakens by 10% to 22% in the Northern Hemisphere. AMOC strengthens by 3 to 8 Sv in the upper 1,000 m of the ocean. All simulations demonstrate greater ESS compared to ECS. ESS is amplified by 70% in CCSM4 and 100% in CESM1.2 and CESM2 relative to ECS.
Simulations disagree in the signs of global mean cloud changes relative to PI: CCSM4 shows small increase in both low and high cloud covers; both CESM1.2 and CESM2 show reduction of cloud covers at all heights.
Simulations also disagree in changes of the Walker circulation and mean climate state of the tropical Pacific. CCSM4-MP shows strengthened Walker circulation and an unchanged mean climate state of the tropical Pacific relative to PI. Both CESM1.2-MP and CESM2-MP show weakened Walker circulation and an El Niño-like tropical Pacific state, an improvement toward matching observed SSTs of the MP.
Finally, we find that MP SST observations are sufficient to rank simulation skill between models. Based on those data, MP global mean surface warming is largely underestimated by CCSM4 but overestimated by CESM2. Both CESM1.2 and CESM2 are better at capturing the amplified northern high latitude warming than CCSM4 is. Yet CESM2 overestimates warming in the tropics. Overall, paleoclimate constraints from the MP suggest that CESM2 may be overly sensitive to forcings from CO 2 , vegetation, and ice sheet changes.

Data Availability Statement
Simulation data are available through the Climate Data Gateway maintained at NCAR. The CESM2 run is available through the Earth System Grid and is part of NCAR's contribution to CMIP6.