Data-constrained assessment of ocean circulation changes since the middle Miocene in an Earth system model

Since the middle Miocene, 15 Ma (million years ago), the Earth’s climate has undergone a long-term cooling trend, characterised by a reduction in sea surface temperatures by over 6 °C, with 4 to 6 °C cooling occurring in the deep ocean. The 10 causes of this cooling are primarily thought to be linked to changes in ocean circulation due to tectonic plate movements affecting ocean seaways, together with a drop in atmospheric greenhouse gas forcing (and attendant ice-sheet growth and feedback). In this study we assess the potential to constrain, using marine sediment proxy data, the evolving patterns of global ocean circulation and cooling of surface climate over the last 15 million years (Ma) in an Earth system model. We do this by compiling surface and benthic ocean temperature and benthic carbon-13 data in a series of seven time-slices spaced at 15 approximately 2.5 million year intervals. We pair this with a corresponding series of seven tectonic and surface climate boundary condition reconstructions in the cGENIE (‘muffin’ release) Earth system model. In the cGENIE model, we adjust atmospheric CO2 together with the magnitude of North Pacific to North Atlantic salinity flux adjustment in a series of 2D parameter ensembles in order to match global temperature and benthic C patterns in the model to the data. We identify that a relatively high CO2 equivalent forcing of 1120ppm is required at 15 Ma in cGENIE to reproduce proxy temperature estimates 20 in the model, noting that this CO2 forcing is dependent on cGENIEs climate sensitivity (which is as the present day) and that it incorporates the effects of all greenhouse gases. The required CO2 forcing progressively reduces throughout the subsequent six time slices delineating the observed long-term cooling trend. In order to match the evolving patterns of the proxy data, we require fundamental change in the mode of ocean circulation at 12.5 Ma with present-day-like benthic δC trends established by 10 Ma. We also find a general increasing strength of Atlantic overturning despite a reduction in salinity of the surface North 25 Atlantic over the cooling period, attributable to falling intensity of the hydrological cycle and polar cooling caused by CO2driven global cooling.

2011) together with the intensification of glacial-interglacial cycles in the Pleistocene. Fundamental changes have also occurred in ocean circulation patterns  during this interval of global cooling. Specificallythe Atlantic Meridional Overturning Circulation (AMOC), that today redistributes heat to the Northern hemisphere, became established in its current form sometime after the middle Miocene (Sepulchre et al., 2013;Bell et al., 2015). In turn, this is suspected to have been linked to the closing of the seaway between the Atlantic and the Pacific with the creation of the Isthmus of Panama (Lunt 35 et al., 2008;Montes et al., 2015;O'Dea et al., 2016;Jaramillo et al., 2017) that finally closed the Central American Seaway (CAS).
Other seaways have also been tectonically transformed since the Miocene, particularly: the disappearance of the Tethys Sea due to the northward movement of Africa (Hamon et al., 2013); the restriction of the Indonesian seaway with the northward movement of Australia (Srinivasan et al., 1998); the widening of the Drake Passage driven by the northward 40 movement of South America relative to Antarctica (Lagabrielle et al., 2009). Associated with these plate movements, the Miocene to Holocene was characterised by significant mountain building which may have played a direct role in the drawdown of atmospheric CO2 via weathering, and hence progressive cooling (Filippelli et al., 1997;Raymo et al., 1998).
Terrestrial proxies for temperature indicate that the Miocene was significantly warmer than the present day (Pound et al., 2012 and references therein). Marine data also indicate a significantly warmer-than-present Miocene climate with surface 45 ocean temperatures over 6 °C warmer than present (Stewart et al., 2004;Herbert et al., 2016), deep ocean temperature 4 to 6 °C warmer (Cramer et al., 2011), and atmospheric CO2 levels at 470 to 630ppm in the middle Miocene (Sosdian et al., 2018).
Sea-floor proxy data have been used extensively to understand patterns of ocean circulation and climate in the Miocene . Oxygen-18 data has been used as a proxy for temperature and global ice volume, often combined with Mg/Ca (as a further temperature proxy) (Cramer et al., 2011;Hauptvogel et al., 2012;Badger et al., 2013;Lear et al., 2015;Pierce et al., 2017). Carbon-13 data has principally been used as a circulation tracer (Lynch_Stieglitz 2003), with high-55 resolution data used to understand timing of changes in (local or regional) climate, and linking this to changes in orbital forcing (e.g. Holbourn et al., 2005). With the increasing availability of data, global distributions of δ 13 C and differences in ocean basin means, or regional δ 13 C has been used as further evidence of changes in circulation since the middle Miocene and of the respective dominance of certain water masses (e.g. Cramer et al., 2011;Butzin et al., 2011). These studies emphasise the importance of changing bathymetry and the closing or opening of ocean seaways, with the Drake Passage (affecting the 60 Antarctic Circumpolar Current, ACC) and the Central American Seaway (affecting the North Atlantic) likely the most important determinants in global circulation patterns since the middle Miocene.
In this paper we aim to explore to what extent proxy data can constrain changing global climate and ocean circulation patterns in an Earth system model. We do not provide an exhaustive sensitivity study of Miocene seaways and the carbon cycle https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License.
here, but instead aim to combine constraints of temperature and ocean circulation to create plausible and self-consistent paleo 65 realisations of climate and ocean circulation, for each of seven time-slices spanning the middle Miocene to present (late Holocene). Our focus in this paper is hence on the model-data methodology and the self-consistency and plausibility of the outcome. For an overview of Miocene circulation and the effects of seaways on that circulation, e.g. see Butzin et al. (2011) and Sijp et al. (2014).

Model-data Methodology 70
We employ foraminifera proxy data for: surface ocean temperature, benthic ocean temperature, and benthic ocean δ 13 C, and compile this for seven time-slices, in order to constrain the applied climate forcing and ocean circulation in the cGENIE.muffin Earth system model from the mid Miocene to the present (late Holocene) ( Table 1). The model climate forcing we employ is in the form of atmospheric CO2, that is to be understood as an equivalent CO2 forcing that encompasses all atmospheric greenhouse gases (esp. methane). Ocean circulation is a function of surface ocean boundary conditions (esp. wind stress) and 75 climate (and the atmospheric CO2 forcing), ocean bathymetry, and the existence and nature of seaways and gatewaysall of which we adjust at each time-slice. We further introduce a variable to the model that alters the salinity transfer between North Pacific and North Atlantic -a classic 'flux adjustment'. This represents the effect of atmospheric moisture transport and the relative precipitation that East vs. West draining watersheds receive on the north American continental land mass, none of which can be reproduced well in the simple 2D energy-moisture balance model based approach employed in the version of the 80 cGENIE model we use (Edwards and Marsh, 2005;Marsh et al., 2011). This 'flux correction' results in an Atlantic Meridional Overturning Circulation (AMOC) being induced (and/or strengthened) in cGENIE and indeed is required in the standard present-day configuration (Ridgwell et al., 2007;Marsh et al., 2011) in order to rectify the aforementioned simplification in modelled atmospheric moisture transport and dynamics.
Three different paleo/proxy datasets were compiled: surface temperature, benthic δ 18 O and benthic δ 13 C. The surface 85 and sea-floor temperature data allows us to evaluate the model skill in reproducing ocean heat distribution, and hence the models simulated global-scale pattern of ocean circulation. It also provides a means of determining the atmospheric forcing (in terms of CO2) that produces surface and deep ocean temperatures that best match the data.
The isotope of carbon, carbon-13, has been used as a tracer for paleo ocean circulation for many years (Lynch-Stieglitz 2003). It is a stable isotope, heavier than carbon-12 and accounts for about 1% of all carbon on Earth. The ratio of carbon-13 90 to carbon-12, designated as "δ 13 C"the divergence from a standard in units of parts-per-thousand (‰), can be estimated for paleo ocean waters by measuring the δ 13 C in shells of foraminifera formed in those paleo water masses. Different water masses have characteristic δ 13 C signatures, which depend on biological activity (in the ocean carbon pumps) and critically for this study, also on ocean circulation patterns. Shells of dead foraminifera gradually accumulate on the ocean floor, thus providing a record of changes in water column chemistry over time at that location in the form of ocean sediments. In general, changes in δ 13 C seen in these records are interpreted as changes of water masses dominant in that location, and therefore as a means of reconstructing paleo ocean circulation.

Datasets
Published surface temperature data selected are those using either alkenones or TEX86 for all seven slices (Figure 1 for surface 100 temperature data locations and Supplementary material [A] for data) noting that these proxies, like any proxy, are subject to uncertainties and limitations (Richey and Tierney 2016). For slices 2.5Ma to 15Ma, published benthic δ 13 C and δ 18 O data for Cibicidoides and Planulina foraminifera species were selected, with at least 15 data-points for each time-slice, covering at least the Atlantic and Pacific basins (figure 1 for benthic data locations, figure 2 for δ 13 C data plotted by paleo-latitude, Supplementary material [B] for data). These species were selected so that temperature could be calculated from δ 18 O using 105 Marchitto et al. (2014). Final benthic temperatures calculated from δ 18 O take account of the effect of benthic water salinity on δ 18 O which is affected by ocean circulation (the temperatures in Table S2 are uncorrected for salinity). The method for this is described in detail in Appendix A; we use the models benthic salinity field as a correction to the calculated temperature from Marchitto et al (2014). A further dataset for benthic δ 13 C for the Holocene from Peterson et al. (2014) was applied for the Holocene time-slice. 110 Paleo-locations for each data point were found using the reconstructions in www.paleolocation.org (Urban and Hardisty 2013) that provide Zanclean (4.466 Ma), Tortonian (9.427 Ma) and Burdigalian (18.2 Ma) paleo-locations using PLATES reconstructions (University of Texas). Thereafter, the paleo-locations for each data-point at our time-slices (Table 1) were interpolated using these three points together with the present day location in a cubic model regression. To account for age model uncertainties, a window of ±1 million years was inspected for each data-point to ensure the data value was not 115 unrepresentative of the general value at that site around the age point. All the data points and their paleo-locations, and the ±million year time-series plotted for each data location are available in Supplementary Material, Fig. S1.
These three individual components and their coupling are described in Marsh et al. (2011) (and references therein). The basic physics parameter calibration of the climate model component is as per Cao et al. (2009) unless described otherwise (below).
For the modern Atlantic basin configuration (and, as per the findings of this paper, also for the last ca. 10 Ma), the lack of 125 atmospheric dynamics and explicit topography on land in the model requires that a freshwater flux adjustment is applied. This https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License.
is implemented by transferring salinity to the North Atlantic region from the North Pacific, as described in Edwards and Marsh (2005).
Our implementation of the cGENIE model includes a relatively complete description of the cycling of carbon and oxygen in the ocean plus exchange with the atmosphere, as described in Ridgwell et al. (2007) and Cao et al. (2009). In 130 addition, the carbon isotopic (δ 13 C) composition of all the carbon pools plus associated fractionations are represented, as described in Ridgwell et al. (2007) with additional description and evaluation in Kirtland Turner and Ridgwell (2016). Given the paucity of constraints on the evolving patterns of aeolian fluxes to the ocean surface, we omit a marine iron cycle and only a single nutrient, phosphate, potentially limiting to biological productivity in the ocean is included in our configuration.

Model boundary conditions 135
For this paper we create a series of new continental and surface boundary condition configurations of the cGENIE modelfor seven time-slices spanning the late Holocene through to mid Miocene. In the original modern configuration of the GENIE (and later cGENIE.muffin) model, ocean bathymetry, land-sea mask, and required ocean circulation files (defining 'islands' and circulation paths around islands) were derived from global topographic observations (ETOPO5) and the topography subsequently filtered (Edwards and Marsh, 2005). Because the atmospheric EMBM component lacks clouds and dynamics 140 (e.g. winds), further fields must be provided as fixed annual average boundary conditions. Firstly, a zonally-averaged planetary albedo profile is applied (as a simple cosine function of latitude in Edwards and Marsh (2005)). Secondly, fields for: (a) wind stress on the ocean surface, (b) wind velocity in the atmosphere, and (c) short-term wind speed, are derived from modern observations and applied to (a) drive surface ocean circulation, (b) transport heat and moisture in the atmosphere and for calculation of heat and moisture exchange between ocean surface and atmosphere, and (c) in calculating air-sea gas exchange, 145 respectively.
In this paper and as per in previous (deeper time) paleo applications of the cGENIE.muffin model (e.g. Ridgwell and Schmidt, 2010), rather than observations, we derive the required boundary conditions from a representative fully coupled GCM experiment. The software suite we employ to do this is called 'muffingen' and is hosted on GitHub (https://github.com/derpycode/muffingen). This software takes a specific GCM experiment as input and does the following: 150 1. Creates input (from GCM) and output (here: 36×36 with 16 levels in the ocean) grids.
2. Using (1), derives a land-sea mask and also ocean bathymetry on the output grid. The land-sea mask is lightly filtered to prevent the occurrence of isolated in-land seas and single-cell width coastal embayment, while the ocean bathymetry is filtered to avoid single cell 'holes' occurring in the ocean floor.
3. Generates drainage basins determining where precipitation and hence in which direction runoff is directed towards 155 the ocean. The specific scheme used here is known as a 'roofing scheme' and operates to create a watershed approximately equidistant from the coast. 4. Derives island and ocean paths files required by the ocean circulation model. 5. Re-grids GCM wind stress and (10 m) wind velocity to the output grid (which for wind stress, means re-gridding to both u and v edge grids). Wind speed is calculated from the mean annual wind velocity components. 160 6. Re-grids the GCM planetary albedo.
The GCM simulations underlying our cGENIE model configurations were carried out using HadCM3LM2.1aE, which is described in detail in Valdes et al. (2017). The models are constrained with paleogeographies and solar constant appropriate for each geological stage in the Miocene, and a CO2 mixing ratio of 400 ppmv. The experimental design is described in detail in Farnsworth et al. (2019). 165 For all configurations we generally follow the default re-gridding algorithm of the muffingen software in order not to impose prior assumptions regarding the importance of specific ocean features, meaning that for most of the reconstructed continental configurations, a Mediterranean Sea is not present in cGENIE. Only in the 12.5 and 15.0 Ma time-slices is the remnant Tethys Ocean sufficiently expansive to re-grid as an ocean basin at the selected 36×36 (16 levels) cGENIE model resolution. We do, however, make the following manual interventions in the generation of the land-sea mask (but not in ocean 170 bathymetry): We chose and calculate a zonally average (GCM-derived) planetary albedo profile rather than a 2D re-gridded one, in order to retain closer back-compatibility with the original GENIE configuration in which an idealised zonal profile is applied (e.g. Edwards and Marsh, 2005). Different GCMs average and save wind speed differently (or only as velocity vectors), so, 185 the final re-gridded wind speed product can differ substantially between GCMs and with modern observations. As a result, airsea gas exchange is rescaled in the late Holocene configuration in order to give a mean global and annual average modern airsea coefficient value for CO2 of approximately 0.058 mol m -2 yr -1 atm -1 . This same air-sea gas exchange scaling is then applied to all older time-slices. As compared to Cao et al. (2009), we also forego the high southern latitude zone of reduced atmospheric diffusivity, previously used for present-day model configurations (described further in Marsh et al., 2011). Initial 190 mean salinity was reduced by 1 PSU to 33.9 PSU in all time-slice configurations for simplicity and consistency (although we recognise that in reality mean ocean salinity should progressively decrease from a modern value of 34.9 back in time with https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License.
progressively decreasing global land ice volume). The orbital configuration was kept at its modern settings throughout all timeslices (0.0167 eccentricity, 0.397789 for sine of obliquity, 102.92 for the longitude of perihelion (in degrees)). However, we did vary the following model parameter values and initial conditions as a function of (geological) time (time-slice age): 195  The solar constant is assumed to change with time and to follow Gough (1981) (and see : Feulner, 2012), resulting in a small increase between 15.0 and 0 Ma, from 1366.09 Wm -2 (a reduction of 0.14% compared to modern) at 15.0 Ma, to 1368.0 Wm -2 by the late Holocene.
 The mean Mg/Ca ratio of the ocean is also assumed to change with time, following Tyrrell and Zeebe (2004). The

Model experimental design
The premise of our experiment design and study is that the observed distribution of ocean temperature proxies and benthic δ 13 C can be explained by a combination of climate state and the specific pattern of global ocean circulation. For this study, we ignore the strength of the biological pump as an additional and independent control on δ 13 C (but not temperature). In a 'perfect' paleo model (which does not exist) and under the correct greenhouse gas boundary conditions (which are poorly constrained), 210 climate and ocean circulation would be a correct and emergent property of the model and match the data. In cGENIE, we consider CO2 as a primary uncertainty in the modelnot only in terms of uncertainty in the real past value, but additionally in the radiative forcing required in cGENIE to generate appropriate temperatures. Additionally, because there is no prior expectation that for a 'correct' surface climate the emergent state of ocean circulation is at-all correct in cGENIE, we consider the salinity flux adjustment (hereafter 'FwF') as an 'unknown' and moreover, as a means-to-an-end in adjusting basin (to 215 global) scale patterns and strength of circulation (and specifically Atlantic overturning circulation). The aim in this methodology being to determine a specific circulation pattern and climate (temperature) state that best reproduces the data. Sv (Edwards and Marsh, 2005). Furthermore, because the land-sea mask progressively changes across the 7 time-slices, we create a common mask for the purpose of salt transfer by identifying all the grid points in the North Pacific and North Atlantic that are ocean ('wet' points) across all configurations. The total FwF value is then divided evenly across all (equal area) grid points in the North Atlantic (positive salinity input) and North Pacific (negative salinity input) such that global ocean salinity is always conserved. 230 For each ensemble member (48) and each time-slice (7) (a total of 336 model simulations), cGENIE is spun-up for 10,000 years until it reaches steady-state. In the absence of appreciable inter-annual variability and unlike in fully coupled GCM experiments, multi-decadal averaging is not necessary in the cGENIE model and we take the last annual average (year 10,000) of the simulation in order to carry out the model-data comparison.
The one caveat and complication to how we conduct the model-data comparison is that in running the parameter 235 ensembles, we identified sustained oscillations in global ocean circulation in many of the simulations that affected both temperature (several degrees) and δ 13 C (several tenths of a per mil). These oscillations were of varying period and magnitude and occurred mainly in the mid-range time slices at mid-range CO2 levels and mid-range flux corrections. Whilst this is extremely interesting from an ocean circulation and dynamics point of view (and will be followed up upon subsequently), it is not the focus of this study and we need to identify a consistent 'representative' climate state in the model in order to make the 240 model-data comparison. To create climate conditions that are representative of each CO2-flux correction combination we therefore parsed the cGENIE model output, identifying ensemble members characterised by self-sustained oscillations in their output. For these experiments, we further automatically identified the period of the oscillation (typically 2-4 kyr) and averaged the model output over one full period, starting the average from the end of the 10,000 year spin up and working back (towards the start of the model experiment). We thereby created a 'mean' annual average output for the affected ensemble members. 245 Both unmodified run-end and reconstructed mean annual averages were then treated exactly the same in terms of carrying out model-data comparison.

Model-data comparison
The datasets collated are used as a constraint to determine which combination of CO2 and North Atlantic salinity flux correction (FwF) produces the best-fit climate to data indicators. This is performed quantitatively by statistically comparing local model 250 output with the data points for all three datasets, and combining these to produce a final "best-fit" model setting for each timeslice. The statistical methods applied are: 1) the difference between the mean of the dataset and the mean of the model output at the data-locations, producing an offset value or mean bias, 2) an overall measure of goodness of fit of the model to the dataset known as "M-score" (Watterson, 1996) that compensates for model-data bias (i.e. bias is not considered in M-score).
Where several data-points are located within one model grid square, the mean of the data values is used to compare with the 255 model value. For the benthic data, each data-point is assumed to be on the ocean floor, so the model's bathymetry determines the data depth, and the data value is compared with the model value in the deepest water in that location.

Individual constraints
The latitudinal trend for δ 13 C data (dotted lines) in Fig. 2 shows that the Atlantic switches trend between 12.5 Ma and 10 Ma. 260 Prior to 12.5 Ma the North Atlantic data indicates a generally more negative δ 13 C than in the South Atlantica situation that reverses from 10 Ma onwards when more positive δ 13 C tend to occur in the North Atlantic. In the Pacific Ocean, the δ 13 C gradient from south to north tends to increase towards the present, with the strongest gradient seen at 2.5 Ma. Overall, δ 13 C values increase spread from 15 Ma towards the present.
Our compiled surface ocean temperatures provide a first order constraint on atmospheric CO2 (in terms of that required 265 in the cGENIE model to generate appropriate warming). The cGENIE model has a climate sensitivity matching that of the present day for all timeslices, if climate sensitivity was different in the past then this would affect the required CO2 forcing needed to achieve a certain global temperature (e,g, Bradshaw et al., 2015). The M-score measures how well the model reproduces these surface ocean temperatures. For each time-slice, the model ensemble is plotted as a grid (Fig. 4), with flux correction (FwF) on the x-axis, CO2 concentration on the y-axis and M-score plotted as contours. The closer the M-score is to 270 1, the better the model does at reproducing the surface ocean temperature. According to this measure, a CO2 forcing of 280 ppm at the Holocene, rising to 1600 ppm by 10 Ma, is suggested by tracking the trajectory of maximum M-score values. In older time slices, the maximum M-score reduces significantly, from over 0.8 for the Holocene to less than 0. For almost all slices, an elbow-type shape is present in the modelled benthic temperatures. As CO2 reduces for a given FwF, benthic T is either stable or even increases (Fig 5). At CO2 levels lower than this elbow, the cooler surface ocean favours the sinking of waters in high latitudes, supporting the thermohaline circulation. From 12.5 Ma this elbow turning point CO2 value is around 800 ppm and this gradually reduces to about 400 ppm in the Holocene (for settings in which model mean deep ocean temperature agrees with data). At CO2 higher than this elbow, there is only a very weak (or shallow) AMOC. 295 The highest M-scores for the older time slices (Fig. 6) do not agree with the surface temperature constraint for CO2 forcing (Fig. 4), with a CO2 of 560 ppm at 15 Ma with a high FwF suggested from benthic temperatures compared to a requirement of over 1000 ppm according to surface temperatures. The calculated benthic temperatures are dependent upon the local salinity and on global ice volume. In general, a higher global ice volume (than estimated by Cramer et al., 2011, Eq. 7a therein), means lower benthic temperatures as calculated from δ 18 O (Marchitto et al 2014, Eq. 9 therein). 300

Combining the constraints
All data constraints together are shown in Fig. 7, with the two temperature constraints (plus independent global mean benthic temperature estimate) mapped onto the statistical fit surface (M-score) of model vs. observed δ 13 C. We mark the best-fit selection with a star and include a range of estimates delineated by error bars on the best fit CO2 and FwF values. Note that 305 the best-fit is only identified from those specific values that we ran the model ensemble members for (i.e. we do not interpolate between settings). The model fit to benthic data δ 13 C are discussed as part of the best-fit selection.
The primary constraint on atmospheric CO2 in the model in this study is the proxy reconstructed surface ocean temperature. This is much less sensitive to the FwF value than the other, benthic ocean constraints. This is because surface ocean circulation patterns are largely dictated by the wind stress forcing, that we do not vary within any single time-slice. 310 Given a surface ocean circulation pattern that is relatively immune to changes in the applied FwF, the ensemble member CO2 value and with it the surface climate state control the mean and pole-to-equator gradient of SST and hence model-data surface temperature fit. For all the following analyses of the model-data fit, we hence start with consideration of SSTs and the choice of CO2 for each time-slice in turn, then bring in additional constraints and consideration of the value of FwF.
1. For the Holocene slice, we have a priori knowledge that CO2 at this warm interglacial period was around 280ppm 315 (Indermühle et al., 1999)a value which in cGENIE agrees with the surface ocean temperature data. This gives us some confidence in the cGENIE Earth system model and the methodology, but caveated by the fact that although we use here a new and different GCM-derived (Farnsworth et al., 2019) modern continental configuration, cGENIE has already previously been calibrated against present day observed ocean temperatures at an atmospheric CO2 value of ca. 280 ppm (e.g. Price et al., 2009;Ridgwell et al., 2007), and so an acceptable fit to SSTs for a CO2 value 320 of 280 ppm is not necessarily unexpected. The global benthic temperature estimate from Cramer et al. (2011) represents a mean climate for both glacial and interglacials, which is around 2 °C cooler than the warmer Holocene. flux correction of 0.1 to 0.2 Sv is required to fit the data (in comparison, the present-day calibrated value in cGENIE is 0.32 Sv). However, the benthic δ 13 C data constraints (Fig. 7) tends towards a higher M-score for a higher flux 325 correction, so we chose a best fit value of FwF 0.2 Sv.
2. For the 2.5 Ma slice, surface and benthic temperatures both suggests a higher (and larger range for a fit to) CO2 than at Holocene, so 400 ppm is selected as the best-fit CO2 value. Similar as in the Holocene, a higher flux correction (FwF) at 2.5 Ma tends to show a better model-data benthic δ 13 C agreement, although this constraint is less strong than the benthic temperature constraint (maximum M-score for benthic temperature is ~0.4, and for δ 13 C at 0.5Sv is 330 ~0.2). We hence select 0.3 Sv as the FwF value, on the higher end of the benthic temperature (δ 18 O) dataset, and in agreement with global mean benthic temperature from Cramer et al. (2011).
3. At 4.5 Ma, the surface temperature data supports a higher again CO2 value in cGENIE, but here the δ 13 C constraint is stronger (as compared to that at 2.5 Ma) and instead supports a lower-CO2-higher-FwF, as the M-score increases in this direction of parameter space (Fig. 7). We hence place the best-fit in this directionon the higher end for the 335 benthic temperature statistical fit range, and with CO2 at 400 ppm and a FwF value of 0.5 Sv.
4. At 7.5 Ma, matching surface-temperature in cGENIE requires a significantly increased CO2 compared to 4.5Ma, with 800 ppm as being clearly the best-fit CO2 value. A similar tendency (to 4.5 Ma) to lower CO2 but higher FwF is also apparent in the δ 13 C constraint for this time-slice, although at 800pm this is less strong (with a low M-score for 800 ppm and high FwF). The flux correction is set at 0.4 Sv, near the centre of the benthic temperature 340 maximum M-score (shown as a dashed blue contour). 5. For 10 Ma and earlier time slices the model-data M-score for surface temperature declines, but with higher scores for higher CO2 in all three cases. At 10 Ma, the benthic δ 18 O temperature dataset suggests a CO2 of 800 ppm and FwF of 0.3 to 0.4 Sv. The δ 13 C fit surface tends towards higher M-score values for higher flux, so we set FwF at 0.4 Sv. 345 6. The 12.5 Ma slice seems to show a transition state for the trends in δ 13 C, with overall low M-score for all combinations of CO2 and δ 13 C. As δ 13 C provides a weak constraint, the CO2 and FwF values are selected as 1120 ppm and 0.2 Sv, respectively, as a compromise between the high CO2 requirement for surface temperature and the lower-CO2 higher-FwF for the benthic temperature. 7. At 15 Ma the ensemble surface of M-score for δ 13 C is inversed, showing a higher score for (generally) higher CO2 350 and combined with lower FwF. Although we have fewer surface temperature data-points, they suggest a high CO2, somewhat in disagreement with the benthic temperature dataset that favours a mid-range CO2 together with higher FwF. As the δ 13 C constraint tends towards a lower FwF (compared to time slices younger than 10 Ma), we select a CO2 of 1120ppm and a low flux correction of 0.1 Sv.
All these selected settings and the ranges are summarised in Fig. 8. In general, in the older time slices, cGENIE underestimates 355 the north Atlantic temperature and this results in a generally higher M-score for the highest CO2 settings for 15, 12.5 and 10 Ma slices. The warm North Atlantic in the data compared to the model may be due to insufficiently simulated increased heat https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License. transport from the lower latitudes. However, benthic temperature patterns do not suggest that a particularly high FwF is appropriate at 15 Ma and 12.5 Ma at the higher CO2 levels. At 10 Ma, the stronger FwF results in increased sinking of the warmer surface waters to the deep N. Atlantic (Fig. 10), although the model still underestimates surface N. Atlantic temperature 360 according to data. This warm N. Atlantic in older time slices (Figure 9b N. Atlantic locations shown as grey triangles) cannot be explained by a stronger AMOC caused by a more saline surface N. Atlantic in the model. In all older time slices, the model high latitude temperatures tend to be too low compared to datathis is an established characteristic of warm climates, where climate models tend to struggle to reproduce the flatter latitudinal temperature gradients seen in data (Goldner et al., 2014).
Taking account of the deep ocean salinity in the temperature calculation from δ 18 O data makes a significant difference 365 to deep ocean temperature patterns (Figure 10b). When flux corrections are higher (mainly from 10 Ma), the saltier North Atlantic waters result in higher calculated temperatures, and data-points in shallower waters are most strongly affected.
Accounting for water salinity increases some data-points temperature by more than 3°C.
The picture for δ 13 C is more complex than for temperature as it represents the combined effects of ocean circulation and of the action of marine biota and the ocean biological carbon pump (in which respect we do not make any attempt to adjust 370 the model towards the data here). Overall, δ 13 C data show a larger range than modelled δ 13 C of DIC from 12.5 Ma to 2.5 Ma; at 15 Ma and at the Holocene the ranges are comparable (Fig. 11). At higher flux corrections (10 Ma to 4.5 Ma) the deep ocean is relatively lighter than at lower flux corrections (Holocene, 12.5 Ma and 15 Ma) with respect to the atmospheric value (Fig.   11). For 15 Ma to 12.5 Ma this is due to low ventilation of deep ocean waters, and for the Holocene due to the dominance of AMOC delivering north Atlantic surface waters to the deep. For 15 Ma and 12.5 Ma we would therefore expect a higher 375 surface to deep δ 13 C gradient (except for the north Atlantic), a pattern generally reversed in later times slices.
Modelled surface ocean temperature has fallen more than benthic ocean temperature since 15 Ma, with simulated best-fit declines of ~6 °C and ~3 °C respectively (Fig. 12). From at least 10 Ma onwards, a more saline surface North Atlantic (with higher FwF imposed) in conjunction with the specific configurations of the continents allows the sinking of warm salty waters and the delivery of surface heat to the deep. However, the absolute salinity of the surface north Atlantic reduces 380 consistently from 15 Ma to present (despite global mean salinity in the model being initialised at 33.9 PSU across all timeslices) (Fig. 12). Only the 2.5 Ma slice shows a more saline N. Atlantic than the previous time slice. The benthic N. Atlantic salinity shows a pattern similar to the imposed flux correction, but is also clearly dependent on other boundary conditions; CO2 and flux correction does not change between 10 Ma and 7.5 Ma but benthic N. Atlantic salinity does. Sea ice cover appears from 10 Ma, but with significantly more sea ice present from 4.5 Ma onwards when best fit CO2 in the model has dropped to 385 400ppm, with sea ice both increasing in area and thickening when CO2 drops further to 280ppm at the Holocene (Fig. 12).
Every ensemble member, regardless of the specific atmospheric CO2 assumption, was driven with a δ 13 CO2 value of -6.5‰. Comparing the offset of simulated benthic δ 13 C values to the data can hence give an estimate of how atmospheric δ 13 CO2 has changed through time. This is shown in Fig. 13 and represents the atmospheric δ 13 CO2 value that is required to best reproduce the mean benthic δ 13 C seen in data measurements. This diagnosed history of atmospheric δ 13 CO2 can in turn be used 390 https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License. as a means of identifying changes in the global carbon cycle (e.g. Hilting et al., 2008) or as initial condition values for future model (and model-data) based studies.

Climate forcing and CO2 levels
Climate modelling efforts of Miocene time have generally found that warming is insufficient compared to proxy data 395 for an atmospheric CO2 concentration around 400pm (Goldner et al., 2014;Bradshaw et al 2012;Krapp and Jungclaus, 2011). Greenop et al. (2014) find that CO2 concentrations of 300ppm to 500ppm are inconsistent with their findings of Antarctic ice sheet stability during the Middle Miocene Climatic Optimum (MMCO, at 17-14 Ma), the warmest part of the middle Miocene.
Estimates of Miocene CO2 in more recent studies have trended higher (MMCO at 470-630 ppm, Sosdian et al., 2018) compared to the earlier estimates (e.g., ~220ppm at 15 Ma, Pagani et al., 1999). Estimates of surface ocean temperature have also tended 400 to trend higher in more recent studies (Pearson, 2012). In both cases (CO2 and temperature estimates), this trend towards a warmer and higher CO2 was due to improved scientific methods and understanding (and the identification of new proxies); whilst terrestrial data has fairly consistently suggested this warm Miocene (e.g. Wolfe, 1985). However, considering only the most recent highest CO2 estimates there is still a discrepancy between the surface ocean temperature constraint (that requires a high CO2 in this study) and estimated CO2 levels from Sosdian et al. (2018). As an example, forcing cGENIE with CO2 levels 405 identified by Sosdian et al. (2018) for 10 Ma (~400ppm) results in a surface ocean that is around 4 °C too cold compared to the surface ocean temperature data. At older time-slices the discrepancy (when assuming 400 ppm throughout) increases further.
At 12.5 Ma and 15 Ma, we find our largest differences in model to data surface temperature in the N. Atlantic (Fig   9), with modelled N. Atlantic temperature being lower than sea surface temperature (SST) indicators. Part of this modelled 410 N.Atlantic surface temperature underestimate may be due to a latitudinal temperature gradient being too high in the model (as also identified in previous modelling efforts of past warm climates), with heat transport to the poles being too low. Increased ocean heat transport was found to reduce the latitudinal temperature gradient and the location of Hadley and Ferrel cells in a modelling study by Knietzsch et al (2015), indicating the importance of atmospheric circulation, which in the cGENIE 2D energy-moisture balance atmosphere is simplified and invariant. The interaction between vegetation and atmosphere was found 415 to produce a warmer Miocene independent of CO2 by Knorr et al. (2011). The interaction between vegetation and paleogeography produced a higher climate sensitivity in the modelled Miocene in Bradshaw et al. (2015) and warming independent of CO2 by Henrot et al. (2010) due to vegetation and lower-elevation topography. Land surface cover and increased northern transient eddy heat transport were found to result in polar warming in a modelled late Miocene in a fully coupled GCM by Micheels et al. (2010). Chemistry-climate feedbacks linked to vegetation differences compared to the present 420 were found to be as strong as or more important than CO2 forcing in the Pliocene by Unger and Yue. (2013). The significant changes occurring in global vegetation distribution since the middle Miocene  then may be critical to fully https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License.
reproducing observed SST patterns. The absence of any representation of vegetation and associated feedbacks in the version of cGENIE we employ here may then account for some of the difference in the CO2 we apply (e.g. 1120ppm at 15 Ma) and that estimated by Sosdian et al. (2018) (e.g. 470-630 ppm at 17-14 Ma). 425 A portion of the aforementioned CO2 difference may alternatively (or in tandem) be explained as the contribution of other green gases to climate forcing during the Miocene; in cGENIE only CO2 is imposed as a greenhouse gas, so the imposed atmospheric CO2 concentration should be understood as an 'equivalent' climate forcing. If the CO2 concentration at 15 Ma is as per estimated by Sosdian et al (2018) -630ppm (their higher estimate), that leaves an equivalent CO2 forcing of ~500ppm to be explained (as compared to the 1120ppm required in this study for the best-fit model setting at 15 Ma). The 500ppm 430 forcing equates roughly to 1000 GtC as CO2. If this equivalent forcing was attributable to methane (at 25 times the warming potential of CO2 at the 100 year timescale), methane concentrations would need to have been 10 times higher (than present) in the mid to late Miocene if they are to explain all the CO2-forcing difference between this study and Sosdian et al. (2018).
Wetlands are a significant source of atmospheric methane in the present day, as well as representing large terrestrial carbon stores. Extensive wetlands existed during the middle Miocene (Eronen and Rossner, 2007;Hoorn et al., 2010;Morley and 435 Morley, 2013), with the generally warmer wetter climate and lower elevation topography being favourable to their creation and persistence. These conditions would be conducive to methane production (Dean et al., 2018).
A large Antarctic ice sheet has been identified for the period soon after the Miocene climatic optimum (Hautvogel et al., 2012;Badger et al., 2013;Pierce et al., 2017). A drop in sea level associated with ice sheet growth would reduce overlying pressure in relatively shallow seas and may have destabilised gas hydrates globally. Gas hydrates were found to have likely 440 destabilised during the sea-level lowering of the Miocene in the present-day Appenines in Italy (Argentino et al., 2019) and in the Black Sea Basin (Kitchka et al., 2016). A release of gas hydrates, if it reached the atmosphere rather than oxidising to CO2 in overlying waters, would cause a drop in δ 13 CO2 that would also be reflected in δ 13 C (ocean) data. However, gas hydrate release (as methane) reaching the atmosphere would normally contribute to warming, which seems incompatible with evidence of large ice sheet growth and falling CO2 soon after the mid-Miocene. Further than this, methane's short lifetime in the 445 atmosphere would suggest that this would need to be a sustained gas-hydrate release (to maintain high atmospheric methane levels) with steady duration of the order of 1 Myr to explain any long-term difference between Sosdian et al (2018) CO2 estimate and the SST requirement for a warmer climate. It would also require mechanisms promoting ice growth in another way (not greenhouse gas driven cooling), a strong CO2 drawdown after 15Ma together with the maintenance of a high atmospheric methane levels. Atmospheric chemistry and baseline methane levels does affect the lifetime of atmospheric 450 methane (Schmidt and Shindell, 2003) but it is more likely that destabilisation of gas hydrates would act on shorter timescales (rather than sustained levels) perhaps influencing short term transient climate changes.
Overall, the combination of a flatter latitudinal temperature gradient, the contribution of other (than CO2) greenhouse gases to climate forcing, and the effect of vegetation and its distribution (which may be seen as part of climate sensitivity), may explain the discrepancy between our modelled CO2 in this study and the most recent data estimates (Sosdian et al., 2018). 455 However, study of Miocene CO2 levels is still ongoing and the question remains to be definitively resolved.

Benthic Temperatures
The pattern of temperature in the deep ocean arises as a combination of the pattern of ocean surface temperature (and surface climate in general) and the large-scale circulation of the ocean. In this study, we have shown that net salinity transport into the N. Atlantic region can induce and strengthen a meridional overturning circulation that in turn drives a warming of the deep 460 ocean, independent of atmospheric CO2 forcing. This helps explain a seeming decoupling of surface and deep water total cooling seen in the data since the middle Miocene, where surface temperatures have cooled more than deep ocean temperature according to data. Further, the benthic temperatures that we derive from benthic δ 18 O data are dependent upon the global ice volume that we assume (Marchitto et al., 2014), these temperature estimates would be different for differing global ice volumes. For example, at 15 Ma, a lower global ice volume assumption than the one we apply would bring δ 18 O-based benthic 465 temperatures more in line with the Mg/Ca based temperature estimate from Cramer et al. (2011) (Fig.7). Finally, local salinity has a strong control on δ 18 O; an increase in measured benthic δ 18 O in the N. Atlantic during the late Miocene that may be interpreted as evidence of a strong cooling, may actually be attributable to the increased salinity of the deep sea water, where salinity (rather than temperature) dominates the δ 18 O signal recorded in the benthic foraminifera. With the onset of Atlantic overturning circulation during the Miocene, the salinity of deep N. Atlantic waters has a strong control on δ 18 O, and when 470 included in the temperature calculation results in increases of up to 3°C in some locations (compared to the temperature uncorrected for salinity) (Fig. 10).

Using carbon-13 to trace ocean circulation patterns
Changes in deep ocean circulation patterns and specifically the aging of water masses and progressive accumulation of isotopically light respired carbon, in theory should be identifiable in global patterns of δ 13 C data from benthic foraminifera. 475 However, the processes that control δ 13 C are complex, involving both circulation and ocean carbon pumps, which furthermore are not independentlarge-scale changes in circulation that affect nutrient return to the surface will hence also modulate the strength of the biological pump in the ocean. Changing ocean interior temperature patterns may further influence where carbon is respired via a temperature control on the rate of carbon respiration (John et al., 2014), further affecting δ 13 C. For instance, we simulate a flatter (smaller range of) deep water δ 13 C than data suggests for time-slices between 10 Ma and the 2.5 Ma, but 480 a comparable range (between model and data) for 15 Ma and the late Holocene (Fig. 11). The similar ranges at these two extremesone the warmest and the other the coolestsuggest that the issue in between is not primarily with the model representation of the carbon pumps, but with the Central American Seaway (CAS) that may be too deep in the intermediate time-slices. A deep CAS allows the flow of less recently ventilated Pacific waters into the Atlantic basin in the model and reduces the δ 13 C gradient between these two ocean basins. The time slice with the closed CAS (Holocene) shows a model 485 range and values of δ 13 C in general agreement with the data for the Pacific and the Atlantic basins (Fig. 11), and a closed or restricted CAS in the 10 Ma to 2.5 Ma slices may improve the models fit to data. This tends to support the hypothesis of an early restriction/closure of CAS, as the increasing N. Atlantic to Pacific δ 13 C gradient is evident from 10 Ma (Fig. 2). However, https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License.
it should be noted that the distribution of δ 13 C is also dependent on other factors (such as local water velocity fields and vertical stream functions, as well as effects of the biological pump that affect δ 13 C). 490

North Atlantic salinity and CAS
In our model set-up the CAS allows some mixing of Atlantic and Pacific waters at every time-slice except the Holocene. The flux correction in the N. Atlantic that we apply may be seen as a combination of compensating for a more restricted (or closed) CAS, and also for the precipitation gradient across N. America which would be affected by precipitation patterns and topography. With a warmer climate and higher CO2, the atmosphere is able to hold more water vapour and the result is a 495 globally wetter climate. However, the North American plains saw the rise of grasslands that favour a drier climate (Janis et al., 2002) through the Miocene, linked to the uplift of Mountain ranges in the west that created a rain shadow on the central plains.
Over the last 12 million years in the Sierra Nevada a rain shadow similar to present was identified by Mulch et al. (2008), indicating mountain uplift in this area occurred prior to 12 Ma. Further north in the Cascade Mountains high uplift and erosion rates were dated to late Miocene (12 to 6 Ma), and the Coast Mountains and British Colombia uplift from 10 Ma (Reiners et 500 al. 2002). We would expect therefore the proportion of a flux correction for rainfall to be higher from around 12.5 Ma to 6 Ma to take account of this mountain building.
A wash-house (warm and wet) climate was identified in Europe in certain periods of the Miocene, from 10.2 Ma to 9.8 Ma and from 9.0 Ma to 8.5 Ma, which appeared to be linked to deep N. Atlantic Ocean temperatures (Boehme et al., 2008).
This was in-turn linked to a possible temporary restriction of the CAS and greater northward heat transport. A middle Miocene, 505 even if temporary (Jaramillo et al., 2017), closure of the CAS was identified by Montes et al. (2015). We identify a transition between 12.5 Ma and 10 Ma, and a high flux correction (FwF = 0.4 Sv) for the 10 Ma time-slice which may be representative of a CAS restriction.
Our modelled global overturning circulation shows a gradual increase in dominance of northern hemisphere deep water, and the development of the AMOC (Fig. 14). Generally in cGENIE, the higher the CO2 forcing the weaker the AMOC, 510 with a strong and deep AMOC only present in the Holocene time-slice. The CAS remains open for all the time-slices except Holocene in this set-up and this may be preventing the establishment of a stronger AMOC in the model at (for example) 2.5 Ma, when the CAS is thought to have already been fully closed. A shallow (but not closed) CAS still allowed the formation of deep water in the North Atlantic in a GCM model study by Nisancioglu et al. (2003), and our CAS at ~1000m does not fully impede the establishment of a shallow AMOC (if the surface north Atlantic is sufficiently saline). An early Pliocene (4.7 to 515 4.2 Ma) shoaling of CAS was found to have no profound impact on climate evolution, as North Atlantic deep water formation was found to be already vigorous by 4.7 Ma by Bell et al. (2015). Vigorous North Atlantic deep water formation appears to have started by 4.5 Ma in our study (Fig. 14), when CO2 drops to 400ppm and we increase N. Atlantic salinity (flux correction FwF at 0.5Sv). A definitive closure of the CAS dates to the late Pliocene (~3Ma, O'Dea et al., 2017). This later final Pliocene closure of CAS had thought to be linked to setting up conditions for northern hemisphere glaciation, although Lunt et al. (2008Lunt et al. ( ) 520 https://doi.org/10.5194/cp-2019 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License.
found that an open or closed CAS made little difference to ice sheet size in their model study. Sea ice growth intensifies in our model when CO2 drops further to 280ppm in the Holocene.
Earlier studies placing Miocene CO2 levels fairly low, around pre-industrial levels or lower (Pagani et al., 1999;Pearson and Palmer 2000), led research on finding other explanations for the global growth of ice sheets and the onset of glacial cycles since the middle Miocene. With newer estimates for CO2, a more complete picture emerges, with the combined 525 effects of CO2 drawdown and ocean seaways influencing global climate. We find that an early CAS restriction probably fits δ 13 C data better, and that the draw-down of CO2 and cooling of polar regions has a more significant role in global circulation patterns and climate (than a late CAS restriction) from the Pliocene onwards.

Conclusions and summary
In this study, we used proxy data estimates for both surface and benthic temperature (δ 18 O) as well as benthic δ 13 C to constrain 530 the evolution of atmospheric CO2 and large scale ocean circulation in the cGENIE.muffin Earth system model. We created plausible climatic states in cGENIE for each of 7 time-slices spanning the mid Miocene to Holocene cooling. Constrained by changes in the absolute magnitude and pattern of benthic δ 13 C, we also diagnosed a plausible history of atmospheric δ 13 C over this time interval for use as boundary conditions in future modelling studies, or as a data target for assimilation in geochemical box-models. 535 In the cGENIE model, we diagnose a progressive reduction in atmospheric greenhouse gas forcing since the mid Miocene, driving global cooling. Simultaneously, we diagnose a gradual strengthening of overturning circulation in the Atlantic that transports heat to the deep Atlantic ocean over the cooling period and leading to a stronger cooling in surface (at ~6°C) than in deep waters (at ~3°C) occurring since the middle Miocene. This onset and strength of the AMOC in cGENIE is controlled by the combined effects of progressive restriction of the Central American Seaway together with our two free 540 parametersa salinity adjustment (representing mountain building in N. America and an increasing Atlantic-Pacific salinity gradient), and a declining atmospheric CO2. Declining CO2 drives cooling which helps to promote the sinking of salty waters in the N. Atlantic, despite surface salinity actually also declining over the cooling period in the model. The net result in the model is a strong and deep AMOC in the Holocene when the CAS is fully closed and atmospheric CO2 is low.

Appendix A. Correcting benthic Temperature calculated from δ 18 O for local salinity 545
The Cramer et al. (2011) δ 18 Osw estimate for sea water is a global mean value. Applying the Marchitto et al. (2014) paleotemperature calculation using one global mean δ 18 Osw therefore results in an uncertainty in temperature depending on location (Fig. A1) due to local differences in δ 18 Osw values (Fig. A2).
The seawater δ 18 Osw is determined by global ice volume (which we get from Cramer et al., 2011), local temperature and local salinity (Rohling, 2013). In the present day with an active AMOC, the North Atlantic benthos have a more positive 550 https://doi.org/10.5194/cp-2019-151 Preprint. Discussion started: 14 January 2020 c Author(s) 2020. CC BY 4.0 License. δ 18 Osw than, for example, the North Pacific. This is due to both temperature and salinity, with the salty North Atlantic waters transported to the deep by the AMOC. In our model ensembles the benthic salinity is affected by both CO2 and flux correction as these affect ocean circulation. Therefore, for each simulation we apply a δ 18 Osw driven correction to the paleo temperatures due to their local salinity. To do this, we use present day deep-water (2500m) δ 18 Osw (LeGrande and Schmidt, 2006) and salinity (WOA 2013, Zweng et al., 2013 to create a general linear model (Eq. A1), where S is salinity. 555 The North Atlantic is the region which is most affected by changes in salinity, and therefore the greatest temperature offset will be in this location. We adjusted the salinity model to get best-fit values for the north Atlantic, and a good fit for the difference in δ 18 Osw between the north Atlantic and the Pacific. All ocean locations with δ 18 Osw between -0.3 and 0.3‰ are shown in a crossplot of data and salinity-model derived δ 18 Osw (Fig. A3). The grouping of points clearly offset from the 1:1 560 line are the high southern latitudes (also clearly visible in Fig. A2). The accuracy of the salinity-derived model in finding δ 18 Osw is ±0.03‰ at one standard deviation, and ±0.06‰ at the 95% confidence level excluding latitudes higher than 70°.
The paleotemperature equation we apply is the linear model from Marchitto et al. (2014, Eq. 8 therein), using the global ice volume estimate from Cramer et al. (2011). The δ 18 Osw model from salinity is also linear so we apply a simple linear correction to the calculated temperature (that shown in Supplementary Table S2). To obtain the δ 18 Osw offset, we first find the 565 modelled global mean salinity (not including latitudes higher than 70°), and subtract this from the modelled benthic salinity, giving a ∆S field (an offset from the mean). We apply Eq. A2 to find ΔT, where 0.8 is the gradient of the linear δ 18 Osw model (Eq. A1) and 0.224 is the gradient of the linear paleotemperature model (Marchitto et al., 2014), and use this to correct T for salinity. The uncertainty for the temperature correction is ±0.13°C at one standard deviation and 0.23°C at the 95% confidence level (based on the δ 18 O model). The range of temperature correction corresponding to the benthic δ 18 Osw spread (of ~0.4‰) 570 in the present day is 1.8°C.

Model code availability
The specific version used of the cGENIE.muffin model used in this paper is tagged as release v0.9.8, and has been assigned a 575 DOI: 10.5281/zenodo.3575063 The code is hosted on GitHub and can be obtained by cloning: https://github.com/derpycode/cgenie.muffin and then changing the directory to cgenie.muffin and checking out the specific release: $ git checkout v0.9.8 580 Configuration files for the specific experiments presented in the paper can be found in the subdirectory: