Late Pliocene lakes and soils: a global data set for the analysis of climate feedbacks in a warmer world

The global distribution of late Pliocene soils and lakes has been reconstructed using a synthesis of geological data. These reconstructions are then used as boundary conditions for the Hadley Centre General Circulation Model (HadCM3) and the BIOME4 mechanistic vegetation model. By combining our novel soil and lake reconstructions with a fully coupled climate model we are able to explore the feedbacks of soils and lakes on the climate of the late Pliocene. Our experiments reveal regionally confined changes of local climate and vegetation in response to the new boundary conditions. The addition of late Pliocene soils has the largest influence on surface air temperatures, with notable increases in Australia, the southern part of northern Africa and in Asia. The inclusion of late Pliocene lakes increases precipitation in central Africa and at the locations of lakes in the Northern Hemisphere. When combined, the feedbacks on climate from late Pliocene lakes and soils improve the data to model fit in western North America and the southern part of northern Africa.


Background
The late Pliocene (Piacenzian: 3.6-2.6 Ma) is the most recent geological time period of considerable global warmth, before the onset of the glacial-interglacial cycles of the Pleistocene (Dowsett et al., 1992(Dowsett et al., , 1994Haywood et al., 2011b;Salzmann et al., 2011). As it is, geologically speaking, rel-atively recent, it represents a recognisable world (in terms of its geography, orography and bathymetry) in which aspects of the Earth's climate can be explored through proxies and modelling studies to better understand the feedbacks, processes and impacts of sustained global warmth (Dowsett et al., 1996Salzmann et al., 2009;. The focus on the late Pliocene palaeoclimates has been driven by the PRISM (Pliocene Research Interpretations and Synoptic Mapping) project of the US Geological Survey. PRISM3 (the third iteration of the PRISM project, which includes a three-dimensional ocean reconstruction) provides palaeoenvironmental reconstructions of the Pliocene world, from geological data, in a form suitable for use in climate modelling studies . With the availability of boundary conditions (aspects of the world required to initialise climate modelling experiments) from PRISM, it has been possible to undertake meaningful climate modelling studies to explore Pliocene climates (e.g. Chandler et al., 1994;Sloan et al., 1996;Haywood et al., 2009;Lunt et al., 2012). Building on single model studies of the Pliocene, PlioMIP (Pliocene Model Intercomparison Project) has brought together ten different climate modelling groups to simulate identical experiments and investigate not only the climate of the Pliocene but also inter-model variability and uncertainty (Haywood et al., , 2011aDowsett et al., 2012Dowsett et al., , 2013Salzmann et al., 2013). PlioMIP uses the palaeoenvironmental reconstructions from PRISM3, which includes palaeogeography, orography, bathymetry, vegetation, ice sheet configuration and oceanic temperatures. However, there is currently no information on late Pliocene global soils or lakes. In PlioMIP experiments 1 and 2, global soils were specified in a manner consistent with the vegetation, or they were kept as modern Contoux et al., 2012). Lakes were specified as absent and not included in any of the PlioMIP experiments. In this paper we present global data sets and palaeoenvironmental reconstructions of global late Pliocene soils and lakes. These are intended to be incorporated into the PRISM4 Pliocene global reconstruction and future PlioMIP experimental design.

The importance of soils and lakes in palaeoclimate studies
Albedo-related soil and vegetation feedbacks are key uncertainties in the Earth System and climate models differ considerably in estimating their strength (e.g. Haywood and Valdes, 2006;Knorr and Schnitzler, 2006). For the terrestrial realm, large inland water bodies and wetlands have also been shown to significantly affect surface temperatures and energy balance in past and present climate systems (e.g. Sloan, 1994;Delire et al., 2002;Sepulchre et al., 2009;Burrough et al., 2009;Krinner et al., 2012). Studies of the African Humid Period in the Holocene have found that lakes and wetlands contribute to the "greening of the Sahara" by increasing regional precipitation (Krinner et al., 2012). A similar increase in regional precipitation was found for the late Pleistocene Lake Makgadikgadi in the middle Kalahari (Burrough et al., 2009). In deeper time palaeoclimate studies, Sloan (1994) found that, when simulating the early Eocene of North America, the addition of a lake had as much impact on the climate of the continental interior as the 1680 ppmv CO 2 in the model's atmosphere. The addition of a modest lake deflected the winter freezing line north and improved the datato-model fit for winter temperatures (Sloan, 1994). Soil albedo has also been shown to have a large impact on regional precipitation (Knorr and Schnitzler, 2006). A series of climate model experiments on the mid-to late Holocene showed that soil albedo in the Sahara had a larger effect on regional precipitation than orbital forcing and sea surface temperatures (Knorr and Schnitzler, 2006). Other experiments have shown that wetter and darker soils in the mid-Holocene Sahara would have facilitated the northward movement of the African monsoon, creating a positive feedback (Levis et al., 2004).
Current palaeoclimate modelling studies of the late Pliocene often struggle to generate sufficient precipitation, particularly in the semi-arid and arid tropical and subtropical regions, to match proxy data (Salzmann et al., 2008;Haywood et al., 2009;Pope et al., 2011). As some studies have shown that lakes and soils have had significant regional impacts on mid-Holocene precipitation (e.g. Knorr and Schnitzler, 2006;Krinner et al., 2012), it stands to reason that similar effects could be seen in the late Pliocene. Conversely, recent work focussing on the Megalake Chad region during the late Pliocene did not show significant increases in precip-   Table S1), whilst lake data have the prefix L (Supplement Table S2).
itation . However, Holocene palaeoclimate studies benefit from comprehensive published data sets of soils and lakes (e.g. Hoelzmann et al., 1998), and up to present no such data have been available for late Pliocene climate model studies . In this paper we present the first global data sets of late Pliocene soil and lake distributions, and these data sets have been transformed into climate model boundary conditions suitable for exploring the feedbacks of soils and lakes in a warmer world. The data sets provide previously missing boundary conditions for late Pliocene palaeoclimate modelling studies. We present the initial results of the first late Pliocene palaeoclimate model studies using the new realistic soil and lake boundary conditions. Finally, we compare BIOME4 (Kaplan, 2001) output, from our new simulations, to the global vegetation database of Salzmann et al. (2008Salzmann et al. ( , 2013 to qualitatively evaluate data-model similarity.

Construction of the lake and soil database
Late Pliocene lake and soil data have been collected and synthesised into an internally consistent format using a Microsoft Access-ArcGIS database that is based on the vegetation database TEVIS (Salzmann et al., 2008). The soil and lake data have been compiled from published literature: soil data ( Fig. 1; Supplement Table S1) are based upon paleosol occurrences (e.g. Mack et al., 2006), whereas evidence for lakes ( Fig. 1; Supplement Table S2) comes from sedimentology (e.g. Müller et al., 2001), dynamic elevation models and topographic studies (e.g. Drake et al., 2008), fauna (e.g. Otero et al., 2009) or a combination of these (e.g. Adam et al., 1990). Both the soil and lake data are recorded with a latitude-longitude (for lakes this represents the centre), a maximum and minimum age in millions of years (Ma) and the method used to date the deposit. The documented soil data also include a soil type, which is based upon the orders  , 1999). The lake data also record an estimated surface area extent, the shape of the lake and for lakes with a surface area greater than 1500 km 2 the latitudelongitude of its northern-, eastern-, southern-and westernmost points. In addition to this, any reported information on water chemistry, details of inflows and outflows or whether the lake was ephemeral have also been recorded. Full details of the databasing methodology and the data sets are available in the Supplement.

Preparing the data for inclusion in a climate model
From the geological data recorded in the late Pliocene lakes and soil database we have produced three maps to allow the inclusion of lakes and soils in palaeoclimate modelling experiments. The three maps are a global soil map, which is accompanied by a table providing preferred soil characteristics ( Fig. 2a; Table 1); a dry-lakes scenario (Fig. 2b); and a wet-lakes scenario (Fig. 2c). All maps use a grid cell size of 2.5 • latitude × 3.75 • longitude; this equates to a spatial resolution of 278 km × 417 km at the Equator.
To develop a global late Pliocene soil map ( Fig. 2a) from the 54 palaeosol occurrences recorded in the database (Supplement Table S1), we combined the soil data with the Piacenzian biome reconstruction presented in Salzmann et al. (2008). This allowed a soil type to be assigned to each model grid cell even if late Pliocene palaeosol data had not been reported from that region. This technique of combining a realistic biome reconstruction with palaeosol data uses the knowledge that at a global scale the distribution of each soil order mirrors certain vegetation biomes (Soil Survey Staff, 1999). When the palaeosol data were combined with the vegetation reconstruction, there were no mismatches between a palaeosol occurrence and a biome that we did not expect to be associated with that soil order.
Both the late Pliocene dry-lake scenario and wet-lake scenario are based upon the estimated surface area of the palaeolakes, translated into a percentage of a model grid cell (Fig. 2b, c). A wet and a dry scenario have been generated to compensate for the uncertainty in the dating of many of these features, which often cover several orbital cycles. By producing a dry-and a wet-scenario map it is possible for climate modelling experiments to explore the impacts of late Pliocene lakes in a warm-wet climate period or a cold-dry climate; this follows the vegetation work of . Late Pliocene lake surface areas have been either taken from the published literature or calculated from published estimates of lake extent. These were then translated into percentages of grid cells by calculating how much of a grid cell would be occupied by each lake. Where megalakes occupied more than one grid cell, the geographic distribution of the lake was based upon the published shape and the distalmost latitude-longitude points of the reconstructed lake.

Uncertainties in reconstructing soils and lakes from geological data
This study is the first to present realistic late Pliocene soil and lake maps derived from the synthesis of geological data (Fig. 2). Despite these maps being the current state of the art it is important to discuss the uncertainty involved in them. For the global soils reconstruction the greatest uncertainty comes from the limited geographic distribution of data ( Fig. 1) and the reliance on the global biome reconstruction (Salzmann et al., 2008) to fill in the gaps. However, there were no soil data points coinciding with a vegetation type they would not normally be associated with, so we can have confidence in our methodology to generate a global map. One discrepancy between the late Pliocene reconstruction and the modern global soil order map is that the late Pliocene does not have any regions of Inceptisols or Entisols (Fig. 2a). Today these two soil orders make up about 30 % of the land surface and represent undeveloped or moderate pedogenic development (Soil Survey Staff, 1999). Producing a global soil order map for the late Pliocene requires the incorporation of many distinct palaeosol occurrences, and most of these are preserved as a pedogenically developed soil order, rather than preserved as an undeveloped or moderately developed soil (e.g. Gürel and Kadir, 2006). Where inceptisol palaeosols are preserved they are commonly associated with a fully developed soil order (e.g. Mack et al., 2006). Further to their limited geological preservation, Inceptisols and Entisols are not commonly associated with particular vegetation types, being a product of limited soil development rather than ambient climate and biome (Soil Survey Staff, 1999). This made them near impossible to plot on a late Pliocene soil map with the available data and methodology. Inceptisols and Entisols were therefore omitted from the reconstruction, but it should be noted that they were not absent during the Pliocene (Sangode and Bloemendal, 2004;Mack et al., 2006). It is difficult to assess the likely impacts on albedo and texture that the addition of Inceptisols and Entisols may have had during the late Pliocene. As these soil orders have limited pedogenic development, they are more intimately tied to their parent material than other soil orders. This could mean that Inceptisols and Entisols could have had any combination of albedo and texture depending on late Pliocene surface geology. The late Pliocene biome reconstruction from Salzmann et al. (2008) is a hybrid map combining 240 palaeobotanical data sites with a "best fit to data" BIOME4 output (forced by HadAM3-predicted late Pliocene climate), which were merged using expert knowledge. Although this means that limited regions of the biome reconstruction do rely more on model predictions than real data, the overall product is pri-marily based on an exhaustive database of late Pliocene plant fossil localities. Developing a global late Pliocene soil map with only 54 paleosol localities required either extensive interpolation (with all the possible errors that may have come with that) or the use of another data set (the hybrid biome reconstruction) and the knowledge that most soil orders (with the exception of Inceptisols and Entisols) are related to particular vegetation types.
The reconstructed late Pliocene lakes represent a synthesis of the published geological data. What is most obvious is the vast areas with no percentage of the grid cell covered by surface water (Fig. 2). This is not meant to mean that these regions were without any surface water, but that there are no published records of lake sediments with the lake extent estimated. However, it is highly likely that during wetter climates in the past many presently arid regions were covered with river systems, as has been shown for the Sahara during the Eemian (MIS 5e) and Holocene (e.g. Hoelzmann et al. 1998;Coulthard et al., 2013). With continued research into late Pliocene lake faunas, floras and sediments these regions will contain more evidence for surface water. Of the lakes presented in the reconstructions of this study the one with the greatest uncertainty is the substantial Megalake Zaire (Fig. 2). This megalake has long been speculated due to all the major tributaries of the Congo River being orientated to the centre of the basin (Summerfield, 1991;Goudie, 2005) and the presence of a submarine canyon rather than a delta (Cahen, 1954;Peters and O'Brien, 2001;Goudie, 2005). However, it has not been ground-truthed with recent geological data and this should be an imperative (Peters and O'Brien, 2001). For further discussion on the uncertainties surrounding the reconstructions of late Pliocene soils and lakes please see Supplement 1.

Modelling
The potential effects of the new lake and soil databases on the Pliocene climate were tested using modelling simulations with the UK Hadley Centre General Circulation Model (GCM), HadCM3. This is a coupled atmosphereocean GCM described by Gordon et al. (2000) and Pope et al. (2000), with horizontal resolution of 3.75 • × 2.5 • in the atmosphere and 1.25 • × 1.25 • in the ocean. The atmospheric component has 19 levels in the vertical and 30 min time steps, while the oceanic component has 20 levels in the vertical and 1 h time steps.
To investigate the impacts of realistic soil and lake distributions on the late Pliocene climate we analyse the results of 5 simulations: a control simulation of 850 yr (PRISM3 control), a simulation with late Pliocene lake levels from the wet-lakes scenario (PRISM3 + wet-lakes scenario) (Fig. 2c), a simulation with late Pliocene lake levels from the dry-lakes scenario (PRISM3 + dry-lakes scenario), a simulation with late Pliocene soils (PRISM3 + soils) and a simulation with soils and the wet-lakes scenario (PRISM3 + soils + wetlakes scenario). The PRISM3 + wet-lakes scenario, PRISM3 + dry-lakes scenario, PRISM3 + soils, and PRISM3 + soils + wet-lakes scenario simulations were all started 500 yr into the PRISM3 control simulation and were run for a further 350 yr. This is sufficient to spin up all atmosphere and vegetation parameters of interest (Hughes et al., 2006). The control experiment comprises boundary conditions from PRISM3; these include a near-modern orography (except for areas of the Andes) and a reduced Greenland Ice Sheet (for full details please see Dowsett et al., 2010;Haywood et al., 2010. Although the preferred boundary conditions for PRISM3 are to remove the West Antarctic Ice Sheet, here we utilise the alternative PRISM3 boundary conditions, which remove all ice from the West Antarctic Ice Sheet and reduce the topography to sea level . The control experiment has a modern orbit and CO 2 levels of 405 ppmv. The initial vegetation patterns for the control run were prescribed from PRISM3; however the version of HadCM3 used here comprises the MOSES2.1 land surface scheme and the TRIFFID dynamic vegetation model (Cox et al., 1999;Cox, 2001) such that vegetation dynamically changes with the climate, and the relative proportions of different vegetation types adjust throughout a long simulation. TRIFFID was run in equilibrium mode for the first 50 yr of the control run; after this TRIFFID was run in dynamic mode throughout. For the control simulation, soil parameters were set to be the same as modern and there were assumed to be no lakes. The lake and soil experiments were initialised from a state 500 yr into the PRISM3 control run; hence their initial vegetation patterns were predicted by HadCM3 + MOSES2/TRIFFID for the PRISM3 climate. Vegetation continued to respond dynamically in all experiments.
The PRISM3 + wet-lakes scenario simulation was identical to the control except that the high-level lakes were included in a very simple way. In the MOSES2.1/TRIFFID version of HadCM3, each grid box is assigned a fractional coverage of 9 different surface types (broadleaf trees; needleleaf trees; shrubs; C 4 grasses; C 3 grasses; ice; urban -not used in the late Pliocene; bare soil; or water); lakes are included in a grid box by increasing the water surface type, while the fractions of all other surface types are reduced as appropriate. It is noted that, although trees, grasses and shrubs can dynamically change throughout the model simulation, the lake fraction of the grid box will remain constant and will be neither increased by precipitation nor decreased by evaporation (Essery et al., 2001). This means that the lakes in the simulation are static and do not depend on precipitation/evaporation/runoff patterns. The prescribed albedo of the lakes is 0.06, whilst roughness length is 0.0003 m.
The PRISM3 + soils simulation required changes to several HadCM3 boundary conditions. The simplest of these is soil albedo, which was determined from the colour of the soil type shown in Table 1. Following Jones (2008) light soils were prescribed an albedo of 0.35, medium soils an albedo of 0.17 and dark soils an albedo of 0.11. These albedo values are based on the assumption that medium and dark soils have average wetness, while light soils are dry. It is noteworthy that, although there are differences in soil types between the late Pliocene and the modern, the simple way that these soil types are incorporated into HadCM3 means that their potential for changing the climate is limited. Table 1 shows that, of the 9 soil types to be included in the PRISM4 database, five are intermediate colour, three are dark and one is light. This means that, even though a soil type may change between the Pliocene and the modern, it is only if the soil type changes to one of a different colour that the climate can be  altered via albedo feedbacks (Fig. 3). Other soil parameters used in HadCM3 (Clapp-Hornberger exponent, saturated hydraulic conductivity, saturated soil water suction, volumetric soil moisture concentrations at critical, saturation and wilting points, dry soil volumetric heat capacity and dry soil thermal conductivity) are prescribed values depending on soil texture as suggested by Cox et al. (1999). Again, it is noteworthy that, even though a soil type may have changed between the modern and the late Pliocene, it is where a soil changes to one of a different texture that will have the potential to impact the climate.
Although HadCM3 dynamically predicts vegetation patterns, this is limited to only 5 types of vegetation, and these are difficult to compare with data sets of late Pliocene vegetation such as PRISM3 (Salzmann et al., 2008). To facilitate a better comparison with palaeobotanical proxy data the climate output from HadCM3 were used to drive the offline vegetation model BIOME4. The BIOME4 model (Kaplan, 2001) is a mechanistic global vegetation model which predicts the distribution of 28 global biomes based on the monthly means of temperature, precipitation, cloudiness and absolute minimum temperature. The model includes 12 plant functional types (PFTs) from cushion forb to tropical evergreen tree (Prentice et al., 1992). It is the bioclimatic tolerances of these that determine which is dominant in a grid cell and, from this, which biome is predicted. The BIOME4 model was run in the anomaly mode with a CO 2 of 405 ppmv. The model was driven from the average annual climate data obtained from the last 100 yr of each HadCM3 experiment, to assess which PFTs were feasible in each grid box and to allocate an appropriate biome at each location.

Results
In this section we will first describe the geographic distribution of late Pliocene soils and lakes that have been reconstructed from geological data and then the results of including these reconstructions in a series of GCM simulations using the PRISM3 boundary conditions.

Late Pliocene soils
During the late Pliocene there were significant differences in the global distribution of soils (Fig. 2a). Overall, the distribution of soils reflects the warmer and wetter world seen in the vegetation reconstruction. Gelisols, associated with tundra type vegetation are restricted to very high latitude areas of North America, Greenland and Eurasia, as well as coastal regions of Antarctica (Fig. 2a). The more northern distribution of boreal and temperate forests is accompanied by extensive high-latitude Spodosols and Alfisols at higher than modern latitudes (Fig. 2a). There is evidence supporting Alfisols at 54 • N from around Lake Baikal, where grey forest soils are preserved (Mats et al., 2004). The extensive grassland and savannas in the continental interiors of North America and Eurasia are translated into extensive Mollisols (Fig. 2a). South of the Alfisol, in North America, there were Ultisols along the west coast and in the southeast of the continent (Fig. 2a). The centre of North America contains a large region of Aridisols, and a mixture of Alfisols (Abbott, 1981) and Vertisols (Mack et al., 2006) along the southern margin. In Europe there is evidence for Alfisols (Icole, 1970;Günster and Skowronek, 2001), Histosols (Basilici, 1997;Bechtel et al., 2003) and extensive Ultisols (Gerasimenko, 1993). At the eastern end of the Mediterranean there was a region of mixed Alfisols (Quade et al., 1994), Oxisols (Kelepertsis, 2002), Ultisols (Paepe et al., 2004) and Vertisols (Graef et al., 1997) (Fig. 2a). The Indian subcontinent contained a mixture of Ultisols and Vertisols during the late Pliocene (Fig. 2a). There is also evidence for Alfisols close to the Himalayas (Sangode and Bloemendal, 2004). In southeast Asia the biome reconstruction translates into extensive Ultisols across this region (Fig. 2a).
In South America there is a large region of Oxisols as evidenced from palaeosols (Mabesoone and Lobo, 1980) and the biome reconstruction. In southern South America Ultisols, Molisols and Alfisols dominated during the Piacenzian (Fig. 2a). The soils reconstruction for Africa relies heavily on the biome reconstruction, except for direct evidence of Vertisols in east Africa (Wynn, 2000;Campisano and Feibel, 2008), Oxisols in southern Africa (Helgren and Butzer, 1977) and Histosols in Madagascar (Lenoble, 1949). Combining these palaeosol occurrences with the biome reconstruction, the distribution of soils in Africa is shown to be dominated by Aridisols in northern Africa and Oxisols in central and southern Africa (Fig. 2a). Palaeosol evidence in Australia shows the presence of Aridisols in the middle of the continent (Hou et al., 2008) and Oxisols in the southeast (Firman, 1994;Hughes et al., 1999). The biome reconstruction of Salzmann et al. (2008) suggests the presence of Alfisols in the southwest and north of the continent, Vertisols in the east and Aridisols in the west and south (Fig. 2a).

Late Pliocene lakes
The global distribution of late Pliocene lakes is dominated by megalakes in Africa and Australia (Fig. 2b, c). In Africa the largest megalake, in both the wet and dry scenario, is Lake Zaire (Beadle, 1974;Peters and O'Brien, 2001;Goudie, 2005). This large water body is reconstructed in the wet scenario as occupying the majority of the modern river drainage basin (Fig. 2c), whereas it is reconstructed smaller in the dry scenario (Fig. 2b). To the east of Lake Zaire was Lake Sudd, which was large during wet phases of the Pliocene (Fig. 2c). However, it is reported to have been a very shallow lake (Salama, 1987) and is therefore considerably reduced in the dry scenario (Fig. 2b). In southern Africa there is evidence for surface water in the region of the modern Okavango Delta and the Makgadikgadi Pan (Ringrose et al., 2002(Ringrose et al., , 2005, both of which have a reduced surface area in the dry scenario (Fig. 2b, c). In east Africa there is evidence for Lake Malawi (Dixey, 1927) and Lake Tanganyika (Cohen et al., 1997), both of these occupy multiple grid cells of the reconstruction (Fig. 2b, c). There were also smaller (sub-gridcell) lakes associated with parts of the Rift Valley (Deino et al., 2006). Northern Africa was dominated by Lake Chad, which was considerably bigger than in modern times during the late Pliocene Otero et al., 2010). The sedimentology of the Chad Basin shows that there were considerable shifts from lake to sub-aerial environments during the late Neogene , and we have represented this by using the reported sediment sections to define a wet-scenario (Fig. 2c) and dry-scenario (Fig. 2b) late Pliocene egalake Chad. Further north there was Lake Fazzan and several smaller lakes in Libya, which were associated with an extensive river system (Drake et al., 2008). The lakes formed in topographic lows as volcanic eruptions restricted and blocked the flow of the river system (Drake et al., 2008).
In Australia there was a series of large lakes in the centre of the continent (Fig. 2b, c). The largest of these was Megalake Eyre (Simon-Coinçon et al., 1996;Alley, 1998;Martin, 2006). To the east of Lake Eyre was Lake Tarkarooloo, another large water body (Callen, 1977), whilst to the northwest was Lake Amadeus, which may have fed Megalake Eyre (Chen et al., 1993).
There is limited evidence for late Pliocene lakes in South America and those reported are modest in size and associated with the Andes Mountains ( Fig. 1; Supplement Table S2). The Quillagua-Llamara Basin, Chile, records an ephemeral late Pliocene lake with evaporates present (Sáez et al., 1999). In Argentina a small saline, though permanent, lagoon is preserved at Llancanelo (Violante et al., 2010) and a small lake is reported from Bogota in Colombia (Wijninga and Kuhry, 1993).
In North America there is a swarm of small to modestsized lakes associated with the valleys of the Rocky Mountains (Fig. 2b, c). The largest of these was Glenn's Ferry in Idaho, which has been reconstructed from sediments and the distribution of fossil fishes (Smith, 1981;Thompson, 1992). There are many small lakes across Eurasia, but only Lake Baikal and Lake Suerkuli covered multiple grid cells (Fig. 2b, c). Lake Baikal is reconstructed as having a similar size to the modern lake; however there was a change in sedimentation related to tectonic activity at 3.15 Ma (Müller et al., 2001). Lake Suerkuli, located on the northern Tibetan Plateau, had an estimated surface area of 4800 km 2 , but was destroyed by activity of the middle Altyn Fault (Chang et al., 2012).

Impact of soils and lakes on simulating late Pliocene climate and vegetation
Using the boundary conditions described in the previous section a series of modelling experiments was undertaken. The GCM simulations were designed to explore the impacts on late Pliocene climate and vegetation that using realistic soil and lake boundary conditions can have. In this section we will present the differences in temperature and precipitation of the three experiments -PRISM3 + soils, PRISM3 + wet lakes and PRISM3 + soils + wet lakes -from the standard PRISM3 control. All results use average values from the final 100 yr of each simulation and only those results deemed to be significant at the 0.1 % confidence level following a Student's t test are presented. This means that in a statistical sense there is only a 0.1 % chance that the results shown are due to intrinsic model variability; any differences in climate that are less significant than this are not presented. However, it is noted that, while the results shown are very likely to be due to the soil and lake boundary conditions imposed, climate model output does generally fulfil all the criteria for accurate significance testing (e.g. variables normally distributed, and independent). This means that some features may occur in the results that cannot be fully attributed to lakes and soils. Nonetheless by only presenting results that are significant at the 0.1 % confidence level, noise can be removed from the results and the effects of lakes and soils on the climate can be more clearly seen. We also show how the inclusion of soils and lakes has effected biome distributions in Pliocene simulations. The PRISM3 + dry-lakes experiment showed results comparable to, though not as pronounced as, the PRISM3 + wet-lakes experiment. To avoid repetition we have not presented the results of the PRISM3 + dry-lakes experiment below, but have included it in Supplement 3.  Fig. 4. Mean annual and seasonal surface air temperature for the soil and lake experiments. Anomalies relative to the standard PRISM3 control run (with modern lake and soil distribution). Everything plotted is significant at the 0.1 % confidence level.

PRISM3 + soils
The mean annual surface air temperature (SAT) in the PRISM3 + soils experiment shows a 1 • C cooling across northern Africa, a 1-2 • C cooling across southwest North America, a warming of 0.5-1 • C across northern South America, up to 2.5 • C warming in the southern part of northern Africa and in southern Africa, a warming of between 1 • C and 3 • C across east Asia, a 1 • C warming in central Australia and small changes around the Middle East (Fig. 4). With the exception of South America, all of these SAT changes can be attributed to soil albedo changes which determine the proportion of incoming shortwave radiation that can be absorbed (Fig. 3). During December-January-February (DJF) the differences in SAT with the PRISM3 control run are generally the same as in the annual ones, but they vary in magnitude (Fig. 4). This is also true during June-July-August (JJA), where we see an additional 0.5-1 • C warming in central northern North America, relating to a decrease in soil albedo (Figs. 3, 4), which is not visible in the annual mean.
In the experiment with late Pliocene soils there is a small increase in mean annual precipitation (MAP) in the southern part of northern Africa and in eastern Africa of around 10 mm month −1 and a reduction in the MAP across the Amazon region of up to 30 mm month −1 (Fig. 5). This reduction in MAP across the Amazon region is also associated with a reduction in evaporation and a reduction in soil moisture. This change in MAP across South America is robust and oc-curs throughout the simulation. The only boundary condition to have occurred in the Amazon region between the soil experiment and the control run is the Clapp-Hornberger exponent (Supplement 2). This parameter can affect the availability of soil moisture for evaporation and in this experiment appears to be inhibiting moisture recycling over the Amazon. It is noted that the Amazon region has high precipitation with substantial internal model variability, and is a particularly sensitive region in HadCM3 (Good et al., 2013); however the changes seen here are substantial and the possibility of the changes occurring due to the influence of changes to soil parameters at remote locations cannot be discounted. It will be interesting to see whether the results over South America are replicated with other GCMs that make use of the new soil database.
Seasonal changes of precipitation attributable to soils shown in Fig. 5 include a 10 mm month −1 increase in rainfall over central Africa during DJF and an increase of around 8 mm month −1 in a narrow band in the southern region of the Sahara during JJA (Fig. 5). These annual and seasonal climatological changes translate into very modest biome changes (Fig. 6). The main differences in this experiment from the PRISM3 control run is an increase of the tropical xerophytic shrubland biome in northern Africa and southern Africa (Fig. 6). In northern Africa this replaces desert and relates to the increased JJA rainfall (Fig. 5). In southern Africa tropical xerophytic shrubland replaces warm-temperate forest biome and temperate sclerophyll woodland biome (Fig. 6). These  changes are probably the result of increased winter temperatures and no change in MAP, leading to higher evaporation and limiting the development of lusher biome types. BIOME4 also predicts an expansion of desert in coastal Brazil and Australia, based upon the climate simulated by HadCM3 (Fig. 6). Figure 4 suggests that the inclusion of Pliocene lakes leads to a cooling in mean annual SAT of up to 2 • C that is confined to the immediate vicinity of the lakes; however there are seasonal variations. For example, the decreases in SAT (up to 2.5 • C) around the large lakes in northern and central Africa are most pronounced in DJF (Fig. 4). The exception to this is near Lake Fazzan (northern Africa), where a yearround warming is observed (Fig. 4). There are also regional decreases of up to 2.5 • C in SAT associated with lakes in the mid-latitudes of the Northern Hemisphere during JJA. The largest differences in SAT in the mid-latitudes occur in the warmest season. This implies that the cooling effects of using energy to evaporate lake water are larger than the warming that would be caused by the lake surface having a lower albedo than the vegetation it replaces. The large seasonal changes in central Africa do not conform to the above hypothesis as they are in a uniformly warm regional climate. For this region the decrease in DJF SAT is related to the average relative humidity. In the PRISM3 control run the av-erage relative humidity during DJF was ca. 20 %, whereas JJA had an average relative humidity of ca. 80 %. Therefore it appears it was only possible to evaporate more lake water during DJF as the higher average relative humidity in JJA meant the air was already saturated with water vapour. For Lake Fazzan the change in surface albedo from bare soil to a lake (bare soil = 0.35, whereas surface water = 0.06) appears to be dominating over the cooling influence of evaporation. It is not understood why the late Pliocene lakes of Australia had no impact on SAT.

PRISM3 + wet-lakes scenario
Changes in MAP between PRISM3 + wet lakes and the PRISM3 control run appear to be relatively widespread; however we only consider those changes proximal to lakes to be true signals (Fig. 5). All MAP anomalies in areas away from the reconstructed late Pliocene lakes should be considered as the result of model variability. This means that the only significant change in MAP associated with our late Pliocene lakes reconstruction is an up to 53 mm month −1 increase associated with Megalake Zaire (Fig. 5). This we attribute to regional recycling of evaporated lake water. Seasonal changes associated with late Pliocene lakes in the Northern Hemisphere are mainly constrained to JJA, when there is enough energy to evaporate lake water and increase local precipitation (Fig. 5).
The biome predictions indicate an expansion of the temperate conifer forest biome and open conifer woodland biome in western North America; this replaces the temperate xerophytic shrubland biome (Fig. 6). Cool mixed forest is replaced with temperate deciduous forest in the region of Europe-Russia. In Africa, there is a small change in biomes along the Sahel-Sahara transition and an expansion of tropical forest biomes at the expense of savanna around Megalake Zaire (Fig. 6).

PRISM3 + soils + wet-lakes scenario
Since the results from the PRISM3 + soils experiment (Sect. 3.3.1) and the PRISM3 + wet-lakes experiment (Sect. 3.3.2) generally occur in disparate regions, the effects of adding both soils and lakes to the boundary conditions of the model is essentially a linear combination of adding the soils and lakes separately. This can be seen in temperature (Fig. 4) and precipitation (Fig. 5) and is the case for all seasons.
As expected the biome reconstruction from this experiment also shows a combination of the biome plots of PRISM3 + soils and PRISM3 + wet-lakes scenario experiments (Fig. 6). There is an increase in temperate conifer forest and woodland in western North America, which is associated with the lakes. Coastal Brazil has a significant increase of the tropical xerophytic shrubland biome, seen in the PRISM3 + soils experiment and the slight increase in desert seen in both experiments (Fig. 6). The Sahara is slightly reduced and has changed into tropical xerophytic shrubland biome (Fig. 6). An expansion of the xerophytic shrubland biome in southern Africa is at the expense of other biome types, except desert (Fig. 6). The tropical xerophytic shrubland biome in southeast Asia is replaced by the savannah biome and the Australian desert has a small reduction, which was also seen in the PRISM3 + wet-lakes scenario (Fig. 6).

Discussion
The global distribution of late Pliocene soils and lakes (Fig. 2) is significantly different from the present day (Soil Survey Staff, 1999;Lehner and Döll, 2004). Whereas soils reflect the global distribution of late Pliocene biomes (Salzmann et al., 2008), the distribution and size of late Pliocene lakes (Fig. 2) contribute to a land-surface covering dramatically different from the present day (Lehner and Döll, 2004). The increase in the number and size of lakes is a response to the generally wetter global climate of the late Pliocene (e.g. Salzmann, 2008) and in places different tectonic conditions (e.g. Chang et al., 2012). However, application of these boundary conditions into HadCM3, alongside the other PRISM3 boundary conditions, produced subtle results (Figs. 4,5,6). The changes in climate and vegetation attributable to Pliocene soils and lakes seen in this study are generally less than seen in similar studies of the mid-Holocene. Palaeoclimate studies on the mid-Holocene have shown that it is possible to double regional precipitation with large lakes (Krinner et al., 2012) and move the African monsoon northwards by modifying soil albedo in the Sahara (Levis et al., 2004).
Although the results of our study are not as obvious as those from the Holocene, they do offer a glimpse that some of the processes, previously identified in the Holocene, may operate in the Pliocene. Although the addition of late Pliocene lakes did not double regional precipitation, as was simulated for the Holocene (Krinner et al., 2012), our experiments do show a 50 % increase around Lake Chad and notable summer increases in western North America. We also show a similar austral summer increase in precipitation around Megalake Makgadikgadi to that which Burrough et al. (2009) produced in their late Quaternary experiments. These similarities suggest that similar forcings are operating in the Pliocene and the Holocene, with regard to large lakes. However, in a study on Megalake Chad during the late Pliocene  found that the presence of a large lake did not sufficiently influence late Pliocene climate to impact on regional vegetation.
The parameterisation of Alfisol along the southern margin of the late Pliocene Sahara led to a summer increase in precipitation (Figs. 2a, 4), which led to a change from the desert biome type to a tropical xerophytic shrubland (Fig. 6). The Alfisol is a darker soil type than the Aridisol (Table 1), and this result is comparable, although more subtle, to those presented by Levis et al. (2004). When the soil and wet-lake boundary conditions were combined, this precipitation signal is increased, but the vegetation response remains the same (Figs. 5, 6). A northwards shift of the Sahel-Sahara boundary is consistent with the few vegetation data available for this region (Leroy and Dupont, 1994;Salzmann et al., 2008).
The climatological changes in western North America (Figs. 4, 5) have changed the regional vegetation from drier open biomes to a region dominated by the temperate conifer forest biome and the conifer parkland biome, which is consistent with reconstructions from palaeobotanical data (Thompson, 1991;Fleming, 1994;Salzmann et al., 2008). The main driving force for these vegetation changes appears to be increased evaporation of the lakes, reducing summer temperature and increasing summer precipitation. When soils and lakes are combined there is a further increase in less seasonal biome types (Fig. 6). This suggests that, although the differences between the control experiment and the PRISM3 + soils experiment are localised, the combination of realistic late Pliocene soils and lakes has positive feedbacks that facilitate the expansion of less seasonal biome types. Despite the limited improvements in the modelled climates and biome distribution of the late Pliocene, there are positive changes between the experiments containing the soils and lake data and the control run. We therefore encourage the use of the late Pliocene soil and lake boundary conditions in future climate modelling studies, including the future PlioMIP experiments (Haywood et al., , 2011a.

Conclusions
Through a synthesis of geological data we have reconstructed the global distribution of late Pliocene soils and lakes. From these reconstructions we have conducted a suite of climate modelling experiments to test the impacts of realistic soils and lakes on the climate of the late Pliocene. The inclusion of soils and lakes does not significantly modify global climate, but does have important regional impacts. Some of these regions have previously been simulated as too dry (e.g. western North America), when compared to palaeobotanical data. We see improvements in the seasonal amounts of precipitation in the southern part of northern Africa and in western North America, which results in the model-predicted biomes comparing more favourably with vegetation proxy data. We strongly encourage the use of these newly developed boundary conditions in future late Pliocene climate research, and the boundary conditions will be made available on the PRISM4 website (http://geology.er. usgs.gov/eespteam/prism/index.html). These new boundary conditions improve regional data-model comparisons, and their feedbacks in a warmer world should be explored further in future palaeoclimate modelling studies.