Main

Earth’s climate was fundamentally different from today during the Last Glacial Maximum (LGM; 23,000–19,000 years ago). Atmospheric carbon dioxide concentration was much lower, at approximately 185 ppm (ref. 1), large ice sheets covered northern North America and Eurasia2 and ocean circulation was different3. Since anthropogenic climate change is projected to shift the climate system towards a different state4, it is essential to evaluate the performance of climate models under boundary conditions different from today5. Simulation of LGM climate has therefore been key to evaluate climate models6,7,8. Decades of data–model comparison have shown that the overall change in global climate between the LGM and the present is consistent among models and palaeoclimate reconstructions, but ambiguities remain in the spatial pattern of glacial cooling8,9,10,11. This is problematic because the spatial temperature pattern is of fundamental importance for climate dynamics and also governs the distribution of habitats and ecosystems on our planet.

Assessing climate model performance using palaeoclimate reconstructions is, however, intrinsically challenging because the indirect nature of the reconstructions renders it difficult to determine whether differences between reconstructions and simulations reflect poor model skill or ambiguity in the proxy data12,13. In seawater temperature reconstructions, ambiguity arises from uncertainty in the attribution of the temperature signal to the depth and season where it originated and from the effect of nuisance parameters confounding the otherwise dominant effect of temperature on the proxy14,15. Simulations, on the other hand, are also associated with ambiguity due to model set-up, experimental design and structural uncertainty. Thus, the attribution of the differences between reconstructions and simulations to either the proxy data or the models is challenging and calls for alternative ways to evaluate simulations of past climate.

The first global reconstruction of LGM temperatures used transfer functions that relate (marine) microfossil assemblage composition to (seawater) temperature16. Since then, new geochemical methods to reconstruct past temperature have become available, but the transfer-function approach still is an important tool in palaeoclimatology17,18. However, in theory, species assemblage data could also provide a direct way to evaluate climate models independently of the reconstructed temperatures. Differences in habitat preferences among species result in changes in assemblage composition along ecologically relevant environmental gradients19. Thus, the compositional similarity between assemblages decreases the further apart they are from each other in the environmental space. Such similarity decay is a fundamental concept in ecology and is observed in many different taxa and ecosystems20,21. Like in other marine organisms22, temperature has consistently been shown to be the main driver of community assembly in planktonic foraminifera on a global scale23,24 (Fig. 1). Indeed, modern planktonic foraminifera assemblages from core-top sediments show a strong monotonic decrease in compositional similarity with increasing temperature difference (Fig. 2a). Contrary to phytoplankton, planktonic foraminifera assemblages have changed consistently with recurrent glacial–interglacial cycles over the late Quaternary period25. Given low dispersal limitations in the marine realm compared with on land, this consistent glacial–interglacial change demonstrates that planktonic foraminifera species have tracked climate change, that is, that their thermal niches have remained stable26,27 rather than (evolutionarily) adapted to the changing climate. Because of this stability of the thermal niches of planktonic foraminifera, the similarity-decay pattern observed in the core-top data should also characterize past assemblages. Thus, comparison of the similarity-decay pattern obtained using fossil species assemblages and simulated temperatures against the reference pattern seen in the core-top data offers an alternative way to evaluate climate models that is firmly grounded on ecological principles. This approach allows a direct comparison of observations and simulations, and hence does not rely on space-for-time substitution that is used for the calibration of temperature proxies, because it is based on the relationship between spatial gradients in species composition and in temperature. Moreover, this approach makes full use of the data, as gradients are assessed in all directions rather than in a point-by-point comparison. Finally, seasonal and vertical habitat tracking by planktonic foraminifera species, which complicates climate reconstructions based on their geochemistry28, is implicitly accounted for in the method.

Fig. 1: Planktonic foraminifera biogeography today (based on core-top data) and during the LGM.
figure 1

The species assemblages separated into three clusters (Methods) show a clear latitudinal pattern due to the influence of temperature on planktonic foraminifera species distribution. During the LGM, polar assemblages expanded further towards the Equator than today. This change occurred at the expense of transitional assemblages, whereas the extent of tropical assemblages changed only marginally. This major reorganization of the foraminifera communities indicates a steepening meridional ecological gradient that was particularly pronounced in the North Atlantic Ocean. White rings around the dots in the right panel indicate sites added to previous work11; all data sources are provided in Methods. Basemap from Natural Earth (https://www.naturalearthdata.com/).

Fig. 2: Simulated LGM temperature gradients are inconsistent with foraminifera biogeography.
figure 2

a, Modern. b, LGM. c, Difference. The core-top data (a) indicate that the similarity between assemblages decreases with increasing temperature difference. The box-plots indicate the median and interquartile range of the similarity; the whiskers extend to 1.5 times this range. They are shown at 1 °C bins, and heatmaps in the background show the normalized number of site pairs. The modern pattern (a) also appears when the LGM assemblages are combined with simulated temperatures (b; ensemble mean). However, the simulations also show large temperature differences among some sites even though they have highly similar (>0.75) assemblages (b; and highlighted in red in c). Notably, these simulated temperature differences appear to lie outside the modern limits for assemblages as shown by the black step line in c, which depicts the 99th percentile of thermal distance as a function of similarity. All data sources are provided in Methods.

In this Article, we use a new dataset of 2,085 LGM planktonic foraminifera morphospecies assemblages from 647 unique sites (Fig. 1), which represents a 50% increase in coverage compared with previous work17 to apply this new macroecological approach. We evaluate near surface seawater temperature patterns from a suite of equilibrium simulations of LGM carried out using a common protocol with state-of-the-art climate models (Methods).

LGM plankton biodiversity patterns

Our new dataset shows clear differences in biogeography between modern and LGM planktonic foraminifera (Fig. 1). The largest change occurred in the North Atlantic Ocean, where cold-water assemblages expanded to the mid-latitudes at the expense of transitional fauna. Evidence that the changes in the assemblage composition reflect changes in the climate, rather than an adaptation of the ecological preferences of the foraminifera species, stems from the good compositional analogues of the LGM species assemblages for virtually all sites (98%; Methods). Adaptation—changes in the thermal niches—would result in different glacial species assemblages than seen today. Thus, the near absence of poor analogues provides further support for the assumption that the similarity-decay pattern observed in the core-top assemblages should also characterize those from the LGM. Hence, if the models simulate LGM cooling accurately, the similarity-decay pattern obtained using simulated temperatures should be indistinguishable from the core-top pattern.

To a first degree, the simulated temperatures indeed imply a similarity decay that is similar to that seen in the core-top data, suggesting that the global pattern of LGM seawater temperature is reasonably well simulated and that planktonic foraminifera changed their distribution consistent with their modern thermal preferences (Fig. 2). However, the LGM similarity-decay pattern obtained using simulated temperatures also shows clear differences from the core-top pattern (Fig. 2). The discrepancy is largest at high (>0.75) similarities, where in many cases the simulated temperature differences exceed the limits indicated by modern assemblages (Fig. 2). This pattern is a salient feature that characterizes all simulations (Extended Data Fig. 1) and is robust against chronological uncertainty (Methods and Extended Data Fig. 8). It arises mainly from the similarity among LGM plankton assemblages in the North Atlantic and the Nordic Seas (Extended Data Fig. 2), where the models maintain the large present-day temperature gradient. This anomalous LGM pattern does not reflect model bias, since the core-top similarity-decay pattern is reproduced with simulated temperatures from the pre-industrial control runs (Extended Data Fig. 3). Notably, the same pattern also appears when LGM similarities are plotted against present-day temperature differences (Extended Data Fig. 4). Thus, whereas the species assemblages indicate changes in the spatial temperature gradients in the LGM, the simulated cooling appears globally rather uniform, without marked changes in oceanic thermal gradients. The distribution of foraminifera during the LGM and the modelled LGM temperatures therefore appear incompatible.

There is no evidence for the emergence of new species or traits or for species extinctions since the LGM29, which is consistent with the good compositional analogues between the present and the LGM (Methods). Moreover, there is no indication for a widening of the thermal niche of cold-water species that expanded into the mid-latitude North Atlantic (Fig. 1). Such adaptation would allow those species to migrate into areas with temperatures warmer than their current thermal range but would result in geochemistry-based estimates of LGM temperature that are warmer than present. However, such positive temperature anomalies are very rare in the most recent compilation of LGM seawater temperature proxies30. We therefore rule out that the relationship between planktonic foraminifera ecology and temperature has changed on the studied timescale. The good compositional analogues also argue against the existence of presently unobserved oceanic conditions during the LGM. Thus, although there is, in theory, the possibility that the relationship between temperature and assemblage composition arises from collinearity with another environmental variable, it is unlikely that this collinearity has changed globally in a systematic way. We are therefore left with the observation that the simulated LGM temperature change cannot explain fundamental biogeographic patterns in planktonic foraminifera that inhabited the ocean at that time.

Pronounced cooling in the glacial North Atlantic Ocean

To further assess the mismatch between the simulations and the foraminifera data, we estimate LGM temperatures from the assemblages. We use a new approach to resolve ambiguities in the seasonal and vertical origin of the proxy signal that have hindered previous comparisons of this kind (Methods). We find that, globally, planktonic foraminifera assemblages best reflect annual seawater temperatures at 50 m water depth. Averaged across the sites, the resulting reconstructed temperatures of LGM seawater (Fig. 3) are 2.45 °C (2.33–2.57 °C; 95% confidence interval) lower than today. The simulations show a similar degree of cooling (−1.30 to −2.89 °C, ensemble mean −2.15 °C), indicating reasonable agreement on a global scale and testifying to the stability of the thermal niches of planktonic foraminifera since the LGM.

Fig. 3: Large model–data differences during the LGM in areas of marked species turnover.
figure 3

ad, The large temporal turnover in planktonic foraminifera assemblages in the North Atlantic (a,b) is consistent with a pronounced cooling (c,d; background in c shows gridded reconstruction; Methods). eh, The simulations, by contrast, show globally rather uniform cooling (e,f), and appear too warm in the North Atlantic (g,h). Simulated temperature and the model–data difference (e,g; ensemble mean) are shown only at the locations where data are available. b,d,f,h, The latitudinal pattern smoothed using a generalized additive model, with the 95% confidence interval in grey. Data sources are provided in Methods. Basemaps in a,c,e,g from Natural Earth (https://www.naturalearthdata.com/).

However, these averaged values of temperature change hide marked differences in the spatial pattern of LGM cooling. The replacement of transitional assemblages with cold-water assemblages in the North Atlantic (Figs. 2 and 3) is associated with large-scale cooling. Reconstructed subpolar North Atlantic (40°–60° N) temperatures are on average 7.3 °C lower than today, leading to drastically different meridional temperature gradients (Extended Data Fig. 5). By contrast, simulated temperature anomalies are spatially rather uniform, and the North Atlantic cooling is markedly weaker in all simulations we assessed (Fig. 3). Differences between the simulations and the reconstructions reach up to 4.9 °C averaged across the North Atlantic (40°–60° N; ensemble mean and up to 8.3 °C for individual models; Extended Data Fig. 5), far beyond the uncertainty of the reconstructions (Methods).

The data thus indicate a marked reorganization of the North Atlantic oceanography. The spatial pattern of the temporal species turnover and reconstructed temperature change (Fig. 3) resembles the fingerprint of the Atlantic meridional overturning circulation (AMOC). AMOC changes can create such a characteristic temperature anomaly through changes in the location of deep convection and the subpolar gyre circulation31. A weaker glacial AMOC, or reduction in its northward extent, is thus a conceivable explanation for the discrepancy. To test this hypothesis, we investigated additional simulations with one of the models under identical LGM boundary conditions but with a weaker AMOC than in the LGM equilibrium simulations (Methods). These experiments yield an ocean that is in closer agreement with the biogeography of planktonic foraminifera. The additional three simulations are characterized by a colder subpolar North Atlantic, which results in similarity-decay patterns that are in closer agreement with the core-top pattern (Fig. 4a,b). Differences between reconstructed and simulated temperatures are also reduced in the experiments with a weaker AMOC, especially in the mid-latitude North Atlantic, where the reduction reaches up to approximately 50% (Fig. 4c and Extended Data Fig. 6). Thus, the representation of the AMOC in the equilibrium simulations appears a likely source of the discrepancy between the observations and the simulations.

Fig. 4: Simulations with a colder North Atlantic improve data–model agreement.
figure 4

a,b, The LGM similarity-decay patterns based on temperatures from the simulation with freshwater input into the North Atlantic (a) are more similar to the modern pattern (b) as the number of site pairs with high similarity, but large temperature difference between them, is reduced compared with the equilibrium simulation (symbols in panel a as in Fig. 2; black lines in b delineate the areas with excess temperature differences between sites in the equilibrium simulation; Extended Data Fig. 1). c, The difference between the simulated and reconstructed temperatures (black) is also reduced compared to the equilibrium simulations (red). Percentages in c indicate reduction in difference in the subpolar North Atlantic. Lines show the meridional pattern, approximated using a generalized additive model smooth, with the 95% confidence interval indicated. The results are presented for the different experiments where the freshwater flux was applied to either offshore Newfoundland or the subpolar North Atlantic. In the extended experiment (Newfoundland ext.), the forcing was applied for 1,050 years instead of 150. Details and the data sources are provided in Methods.

In these experiments, the reduction in the AMOC was achieved by prescribing freshwater flux into the North Atlantic Ocean. In two experiments, a 150 yr, 0.2 Sv freshwater forcing was applied to either offshore Newfoundland or the central subpolar North Atlantic32, with the latter showing a larger reduction in the AMOC and stronger associated cooling. To assess the ocean response to forcing that is (more) commensurate with the duration of the LGM, we conducted a third experiment, in which we applied a stochastically varying freshwater flux (0.1 ± 0.05 Sv) near the coast of Newfoundland for 1,050 years, simulating persistent, but variable discharge from the Laurentide ice sheet. In all experiments, the AMOC shows a marked slowdown (coastal: 11.8 Sv; subpolar: 4.3 Sv; coastal long: 8.1 Sv), which appears in closer agreement with reconstructions of deep ocean circulation33,34 compared with the default LGM simulation (18.2 Sv).

Our experiments were designed to mimic freshwater input by ice-sheet melting and calving as we hypothesize that the large ice sheets during the LGM may have been in a dynamic equilibrium with local stochastic mass loss similar to continental ice sheets today35. However, cooling in the North Atlantic can in simulations also result from other processes such as changes in the ice-sheet height or in oceanic mixing36,37. Nevertheless, the additional experiments demonstrate that a reduced AMOC is a physically plausible mechanism to bring the reconstructions and simulations in closer agreement.

By combining insights from different disciplines, we have demonstrated the power of using ecological principles to evaluate simulations of past climate. Because similarity decay with temperature arises from the strong temperature dependence of planktonic foraminifera species distribution, the method used here can, in principle, also be applied to assemblages containing extinct species, providing new ways to assess simulations of climate in the distant past. Our analysis indicates distinct regional patterns in the LGM temperature change, with the largest cooling in the northern North Atlantic, probably related to a markedly reduced AMOC. Resolving the changes in the oceanic thermal gradients indicated by planktonic foraminifera assemblages is a clear target for simulations, including transient ones, of LGM climate. Finally, the spatial heterogeneity of climate change that is robustly implied by glacial plankton biogeography underscores the need for climate model evaluation beyond globally averaged statistics since the spatial heterogeneity of climate change is critical for ecosystems and our society.

Methods

Data

Planktonic foraminifera assemblages

To characterize the modern similarity-decay pattern and to calibrate the transfer-function models, we used core-top planktonic foraminifera assemblage data from the ForCenS compilation38. This compilation contains species relative abundances based on counts of at least 300 specimens in the size fraction >150 μm. For some sites, the ForCenS compilation contains replicate counts, and as in previous work39, we randomly preserved a single sample from those sites, resulting in a total of 3,916 samples from unique sites from across the global ocean.

For the LGM, we extended the multiproxy approach for the reconstruction of the glacial ocean surface (MARGO) compilation11 by over 50% using identical criteria for inclusion to ensure compatibility. We compiled data from literature40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78 and added new samples from sediment cores with age control based on radiocarbon dating79,80,81,82,83,84,85,86,87,88,89,90. Counts from the new samples were performed on aliquots containing at least 300 specimens in the size fraction >150 μm and following the ForCenS taxonomy38. Our new dataset contains 2,085 samples from 647 unique sites (Fig. 1). All new samples were assigned a chronological confidence level that is consistent with MARGO91. The highest confidence level is assigned to age control on the basis of layer counting or on at least two radiometric dates within the 19,000–23,000 years bp interval. Level 2 indicates age control based on two bracketing radiometric age-control points within the 12,000–30,000 years bp interval or by correlation to regional records with level 1 chronology. Level 3 chronologies are based on other stratigraphic constraints correlated to records that match level 2. Data for which age control could no longer be verified, but were previously assigned to the LGM interval, are indicated with level 4. The taxonomically harmonized dataset includes extensive metadata to facilitate reuse. We encourage future users of the data to also acknowledge the original sources of the assemblage and age-control data.

Climatology

To assess the similarity decay in the core-top datasets, develop the transfer-function model and derive LGM temperature anomalies, we used climatology data from a 100-km-radius circle around each data point because sediment samples integrate the export flux of foraminifera over a similar area92,93. We tested the applicability of transfer-function models for different seasons and depths (see the following) and deliberately used the World Ocean Atlas version 199894 to reduce the bias of the most recent global warming on our inferences17.

Simulations

We assessed the Palaeoclimate Modelling Intercomparison Project (PMIP) phase 3 and phase 4 simulations that were available at the time of analysis. In total, these were ten simulations from eight models that have LGM and pre-industrial runs: NCAR-CCSM495,96, CNRM-CM597,98, FGOALS-g299,100, GISS-E2-R101,102, MIROC-ESM103,104, MPI-ESM-P105,106, MRI-CGCM3107,108 and AWI-ESM32. These simulations were all run using fixed ice sheets and without freshwater hosing in the North Atlantic as described in the respective PMIP protocols109,110. From these, CCSM4 and GISS have two LGM physics ensemble members each. The AWI-ESM has a higher spatial horizontal resolution in the ocean of up to 30 km. We used the potential temperature (thetao) fields from the ocean models, and all data files have been remapped from their generic to a 1° × 1° longitude–latitude grid using bilinear remapping. To ensure completeness of the temperature profiles, we created a common mask of existing data in the LGM and pre-industrial simulations at 100 m water depth for each simulation. From this mask of existing data, the nearest (in geometrical distance) gridbox containing the site was extracted. If the 50 m depth level was not in the model, the value was linearly interpolated from the existing depth levels.

We also compared our reconstructions with three experiments run using the AWI-ESM using the same LGM boundary conditions but with differing freshwater fluxes applied to the North Atlantic. In two previously published experiments, a 0.2 Sv freshwater flux was applied 150 years32. These simulations differ in the location of the freshwater input: either between 50° and 70° N off the coast of Newfoundland close to the Labrador Sea (called Newfoundland in Fig. 4) or equally distributed across the same latitude band in the North Atlantic (called subpolar). A global conservation in salinity was enforced at each time step by subtracting the global mean changes in net water flux from the first ocean layer. In a third, new experiment, we applied a stochastically varying freshwater flux (0.1 ± 0.05 Sv) for 1,050 years in the same area off Newfoundland as in the other experiments (Newfoundland ext. in Fig. 4). For a more realistic representation of the freshwater impacts, we switched off the salinity conservation in this simulation and allowed the global mean salinity to decrease with the continuous input of freshwater. In all experiments, the reduction in the AMOC happens within 100 years. From the short experiments, we use the temperature fields averaged across the years 101–150 of the freshwater perturbation, and for the longer experiment we used values averaged over the last 250 years of the simulation.

Biodiversity patterns

We assessed biodiversity patterns using globally harmonized taxonomy in which both subspecies of Globigerinoides ruber and Globorotalia menardii and Globorotalia tumida were lumped because they were not always separated in our data. We first assessed whether the LGM assemblages are compositionally similar to the modern assemblages by comparing the dissimilarity (square chord distance) between the LGM assemblages and their most similar modern counterparts with a baseline established from the core-top assemblages. Following previous work111, we used the fifth percentile of the pairwise dissimilarities in the global core-top dataset as a cut-off for poor analogues. The median minimum square chord distance between the LGM and the core-top assemblages is 0.09, and at 98% of the sites, the assemblages have a distance below the poor analogue threshold. This high similarity between the core-top and LGM assemblages provides strong evidence against changes in the environmental niches of the individual foraminifera species as well as against the existence of presently unseen oceanographic conditions during the LGM. It also warrants merging of the two datasets to visualize changes in planktonic foraminifera biogeography. To this end, we performed k-mean cluster analysis to identify changes in the distribution of cold-water, transitional and warm-water assemblages in the merged dataset (Fig. 1).

To investigate the similarity decay, we used the Bray–Curtis dissimilarity112 because this metric is sensitive to small changes in assemblage composition and has been shown to yield robust characterizations of beta diversity113. The Bray–Curtis dissimilarity accounts for differences in the relative proportions of taxa, and we express similarity as 1 – dissimilarity. To plot assemblage similarity as a function of environmental (temperature) distance (similarity or distance-decay plot), we used annual-mean temperature data at 50 m water depth.

We averaged LGM assemblages from the same sites before calculating the dissimilarities to avoid skewing the decay pattern towards higher similarity at zero thermal distance induced by replicate samples. To determine the LGM distance decay, we derived LGM temperatures from the simulated temperature anomaly to minimize possible model-specific bias. For the calculation of temporal turnover (Fig. 3), we used only LGM samples that have a core top within a 100 km radius to reduce turnover that is due to spatial, rather than temporal, compositional change. This cut-off distance is warranted by the fact that sediment samples spatially integrate over considerable distances92.

To assess the effect of differential spatial sampling of the core-top and LGM datasets on the similarity-decay pattern, we subsampled the core-top dataset to sites within a 100 km radius from the LGM samples and to the sites nearest to the LGM samples. The decay patterns of the subsampled data are nearly identical to the pattern using all core-top samples (Extended Data Fig. 7). Hence, we conclude that our reference core-top similarity-decay pattern is not affected by differential sampling of the temperature gradient.

The anomalously high similarities among LGM assemblages at large simulated temperature differences do not arise from chronological uncertainty (Extended Data Fig. 8). They are present in all four subsets of the data, except in the subset with lowest chronological confidence (level 3) because of the spatial distribution of the samples. Notably, the thermal limits for differences between assemblages derived from the core-top data (black step(ped) line in Fig. 1) already incorporate noise due to chronological uncertainty in the data39 and hence represent conservative estimates of the position of these limits. The fact that some simulated temperature differences fall outside these limits thus underscores the chronological robustness of our findings.

To evaluate whether the anomalous LGM similarity-decay pattern reflects model-specific bias or arises from the LGM simulations, we compared the climatology-derived core-top similarity-decay pattern with the patterns obtained using the (pre-industrial) control runs of the climate models. The patterns obtained using simulated temperatures show only minor differences from those obtained using climatology and, importantly, do not show the excess similarities at large temperature differences (Extended Data Fig. 3). Thus, the odd pattern in the LGM similarity decay is a phenomenon related to the way glacial conditions are simulated. At the same time, these results suggest that the similarity-decay pattern in the core-top data is not due to post-industrial changes in temperature gradients, consistent with the globally uniform nature of anthropogenic warming114.

Finally, the difference between the modern similarity-decay pattern and the simulation-based LGM pattern could in theory also arise from some degree of ecological noise in the assemblage data. We note, however, that the cluster of site pairs with anomalously large temperature differences at high similarity lies outside the limits of the modern decay pattern (Fig. 2). Moreover, the difference in the decay patterns originates from the North Atlantic, where the strength of the modern similarity–temperature relationship is larger (pseudo r2: 0.71) than the global average (pseudo r2: 0.47). Here the pseudo r2 is calculated as \(1-\left(\frac{{model\; deviance}}{{null\; deviance}}\right)\) derived from a binomial generalized linear model with a log link to account for the nonlinear nature of the data. We therefore conclude that the difference between the modern decay pattern and the LGM decay pattern using simulated temperatures is robust.

Temperature estimates

Previous work has documented that temperature is the single best predictor of planktonic foraminifera species assemblage composition on a global scale23,115, and various transfer-function techniques have been successfully applied to obtain temperature estimates of the past ocean17. Here we use the modern analogue technique to derive temperature estimates from the species assemblages because this method is conceptually simple and allows for nonlinearity in the response of planktonic foraminifera assemblages. We develop transfer-function models for individual regions to minimize the effect of endemism of cryptic species, which has been shown to negatively impact transfer-function performance17. Regions were defined as in previous work17, and the taxonomy was harmonized to retain as many samples possible in each region. Specifically, this means that for the Mediterranean, the subspecies of Globigerinoides ruber were lumped and that Globorotalia menardii and Globorotalia tumida were combined in the Pacific. Temperature estimates were based on the similarity weighted mean of the ten best analogues, and analogue quality was assessed using the squared chord distance because this metric has been shown to be most effective in identifying analogues in microfossil datasets116.

Vertical and seasonal attribution

In contrast to previous work, we provide temperature reconstructions for a single season only. This is because seasonal seawater temperatures are highly correlated, and temperatures at different seasons have therefore little, if any, independent predictive power17,23,117. This issue has been touched upon before17, and here we make the conscious decision to break with convention and provide only a single independent estimate of past temperature. However, ambiguity persists as to which seasonal temperature the species assemblages reflect best14. Given the three-dimensional nature of the ocean, the same applies to the vertical dimension because planktonic foraminifera have not only variable seasonal abundances118 but also variable depth habitats in the global ocean119. We address this problem by estimating the proportion of variance in the species assemblages that is explained by temperature during different seasons and at different depths14,117. The temperature during the season and at the depth that explains globally most of the variance in the species assemblages is then used for the final transfer-function model that is applied to estimate temperature from the fossil samples.

We first performed this analysis using canonical correspondence analysis on the core-top assemblages data using temperatures from climatology as constraining variables. Then to test the significance of temperature estimates from transfer functions, we also evaluated to what degree the temperatures predicted from both the core-top and the LGM assemblages explain the variance in the assemblages themselves. In doing so, we extended the framework used to assess significance of transfer-function predictions in previous work14,117 from the temporal to the spatial domain. We derived the proportion of variance in the first principal component of the assemblage data that is explained by the predicted temperatures assuming that this first component captures the imprint of the environmental variability (in space) on the assemblage data.

We assessed four seasons and annual mean at 14 different depth levels in the upper 500 m for five regions; that is, we built 350 transfer-function models to derive temperature estimates for the modern and fossil dataset. Given the larger vertical than seasonal temperature gradients in the tropics28, we analysed tropical and extratropical regions separately. Irrespective of whether we assessed core-top or LGM assemblages, we found that seasonal temperatures explain nearly the same amount of variance in the species assemblages in most regions and that the choice of seasonal temperature for transfer-function development is not critical (Extended Data Fig. 9). A different picture emerges when considering depth. Especially in the tropics, there is a clear pattern in the depth at which temperature explains most of the variance; outside of the tropics, the dependence on depth is markedly smaller (Extended Data Fig. 9). Although there are regional differences, the tropics generally show a maximum in the explained variance that is not at, but slightly below, the sea surface. Globally, most of the variance in the modern as well as in the fossil planktonic foraminifera assemblages is explained by mean annual temperatures at 50 m water depth. We thus estimated LGM temperatures using the transfer-function model trained on annual-mean temperatures at 50 m water depth.

Negligible effect of poor analogues

A potential caveat with the use of transfer functions is the presence of fossil samples with low similarity to the samples in the modern training set, the so-called poor analogue problem. Warm LGM subtropical gyres, a consistent feature of planktonic foraminifera assemblage-based seawater temperature reconstructions16,17, have previously been attributed to poor analogues. To some degree, this problem can be related to the species assemblage variability captured in the training set. Compared with previous work, we used a much more extensive core-top dataset, and we thus expect the poor analogue problem to be smaller. Nevertheless, to assess the possible effect on the reconstructed temperature of LGM assemblages that lack close analogues in the core-top dataset, we filtered out samples with a minimum square chord distance to the core-top samples above the fifth percentile of the pairwise dissimilarities in the global core-top dataset111 (Extended Data Fig. 10). The mean and 95 percentile range of the reconstruction are −2.45 °C (−9.51 to 1.22 °C) without analogue filtering. Global analogue filtering reduces the number of samples and sites by 169 and 15, respectively, and yields a mean temperature change of −2.46 °C (−9.55 to 1.27 °C). When using a regionally defined threshold, the dataset is reduced by 252 samples and 29 sites, and the effect on the reconstructed temperature change is also very small (mean −2.46 °C (−9.56 to 1.21 °C)). Thus, this sensitivity test shows that the influence of poor analogues in the LGM dataset is negligible and that the subtropical warming is robust.

Uncertainty

Because of the strong spatial autocorrelation in ocean temperatures, the uncertainty of the temperature prediction based on planktonic foraminifera assemblages is also spatially correlated. We have assessed the effect of the reconstruction uncertainty using a Monte Carlo approach. For each region, we generated 1,000 random temperature fields with the same spatial autocorrelation structure as the residuals of the transfer-function model. We added these to the reconstruction itself to determine the uncertainty in the overall estimated mean temperature change. Whenever multiple estimates were available for the same site, we reduced the uncertainty by the number of observations. The reconstruction errors (2 s.d.) vary regionally, with a median value of 1.52 °C. This uncertainty is larger than previously reported17 because our estimate accounts for spatial autocorrelation120.

Anomalies

We present temperature anomalies with respect to climatology because not all LGM samples have a nearby core-top sample. Presenting anomalies with respect to temperature derived from the core-top assemblages would therefore result in reduced spatial coverage. We have nevertheless assessed whether the way the anomalies are calculated matters for the reconstructions by comparing the two different temperature anomalies for LGM sites with core-top samples within a 100 km radius. We find only a negligible and random difference (0.01 ± 1.27 °C; 1 s.d.) without any spatial pattern, validating the use of anomalies versus climatology.

Calculations

All calculations and visualizations were made in R121 using the packages tidyverse122, vegan123, geosphere124, patchwork125, rioja126, broom127, fields128, rgdal129, sp130,131 and gstat132,133. The reconstruction shown in Fig. 3 was gridded in an optimal way using the Data Interpolating Variational Analysis method134 as outlined in ref. 135, but based on annual-mean sea-ice extent reconstruction and transfer-function results.