Invited reviewReview of probabilistic pollen-climate transfer methods
Highlights
► Classical pollen-climate transfer methods are linked to spatial multi-proxy BHM. ► A pollen-ratio model performs similar to MAT for summer temperature reconstruction. ► Approximately 1000-yr varved sediment records from west-central Wisconsin, USA. ► Holocene pollen records from southern California, USA. ► Single-site reconstructions are coherent but underestimate cross-site uncertainty.
Introduction
Over the past century numerous approaches to paleoclimate reconstruction have been made. Due to different types of proxy data, different reconstruction methods, and spatially and temporally inhomogeneous distribution of paleo archives, obtaining a coherent picture of past climate variability remains a complicated task. In order to address this problem, we identify two common goals when analyzing paleoclimate data, which are to:
- •
combine proxies for a spatially complete picture of past climate in order to inform physical understanding of the climate system in terms of dynamics and response to forcings; and
- •
estimate the uncertainty in climate reconstructions, which is naturally large due to the approximation of a complex proxy-climate relation.
A significant advance in the attempt to provide a spatially complete picture of past climate has been the development of systematic climate field reconstruction methods, or CFRs. CFRs were first developed from eigenvector/singular value empirical orthogonal function (EOF) methods used in modern instrumental climatology. In paleoclimatology, several variants of EOF-based CFRs have been employed, primarily with annually resolved proxy data (of multiple kinds) over the past two millennia: all use the proxy information to estimate the temporal loadings of a truncated basis set of the EOFs of the instrumental climate field of interest to be reconstructed back in time. Once established through direct (e.g. Luterbacher et al., 2004) or inverse regression relationships (e.g. Mann et al., 1998, Mann et al., 1999, Wahl and Ammann, 2007), these estimated loadings are then applied to the retained EOF patterns (through the appropriate expansion formula for EOF derivation) during both the instrumental and reconstruction periods to form estimated values of the entire climate field, conditional on the proxy information. A variant of this method uses the regularized expectation maximization (RegEM) algorithm to estimate the temporal loadings from the proxy information, and then reconstructs the EOFs as described (e.g., Mann et al., 2009); RegEM can also be used to impute the reconstruction field directly (e.g., Rutherford et al., 2005, Smerdon et al., 2010). A closely-related method forms a truncated basis set from the joint proxy-instrumental cross-covariance for the same purpose in canonical correlation analysis (CCA) (cf. Cook et al., 1994, Smerdon et al., 2010). Although it is not a systematic field reconstruction method per se, the point-by-point regression (PPR) method developed by Cook et al., 2004, Cook et al., 2007, Cook et al., 2010) has been used successfully to reconstruct fields of the Palmer Drought Stress Index (PDSI), a key hydroclimate variable that integrates both temperature and precipitation in a soil moisture model, in both North America (2004, 2007) and Monsoon Asia (2010).1
CFR methods are a large step forward in reconstructing past climate in a spatially complete manner, however in some cases they have not been accompanied by systematic estimation of uncertainty at the field level, as opposed to index values of regional, hemispheric or global means (e.g., Luterbacher et al., 2004, Mann et al., 2008, Mann et al., 2009). Mann et al. (2009) used the unexplained variability in the validation period multiplied by the calibration period decadal variance at each grid cell as a measure of the squared decadal standard error of reconstruction. Gebhardt et al. (2008) present a different approach where a probabilistic data-based method for local reconstructions is combined with a dynamic constraint on the reconstructed climate parameter, which leads to climatological fields being optimized with respect to both the proxy data and to the prescribed dynamics in a statistically consistent way. Wahl and Smerdon (submitted) apply a simplified version of a technique for generating probabilistic ensemble draws of spatial mean temperature reconstructions (Li et al., 2007) to one of the truncated EOF CFR methods described above; thereby generating ensemble draws of the estimated temporal loadings associated with the retained EOFs and in turn yielding a probabilistic field reconstruction ensemble for the target domain of western North America. This method, while a valuable step forward, still underestimates the full uncertainty inherent in the reconstruction process because it does not include a statistical model for uncertainty in the proxy data. A more complete characterization of uncertainty would involve developing such models and incorporating them into the reconstruction process, to which we now turn.
At the forward edge of these methods, with special focus on the common goals of incorporating multiple proxies and quantifying uncertainties in a rigorous manner, is, e.g., Li et al. (2010). They developed a Bayesian hierarchical model (BHM) to reconstruct the northern hemisphere mean temperature from different sources, such as proxies with different temporal resolution as well as external forcings such as solar irradiance, volcanic aerosols, and greenhouse gas concentrations. Their method is applied to synthetic climate data taken from a global climate model and synthetic proxy data also derived from the simulation. Tingley and Huybers (2010) provide an important extension to this approach by reconstructing space–time temperature processes.
However, as mentioned, these methods have been applied to pseudo-proxies. On the one hand, this is a well-justified and natural approach to test the capability of a BHM for paleoclimate reconstructions, as emphasized in the comments on Li et al. (2010) by Cressie and Tingley. It allows estimating the characteristics with respect to spatio-temporal processes before incorporating more complex climate-proxy relations. Similar tests a have been performed, e.g., by von Storch et al. (2009) where they compare three different reconstruction methods, the inverse regression method by Mann et al. (1998), a direct principal components regression, and the composite plus scaling method, in the virtual reality of the ECHO-G climate simulation. The result, that the methods display different performance depending on the noise model and the size of the pseudo-proxy network, shows that test-bed studies are an important informative tool. This utility is confirmed in a growing body of reconstruction simulation experiments (e.g., Smerdon et al., 2010).
On the other hand, additional uncertainty arises from the question how well the test-bed studies relate to the performance of the same methods using real data. It should be noted that the way pseudo-proxies are created provides a degree of caution to the application of an assumption that test-bed results and real-data results using the same method and targeting the same spatial domain will necessarily strongly parallel each other, even if the test-bed climate realistically represents the real-world climate. This is an issue that needs to be addressed in applying these approaches to fossil proxy data, including the main focus of this article, which is on pollen. Although the aforementioned BHMs account for different noise models, the underlying assumption is a linear relation between proxies and temperature. We discuss the aspect of linearization for a limited case in Section 3. However, glancing at the large number of approaches to pollen-based climate reconstruction, or so-called transfer functions (Section 2), it is obvious that a linear link between pollen counts and environmental variables such as near surface temperature is not generally applicable for pollen as a paleo-proxy. From these observations come the idea and motivation for this study, which is to enhance the realism of spatial hierarchical models for use of pollen proxies in paleoclimate reconstruction.
The general challenge in pollen-based paleoclimate reconstructions is that there is not a strong direct functional or mechanistic relationship between pollen spectra and climatic variables. This complexity arises because pollen production is affected by the interaction of a large number of nonlinear processes. Let us consider the ideal case of a single tree species and its dependence on one climate parameter, e.g. average near surface temperature. In a very simplified picture, one might think of a unimodal response of pollen counts on temperature, describing a maximum climatic range with increasing pollen production toward optimum conditions somewhere within. While this assumption appears to be appropriate for the abundances of whole organisms (such as chironomids, used in methods related to pollen-based paleoenvironmental reconstruction), it should be noted that pollen production is a sub-organism-level, secondary metabolic process—each grain containing the male gamete component of plant reproductive biology. It includes irregularities like mast years, during which plants produce heavy seed crops, while in other years there is no, poor, or only moderate seed production. Furthermore, climate-pollen relations exhibit a second order effect beyond vegetation differences: in some situations pollen production under stress, i.e. at the edge of a plant’s climatic range, might even be higher than average. We do not address tertiary effects on local pollen assemblages such as wind-drift, but a key process that needs to be accounted for is plant competition, which makes it difficult to treat different taxa independently. In addition, the combination of taxa itself can be disturbed by cultivation of land, leading to so-called no-modern-analogue situations. A concrete example is given in Neumann et al. (2007) for the Eastern Mediterranean, which is heavily prone to cultivation during the late Holocene. We will detail the basic concepts of pollen-based climate reconstructions from a more paleobotanical perspective in Section 2, but the brief considerations above indicate how much pollen-based climate reconstructions naturally involve multivariate, nonlinear processes and considerable uncertainties.
Uncertainties in paleoclimate transfer methods can also be understood from a more fundamental point of view. In climate science and risk management the terms epistemic and aleatory uncertainty are used (e.g. Troccoli et al., 2008). Epistemic uncertainty follows from limited process knowledge or modeling capabilities and relates to most of the problems addressed in the previous paragraph. Aleatory uncertainty is given by the stochastic nature of the climate system and its subsystems, atmosphere, ocean, cryosphere, biosphere, and lithosphere, and hence is not reducible. Using properties of one subsystem to deduce properties of another subsystem, as is done by botanical-climatological transfer methods for instance, naturally involves random processes. This motivates the recent trend from a deterministic point of view toward probabilistic methods where traditional transfer functions are understood as joint or conditional probability distributions (e.g. Kumke et al., 2004). A very natural framework to deal with this point of view, i.e. various levels of random effects, is a BHM, which leads to the objectives of this article. These new and upcoming approaches, as mentioned above, are typically derived from the perspective of spatio-temporal statistics, rather than from pollen-based research, and there is therefore a need to re-consider classical pollen-climate transfer functions (as far as this term still holds) in the development of this wider, probabilistic perspective.
In Section 2 we discuss the background of pollen-climate transfer functions in the context of a Bayesian framework in order synthesize the existing approaches. A case study is given in Section 3 by developing a probabilistic version of the pollen-ratio method for potential inclusion in a spatial multiproxy BHM. Finally we go beyond single-site reconstructions and analyze uncertainty in reconstruction from three nearby lake sediment records, by the means of ensemble post-processing (Section 4).
Section snippets
Bayesian framework
The history of pollen-climate transfer methods dates back at least 65 years, leading to a number of different approaches. A general synthesis is given in Bartlein et al. (2010) and Birks et al. (2010). Since most of the methods have been developed from a point of view focused on the paleo archives, they often employ mechanistic relations or expectation values. In order to put these into the general aims of this paper, in particular to account for the stochastic nature of both the climate and
Method
As mentioned in section 2.4, we introduce here a simplified variation on the MAT that includes the concept of response surfaces, using a generalized linear model (GLM) with binomial response and logistic link function. This statistical formulation is well-expressed theoretically in that pollen count data for two groups of taxa of the form for the individual taxa j in group i are binomially distributed, conditional on additional covariates and the overall pollen sum across
Discussion of site-specific uncertainty
An obvious next step in probabilistic reconstruction modeling is to go beyond single-site reconstructions. The reconstructions from the three nearby lakes in west-central Wisconsin, USA (Section 3) allow discussion of different sources of uncertainty and also to relate to the use of ensemble simulations in modern climatology and numerical weather prediction. Ensembles of climate simulations are presently the only way to sample uncertainties arising from initial conditions, parameter selection,
Conclusions
Compared to recent advances in modern climatology, reconstructing paleoclimate from proxy data is still a complicated task. The picture of past climate involves a patchwork of different proxy variables, environmental variables, and spatial and temporal scales using a large number of statistical methods. An important challenge to the paleoclimatological community going forward is to synthesize local reconstructions from different proxies for a spatially complete picture of past climate. This
Acknowledgements
Support for the research reported in 2.4 Modern analogues, 3 Case study of the pollen-ratio model was provided by NOAA (Wisconsin summer temperature reconstructions). Additionally, this material is based upon work supported by the National Science Foundation under Grant No. (0724619) (Wisconsin summer temperature reconstructions) and Grant No. (9801449) (California July temperature and annual precipitation reconstructions). Additional acknowledgment is extended to Dr. Margaret Bryan Davis,
References (89)
- et al.
Holocene climatic change in the northern Midwest: Pollen-derived estimates
Quaternary Research
(1984) Quantitative estimates of temperature changes over the last 2700 years in Michigan based on pollen data
Quaternary Research
(1981)- et al.
North American drought: reconstructions, causes, and consequences
Earth Science Reviews
(2007) Late-Holocene climates of eastern North America estimated from pollen data
Quaternary Research
(1988)- et al.
Vegetation and fire history from three lakes with varved sediments in northwestern Wisconsin (U.S.A.)
Review of Palaeobotany and Palynology
(1985) - et al.
A statistical approach to evaluating distance metrics and analog assignments for pollen records
Quaternary Research
(2003) - et al.
Inverse vegetation modeling by Monte Carlo sampling to reconstruct paleoclimate under changed precipitation seasonality and CO2 conditions: application to glacial climate in Mediterranean region
Ecology Modelling
(2000) - et al.
Calibrating pollen data in climatic terms: improving the methods
Quaternary Science Reviews
(1983) - et al.
Eemian to early Würmian climate dynamics: history and pattern of changes in Central Europe
Palaeogeography, Palaeoclimatology, Palaeoecology
(2004) - et al.
Holocene temperature changes in northern Fennoscandia reconstructed from chironomids using Bayesian modeling
Quaternary Science Reviews
(2002)