Introduction

Biomes are conceptual constructs that categorise terrestrial ecosystems into structural and functional units and thereby help us organise our knowledge on how ecosystems work (Moncrieff and others 2016). Despite their importance, there is surprisingly little consensus on how to define biomes (Moncrieff and others 2016). Whittaker (1975) made the case that biomes are strongly dependent on climate, yet Bond (2005) illustrated how fire and herbivores can prevent vegetation from attaining its climatic potential vegetation state, and Walter (1973) drew attention to how soils and orographic factors allow vegetation to deviate from the climatic potential. The manifold impacts of climatic change make it extremely likely that these constructs we call biomes are changing in character and distribution (Parmesan and others 2022). Indeed, studies that have used satellite imagery detected changes in vegetation cover and ecosystem functioning that are large enough to qualify as biome shifts (Seddon and others 2016; Higgins and others 2016; Buitenwerf and others 2015; Song and others 2018; Zhu and others 2016).

Predictions of future biome shifts are primarily derived from dynamic global vegetation models (DGVMs). DGVMs are ambitious models that seek to simulate how resource assimilation, growth, competition and consumption (fire, herbivory) processes interact over ecological time scales (a few years to a few centuries) and thereby how biomes might shift in response to changes in the climate system (Prentice and others 2007). Recent studies have, however, suggested that DGVM models poorly represent how co-limitation processes involving temperature, soil moisture, nutrient availability and atmospheric CO2 may impact on vegetation dynamics (Smith and others 2016; Wang and others 2020). In this study, we therefore follow an alternative, data-driven approach that uses the information in species distribution databases and developments in process-based species distribution modelling to infer the risks of biome shifts.

Methods from species distribution modelling have previously been used to model biome shifts. For example, Rutherford and others (1999) used species distribution modelling to model the climate space occupied by the biomes of South Africa, essentially assuming that each biome had a geographic range analogous to that of a single species. This analysis forecast rather dramatic changes in the distribution of the biomes of South Africa. In this study, we apply a recently proposed workflow (Conradi and others 2020) for modelling biome-level shifts. Unlike previous work (for example, Rutherford and others 1999), this workflow does not model biomes as singular entities, rather it uses stacked species distribution models to characterise the climatic suitabilities of the plant types that characterise biomes. The suitability of a geo-location for plant types are then used to classify geo-locations into groups. To avoid confusion with existing biome definitions, we refer to these groups as phytoclimes, which we define as geographic regions where the climate favours particular combinations of plant types. In contrast, biomes can be defined as large scale units with an internally similar set of vegetation formations that form in response to a shared macro-climate and additional factors such as competition, facilitation, disturbance (for example, fire, herbivory, extreme weather events), dispersal, recruitment and time. That is, the phytoclime concept is related to yet distinct from biome concepts.

The data-driven approach used in this study has several advantages. First, by using the data on the distribution of hundreds of species to characterise the suitabilities of each plant type, the workflow uses model averaging, which so long as there is no systematic bias in the model projections, reduces uncertainty in the estimates of each plant type’s environmental suitability. Second, the workflow does not a priori define the phytoclimes, which in turn makes the phytoclime definition independent of expert knowledge on the floristic and climatic characteristics of vegetation formations that is often used to classify biomes. Third, projections of phytoclime shifts emerge as the cumulative response of all species belonging to each plant type to shifting climates. That is, phytoclimes are not assumed to be entities that shift, rather it is assumed that the suitability of locations for species will shift causing shifts in plant type suitability surfaces, which can be summarised as shifts in phytoclimes.

This study uses southern Africa as a case study to explore phytoclime shifts, where southern Africa is defined as Namibia, Botswana, Zimbabwe, South Africa, Lesotho, Eswatini and Mozambique south of the Zambezi River. South Africa in particular has a long history of biome research (Acocks 1953; Rutherford and Westfall 2006; Rutherford and others 2003), and the biome concept has shaped and structured environmental policy and the practice of ecological science in South Africa over recent decades. The South African biodiversity vulnerability assessment, which was part of the South African country study on climate change (Rutherford and others 1999), developed the case that changes in the climate forces that regulate the distribution of the biomes of South Africa would lead to large and dramatic re-organisation of the country’s vegetation. The severity of the impacts predicted in the Rutherford study served to stimulate an intensification of research on climate change impacts in the region. Perhaps more importantly, the report has had a sustained impact on South African climate change policy. More than two decades later, improved global circulation models, revised climate change scenarios and new methods for linking vegetation to climate forcing data are available. The aim of this study is therefore to revisit how climatic change may impact the distribution of the vegetation of southern Africa using the workflow proposed by Conradi and others (2020) and state-of-the-art climate projections (Karger and others 2021) derived from the latest ensemble of global circulation model (GCM) runs from the Coupled Model Intercomparison Project (CMIP6, Eyring and others 2016).

Methods

Species Distribution Data Sources

In this section, we describe the species distribution data sources used, how we cleaned the distribution data, resolved the species names and assigned growth form types to species.

Because the species distribution data are used to characterise the climatic suitabilities of the regions major plant types, we included species endemic to the study region and any species that had distribution records in the study region. We use the NVD (National Vegetation Database, Rutherford and others 2012), ACKDAT (Acocks’ Collection of South African Vegetation Data, Rutherford and others 2003) and BIEN (Botanical Information and Ecology Network) version 4.1 (http://bien.nceas.ucsb.edu/bien/) as sources of observational data on species distributions. NVD is a database of vegetation plot surveys conducted in South Africa, while ACKDAT is a database of extensive species inventories conducted by John Acocks throughout South Africa. BIEN is a global database of georeferenced plant observations derived mainly from herbarium collections, plot surveys and observations, and includes the South African PRECIS (Pretoria Computerised Information System) herbarium database (which is imported to BIEN from GBIF), but not ACKDAT or NVD. We downloaded the global distribution of all species with at least one record in the southern African study region from the non-public version of BIEN, but used only ’plot’ or ’specimen’ observation types. Observations that were flagged as ’literature’, ’checklist’, ’human observation’, ’trait occurrence’, ’cultivated’ and ’unknown’ were not used. We did use records flagged as ’non-native’. The non-public BIEN version includes presence records of taxa listed in the CITES (Convention on International Trade in Endangered Species of Wild Fauna and Flora) appendices I, II and III and taxa listed as ’EN’ or ’CR’ in the IUCN Red List of Threatened Species (Ver. 3.1, 2nd Ed.) that are not accessible from the public version of BIEN. The BIEN data were downloaded using the R package todoBIEN v.1.2.4, which has the same functionality of the BIEN R package (Maitner 2020), but accesses the non-public BIEN database version. References for BIEN data used are provided in Supplement S1.

Each database was filtered using data quality criteria. NVD plots with location confidence > 5 km were removed and only plots that surveyed ’all species’, ’full floristics’ or ’permanently recognisable species’ were included. Location confidence in ACKDAT is 1.5 km and the geographic resolution of all ACKDAT sites were therefore regarded as sufficient. We retained only ACKDAT ’SiteType A’ locations as these sites had complete inventories. Records in the BIEN databases have already been subject to quality filtering, including validations of coordinates and species status (for example, cultivation status). We further filtered the data by removing records with zero latitude or longitude and records within a radius of 10 km from country and province centroids, 5 km from country capitals, 200 m from biodiversity institutions (museums, herbaria, botanical gardens, universities) and one degree from the GBIF headquarters in Copenhagen, Denmark, using the R package CoordinateCleaner version 2.0-11 (Zizka and others 2019). Because country centroids may differ between spatial databases (CoordinateCleaner uses www.naturalearthdata.com), we additionally calculated country centroids based on the Database of Global Administrative Areas version 3.4 and removed records within a 20-km buffer to these centroids.

Taxonomic names in NVD and ACKDAT use different versions of the POSA (Plants of South Africa) taxonomy (Germishuizen and Meyer 2003) and were standardised using the Taxonomic Name Resolution Service (TNRS, Boyle and others 2013). The BIEN taxon names were also standardised with the TNRS (using the 'verbatim scientific name' provided by the original datasets) and then matched with the standardised ACKDAT and NVD names. The TNRS resolved taxon names based on queries to The Plant List (http://www.theplantlist.org), Tropicos (http://www.tropicos.org) and The PLANTS Database (http://plants.usda.gov). In total 289 (0.04%) taxon names in ACKDAT and 1370 (12%) taxon names in NVD could not be resolved. The unresolved NVD taxon names are mostly taxa that were identified to the genus level only. Records with unresolved taxon names were discarded from further analysis steps.

Presence and Pseudo-absence Sampling

In this section, we describe how the cleaned and filtered species distribution data are used to generate the occurrence (presence and pseudo-absence) data needed for the species distribution modelling.

The filtered species distribution data described in the previous section were used to generate a list of species for further analysis. For each species, we combined the geographic locations of the presence observations in ACKDAT, NVD and BIEN and then removed duplicate locations using the R command sp::remove.duplicates (Pebesma and Bivand 2005). We then created a list of unique locations where the species had not been observed in the ACKDAT and NVD surveys, thereby creating a list of absence observations. The resulting lists of presence and absence locations were then thinned using the R command spThin::thin (Aiello-Lammens and others 2015). This procedure uses a randomisation algorithm to sample point data sets so that the largest possible sample of points is retained with the constraint that the sampled points are more than a threshold distance from one another. We used 10 km as the threshold distance. Thinning served to both reduce geographical variation in sampling intensity and, by reducing sample size, reduce computational time.

We then used the Köppen-Geiger climate zones (hereafter simply Köppen zones, Kottek and others 2006) to stratify the sampling of records by climate zones. This stratified sampling preserved the relative proportion of records in the different Köppen zones. That is pj = Nprj/rtotal presence records are sampled in each zone. Here rj is the number of presence records in zone j, rtotal is the total number of presence records for the species and Np is the total number of presence records we wish to sample (Np  ≤  rtotal). We generated two types of pseudo-absence points. The first draws a sample from the locations where the species was not observed in the ACKDAT and NVD databases. ACKDAT and NVD are restricted to South Africa, therefore the second type of pseudo-absence points therefore consists of randomly selected locations outside of South Africa. The number of not-observed locations to be sampled in each zone is weighted by aj, where aj = (max(p) − pj) and p is the vector of presence records (p) across Köppen zones. We use this weighting factor to sample Np sites where the species was not observed in the NVD and ACKDAT records from each Köppen zone. To generate additional pseudo-absence points we used the same weighting factor applied to each Köppen zone to randomly select Np pseudo-absence points outside of South Africa. Overall the presence pseudo-absence sampling scheme results in twice as many absence as presence points and ensures that all Köppen zones are represented.

Plant Type Assignment

In this section, we describe how we assign species to plant types. This is done for all species for which we have adequate species distribution data to attempt species distribution model fits.

The species were grouped into ten plant type types: trees, shrubs, C3 grasses, C4 grasses, restioids, geophytes, annual forbs, perennial forbs, succulents and climbers. Species in the Restionaceae were classified as restioids. All species belonging to the families Poaceae, Cyperaceae, Juncaceae, Anarthriaceae, Centrolepidaceae, Ecdeiocoleaceae, Joinvilleaceae, Thurniaceae and Typhaceae were classified as grasses. The remaining Poales were assigned to other plant types. The database of Osborne and others (2014) was then used to identify grass species with the C4 photosynthetic pathway. Data limitations mean that we assumed that all other plants use the C3 pathway. Eggli (2001), Eggli (2002), Hartmann (2002a), Hartmann (2002b), Albers and Meve (2002), Eggli (2003), and Eggli and Nyffeler (2020) were used to assign succulents. Epiphytic succulents were assigned to the climber category. For the remaining species, we extracted information on whole plant type from POSA, when information was not available in POSA we used information from BIEN. Sources of the BIEN plant type data are in Supplement S1. If multiple entries were contradictory (for example, a species might have database entries 'shrub' and 'tree'), we used the most frequent entry. If species-level plant type information was not available, we used the most frequent entry for the genus in POSA, and if that was also not available, we used the most frequent entry for the family in POSA. Species classified as herb or fern were assigned to the forb category, and species classified as epiphyte, liana or vine were assigned to the climber category. Species classified as aquatic or hydrophytes were removed. We used the 'life form 2' category in the GIFT (the Global Inventory of Floras and Traits) database (Weigelt and others 2020) to assign annual forbs and geophytes. Preliminary analysis using information on whether trees and shrubs were evergreen or deciduous or whether species were fire associated were abandoned due to uncertainty in assigning species to these categories.

Species Distribution Modelling

In this section, we describe the species distribution modelling undertaken using the TTR-SDM (Thornley transport resistance species distribution model). The next paragraphs provide an overview of the species distribution model and the model variants used in this study. Supplementary materials 2 provides a description of the Thornley transport resistance model and how environmental forcing is added to the Thornley model to yield the TTR-SDM. Supplementary materials 3 provides a description of the Farquhar photosynthesis model that is used to describe carbon assimilation in some variants of the TTR-SDM. Figure S1 provides a conceptual overview of the model, while Table S1 provides an overview of the model’s state variables, constants and forcing variables.

The TTR-SDM is based on a model of plant growth (Thornley 1998) linked to monthly environmental forcing (Higgins and others 2012). The model simulates how the growth of a plant is co-limited by photosynthetically active radiation, atmospheric CO2, temperature, soil moisture and soil nitrogen. Each of these environmental factors can vary on a monthly basis and the parameters describing the environmental co-limitation are calibrated for each species using distribution data. We use four different variations of the TTR-SDM in this study.

The first variant, TTR-STD, is the model as described in Higgins and others (2012). The second, TTR-FQR replaces the carbon assimilation routines with a Farquhar-type (Farquhar and others 1980; von Caemmerer 2000) photosynthesis model; this model version was first used in Conradi and others (2020). The Farquhar model is widely used in dynamic vegetation models to describe how light, temperature and atmospheric CO2 influence photosynthetic rates. C3 and C4 variants of the the Farquhar model, as described by von Caemmerer (2000) are used in this study. The third, TTR-RED also uses the Farquhar photosynthesis model, but reformulates how the environment influences growth and assimilation. In particular, co-limitation processes are described as products of limiting factors and not as the minimum of limiting factors. In TTR-STD and TTR-FQR biomass growth is influenced by temperature, whereas in TTR-RED it is influenced by temperature and soil moisture. TTR-RED is used here for the first time. TTR-STD and TTR-FQR are fitted using a genetic algorithm, TTR-RED is fitted using a genetic algorithm (we refer to this as TTR-RED-DE) and with a Markov chain Monte Carlo method (MCMC, we refer to this as TTR-RED-LD). TTR-STD, TTR-FQR and TTR-RED describe the physiological niche of species with 24, 18 and 18 parameters, respectively.

Previous work has shown that the TTR-SDM produces parameterisations with high predictive accuracy and transferability to climatic conditions outside the training data domain (Magarey and others 2018; Higgins and others 2020, 2021), which is a requirement for characterising phytoclimes (Conradi and others 2020).

Climate Data

In this section, we describe the climate system data sources used to force the TTR-SDM.

The environmental dependencies described in the previous section and Supplements 2 and 3 are modelled in monthly time steps. We used monthly climate data at 1 km2 resolution from CHELSA 2.1 (Karger and others 2017, 2021). The ambient data in CHELSA are represented by climatology for the years 1981–2010. The future climate data for CHELSA is based on ISIMIP3b BA CMIP6 GCM simulations (Eyring and others 2016). Future projected climatologies from five GCM models (GFDL-ESM4, IPSL-CM6ALR, MPI-ESM1-2-HR, MRI-ESM2-0, UKESM1-0-LL) and for 3 SSPs (ssp126, ssp370, ssp585) were used for three future time slices (2011–2040, 2041–2070, 2071–2100). ssp126 represents the low end of the range of plausible future pathways, ssp370 represents a medium to high end of plausible future pathways, and ssp585 represents the high end of plausible future pathways.

For model variants that use the Farquhar photosynthesis model to describe carbon assimilation, we use atmospheric CO2 concentrations taken from ISIMIP (https://esg.pik-potsdam.de/projects/isimip/). Specifically, we used CO2 concentrations (ppm) for ssp126, ssp370 and ssp585 of 442,452 and 453 for the 2011–2040 climatology, 525,584 and 636 for the 2041–2070 climatology, and 592,761 and 963 for the 2071–2100 climatology. A CO2 concentration of 338 ppm was used for the ambient climatology.

Additional data sources used were soil field capacity and wilting point (Global-Soil-Data-Task-Group, 2000) (0.083°resolution) and monthly solar radiation (Trabucco and Zomer 2010) (1 km2 resolution). The rainfall, temperature, solar radiation and soil texture information are used in a Hargreaves-type evapotranspiration model similar to that used by Trabucco and Zomer (2010) to estimate monthly soil moisture content. Due to uncertainty in soil nitrogen data products, we do not use soil nitrogen data in this study. That is, we assume that soil nitrogen is constant and therefore that variation in nitrogen uptake is due to variations in temperature and moisture.

SDM Model Fitting

In this section, we describe how we estimated the parameters of the TTR-SDM from the species distribution data and climate system data described in the previous sections. For each species with sufficient data (more than 30 presence records after spatial thinning), we fitted each of the model variants. We set the sample size parameter, Np (Species distribution data sources section) to 200. The model fitting procedure involved using the model and the monthly forcing data to predict the biomass of a species at the sample locations. The natural log of this biomass is then used as the predictor variable in a logistic regression model that uses a complementary log–log link function. For the TTR-SDM, TTR-FQR and TTR-RED-DE model variants, we maximise the likelihood of this binomial model using the differential evolution genetic algorithm (Price and others 2006) as implemented in the R package DEoptim (Mullen and others 2011). For the TTR-RED-LD variant we additionally estimate the model using the twalk MCMC algorithm (Christen and Fox 2010) as implemented in the R package LaplacesDemon (Statisticat-LLC., 2020).

Plant Type Suitability and Phytoclime Classification

In this section, we describe how we use the spatial projections of the species distribution models fitted in the previous sections to produce plant type suitability surfaces, which are then classified to yield phytoclime maps. The workflow described in the next steps was developed and described in more detail in Conradi and others (2020).

The suitability of a location for each plant type was calculated as the average, location-specific suitability score (0–1) for all species belonging to that plant type. This procedure yielded 10 plant type suitability by site matrices (one for each plant type). We then used finite Gaussian mixture modelling, an unsupervised model-based classification method, to identify clusters in the plant type by site matrices using the R package mclust (Scrucca and others 2016). These clusters are used to project phytoclimes. We estimated the BIC (Bayesian information criterion) for a range of candidate models and number of clusters using the R command mclust::mclustBIC (Scrucca and others 2016). This revealed that model support increased as a saturating function of cluster number, suggesting that > 6 clusters are needed to describe the data and that BIC values continuously improve up to cluster numbers > 32. More than 32 clusters, however, become problematic to interpret and we therefore investigate a maximum of 32 clusters. Different variations of the mclust algorithm are available, and these preliminary analyses revealed that the option 'VVV' (ellipsoidal, varying volume, shape, and orientation models) yielded consistently better BIC values than the alternatives. Using the R command mclust::hc (Scrucca and others 2016), we then created ambient phytoclime maps with 6, 9, 12, 18, 24 and 32 phytoclimes using plant type by site matrices derived from the 4 variants of the TTR-SDM. Six clusters are the number one might expect in southern Africa if using a global biome map (for example, Olson and others 2001; Keith and others 2020). Rutherford and others (1999) used 7 biomes. The current South African biome map (Mucina and Rutherford 2006, Figure S2) uses 9.

Change Analysis

To analyse future changes in the phytoclime memberships of the grid cells, we used the clustering model estimated from the ambient plant type by site matrix to predict the phytoclime of each cell based on its predicted plant type suitability scores at each future time slice (climatologies centred on 2025, 2055, 2085). For each pixel, we then recorded for each time slice whether the pixel’s plant type suitability had changed enough to be assigned to a different phytoclime cluster. Using a Kaplan–Meier estimator (R command survival::survfit, Therneau 2020), we estimate the mean time to change, taking into consideration that the projected time to change estimates are right censored. For each pixel, we use time to phytoclime transition estimates from the four TTR-SDM variants and the five GCMs to estimate the Kaplan–Meier estimator; that is the estimates represent model averaged estimates, where we average over TTR-SDM variants and GCMs. The mean time to change estimates were calculated separately for each SSP. For these mean times, we use the year 2000 as the reference year defining ambient. These mean times to phytoclime change and their standard errors are plotted as maps allowing visualisation of the location and extent of phytoclime scale change as predicted by the models (Figures S10, S11, S12, S13, S14 and S15).

To gain additional insight, we estimated how the plant type suitability surfaces (Figure 2, Figures S4, S5, S6) change over time. The rate of change of in suitability was estimated using linear regression. The effects of TTR model variant, GCM and SPP on the rate of change are accounted for by including these factors as covariates in the regression model, allowing us to concentrate on spatial patterns in the rate of change. These analyses are plotted in Figure 7.

Results

Species Distribution Model Fits

The data filtering procedures yielded sufficient data to attempt model fits for 5327 species; this is 24% of the 21,817 species in the region (Germishuizen and Meyer 2003). For 5006 of these species, we could assign a plant type. The species were distributed as follows between the plant types: C3 grass: 294, C4 grass: 295, climber: 245, forb: 366, annual forb: 914, geophyte: 504, restiod: 67, shrub: 1499, succulent: 481, tree: 341.

For each species model, we calculated the AUC (area under the receiver operator characteristic plot) statistic, which is widely used measurement of the discriminatory capacity of classification models (Fiedling and Bell 1997). We used the AUC values of the fitted models to identify poor fits and to compare the ability of the models to describe the data. Models for 3, 6, 7 and 8 percent of the species were poor (AUC < 0.75) for the TTR-STD, TTR-FQR, TTR-RED-DE, TTR-RED-LD, respectively, and such models were excluded from subsequent analysis steps. Figure 1 compares the AUC values between the model variants, revealing that TTR-STD had systematically higher AUC values than other model variants. This is not surprising since the TTR-STD uses more free parameters. Differences between TTR-FQR and TTR-RED were smaller. Differences between TTR-RED-DE and TTR-RED-LD were even smaller: the regression line comparing these two model variants was close to the 1:1 line. Previous work has, however, indicated that good performance suggested by AUC values in a training data domain can be associated with over-fitting and that models with higher AUC often transferred less well into novel (for example, future climate or non-native region) data domains (Higgins and others 2021).

Figure 1
figure 1

Comparison of AUC values for the TTR-STD, TTR-FQR, TTR-RED-DE, TTR-RED-LD variants of the TTR model fitted to 5327 southern African plant species. The scatter plots compare the AUC values between models where each data point represents a species. The regression equations in the lower panels describe the fitted linear regressions (solid red lines). The dashed black lines represent the 1:1 line. The histograms show the distribution of AUC values for each model and \(\bar{x}\) indicates the mean AUC value.

Suitability Surfaces

The ambient suitability surfaces for each plant type (Figure 2 for TTR-RED-LD based calculations, Figures S4, S5, S6 for the other model variants), although broadly consistent between the different model variants, did show differences (compare geophytes and restioids across the different model variants), which in turn justifies including this source of model uncertainty in the phytoclime shift analyses.

Figure 2
figure 2

Ambient suitability surfaces for plant types derived from the TTR-RED-LD model. The suitability scores are the averaged suitability scores of the species belonging to each plant type, transformed to scale between 0 and 1.

The ambient suitability surfaces showed that the arid region associated with the Namib (for names of biomes and places referred to in the text see Figures S2 and S3) was generally unsuitable for all plant types and suitability gradually increases from west to east (Figure 2 and Figures S4, S5, S6). Moreover, the higher lying parts of Lesotho were generally unsuitable for most plant types. However, the suitability surfaces of the different plant types differed considerably. C3 grasses had higher suitability scores in the south coast and eastern coast of South Africa, and this enhanced suitability area extended over the Drakensberg escarpment and into the highveld areas including the Soutpansberg mountains and Nyanga mountains in Zimbabwe. C4 grasses by contrast had higher suitability scores in the northeastern part of the region. Both grass types had lower suitability scores in the arid central and western parts of the region, although this trend was stronger for C3 than for C4 grasses. Perennial forbs had higher suitability scores in the south and eastern coastal regions as well as the higher lying areas identified by C3 grasses. Annual forbs had suitability surfaces that were similar to those of perennial forbs, albeit with a broader climatic amplitude; in particular, they revealed less aversion to the arid western regions. Geophytes had high suitability scores in the fynbos region, the South African east coast and higher lying interior including the Nyanga mountains in Zimbabwe. Restioids had higher suitability scores in the fynbos regions and a distinct aversion to almost all other parts of the region. Succulents exhibited rather complex suitability surfaces, with higher suitability scores in the western and Eastern Cape as well as the lowveld regions of South Africa and the Tuli Block. Shrubs had higher suitability scores in the southern coastal regions and the central east of the region and they had lower suitability scores in the arid regions of Namibia and the Kalahari. Trees showed higher suitability scores in the northeastern part of the region and low suitability scores in an arid triangle that extends from the Cunene River mouth to the southern coast. Climbing plants had a similar suitability surface to trees, which is not surprising since many climbing plants need trees to climb on.

Phytoclimes and Change in Phytoclimes

Phytoclime maps for the TTR-RED-LD model variant are shown in Figure 3. Analogous figures using the other model variants are shown in Figures S7, S8, S9. The phytoclimes are to be interpreted as regions with similar climatic suitability for plant types. The six phytoclime map in Figure 3 shows broad units that are similar to what one finds on a global biome (Olson and others 2001) or ecosystem typology (Keith and others 2020) map, whereas the 32 phytoclime maps show more finely resolved units. Figures 4 and 5 use the case of a six phytoclime map as an illustrative example. Here, phytoclime 3 includes climates that favour sparse desert vegetation since all plant types have low suitability scores in this phytoclime. Phytoclime 2 includes climates that support arid savanna; it is also unsuitable for most plant types, but C4 grasses, annual forbs and trees have relatively high suitability scores here. Phytoclime 1 includes climates that support arid succulent vegetation formations, succulents are the most important plant type in this phytoclime followed by shrubs, C4 grasses and annual plants and to a lesser extent geophytes. Phytoclime 4 includes climates that support a type of savanna; the phytoclime is suitable for C4 grasses, trees and annual forbs and unsuitable for restioids, geophytes and C3 grasses. Phytoclime 5 supports another type of savanna, the phytoclime is, like phytoclime 4, suitable for C4 grasses, trees and annual forbs and unsuitable for restiods, but is overall more suitable for trees and C3 grasses and somewhat more suitable than phytoclime 4 for geophytes and C3 grasses. Phytoclime 6 includes climates that support both fynbos, grassland and types of savanna. It is climatically suitable for all plant types.

Figure 3
figure 3

Ambient phytoclime maps of southern Africa derived from the TTR-RED-LD model for 6, 9, 12, 18, 24 and 32 clusters. The phytoclimes were derived from a unsupervised classification of locations by their climatic suitability for the 10 plant types shown in Figure 2.

Figure 4
figure 4

Ambient and future (climatology centred on year 2085, SSP 585, GCM gfdl-esm4) phytoclime maps of southern Africa derived from the TTR-RED-LD model for a 6 phytoclime classification. The right-hand panel summarises the phytoclime transitions between the two maps.

Figure 5
figure 5

Plant type suitability for the 6 phytoclime map derived from the TTR-RED-LD model under ambient conditions and the change in the suitability scores in the ambient phytoclime regions forecast for the year 2085 under SSP 585 using the gfdl-esm4 GCM (as in Figure 4). The values in the left panel express the normalised average climatic suitability for the plant types in the phytoclimes and the values in the left panel show their change.

If we compare our phytoclimes to the 9-biome typology for the vegetation of South Africa, Lesotho and Swaziland developed by Mucina and Rutherford (2006) (Figure S2), we find broad congruence and interesting differences. In our six phytoclime map, phytoclime 6 largely subsumes Mucina’s Fynbos, Grassland, Albany Thicket and Indian Ocean Coastal Belt (IOCB); phytoclime 1 includes the Succulent Karoo biome and parts of the Nama Karoo biome; and the Savanna biome is split across phytoclimes 2 and 5. Increasing the number of phytoclimes allows us to discern more detail and phytoclimes that resemble currently recognized biomes, such as phytoclime 14 resembling the Fynbos biome in the 24 phytoclime classification, but also identify phytoclimatic units that do not correspond to the Mucina and Rutherford (2006) classification.

Using the same example of 6 phytoclimes projected by TTR-RED-LD we can explore how this map might change under one future scenario (for year 2085, assuming SSP 585 as forecast by the GFDELESM4 GCM, Figures 4 and 5). This is one of 480 future projected phytoclime maps made in our analysis (3 time slices, 3 SSPs, 5 GCMs, 4 SDMs, 6 phytoclime cluster numbers). In subsequent analyses we summarise across these projections. This single scenario illustrates that parts of phytoclime 6 (currently supporting grassland and fynbos) will lose area to phytoclime 5 (currently supporting savanna), particularly on the South African highveld and in the higher lying areas of Zimbabwe (Figure 4). Figure 5 reveals that the area currently occupied by phytoclime 6, which under ambient conditions is suitable for almost all plant types, becomes increasingly unsuitable for restiods and C3 grasses, geophytes and perennial forbs, but increasingly suitable for C4 grasses, trees, succulents and to a lesser extent for annual forbs, climbers and shrubs. The other major change observed is that phytoclime 4 will gain area from phytoclime 5 and phytoclime 2 (Figure 4). Because phytoclimes 2, 4 and 5 support different types of savannas, this analysis forecasts substantial changes in the climatic forcing within savanna vegetation formations.

To summarise when we expect a phytoclime change, we averaged the expected year of phytoclime change for each pixel across all TTR-SDM variants (n = 5), SSPs (n = 3), GCMs (n = 5) and for all phytoclime numbers (n = 6) (Figure 6). This indicates that the central interior (the Kalahari, northeastern Namibia, northern Botswana) is likely to change phytoclime earlier. If we analyse change using a 6 phytoclime analysis (Figure S10), we detect less change than with a 32 phytoclime analysis (Figure S15), because each phytoclime occupies a relatively large area in climate space and a large change in plant type suitability is needed to cross the boundary into a different phytoclime. A 6 phytoclime analysis reveals that phytoclime shifts occur mostly after more than 75 years, although there are areas in the centre of the region where change is expected within 25 and 50 years (Figure S10). The standard errors of these estimates varied between 0 to 10 years (for n = 20 used here a standard error of 10 is equivalent to a standard deviation of circa 44 years). The error was close to zero in areas where on average no change was predicted, such as the south coast and the Namib desert. The changes forecast by this 6 phytoclime analysis were remarkably consistent across SSP scenarios, revealing only an inflation of the areas projected to change earlier with more extreme scenarios (Figure S10). More sensitivity to climatic change was revealed with increasing phytoclime number. Figures S10, S11, S12, S13, S14 and S15 illustrate the change expected with 6–32 phytoclimes. Under the 32 phytoclime scenario (Figure S15) the majority of the interior of the region shifted phytoclime state within 25–50 years. The standard error associated with these areas of change is lower than in the 6 phytoclime analysis (mostly < 10 years). As is the case with the 6 phytoclime analysis, differences attributable to SSPs were quantitative and did not influence the qualitative pattern of change. Taken together, patterns in the timing of phytoclime change (Figure 6) are consistent with the continentality multiplier effect on temperature and the fact that the interior is arid under ambient climate and therefore closer to the physiological limits of many species. Botswana and adjacent areas are predicted to be areas of rapid change. Resilient regions, that is regions predicted not to be subject to phytoclime-level change, include the southern and western coastal regions and parts of the interior adjacent to South Africa’s west coast.

Figure 6
figure 6

The average time to phytoclime transition (using the year 2000 as the baseline) averaged across TTR-SDM model variants (n = 4), SSPs (n = 3), climate model (n = 5) and number of phytoclimes assumed (n = 6) using a Kaplan-Meir estimator. Figures S10—S15 plot the mean time to phytoclime transition for different phytoclime numbers and SSPs.

The analysis of the temporal change in plant type suitability surfaces (Figure 7) revealed very clear spatial trends in the estimated rate of change and these trends vary conspicuously with plant types. The most striking trend is that eastern South Africa experiences an increase in suitability for C4 grasses, trees and climbers, and to a lesser extent shrubs, succulents and annual forbs (Figure 7). Furthermore, Lesotho is forecast to become increasingly suitable for all plant types in the future, suggesting that higher altitude parts of the Drakensberg mountains could serve as a climate refuge for many species and plant types. A similar trend can be seen in the Cape Fold mountains of the fynbos region. C4 grasses will find this region more suitable in the future. These same areas, barring Lesotho, will become less suitable for C3 grasses. C4 grasses were also forecast to find the Fynbos, Karoo and Namib more favourable in the future. C3 grasses were predicted to find the south coast, an area for which they have high ambient suitability scores, less suitable in the future. Perennial forbs, which under ambient conditions had low suitability scores in the arid west, were predicted to find this area slightly more suitable in the future. Geophytes were largely predicted to face decreases in climate suitability except in Lesotho and surrounding areas. Restioids were predicted to face a dramatic decrease in climatic suitability in the areas in which they currently have a high climatic suitability (the south coast). Projected loss and gain patterns for climbers and trees resembled those for C4 grasses. Succulents were projected to face losses in their ambient high suitability areas such as the Succulent Karoo and the area around the intersection of the borders between South Africa, Botswana and Zimbabwe (the Tuli block). Shrubs were projected to find the Drakensberg and surrounding areas more suitable, but parts of the savanna regions of South Africa, Botswana and Zimbabwe as well as the Albany Thicket biome of the Eastern Cape less suitable.

Figure 7
figure 7

The mean rate of change in plant type suitability (units % change in suitability per 100 years) for the 10 plant types considered in this study. The rate of change is calculated assuming a linear change in suitability over time and using TTR-variant, GCM and SSP as covariates.

Discussion

The modelled plant type suitability surfaces (Figure 2) are the foundation upon which our analysis is based. It is therefore reassuring that the modelled surfaces are consistent with our ecological knowledge. For example, the surfaces indicate trees cannot tolerate extremely dry climates or high elevations. In addition, the differences in the suitability surfaces of C3 and C4 grasses reflect our ecological understanding of their temperature responses (Ehleringer and others 1997). When we use this information (Figure 2) to classify groups, we create phytoclimes, which are interpreted as regions where the climate promotes similar combinations of plant types. We emphasise that, although phytoclimes specify how climate constrains which types of plants can grow where, they do not specify which vegetation formation will occur. It follows that modelled phytoclime boundaries do not clearly delimit biomes of the region (for example, Mucina and Rutherford 2006, Figure S2). For example, the six phytoclime map (Figure 3) does not discriminate between Fynbos, Grassland, Albany Thicket and IOCB, lumping them all into one phytoclime with relatively high suitability for most plant types. This is because while Fynbos and Grasslands are dominated by shrubs and grasses, respectively; these biomes harbour a high diversity of species from other plant types, especially in areas protected from fire or frost. Similarly, the IOCB represents a complex mosaic of grassland, savanna and forest patches variously controlled by disturbance and hydrological conditions. There is also much uncertainty about the influence of climate on the IOCB’s plant type composition due to its long history of utilisation and transformation by humans (Skowno and others 2021). Interestingly, state-of-the art DGVMs, which consider effects of fire and competitive interactions between plant functional types, also lump Fynbos, Grassland, Albany Thicket and IOCB in one biome (Allen and others 2020).

Although the phytoclimes can be interpreted ecologically (compare Figure 5), the underlying classifications are uncertain. Using unsupervised classification to delimit phytoclime units, it is apparent that there is low agreement between phytoclime maps produced using slight variations of the protocol. Our results highlight the sensitivity to the number of phytoclime classes used and the species distribution model underlying the plant type suitability surfaces (Figures 3, S7, S8, S9). In preliminary analyses, we found that the model uncertainty in the classification remained high irrespective of the plant type classification and classification algorithm used. We speculate that this uncertainty is not due to details of the protocol used, but rather is expected when seeking boundaries on gradients of physiological suitability. We would not expect a physiological suitability gradient to have hard boundaries, nor would we expect a clustering of boundaries on gradients; both of these factors would increase uncertainty in projected boundary positions. It may be that in a global analysis with longer climatic gradients, that uncertainty in boundary positions would be lower. Nonetheless, since it is a priori not clear if hard boundaries exist, we should be cautious when forcing algorithms to classify phytoclimatic variance into units such as phytoclimes (Peuquet 2015). In addition, the patterns that emerge from the phytoclimatic classifications are influenced by the spatial resolution of our phytoclimatic surfaces and the lengths of the environmental gradients in the region (that is, the modifiable areal unit problem (MAUP) as defined by Openshaw 1984). This means that, at least for this study, the boundaries of the phytoclimes are not robust and should not be used as planning boundaries.

Our focus on interpreting the results is therefore not to use the phytoclime maps to redefine the biomes of southern Africa. This work has been comprehensively done using hybrid expert and data-driven approaches for South Africa (Mucina and Rutherford 2006) and globally (Keith and others 2020). It is for this reason that we use the term phytoclime to emphasise that our maps represent regions where the climate favours the growth of particular combinations of plant types. A Mucina and Rutherford (2006) type biome map shows a vegetation scientist’s classification of the vegetation that actually grows in a region. The plant type suitability surfaces that underlie phytoclimes are ecologically interpretable as climate attractors for a plant type (Figure 5); implicit is that dispersal, recruitment, competition, facilitation, herbivory, fire and time will conspire to ensure that the phytoclimes are seldom realised as vegetation formations.

An advantage of phytoclimes is that they are produced by a repeatable algorithm, making it trivial to project how phytoclimes may shift with changing climate. They thereby provide an transparent estimate of how climate forcing will impact on the relative physiological performance of the plant types that define the vegetation formations that constitute biomes. Another advantage is that we can explore this forcing at different resolutions. As we increase the number of units classified we by definition reduce the environmental space occupied by a phytoclime and will, all other things being equal, detect more phytoclime shifts. That is, only changes of the greatest magnitude were observable in the 6 phytoclime classification, while more subtle changes were observed in the 32 phytoclime classification. In the 32 phytoclime classification, most of the central interior of the subregion was predicted to shift state within 50 years. In reality, the change would be continuous through time and subtle at first. The difficulty is knowing what shifts or thresholds in the combination of plant type suitabilities are ecologically important for factors such as ecosystem function or supporting other biota. Our approach was objective, driven by the data and classification algorithm, but one could use the method to track particular combinations of plant type suitabilities or threshold changes of interest that were defined a priori.

Previous studies suggested rather dramatic biome shifts for South Africa. In particular Rutherford and others (1999) and DEA (2013) forecast that western South Africa would be replaced by arid biomes analogous to those found in Namibia and Botswana, that savanna would replace large parts of the grassland biome, and that the existing biomes would be compacted into the east and against the south coast. Our results do not completely contradict these earlier projections. We too see a compacting of phytoclimes and an expansion of phytoclime formations currently located north of South Africa, and we observe less change along the southern coast. We predict less change in the Namaqualand and fynbos regions than predicted by Rutherford and others (1999), but we do show that restioids, one of the key floristic indicators and dominant components of fynbos, will be severely impacted, suggesting significant change within this phytoclime.

The dominant trend in our phytoclime shift analyses is phytoclime turnover over the centre of the region. This trend is evident irrespective of the species distribution model, global circulation model or shared socio-economic pathway used. This is consistent with large (4–8 °C) temperature change expected over the central parts of the region (Engelbrecht and Engelbrecht 2016), which not only affects plant physiology directly, but also indirectly via increased evaporation. Using a regional climate model, Engelbrecht and Engelbrecht (2016) calculated that the area covered by hot desert climates south of 22° would increase from 33 to 60%. Similarly, the cold-steppe climates were projected to decrease in area from 13 to 1%. Our analysis is based on more recent climate projections (CMIP6) and their statistical downscaling (Karger and others 2021). The temperature shifts projected by CMIP6 are, however, larger than those that underpin the synthesis in Engelbrecht and Engelbrecht (2016) and rainfall is robustly projected to decrease over the western and northern parts of our study region, with only relatively small areas in the east (KwaZulu-Natal, southern Mozambique) predicted to experience increases in rainfall (Almazroui and others 2020). That is, overall the trend in the climate forcing we used is increasing temperature and aridity concentrated over the interior of the region.

There are many caveats associated with any modelling study. First, the phytoclimes produced using our protocol are best interpreted as climate attractors for different plant type combinations. These attractors move in geographical space as the climate changes, but the vegetation will lag behind them due to the legacies of persistent life history stages and priority effects associated with recruitment. Second, the roles of competition, facilitation, fire, herbivory, dispersal, recruitment and time are not explicitly accounted for and this is a weakness relative to DGVM model projections. In preliminary analyses we used the MODIS burnt area product (Giglio and others 2015) to identify species whose distributions were associated with fire activity with the aim of delimiting additional plant type classes (for example, pyrophytic versus pyrophobic shrubs). The MODIS record is, however, currently not long enough to adequately characterise the fire regime of the fynbos, and many fynbos species known to be pyrophytes were statistically not associated with fire activity as summarised in the MODIS record. This problem could be resolved in future studies that, for example, use statistical methods to interpolate fire regimes for the region. A further caveat is that the restioid plant type is forecast to be the most severely impacted plant type. However, restioid species inhabit topographically complex environments, which are not necessarily perfectly represented by the resolution of the distribution and climate forcing data used to characterise their phytoclimatic profiles. Future studies should therefore seek to improve the topographic resolution. In conclusion, we have applied a recently proposed workflow (Conradi and others 2020) for projecting how climate forcing may impact on the regional vegetation physiognomy of southern Africa. This new workflow largely confirms a more than 20-year-old study (Rutherford and others 1999) that suggested that the central interior of the region would be subject to significant change and that the southern and eastern parts of the region would be less impacted. Our usage of plant type suitability surfaces further allowed us to forecast that trees, shrubs, succulents and C4 grasses will expand onto the interior plateau of South Africa and that C4 grasses would expand along the southern coast (compare February and others 2021). This plant type analysis also revealed that Lesotho is a potential climate refuge for all plant types, but that large parts of Namibia, Botswana, Zimbabwe and Mozambique would in the future be less suitable for all plant types.

Future work should consider alternative plant type classification schemes. For instance, shade tolerance, fire resilience, deciduousness are highly relevant aspects of plant life histories that we did not consider here due to data limitations. Future studies should also focus on attribution, that is identifying the environmental forcing factors most responsible for historical (for example, Higgins and others 2023) and future trajectories of change.