Phylogenetic diversity of Amazonian tree communities

RESEARCH Phylogenetic diversity of Amazonian tree communities Eur ıdice N. Honorio Coronado*, Kyle G. Dexter, R. Toby Pennington, J erôme Chave, Simon L. Lewis, Miguel N. Alexiades, Esteban Alvarez, Atila Alves de Oliveira, Iêda L. Amaral, Alejandro Araujo-Murakami, Eric J. M. M. Arets, Gerardo A. Aymard, Christopher Baraloto, Damien Bonal, Roel Brienen, Carlos Cer on, Fernando Cornejo Valverde, Anthony Di Fiore, William Farfan-Rios, Ted R. Feldpausch, Niro Higuchi, Isau Huamantupa-Chuquimaco, Susan G. Laurance, William F. Laurance, Gabriela L opez-Gonzalez, Beatriz S. Marimon, Ben Hur Marimon-Junior, Abel Monteagudo Mendoza, David Neill, Walter Palacios Cuenca, Maria Cristina Pe~ nuela Mora, Nigel C. A. Pitman, Adriana Prieto, Carlos A. Quesada, Hirma Ramirez Angulo, Agust ın Rudas, Ademir R. Ruschel, Norma Salinas Revilla, Rafael P. Salom~ao, Ana Segalin de Andrade, Miles R. Silman, Wilson Spironello, Hans ter Steege, John Terborgh, Marisol Toledo, Luis Valenzuela Gamarra, Ima C. G. Vieira, Emilio Vilanova Torre, Vincent Vos and Oliver L. Phillips


INTRODUCTION
A central task of biology is to quantify biodiversity and how it varies geographically (Myers et al., 2000). Elucidating and understanding the patterns of diversity is particularly important within the tropics, because of their high species richness and the pressing need to develop and apply effective conservation strategies in the face of massive habitat alteration. While the species diversity of specific areas can be measured using different indices (e.g. species richness, Fisher's alpha), these ecological metrics may fail to account for the evolutionary, or lineage, diversity of communities. As a result, some authors have advocated developing and implementing metrics, such as phylogenetic diversity, which quantify the lineage diversity of communities (Vane-Wright et al., 1991;Faith, 1992).
Phylogenetic diversity (PD) is generally estimated as the total branch length of a phylogeny representing the species in a community (PDss; Faith, 1992). Alternative metrics to represent the evolutionary diversity in communities are available, such as the mean phylogenetic distance between all species and the mean phylogenetic distance between each species and its closest relative (MPD and MNTD respectively; Webb et al., 2002;Helmus et al., 2007;Cadotte et al., 2010). All these metrics are often correlated with species richness (SR; the total number of species in a community), and thus SR can sometimes be used as a proxy for PD (Polasky et al., 2001;Rodrigues & Gaston, 2002). However, some areas contain significantly greater or less PD than expected given their SR (Sechrest et al., 2002;Forest et al., 2007), and null model approaches have been developed to estimate PD while controlling for variation in SR . These standardized metrics may add complementary information about the evolutionary history and conservation significance of sites (Winter et al., 2013). The availability of these recently developed PD metrics, in conjunction with the advent of standardized floristic sampling across Amazonia (Malhi et al., 2002;Phillips & Miller, 2002) and a robust angiosperm phylogeny (Bremer et al., 2009), now make it possible to examine how PD varies at large spatial scales across the world's most species-rich forest (Gentry, 1988;ter Steege et al., 2013; see also Chave et al., 2007).
Previous research has shown tree species diversity in 1 ha plots across the Amazon to be highest in its western and central regions and lowest in the east, on the Guianan and Brazilian shields (ter Steege et al., 2003). Because PD is correlated with SR, we would expect that PD is greatest in the western and central Amazon, but this has yet to be thoroughly tested (although see Chave et al., 2007). In addition, numerous factors may drive spatial variation in PD and whether communities show greater or less PD than expected given their SR. For example based on variation in substrate age, one might hypothesize that tree communities on the Guiana and Brazilian Shields, which overlay substrates of ancient Pre-Cambrian origin (Quesada et al., 2011), might have higher PD than expected given their relatively low SR. This high PD would reflect accumulated lineage diversity over tens of millions of years, with many deep phylogenetic branches separating species from these older diversification events (Swenson, 2009). In contrast, tree communities of western Amazonia overlying Pliocene and Pleistocene sediments from the Andes (Hoorn et al., 2010;Quesada et al., 2011) might be expected to show lower PD than expected given their high SR because of the dominance of recent evolutionary radiations of certain clades within which phylogenetic branches are short (Richardson et al., 2001;Erkens et al., 2007).
Soil fertility and precipitation seasonality also vary across Amazonia. Overall, the relatively young soils of western Amazonia are fertile in comparison with the highly weathered soils of central and eastern Amazonia and the Guianan and Brazilian Shields, whereas the poorest soils are found beneath white-sand forests that occur sporadically in small to large patches throughout the northern part of the basin (Quesada et al., 2011). In addition, the dry season varies from being essentially absent in the north-west to lasting 5-6 months in the south-east and some northern areas (Sombroek, 2001), where moist forests give way to savannas and seasonally dry tropical forest (SDTF). Some of these environmental conditions may represent ecophysiological barriers that few lineages have been able to overcome (Anacker & Harrison, 2012;Miller et al., 2013). Thus, an additional hypothesis to the one above, based on substrate age, is that tree communities in areas of the Amazon with greater ecophysiological barriers to growth (i.e. potentially more stressful environments) will show the lowest phylogenetic diversity (Qian et al., 2013).
We used a network of 283 forest inventory plots (RAIN-FOR; Malhi et al., 2002) to quantify the PD of tree communities and examine its spatial and environmental variation across Amazonia. We rarefied all plots to the same number of individuals, and then calculated (1) the total phylogenetic branch length of all species occurring in each plot, PD sensu stricto (PDss; Faith, 1992), (2) the mean pairwise phylogenetic distance between species (MPD; Webb, 2000;Webb et al., 2002), and (3) the mean nearest taxon distance (MNTD; Webb, 2000). We also calculated standardized versions of these metrics that account for variation in SR. We then tested the hypothesis, based on substrate age, that tree communities in the Guiana and Brazilian Shields will show the greatest PD, whereas those in the western Amazon will show lower PD. And while our sample size outside of typical terra firme and floodplain moist forest is limited, we conducted a preliminary test of the hypothesis that tree communities in potentially more stressful environments, namely white sands, savannas, and SDTFs, will show the lowest PD. By examining the phylogenetic diversity of tree communities throughout Amazonia, we aim to provide insights into its biogeographical history and to inform the setting of conservation priorities.

Tree community plot data
In this study, we used a total of 283 inventory plots of the RAINFOR forest plot network curated at Forest-Plots.net (see Table S1 in Supporting Information). Plots are generally one hectare in size (mean AE SD = 1.1 AE 0.6 ha) and with all trees ≥ 10 cm diameter at breast height (DBH) sampled. We restricted analyses to old-growth forest plots and excluded plots with limited species identifications. Each plot was treated as a community and classified into three main biomes ( Fig. 1): tropical moist forest, TMF (n = 265 plots), seasonally dry tropical forest, SDTF (n = 13), and savanna, S (n = 5). Fourteen plots were from the northern Andes (Colombia and Venezuela), outside the Amazon basin, but were included because of their close phytogeographical connection to Amazonia. SDTF plots are located from Bolivia to Venezuela, whereas savanna plots are only from Brazil and are separated by a maximum of 250 km.
The 265 tropical moist forest plots were further classified by the maximum age of the underlying geological formation. The Guiana and Brazilian Shields represent the oldest geological formations in Amazonia (TMF.o: > 500 Ma), followed by formations of central and eastern Amazonia (TMF.i: 20-100 Ma) located between the Shields, whereas areas near to the Andes (western Amazonia and northern Andes) are dominated by younger sediments (TMF.y: < 20 Ma; Quesada et al., 2011) deposited mainly during the Pliocene and the Pleistocene (Hoorn et al., 2010) (Fig. 1). All TMF plots were also classified by forest type: montane forest, flooded forest, terra firme forest, and white-sand forest. Terra firme and flooded forests were sampled for each substrate age category, whereas montane forests were only sampled in western Amazonia on young substrates and white-sand forests were not sampled on substrates of intermediate age (see Table S1).
In total, the initial dataset included 183,908 individual trees sampled in Bolivia, Brazil, Colombia, Ecuador, French Guiana, Guyana, Peru, Surinam and Venezuela. To ensure a standardized nomenclature across plots based on the APG-III classification (Bremer et al., 2009), the Taxonomic Name Resolution Service version 3.0 was used (http://tnrs.iplantcollaborative.org; accessed on 01/03/2013). Tree ferns and gymnosperms only occur in significant numbers in montane plots, and they are exceedingly rare in lowland forest, which is the focus of this study. These very rare species represent 0.018% of all individual trees in our lowland plots and are essentially stochastically sampled in any given 1 ha plot (tree ferns and gymnosperms were found in a total of nine and two lowland plots respectively). Given this stochasticity and the strong effect of tree ferns and gymnosperms on phylogenetic diversity metrics (they are subtended by very long phylogenetic branches; Faith et al., 2004;Kembel & Hubbell, 2006;Chave et al., 2007), we excluded them from phylogenetic diversity calculations. We also excluded all individuals not identified to a named species (13.6% of individuals). To determine if unidentified individuals could be biasing results, we assessed the correlation between the PD metrics and the proportion of unidentified individuals in each plot. The final dataset contained a total of 157,340 individuals, belonging to 3868 species, 732 genera and 126 families of angiosperms.

Phylogenetic trees
A phylogenetic tree of the whole species pool (see Fig. S1) was generated using Phylomatic in PHYLOCOM version 4.2 (Webb et al., 2008). This tool provides a phylogenetic hypothesis for the relationships among taxa by matching the list of species with up-to-date family and genus names, and tip labels of a provided megatree (Webb & Donoghue, 2005). In this case, the topology of R20120829.new provided at http://phylodiversity.net/phylomatic/ was used. An ultrametric phylogeny including branch length in millions of years (Ma) was obtained using bladj in PHYLOCOM. This command fixes the root node (angiosperms, 179 Ma) and other nodes to specified ages based on Wikstr€ om et al. (2001). Inconsistencies in syntax between internal node labels of the phylogeny and the ages file were modified manually to ensure a better performance of the node calibration using bladj (Gastauer & Meira-Neto, 2013). To determine if PD metrics are affected by phylogenetic resolution, we compared our results generated using the PHYLOCOM phylogeny with those using a phylogeny of Amazonian tree genera generated from DNA sequences of rbcL and matK plastid genes (K. G. Dexter & J. Chave, unpublished data). Full details of the temporally-calibrated, ultrametric phylogeny construction can be found in the Supporting information.

Phylogenetic diversity metrics
We used the PHYLOCOM phylogeny, which includes all genera in our dataset, to calculate six metrics that evaluate the evolutionary history present in communities: (1) the total phylogenetic branch length of all species occurring in a given community, i.e. phylogenetic diversity sensu stricto (PDss; Faith, 1992); (2) mean pairwise phylogenetic distance between species in terms of branch length (MPD; Webb, 2000;Webb et al., 2002); (3) mean nearest taxon distance (MNTD; Webb, 2000;Webb et al., 2002) and (4, 5 & 6) their equivalents, standardized for species richness (ses.PDss, ses.MPD, and ses.MNTD). For each community, these standardizations were accomplished by randomly drawing the same number of species from the phylogeny as present in the community, repeating this 1000 times, calculating PDss, MPD and MNTD for each randomization, taking the difference between the observed value of PDss, MPD, and MNTD and the mean of the random values, and dividing these differences by the standard deviation across the randomizations. These derived metrics therefore represent standardized effect sizes (ses) and are designated as such ses.MPD and ses.MNTD are equivalent to the inverse of the NRI and NTI indices of Webb (2000). We consider the total phylogenetic branch length (PDss) in communities (Faith, 1992;Forest et al., 2007) and its deviation from expectation given species richness (ses.PDss) to be the most straightforward measures of evolutionary diversity in communities with respect to conservation prioritization. Lastly, we included the MPD, MNTD, ses.MPD, and ses.MNTD metrics of PD because of their history of use in the literature (e.g. Forest et al., 2007;Gonzalez et al., 2010;Fine & Kembel, 2011); MPD measures phylogenetic structure at deep nodes and MNTD at shallow nodes (Webb, 2000).

Data assessment and analysis
To minimize the effects of variation in sampling effort (i.e. plot size) and tree density, we used a rarefaction procedure that standardized all plots to 249 individuals, which was the lowest observed number of individual trees (≥ 10 cm DBH) among all plots. Values for PDss, MPD, MNTD, ses.PDss, ses.MPD, ses.MNTD and SR (the total number of species) for each rarefied community were calculated using the package picante  in the R STATISTICAL SOFTWARE version 2.15.1. PD metrics can also be sensitive to the most basal clades in a phylogeny (Swenson, 2009), so we classified taxa into one of the three major angiosperm clades (Magnoliids including Chloranthales, Monocots, and Eudicots), and the percentage of species in each clade was calculated. The mean across 100 rarefactions of the PD metrics, SR, and the proportion of major clades were used in subsequent analyses. The values of PDss, MPD, MNTD, ses.PDss, ses.MPD and ses.MNTD were compared among communities growing on substrates of different geologic ages and forest types using F-tests and Tukey tests. We additionally compared all communities in potentially more stressful environments (whitesand forests, savannas and SDTF) vs. all in potentially less stressful environments (terra firme and montane forests) using a t test. Flooded forests were excluded from the analysis of stressful habitats because intensity and length of flooding is known to vary among plots, but we lack precise information on this. We also assessed the correlation of PD metrics with SR, the proportions of species in major clades, and the latitude and longitude of plots.
We assessed if there was any bias to the phylogenetic diversity metrics with respect to unidentified individuals by examining the correlation between percentage of unidentified individuals in plots and the various PD metrics. We also re-analysed a subset of the data (n = 117 plots each with >500 trees), rarefying the plots to 500 individuals per sampling unit, in order to test the effect of sample size in the rarefaction procedure on estimating phylogenetic diversity. Finally, we re-analysed a subset of the data (n = 257 plots), including plots that have more than 80% of species and individuals sampled in the sequenced-based genus-level phylogeny, in order to test the effect of phylogenetic resolution on estimating phylogenetic diversity. The random resolution of species-level relationships within genera in the genus-level phylogeny was repeated for each set of rarefied communities.

Species richness and major angiosperm clades
Terra firme moist forests of intermediate and young geological formations have the highest species richness (SR), with an average of 88 and 72 species respectively (for 249 rarefied individuals; Table 1). Flooded moist forest communities in western and central Amazonia had greater SR than flooded and terra firme forests on the Guiana and Brazilian Shields, whereas the lowest SR was found in white-sand forests of the Guiana Shield and Andean montane forests (  (Table 1). SDTF shows the lowest percentage of Magnoliid and Monocot species, and the greatest of Eudicots, but the abundance of these clades in savannas is more similar to the values typical of the moist forest plots.

DISCUSSION
Our study has revealed a highly non-random spatial and environmental distribution of phylogenetic diversity (PD) across tree communities of Amazonia, by whichever metric it is evaluated, with some areas and environments holding significantly more, or less, phylogenetic diversity than others (Fig. 3). Phylogenetic diversity sensu stricto (PDss) and the mean nearest taxon distance (MNTD) in the Amazon correlate strongly with species richness (SR; Fig. 2a (western and central Amazonia respectively) have the highest PDss and the lowest MNTD values. Once variation in SR is controlled for, we found that the youngest and oldest substrates (the latter on the Brazilian and Guiana Shields) have the highest ses.PDss and ses.MNTD. The lowest values of ses.PDss and ses.MNTD were found in potentially more stressful environments, in particular white-sand forest and SDTF.
We also found that the mean pairwise phylogenetic distance between species (MPD) and its standardized equivalent, ses.MPD, depend primarily on how evenly taxa are distributed among the three major angiosperm clades (Magnoliids, Monocots and Eudicots), which is shown by the strong positive correlation between their values and the proportion of taxa in plots that are Magnoliids and Monocots (the two rarer clades; Fig. 2b). Thus, communities in western Amazonia, that have many Magnoliids and Monocots present, have the greatest MPD and ses.MPD values. While it is important to have a measure of how evenly distributed taxa are across the major clades of a phylogeny, MPD and ses.MPD do not seem to reflect lineage diversity per se. Moreover, ses.PDss and ses.MNTD were strongly positively correlated, giving similar patterns across geological substrates environments. We therefore focus below primarily on patterns with respect to PDss and ses.PDss.
Has the greatest phylogenetic diversity been accumulated in communities overlaying old geological formations?
Communities on old geological substrates in the Brazilian and Guianan Shields showed lower PDss than communities on young or intermediately aged geological substrates (Fig. 3e), which is unsurprising given their lower species richness. The communities on old geological substrates did show a higher median ses.PDss (Fig. 3f), but the distribution of ses.PDss values overlapped broadly with those for communities on the youngest substrate. The same pattern was found for ses.MNTD. Thus, our prediction that PD would be positively correlated with substrate age was falsified. However, we suggest that different processes may explain the high ses.PDss values observed in different communities across Amazonia. The high ses.PDss and ses.MNTD found in the Guiana and Brazilian Shields may very well be explained by their long-term geological history and the accumulation of lineages over many millions of years.
To understand the rejection of the hypothesis that geologically older substrates show the greatest PD, we need to consider why tree communities of western Amazonia show such high ses.PDss and ses.MNTD. That communities of western Amazonia show high PDss is unsurprising, as PDss is strongly correlated with SR, and SR is substantially higher in the western Amazon (ter Steege et al., 2003). However, much of this species diversity is due to recently radiated species-rich genera (Gentry, 1982) such as Inga (Richardson et al., 2001) and Guatteria (Erkens et al., 2007), and short phylogenetic branches such as those within these genera do not greatly increase PD (Swenson, 2009). Moreover, low MNTD would be explained by the presence of short phylogenetic branches separating the nearest taxa in these diverse communities. However, another exceptional aspect of western Amazonian tree communities is that they are occupied by lineages from the entirety of the angiosperm phylogeny, which leads these communities to have high ses.PDss, and apparently also high ses.MNTD. One explanation might be related to the potentially high phylogenetic diversity found in the adjacent Andes, which provides a proximate resource to 'invade' western Amazonia (see also Chave et al., 2007). Another explanation might be related to the particular environmental and ecological conditions (relatively fertile and aseasonal environments) in the west, which may be easier to invade by multiple lineages with diverse evolutionary backgrounds. Moreover, the ability of diverse lineages to establish in the western and southern Amazon may also be related to the high rates of disturbance and turnover in the region (Quesada et al., 2012;Marimon et al., 2013;Baker et al., 2014). Thus, in the same way that more fertile, dynamic, and disturbed tropical forests have more open nutrient-cycles on ecological time-scales (Vitousek & Sanford, 1986), they also appear to be more open to repeated establishment of plant lineages on evolutionary time-scales.
Do environments with more potential ecophysiological barriers to growth show the lowest PD in their tree communities?
We expected that environments with potentially more stressful ecological conditions, namely marked seasonality of precipitation and/or low soil fertility, would have the lowest phylogenetic diversity, because these may represent ecophysiological barriers that are difficult for many lineages to surmount evolutionarily (Anacker & Harrison, 2012;Miller et al., 2013;Qian et al., 2013). Both savannas and SDTF Figure 3 Variation in phylogenetic diversity, as evaluated by several metrics, across Amazonia. The results for phylogenetic diversity sensu stricto (PDss), its equivalent standardized for variation in species richness (ses.PDss), and the standardized measures of mean pairwise phylogenetic distance between species (ses.MPD) and mean nearest taxon distance (ses.MNTD) are shown in different rows. (a-d) The maps show the spatial distribution of values for each metric, with the size of circles corresponding to their values. If there were multiple plots in a given one-degree grid, the mean value is shown. (e-h) The tropical moist forest biome is classified based on maximum age of geological formations (TMF.y: < 20 Ma; TMF.i: 20-100 Ma, TMF.o: > 500 Ma), whereas savanna and seasonally dry tropical forest are indicated as S and SDTF respectively. Letters in boxplots indicate significant difference among mean values (Tukey's HSD; P < 0.05). have a pronounced dry season, but they show contrasting patterns of PD. While PD metrics of savannas were similar to those of nearby communities in tropical moist forest, SDTF generally has low PD ( Fig. 3e-h). Savannas and tropical moist forest communities may share similar lineages across the angiosperm phylogeny, a pattern which supports previous studies that suggested that Brazilian savannas are formed by the numerous independent colonizations of lineages from nearby biomes around 4-10 Ma (Simon et al., 2009;Simon & Pennington, 2012). Conversely, the low PD values shown for SDTF communities suggest that fewer clades have succeeded in colonizing SDTF, and that consequently, SDTF is occupied by closer relatives. However, our conclusions must be taken as preliminary given the low sample size and limited geographical extent of our savanna and SDTF plots.
Previous studies have indicated strong habitat specialization in white-sand communities as indicated by the high number of individuals that represent white-sand specialist species (Fine et al., 2010), and by the distinct ecophysiology and defences against herbivores that these species have evolved in order to live on such poor soils (Fine et al., 2004). Therefore, we also expected that white-sand forests would have a high frequency of closely related species and low phylogenetic diversity. But while our results showed that both white-sand communities of the Guiana Shield and the western Amazon have low PDss, only those communities in the Guyana Shield have low ses.PDss values compared to neighbouring terra firme or flooded forest. We found higher values of ses.PDss in the small patches of white-sand forests of western Amazonia than in the Guiana Shield, suggesting a greater influence of the regional pool (i.e. species present in the surrounding phylogenetically diverse terra firme forest entering white-sand patches) than in the larger, more contiguous white-sand patches of the Guiana Shield.

Conservation priorities
Conservation planning based upon species richness (SR) gives the same value to communities with equal SR regardless of the total phylogenetic diversity of the species that they contain (e.g. Forest et al., 2007). But if we are to preserve the full spectrum of lineage diversity and the evolutionary processes that led to the exceptional biodiversity of Amazonian communities, regional conservation planning must incorporate phylogenetic information.
In this study, we showed that while PDss is strongly correlated with SR (see also Forest et al., 2007;Cadotte et al., 2012), communities can vary greatly in their deviation from expected PD given SR, as measured by ses.PDss. While communities in the central and western Amazon have the greatest tree species richness in the basin (ter Steege et al., 2003), the central Amazon shows much lower phylogenetic diversity than expected given its species richness (ses.PDss) compared to the western Amazon (Fig. 3e), thus suggesting that the western Amazon basin may hold a higher value for conservation of lineage diversity.
In addition, we found that the mean pairwise phylogenetic distance between species (MPD) is not strongly correlated with species richness, which could suggest that it is a better metric of phylogenetic diversity than PDss. However, we found that MPD and its standardized equivalent (ses.MPD) are strongly dependent on how evenly divided the species in a tree community are among the three major angiosperm clades (Magnoliids, Monocots and Eudicots; Fig. 2b). While this division is certainly interesting from an ecological and evolutionary perspective, we suggest that MPD and ses.MPD may not be the most useful metrics of phylogenetic diversity for conservation prioritization. Meanwhile, MNTD shows a strong inverse relationship with SR, and ses.MNTD essentially conveys the same information as ses.PDss (i.e. they are strongly positively correlated). Thus, we suggest that, PDss and ses.PDss may provide the most straightforward, interpretable means to evaluate lineage diversity in communities. While PDss is strongly correlated with SR and could perhaps be inferred from it, a phylogeny is clearly necessary to calculate ses.PDss and determine whether communities show more or less lineage diversity than expected given their species richness. An urgent priority for conservation should be to develop bigger community phylogenies that include all lineages, greater numbers of species within lineages, and greater phylogenetic resolution. Such phylogenies would allow evolutionary information to be properly incorporated into conservation decisions.

Table S1
Floristic tree inventories compiled from RAINFOR forest plot network.

Table S2
Fossil-based calibrations used in sequenced-based genus-level phylogeny.

Figure S1
Phylogenetic tree for the whole species pool for 283 floristic inventories compiled from RAINFOR dataset.