Edaphic specialization onto bare, rocky outcrops as a factor in the evolution of desert angiosperms

Significance The environmentally stressful conditions found in desert regions have often been implicated as the main factor in the evolution of drought tolerance in desert plants. Yet, many iconic desert plant lineages evolved prior to the recent emergence of widespread arid climates, suggesting an important role for preadaptation (exaptation). We provide empirical support for this view by showing that life-history evolution associated with ecological specialization onto rock outcrops was a precursor to establishment and extensive diversification in North American deserts in the desert rock daisies (Perityleae). We caution against assuming the presence of ancient dry biomes based on time-calibrated phylogenies and we emphasize the potentially limited responses of organisms to increasing aridity caused by global climate change.


Sampling, sequencing, phylogenetic inference, and divergence time estimation (DTE).
Our study included near-comprehensive coverage of all minimum-rank taxa of tribe Perityleae except Pericome macrocephala B.L. Rob., Perityle aurea Rose, Laphamia grandifolia (Brandegee) I.H. Lichter-Marck, L. warnockii (A.M. Powell) I.H. Lichter-Marck, and the four taxa of genus Villanova Lag. in subtribe Galeaninae for which material was unattainable. Three additional taxa, Laphamia ciliata Dewey, L. quinqueflora Steyermark, and L. vitreomontana (Warnock) I.H. Lichter-Marck were excluded from this study due to low sequence yield. As outgroup taxa, we included publicly available sequence data from GenBank sequence read archive (SRA) for Helianthus annuus L. of tribe Heliantheae and Conoclinium coelestinum DC. and Eutrochium fistulosum (Barratt) E.E. Lamont of tribe Eupatorieae. We extracted DNA from silica dried or herbarium leaf tissue using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA) or a modified version of Doyle and Doyle's (1) CTAB protocol identical to the one used by Lichter-Marck et al. (2). We prepped libraries using the HyperPrep protocol (Kapa Biosystems Inc., Wilmington, MA) in quarter reactions. Libraries were pooled 8 per reaction and enriched for 1060 low-copy nuclear genes using the COS kit (3)(4)(5) according to the directions provided by myBaits (Arbor Biosciences, Ann Arbor, MI). Enriched libraries were sent to Genewiz (South Plainfield, NJ) for massively parallel paired-end 150 bp sequencing on a HiSeq 4000, yielding >91,000 MB of raw sequencing data. Raw sequencing yields were deposited in the GenBank SRA under bioproject PRJNA918694. See Table S1 for voucher and sample SRA information.
Maximum likelihood (ML) and Bayesian inference (BI) trees were generated in RAxML-NG (6) and ExaBayes (7) using a partitioned data matrix, with each locus independently assigned a GTR+Gamma+I model of molecular evolution. The best ML tree was selected after 1000 independent searches and 1000 bootstrap replicates. The BI analysis used two independent chains to sample the posterior distribution of trees over 100,000 generations before computing a maximum clade credibility (MCC) tree based on a 50% majority rule. Gene trees used in the ASTRAL analysis were inferred separately using RAxML-NG with a GTR+Gamma+I substitution model and nodes with less than 20% bootstrap support collapsed (8). Our MCMCtree analysis used a HKY model of molecular substitution with four gamma distributed rate categories and a relaxed molecular clock conditioned on a log normally distributed rate prior (9)(10). Posterior estimates of molecular substitution rates and divergence times were estimated during 200,000 generations of MCMC, and results inspected for stationarity in the program Tracer (11). To overcome computational limitations with genome scale data we subset 15 loci for use in our MCMCtree analysis using principal component axis 2 in the results of an analysis using gensortR (see ref. 12 for details).

Evolution of life history and edaphic endemism
All analyses were implemented in RevBayes (12) using the post burn-in MCC chronogram output of our MCMCtree analysis with outgroup taxa removed. Maximum a posteriori (MAP) estimates of ancestral states for both life history and edaphic endemism were based on a post burn-in posterior of 13,500 trees. We used reversible jump MCMC (rjMCMC) to assess statistical support for non-zero transition rates by alternatively setting each transition rate to zero or inferring nonzero transition rates under an exponentially distributed prior. Based on past sensitivity analyses (2), non-zero transition rates were drawn from an exponential distribution with a mean of 1 transition over the tree. Non-zero or zero rate models were inferred with equal prior probability and posterior model estimates recorded using a custom binary logical monitor. We calculated Bayes Factors as two times the natural log of the number of iterations in which a non-zero model was inferred divided by the total number of iterations (the posterior odds of model 0) divided by the number of iterations where a reversible model was inferred divided by the total number of iterations (the posterior odds of model 1), while disregarding the prior odds since we used a uniform prior. Given proven sensitivity of tests to prior information from root states (14), we based root state frequencies on ancestral state reconstructions of habitat and life-history derived from a previous study of helenioid Heliantheae (15), which showed strong prior support for tropical and perennial states at the root of Perityleae. Transition rates under all combinations of irreversible and reversible models were inferred during 15,000 iterations of an MCMC and inspected for convergence. To improve mixing, in the presence of support for non-zero transition rates, the analysis was rerun under the non-zero transition rate prior and (MAP) ancestral states compiled from a post burn-in posterior of 13,500 trees.
To test for associations between life history and endemism on rocky outcrops we used BayesTraits (16) to compare the fit (log-likelihood) of two continuous time Markov models of life history and edaphic evolution, a dependent and independent model (17). Both models aimed to infer transitions to or from the suffrutescent perennial life-history and bare rock edaphic affinity at any given moment over infinitesimally small units of time, but the dependent model included an additional four transition-rate parameters (eight total) because it assumes that the rate of change in one trait is dependent on the state of the other. Double transitions in both traits at the same time were disallowed in both models. Given the size of the data and complexity of the models, we used reversible jump MCMC (rjMCMC) to integrate parameter estimates over the model space, weighted by their probability. Models were compared using log Bayes Factors with the equation: Log BF = 2(log likelihood dependent model -log likelihood independent model) and a log BF greater than 10 was interpreted as strong support for a correlation. For each model rjMCMC was conditioned on an exponential distribution with a mean of 100 and a hyperprior of zero. Marginal log-likelihoods were calculated using a steppingstone sampler (18) with 100 stones successively heated for 10,000 iterations.

Historical biogeography with paleo-biome informed models
Discrete biogeographical ranges for each taxon were obtained from the Southwestern Environmental Information Network (SEInet) and from an extensive review of herbarium specimens as part of the first author's systematic research on Perityleae. We scored each tip for presence in four geographical regions: South America, Baja California, the Trans-Mexican Volcanic Belt, and the Basin and Range Province (Fig. 2, S4). In South America, Perityleae has a disjunct distribution, occurring in the Atacama Desert in southern Peru and northern Chile as well as the Desventuradas Islands, ~200 km off the coast of central Chile. Baja California was defined as including the entire peninsula and nearby islands, including Guadalupe Island, the Channel Islands of southwest California, and the Revillagigedo Archipelago, located 100 km south of San Jose del Cabo. The Trans-Mexican Volcanic Belt (19) described the area covered by the predominantly volcanic mountain ranges and plateaus of northwest and central Mexico, including the Sierra Madre Occidental. The Basin and Range Province covers a wide geographic area of western North America that is characterized by its distinctive topography, with thousands of discontinuous mountain ranges punctuated by low, arid valleys (19). The Basin and Range Province here included the Madrean sky island archipelago and surrounding Sonoran Desert, the eastern continental Chihuahuan Desert and sky islands of Texas and New Mexico in the United States, and Chihuahua, Coahuila, and Tamaulipas in Mexico, as well as the numerous northsouth oriented mountain ranges of the Great Basin region in Nevada and Utah, which have been called "caterpillars marching toward Mexico." (20) Our characterization of biomes was based on the level I eco-regions of North America published by the Intergovernmental Commission for Environmental Cooperation. Desert includes the Great Basin, Sonoran, and Chihuahuan subregions, which cover broad swaths of the arid southwest United States, Baja California, and northern mainland Mexico. This biome includes both hot and cold deserts with unpredictable precipitation averaging ~250 mm or less per year and arid plant communities dominated by shrubs, stem succulents, and seasonal annuals scattered in arid, mostly open landscapes without a dense canopy cover (19). The temperate biome included high elevations of the Sierra Madre Occidental in Mexico and the Mogollon Rim in Arizona, the only areas in our study region where precipitation is predictably high, yet temperatures fall seasonally below freezing, containing plant communities dominated by coniferous or deciduous trees (19). The subtropical biome includes mid-elevation parts of the arid southwest U.S. and several states in northern and central Mexico where periodic but not extreme drought combined with infrequent freezing temperatures produces a vegetation mosaic of grassland and shrubland with patches of forest (19). The tropical biome mostly refers to tropical deciduous forests (selva tropical subcaducifolio sensu 19), habitats where freezing temperatures do not regularly occur and seasonality in rainfall is intense, and which support a dense canopy of mostly drought deciduous trees, robust multi-stemmed shrubs, and stem succulents. In our study area, tropical deciduous forests cover the western Pacific plain of the Sierra Madre Occidental as well as the southern Cape Region and Sierra la Giganta of Baja California (19,(21)(22). For a majority of Perityleae, we were able to assign each taxon to only one biome. The exceptions were wide ranging taxa, which were coded as polymorphic for multiple biomes with equal probability. One species, Perityle emoryi, has a distribution that overlaps with the California Floristic Province, but is mostly found in desert habitats, so we assigned it to North American desert (23). Nesothamnus incanus, which is endemic to the environmentally heterogeneous Guadalupe Island off the coast of Baja California, Mexico is found in low, arid habitats within the island and was therefore assigned to desert (21).
Prior to running a fully parameterized biome shift model, we first estimated rates of dispersal and extirpation among geographic areas, as well as ancestral biogeographic ranges separately from biomes using a Bayesian implementation of the Dispersal Extinction Cladogenesis (DEC) model (24) in RevBayes (13). We based our analyses on the MAP chronogram output of our MCMCtree analysis. Given evidence that transitions into the null range can result in abnormal extirpation rates and range sizes, we conditioned our analysis on survival by setting the probability of entering the null range to zero (25). Our implementation of the DEC modeled anagenetic dispersal and local extinction (extirpation) rates and was based on a loguniform prior and considered only allopatry and subset sympatry as potential cladogenetic events (see 26 for rationale). Dispersal and extirpation rates, cladogenetic events, and ancestral biogeographic ranges were inferred during 30,000 iterations of an MCMC. Estimates of ancestral ranges at nodes and shoulders were compiled from 28,000 samples of post burn-in posterior trees and the results plotted using the R package RevGadgets (27, Fig. S4).
To explicitly test the hypothesis that shifts into the desert biome tracked the availability of desert habitats through time, we performed a sensitivity analysis using rjMCMC to model two alternative scenarios: a) total dependence of biome-shifts on paleo-biome structure through time and b) total independence of shifts on paleo-biome structure through time. To model independence we used the set.value() function in RevBayes to fix parameter w_b at 0 and parameters w_u and w_g at 0.5, and to model dependence w_b was assigned a value of 1 and w_u and w_g a value of 0. We then estimated transition rates among biomes and generated model averaged ancestral states across the four geological time intervals using 25,000 generations of rjMCMC, with either model weighted with equal prior probability. We checked for convergence in the program Tracer (11) and two times the Log Bayes Factor was calculated as the posterior probability of the independent model (the number of MCMC samples that visited the independent model divided by the total number of samples) divided by the posterior probability of the dependent model (calculated likewise). Equal prior probability was placed on the two models, therefore the prior ratio was not included in the computation of the BF. Following Kass and Raftery (51), 2logBFs were interpreted as: zero to two, equivocal for a non-zero transition rate; two to six, strong support for non-zero transition rate; greater than six, decisive support for a non-zero transition rate.

Taxonomic implications of a more densely sampled phylogeny of Perityleae
Resolution of a diverse early diverging lineage that tracked the separation of the Baja California peninsula reinforces the recently improved understanding of Perityleae that required revision of taxonomic concepts (28). Until recently, most of the species diversity in Perityleae was recognized within the "core" genus Perityle, but non-monophyly of Perityle and support for the type species, Perityle californica, as nested within an early diverging Baja California clade was the basis for several changes, including: a narrowed circumscription of Perityle, recognition of the long-branch Nesothamnus, and reinstatement of Galinsogeopsis and Laphamia with new, broader delimitations. Expanded sampling of the Baja California clade using target capture in the current study also supports the inclusion within Perityle of P. socorrensis Rose, P. emoryi Torr., Perityle tenuifolius (Phil.) Lichter-Marck, and the three taxa of genus Amauria Benth. (22,28). Perityle remains non-monophyletic as currently circumscribed, with Perityle rosei Greenm. and P. trichodonta S.F. Blake resolved as more closely related to Pericome and Eutetras than to other members of Perityle. The monotypic, Guadalupe Island endemic Nesothamnus incanus (A. Gray) Rydb. occupies a long branch that is the sister lineage to Galinsogeopsis.

Non-monophyly of densely sampled taxa
Inclusion of multiple conspecific samples in the present study showed support for monophyly of recognized species in most cases, but there were several species for which samples did not group together. Taxa  In part, these results reflect our choice to sequence samples representative of morphologically unique and geographically disjunct populations collected during the first author's field work and examination of herbarium collections. Perityleae tend to grow in unreachable parts of the most inaccessible cliffs and canyons in the North American deserts, where their isolated populations are arrayed across 2000 miles of dry and mostly roadless terrain punctuated by imposing rocky mountains (28). Comprehensive floristic studies of sky islands and parts of the Sierra Madre Occidental have led to the description of many new-to-science species of Perityleae since the tribe was last treated by Powell (e.g., 29), and our results suggest that the currently recognized diversity of Perityleae continues to be underestimated. Our fragmentary knowledge of the diversity in this group is a conservation concern because Perityleae habitats are imminently threatened by energy development, climate change, and mining. With >30% of the species in this group listed as vulnerable, imperiled, or critically endangered on NatureServe, Perityleae stands out as having one of the highest proportions of at-risk taxa of any taxon in the arid western North American flora.

Widespread occurrence of edaphic specialization on bare habitats among close relatives to predominantly desert plant and animal lineages
Familiar desert plant lineages are dominant elements in the flora of rocky outcrops in the tropical deciduous forests that are adjacent to, and considered a major source area for expansions into, the North American deserts (30) Sm.). Decomposed, but typically open edaphic substrates such as gypsum, greenstone, and limestone outcrops, alkaline sinks, and volcanic talus also harbor members or close relatives of predominantly desert plant lineages in adjacent more densely vegetated biomes, including Tiquilia (Ehretiaceae; e.g. T. gossypina clade), Encelia (Compositae; e.g. Enceliopsis spp.), ivesioids (Rosaceae; Potentilla spp.), jewelflowers (Brassicaceae; Thelypodieae), Nama (Namaceae; e.g. Nama rupicola Bonpl. ex Choisy), and Zeltnera (Gentianaceae; e.g. Zeltnera calycosa (Buckley) G. Mans.). Similar patterns can be found in groups of predominantly desert animals, such as rattlesnakes, lizards, geckos, and tortoises, which are often found closely associated with rocky outcrops in a tropical context, though arboreal lifestyles may have also played a role in pre-adapting these lineages for success in the desert ecosystem (31). Putative adaptations for growth on bare rock include stress tolerant leaf traits, such as dense hairs, small waxy leaves, or lack of leaves, a caespitose growth form, shifts towards self-compatibility, and heightened anti-herbivore defenses (see ref. 32 for a recent review). In groups that show edaphic specialization for bare or chemically harsh edaphic outcrops, such innovations in stress tolerance may provide the mechanism for shifts into new, taxing environments where there is less competition or additional pressures that facilitate innovation, and therefore create opportunity for rapid radiation (32).   . Time calibrated phylogeny of Perityleae based on a fixed topology derived from a pseudocoalescent analysis of 229 low-copy nuclear genes inferred with Astral-III (8). Divergence times were estimated using the software program MCMCtree and based on a subset of 15 loci selected using multivariate phylogenetic subsampling with the program genesortR in the R statistical environment (R Core Team 2019))(9-10,12; Table S2). Our analysis was conditioned on an uncorrelated log normal relaxed molecular clock and an HKY molecular substitution model with four gamma distributed rate categories. A secondary node calibration of 7.93-23.    Fig. S5. Maximum clade credibility phylogeny of Perityleae based on a partitioned data matrix of 229 low-copy nuclear loci inferred from 1,000,000 generations of MCMC in ExaBayes (7). Posterior probabilities equal to one were inferred at all nodes. The analysis used two independent chains to sample the posterior distribution before computing a maximum clade credibility tree based on a 50% majority rule. (A) Position of the widespread allopolyploid Perityle emoryi, a major topological incongruence with the pseudocoalescent (ASTRAL) analysis, which resolves P. emoryi and the sample tentatively determined as P. cf. californica as the sister lineage to Laphamia.    Table S2. A summary of descriptive statistics for low-copy nuclear loci obtained by nextgeneration sequencing (target enrichment) with the Compositae Conserved Orthologous Set (COS) and used in full and subset data matrices to infer phylogenies of the rock daisies (3)(4)(5). Individual loci were aligned using MAFFT (34) under default settings. Characteristics of gene trees were based on phylogenies inferred separately for each locus in RAxML-NG with a GTR+Gamma+I substitution model (6). Estimation of gene properties were performed in the R statistical environment (R Core Team 2019) with the genestortR pipeline (see ref. 12 for specific details on the analyses).