Phylogeny and circumscription of Sapindaceae revisited : molecular sequence data , morphology and biogeography support recognition of a new family , Xanthoceraceae

1Department of Biodiversity and Conservation, Real Jardin Botanico, CSIC, Plaza de Murillo 2, ES-28014 Madrid, Spain 2Missouri Botanical Garden, P.O. Box 299, St. Louis, MO 63166-0299, U.S.A. 3Muséum National d’Histoire Naturelle, Case Postale 39, 57 rue Cuvier, FR-75231 05 CEDEX, Paris, France 4Department of Ecology and Evolution, Biophore, University of Lausanne, CH-1015 Lausanne, Switzerland 5Department of Botany, Bergius Foundation, SE-10691, Stockholm University, Stockholm, Sweden 6Institute of Biology, University of Neuchâtel, Rue Emile-Argand 11, CH-2000 Neuchâtel, Switzerland 7Conservatoire et Jardin botaniques de la ville de Genève, ch. de l’Impératrice 1, CH-1292 Chambésy, Switzerland *Author for correspondence: pete.lowry@mobot.org


INTRODUCTION
The systematics of the family Sapindaceae has challenged taxonomists for more than a century since its first comprehensive treatment was published by Radlkofer (1890Radlkofer ( , 1933)).Until the late 1980s, Sapindaceae were widely treated as distinct from two closely related families, Hippocastanaceae and Aceraceae, based primarily on morphology and biogeography (Takhtajan 1987, Cronquist 1988, Dahlgren 1989).Several recent studies using pollen morphology (Müller & Leenhouts 1976), phytochemistry (Umadevi & Daniel 1991) and molecular sequence data (Gadek et al. 1996, Savolainen et al. 2000, APG II 2003, APG III 2009, Harrington et al. 2005, Buerki et al. 2009) have, however, led to the adoption of a broader concept in an effort to ensure monophyly, uniting these entities into a single family, Sapindaceae s. lat.
Sapindaceae s. lat.as currently circumscribed by Harrington et al. (2005), Thorne & Reveal (2007) and Buerki et al. (2009Buerki et al. ( , 2010) ) comprise c. 1900 species and 142 genera distributed among four subfamilies: Dodonaeoideae Burnett, Hippocastanoideae Burnett, Sapindoideae Burnett and Xanthoceroideae Thorne & Reveal. Recently, Buerki et al. (2009) demonstrated the para-/polyphyly of all tribes as defined by Radlkofer (1933), with a single exception, Paullinieae Kunth.Although they sketched an informal system that recognizes a dozen monophyletic groups, they did not propose new tribal limits within the four subfamilies as many potentially important genera of Sapindaceae were not included in their study due to the lack of sequenceable material.
Historically, Radlkofer (1933) recognized fourteen tribes within Sapindaceae s. str., five in Dodonaeoideae and nine in Sapindoideae (see table 1 in Buerki et al. 2009 for details).Within Dodonaeoideae, however, he encountered difficulty assigning nine genera to the four previously described tribes, ultimately deciding to place them in a new tribe, Harpullieae Radlk.Within this heterogeneous assemblage, he recognized two informal groups according to the presence (Delavaya Franchet, Ungnadia Endl.and Xanthoceras Bunge) or absence (Arfeuillea Pierre, Conchopetalum, Eurycorymbus Hand.-Mazz., Harpullia Roxb., Magonia A.St.-Hil. and Majidea J.Kirk ex Oliv.) of a terminal leaflet.While revising Radlkofer's infrafamilial system, largely on the basis of pollen and other morphological features, Müller & Leenhouts (1976) discussed the possible expansion of Harpullieae to include the three genera comprising Hippocastanaceae, viz.Aesculus L., Billia L. and Handeliodendron (these authors did not, however, comment on the taxonomic status of Aceraceae).In their revised classification, Müller & Leenhouts (1976) concluded that the connection between Hippocastanaceae and Harpullieae might involve two genera in particular, Handeliodendron, originally described in Sapindaceae (Rehder 1935), and Delavaya, which has always been placed in Sapindaceae.Müller & Leenhouts (1976) also regarded Harpullieae as a "heterogeneous assemblage", with several genera difficult to connect to the others.For example, they classified Harpullia pollen as both type-A and type-H and Magonia pollen as type-E, whereas other members of the tribe exclusively exhibit the more common type-A pollen (see Buerki et al. 2009 for more details on pollen morphology).Moreover, Harpullieae range from tropical (e.g.Conchopetalum, Delavaya, Magonia) to temperate (Xanthoceras) regions and include both evergreen and deciduous species (Radlkofer 1933, Müller & Leenhouts 1976).Based on wood anatomy, Klaassen (1999) noted a difference between the temperate and tropical genera in the tribe, and among the tropical ones he indicated that Delavaya and Ungnadia stood out because their wood is similar to that of members of tribe Cupanieae Reichenb.(Sapindoideae).Buerki et al. (2009) found Harpullieae to be polyphyletic, with Xanthoceras occupying a basal position within Sapindaceae s. lat., Arfeuillea, Eurycorymbus, Harpullia and Majidea placed in Dodonaeoideae, Delavaya occupying a basal position within Sapindoideae, and Conchopetalum resolved in the Macphersonia group (Sapindoideae; Buerki et al. 2009) closely related to the newly described endemic Malagasy genus Gereaua Buerki & Callm.(Buerki et al. 2010).A close relationship between Delavaya and Ungnadia was found in an earlier cladistic analysis based on morphology (Judd et al. 1994), which identified the presence of prolonged basal petal appendages and glabrous stamens as putative synapomorphies, again suggesting that Harpullieae were far from representing a natural assemblage.
In the present study we seek to (1) clarify the relationships of Xanthoceras within Sapindaceae s. lat.and in particular with respect to the other taxa traditionally and/or currently placed in Harpullieae, and (2) re-examine the appropriateness of maintaining the current broadly circumscribed but morphologically heterogeneous definition of Sapindaceae and explore the possible advantages of alternative family circumscriptions.Toward this end, we have significantly expanded the dataset of Buerki et al. (2009) to conduct a new set of phylogenetic analyses, comparing the results with information from morphology and biogeography.

Sampling, sequence data and phylogenetic analyses
Species names, voucher information, and GenBank accession numbers for all sequences are provided in the appendix.The dataset presented in Buerki et al. (2009) was expanded to include a total of 243 samples encompassing more than 70% of the generic diversity in Sapindaceae s. lat.(104 of the currently recognized 142 genera; half of the 38 genera not included in this analysis are monospecific), representing an increase of ninety ingroup samples and nineteen genera.To assess the phylogenetic relationships of the taxa placed in tribe Harpullieae and in the traditionally recognized families Aceraceae and Hippocastanaceae, we sampled at least one species from each genus currently assigned to these groups by adding the following genera: Magonia and Ungnadia from Harpullieae, plus Billia and Handeliodendron from Hippocastanaceae (Aesculus, the third member of this family, was included in the analysis of Buerki et al. 2009, as were both genera of Aceraceae, Acer and Dipteronia).The outgroup sampling included one taxon each from Anacardiaceae (Sorindeia sp., used as the most external outgroup), Meliaceae (Malleastrum sp.) and Simaroubaceae (Harrisonia abyssinica Oliv.).
The DNA extraction, amplification and sequencing protocols used for the nuclear and plastid regions are provided in Buerki et al. (2009).The nuclear sequences include the whole ITS region (ITS1, 5.8S and ITS2) and plastid markers include both coding (matK and rpoB) and non-coding regions (the trnL intron and the intergenic spacers trnD-trnT, trnK-matK, trnL-trnF and trnS-trnG).
Single-gene, total evidence analyses and their corresponding bootstrap analyses were performed using the maximum parsimony (MP) and maximum likelihood (ML) criteria following the same procedure as in Buerki et al. (2009).Parsimony ratchet (Nixon 1999) was performed for each partition and for the combined data set using PAUPRat (Sikes & Lewis 2001).Ten independent searches were performed with 200 iterations and 15% of the parsimony informative characters perturbed.A strict consensus tree was constructed based on the shortest equally parsimonious trees.To assess support at each node, non parametric bootstrap analyses (Felsenstein 1985) were performed using PAUP* (Swofford 2002) following the same procedure as in Buerki et al. (2009).Model selection for each partition was assessed using Modeltest v. 3.7 (Posada & Crandall 1998).ML analyses were performed using RAxML v. 7.0.0(Stamatakis 2006, Stamatakis et al. 2008) with 1000 rapid bootstrap analyses followed by a search for the best-scoring tree in one single run.These analyses were done using the facilities made available by the CIPRES portal in San Diego, USA (http://8ball.sdsc.edu:8888/cipres-web/home).
Topological differences between single-gene phylogenetic trees were compared by taking into account the level of resolution obtained by each marker and its bootstrap support.Topological differences with bootstrap support (BS) less than 75% were not considered.

Alignment and phylogenetic analyses
The number of samples and statistics for each partition and the combined data set are summarized in table 2. The best-fit model for all partitions was the general time reversible (GTR) with an alpha parameter for the shape of the gamma distribution to account for among-site rate heterogeneity (GTR+G).
The only exception was the ITS region, in which a proportion of invariable sites was added (GTR+G+I).The MP and ML single-gene phylogenies provided different levels of resolution, but no differences with a bootstrap support greater than 75% were identified when compared, so we combined them in a total evidence approach.Statistics (number of most parsimonious trees; tree length; and consistency and retention indices) for each analysis are reported in table 1.
For the combined analyses under the MP criterion, nine of the ten independent PAUPrat searches converged on a best score of 11526 steps and produced a total of 949 most parsimonious trees, which were used to compile a strict consensus (not shown); this consensus tree comprised several polytomies, especially near the tips.Under the ML criterion, the best-fit model for the combined matrix was GTR+G+I.This model was used to perform the single ML run search (log likelihood = − 79995.7),followed by rapid bootstrap analyses.
When compared, analyses compiled under the MP and ML criteria yielded very similar topologies.Moreover, as no moderately to strongly supported differences were observed between the two phylogenetic trees, only the ML topology will be presented and discussed hereafter (figs 1 & 2).

Phylogenetic relationships
With the addition of the ninety ingroup samples used in the present analysis, including representatives of several genera of Sapindaceae s. lat.that had not previously been sequenced, the phylogenetic relationships revealed here are highly congruent with the informal system proposed by Buerki et al. (2009).Based on sampling that includes at least one representative of all genera traditionally placed in Sapindaceae tribe Harpullieae, Aceraceae and Hippocastanaceae, our results further confirm that Xanthoceras sorbifolium Bunge (previously assigned to Harpullieae by Radlkofer, 1933) is resolved as sister to the remaining sampled members of Sapindaceae s. lat.(however with a low BS; fig.1A).Our results also indicate that the other genera of Harpullieae belong to three additional clades, one in subfam.Dodo naeoideae and two in subfam.Sapindoideae (figs 1 & 2), confirming the polyphyly of the tribe.Within Dodonaeoideae, five of the genera currently assigned to Harpullieae belong to the Dodonaea group, viz.Arfeuillea, Eurycorymbus, Harpullia, Magonia and Majidea (fig.1C), and Harpullia itself appears to be polyphyletic, with the three species sampled occupying very different positions within the phylogeny (expanded sampling to include additional members of the genus are, however, needed to confirm this finding).Within Sapindoideae, two of the three remaining genera assigned to Harpullieae (Delavaya and Ungnadia) are placed in the Dela vaya group, the basal most lineage within the subfamily, and the third genus (Conchopetalum) belongs to the Macphersonia group (fig.2).
The inclusion of Billia and Handeliodendron in our analysis, along with additional species of Acer and Aesculus, strengthens support for the monophyly of both Aceraceae and Hippocastanaceae and confirms their sister relationship (fig.1A & B).Our results suggest the possible paraphyly of Acer (with respect to Dipteronia) and of Aesculus (with respect to Billia and Handeliodendron), although this finding should be tested further with additional sampling.Support for the clade comprising Sapindaceae s. str.(i.e.Dodonaeoideae plus Sapindoideae) is likewise stronger in the present analysis (BS = 88) than in that of Buerki et al. (2009;BS = 69;fig. 1).Moreover, Diplokeleba N.E.Br., long regarded as a member of Sapindoideae (tribe Cupanieae), is instead placed within Dodonaeoideae (fig.1C).

Polyphyly of Harpullieae
The results presented above clearly show that the tribe Harpullieae (as well as all other sapindaceous tribes with the exception of Paullinieae), as defined initially by Radlkofer (1890Radlkofer ( , 1933) ) and modified by Müller & Leenhouts (1976), is highly polyphyletic, with members placed in no fewer than four clades scattered among various parts of Sapindaceae s. lat.Harrington et al. (2005) and Buerki et al. (2009) argued that additional sampling (especially of Harpullieae) was required before taking a definitive stand regarding the phylogenetic and taxonomic status of Xanthoceras.Although we have now analyzed more than 70% of the genera and included all those that are putatively related to Xanthoceras, its precise phylogenetic position within Sapindaceae is not strongly supported (BS < 50; fig.1A).However, both the MP and ML analyses presented here clearly point toward Xanthoceras comprising a basal lineage with Sapindaceae s. lat.(fig.1A).Moreover, a study comparing the performance of supertree methods based on an identical dataset (Buerki et al. in press) produced the same result, with both the Matrix Representation with Parsimony and MinFlip supertree methods placing Xanthoceras as the most basally branching lineage.This phylogenetic pattern might be explained either by a higher rate of extinction in the lineage that now comprises only Xanthoceras than in the other lineages, or alternatively by a rapid diversification or radiation of these other lineages resulting in a loss of phylogenetic signal (Judd & Olmstead 2004).In the case of Sapindaceae s. lat., the former hypothesis seems more likely based on preliminary divergence time estimations that place the origin of the clade in the Late Cretaceous (c.110 My), with divergence among the four lineages occurring between 90 and 80 My (Buerki et al. in prep.).
The pattern observed here, in which resolution between lineages remains problematic even after sequencing a large number of markers from a broad sampling of taxa, has been observed in many other angiosperm groups, especially among the rosids (Bello et al. 2009 and references within), such as Fabales, where the relationships among the currently accepted families remain unsolved.In order to clarify the situation within Sapindaceae s. lat.and provide a practical classifica- tion that circumscribes easily recognizable groups, we suggest that other criteria should be considered in addition to monophyly.In addition to representing the most basal lineage within the family, Xanthoceras presents a unique and highly distinctive combination of morphological characters, including imparipinnate leaves (vs.paripinnate, evergreen leaves in most genera), large flowers with petals c. 1.5-2 cm long (vs.small flowers with petals < 1.5 cm long), 5-horn-like appendages protruding from the nectary disk (vs.no appendages protruding from the disk), 7-8 ovules per locule (vs.generally 1-2 ovules per locule) and > 15 seeds (vs.1-3 seeds).Moreover, if Xanthoceras is included within Sapindaceae s. str., it stands out as the sole member with a north-temperate distribution, whereas all other genera occur in the tropics and/ or subtropics.
The tropical Chinese genus Delavaya, traditionally assigned to Harpullieae, has been viewed by several authors (e.g.Müller & Leenhouts 1976, Cronquist 1988) as a "link" between Sapindaceae and Hippocastanaceae through the temperate genus Handeliodendron.The molecular analyses presented here failed to confirm this hypothesis (figs 1 & 2; see below).Instead, they indicate that Handeliodendron belongs to the Hippocastanaceae clade, a placement previously suggested by Forest et al. (2001) based on the presence of simple, opposite leaves, whereas Delavaya occupies a basal position within subfam.Sapindoideae along with Ungnadia from Texas and Florida, another genus originally assigned to Harpullieae (these two genera thus forming the Delavaya group; fig.2).As indicated above, a close relationship between Delavaya and Ungnadia was previously suggested by Klaassen (1999) and Judd et al. (1994) based on wood anatomy and morphological cladistic analyses, respectively.
A majority (five out of nine) of the genera traditionally assigned to Harpullieae (viz., Arfeuillea, Eurycorymbus, Harpullia, Magonia and Majidea) belong to subfam.Dodonaeoideae, and in particular to the Dodonaea group, a finding that is consistent with Radlkofer's (1890Radlkofer's ( , 1933) ) original placement of Harpullieae.The basal most branch of the Dodonaea group includes the South American genus Magonia (fig.1C), associated by Müller & Leenhouts (1976) with the temperate Asian Xanthoceras on the basis of their sharing seven or eight ovules per locule.The close relationship between Averrhoidium, Diplokeleba (previously assigned to Sapindoideae) and Magonia might be reflected in part by seed morphology; the first two genera are the only members of Sapindaceae to have winged seeds (Radlkofer 1933).
Finally, the results of the phylogenetic analyses presented in this study are in agreement with the findings of Buerki et al. (2009) with regard to the position of the last genus traditionally assigned to Harpullieae, the Malagasy endemic Conchopetalum, which is confirmed to belong to the Macphersonia group (Sapindoideae; fig.2; see Buerki et al. in press for more details).Relationships between this taxon and other members of Sapindaceae are discussed in Buerki et al. (2009).
Harpullieae have traditionally been considered to represent a "link" between Aceraceae, Hippocastanaceae and Sapindaceae, but as mentioned above, the molecular analyses presented here fail to confirm this hypothesis (figs 1 & 2).Our analyses do not support the long-held view that Acera-ceae and Sapindaceae are closely related (Radlkofer 1890, 1933, Müller & Leenhouts 1976, Umadevi & Daniel 1991).Instead, they show that the two genera currently placed in Aceraceae form a strongly supported group, and that they are more closely related to Hippocastanaceae than to the clade comprising Sapindaceae s. str.(fig.1B), as earlier suggested by Harrington et al. (2005), Thorne & Reveal (2007) and Buerki et al. (2009).
The present analyses further confirm that (i) Sapindaceae s. lat.constitute a monophyletic entity that is supported by molecular (but not morphological) synapomorphies; (ii) the three traditionally recognized families Aceraceae, Hippocastanaceae and Sapindaceae, as circumscribed by Radlkofer (1933), are each monophyletic and moderately to strongly supported, provided that Xanthoceras is excluded from Sapindaceae; and (iii) Xanthoceras sorbifolium is sister to the clade comprising these three families (fig.1A).The concept of a broadly defined Sapindaceae that includes Aceraceae, Hippocastanaceae and Xanthoceras, recently adopted by the Angiosperm Phylogeny Group (APG II 2003) and followed by Harrington et al. (2005), Buerki et al. (2009) and APG III ( 2009), is consistent with the phylogenetic relationships revealed in earlier studies and confirmed here.However, this broad circumscription of Sapindaceae presents several conceptual problems.First, no clear morphological synapomorphies have been identified for Sapindaceae s. lat.(Harrington et al. 2005, Thorne & Reveal 2007, Buerki et al. 2009) and the high level of heterogeneity that results from the inclusion of Xanthoceras and the taxa traditionally placed in Aceraceae and Hippocastanaceae makes it difficult to characterize the family.Second, treating Sapindaceae broadly reduces these easily identified and widely recognized families to synonymy, changing the long-established family assignment of several well known, emblematic and widely cultivated genera, most notably Acer and Aesculus.

Classification
Two alternative approaches are available to address the family level circumscription of the taxa currently placed in Sapindaceae s. lat.: (i) retain the broad definition recently proposed by the Angiosperm Phylogeny Group (APG II 2003, APG III 2009) or (ii) resurrect the temperate families Aceraceae and Hippocastanaceae, restrict Sapindaceae s. str.slightly by excluding Xanthoceras, and describe a new family to accommodate this genus.Both interpretations are consistent with the primary principle of classification as defined by Backlund & Bremer (1998), which requires the monophyly of taxonomic entities.However, the second approach is clearly preferable when two other principles proposed by these authors are taken into consideration, maximizing ease of identification and maintaining nomenclatural stability.While an argument could be made that it is preferable to avoid adding a new name at the family rank (Stevens 1997), in the present case we believe that this is significantly outweighed by the clear advantages of maintaining Aceraceae, Hippocastanaceae and Sapindaceae (excluding Xanthoceras) as the morphologically and biogeographically coherent entities that have been recognized for well over a century.In order to render Sa pindaceae

Table 1 -Characteristics of partitions used in the phylogenetic analyses of Sapindaceae s. lat.
IGS = intergenic spacer; MP = maximum parsimony; PI = potentially parsimony informative; CI = constistency index; RI = retention index; 1 for No. of sequences, the total number of samples for the combined analyses is indicated between brackets; 2 for mean amount of phylogenetic information per sample: averaged by alignment size/variable sites number/PI sites number.