Ecological niche comparison and molecular phylogeny segregate the invasive moss species Campylopus introflexus (Leucobryaceae, Bryophyta) from its closest relatives

Abstract The delimitation of the invasive moss species Campylopus introflexus from its closest relative, Campylopus pilifer, has been long debated based on morphology. Previous molecular phylogenetic reconstructions based on the nuclear ribosomal internal transcribed spacers (ITS) 1 and 2 showed that C. pilifer is split into an Old World and a New World lineage, but remained partly inconclusive concerning the relationships between these two clades and C. introflexus. Analyses of an extended ITS dataset displayed statistically supported incongruence between ITS1 and ITS2. ITS1 separates the New World clade of C. pilifer from a clade comprising C. introflexus and the Old World C. pilifer. Ancestral state reconstruction showed that this topology is morphologically supported by differences in the height of the dorsal costal lamellae in leaf cross‐section (despite some overlap). ITS2, in contrast, supports the current morphological species concept, i.e., separating C. introflexus from C. pilifer, which is morphologically supported by the orientation of the hyaline hair point at leaf apex as well as costal lamellae height. Re‐analysis of published and newly generated plastid atpB‐rbcL spacer sequences supported the three ITS lineages. Ecological niche modeling proved a useful approach and showed that all three molecular lineages occupy distinct environmental spaces that are similar, but undoubtedly not equivalent. In line with the ITS1 topology, the C. pilifer lineage from the New World occupies the most distinct environmental niche, whereas the niches of Old World C. pilifer and C. introflexus are very similar. Taking the inferences from ecological niche comparisons, phylogenetics, and morphology together, we conclude that all three molecular lineages represent different taxa that should be recognized as independent species, viz. C. introflexus, C. pilifer (Old World clade), and the reinstated C. lamellatus Mont. (New World clade).


| INTRODUCTION
Accurate species identification is of great importance, for example, in biodiversity assessments, conservation, but also to monitor species with invasive potential. In bryophytes, a morphological species concept is still most commonly employed (Shaw, 2009), but species identification is frequently hampered by relatively few and often (highly) variable morphological characters. This is especially true for species of large and taxonomically complex genera such as Campylopus Brid.
Although extensive morphological revisions have reduced the number of Campylopus species from ca. 1,000 to about 150 (Frahm, 1999; and references therein; Frey & Stech, 2009), the circumscription and identification of many of these species remains difficult. This is especially pressing in delimiting the invasive species Campylopus introflexus (Hedw.) Brid. from its closest relatives.
Assessing morphological species delimitations in Campylopus using molecular data is challenging. Chloroplast markers (atpB-rbcL, trnT-F, atpI-atpH) provided little phylogenetic signal to delimit species (Stech, 2004), as also reported for other moss genera such as Dicranum Hedw. (Lang, Bocksberger, & Stech, 2015), and were partly difficult to sequence, especially at pB-rbcL, the slightly more variable of the cpDNA markers. In contrast, the nuclear ribosomal internal transcribed spacer (ITS1-5.8S-ITS2) region, the most widely used nuclear marker for plant phylogenetic inferences (Stech & Quandt, 2010), is highly variable in Campylopus. Two main types of ITS1 were found, one in C. introflexus, C. pilifer from the Old World and the sister genus Pilopogon, and the other in the New World samples of C. pilifer and other Campylopus species (discussed in detail by Stech & Dohrmann, 2004). However, the weakly resolved maximum parsimony analyses of ITS1 and ITS2 separately in Stech and Dohrmann (2004) did not allow to fully assess the impact of different phylogenetic signals in both spacers on species relationships in Campylopus.
Incongruence between 18S and the internal transcribed spacers was reported by Durand et al. (2002) in the invasive green algal species Cauler paracemosa (Forsskål) J. Agardh, and paralogous ITS copies were found in the angiosperm family Calycanthaceae (Li, Ledger, Ward, & del Tredici, 2004); however, incongruence between ITS1 and ITS2, as observed in Campylopus has, to the best of our knowledge, not yet been found in any other group of land plants. In contrast to several other widespread Campylopus species, C. introflexus was shown to be monophyletic and well delimited based on ITS sequences, but its relationships with the two molecular lineages of C. pilifer remained ambiguous (Gama et al., 2016;Stech & Dohrmann, 2004;Stech, Sim-Sim, & Kruijer, 2010;Stech & Wagner, 2005).
As Padial, Miralles, Dela Riva, and Vences (2010) have pointed out, consensus is emerging that species are separately evolving lineages of populations or metapopulations, and the decision that separate lineages should be recognized as distinct species should take into account a combination of different data and analysis methods (integrative taxonomy). Integrative taxonomic approaches combining molecular and morphological evidence have indeed shed new light on species delimitations in bryophytes (e.g., Caparrós, Lara, Draper, Mazimpaka, & Garilleti, 2016;Dirkse, Losada-Lima, & Stech, 2016;Draper et al., 2015;Medina, Lara, Goffinet, Garilleti, & Mazimpaka, 2012;Sim-Sim et al., 2017). In Campylopus, additional data sources are needed to assess both the molecular phylogenies and the morphological variability. In this study, we use ecological niche comparison techniques (Broennimann et al., 2012) to understand niche differences between the molecular lineages of C. pilifer and C. introflexus. We consider the niche as describing the set of biotic and abiotic conditions where a species can persist (Grinnell, 1917;Holt, 2009;Hutchinson, 1957), which is the niche concept important for understanding the large-scale geographic distribution of species (Wiens et al., 2010). According to the competitive exclusion principle, no two species can occupy exactly the same niche space (cannot have equivalent niches) over a long period of time (Gause, 1934). Even in the case of niche conservatism, i.e., when species diversification is not driven by ecological speciation, closely related species tend to be ecologically similar, but not identical (e.g., Kozak & Wiens, 2006;López-Alvarez et al., 2015). If the ecological niche evolves as part of the speciation process, patterns of ecological differentiation can be potentially useful for species delimitation (Martínez-Gordillo, Rojas-Soto, & Espinosa de losMonteros, 2010). In fact, there is growing evidence that ecological niche data can assist species delimitation in different groups of organisms, including vertebrates (e.g., Leaché et al., 2009;Martínez-Gordillo et al., 2010;Raxworthy, Ingram, Rabibisoa, & Pearson, 2007), invertebrates (e.g., Gurgel-Gonçalves, Ferreira, Rosa, Bar, & Galvao, 2011;Hawlitschek, Porch, Hendrich, & Balke, 2011), and, most recently, also land plants (e.g., Aguirre-Gutiérrez, Serna-Chavez, Villalobos-Arámbula, Pérez de la Rosa, & Raes, 2015;Shrestha & Zhang, 2015). However, no such study has yet been performed in bryophytes.
Based on the ecological data, molecular analysis of extended datasets (phylogenetic analysis of nuclear ITS and haplotype analysis of plastid atpB-rbcL sequences), and re-evaluation of diagnostic morphological characters using ancestral state reconstruction, we aim to conclude about species delimitations of C. introflexus and C. pilifer.

| Taxon sampling
For molecular phylogenetic analysis, ITS sequences of 89 specimens of Campylopus and two of Pilopogon Brid. as outgroup representatives, were compiled. Of these, 67 were taken from own previous analyses (Gama et al., 2016;Stech, 2004;Stech & Dohrmann, 2004), were renamed based on the molecular results, except C. catarractilis (see Section 4). AtpB-rbcL spacer sequences were compiled for 38 Campylopus specimens (20 from Stech, 2004;Stech & Dohrmann, 2004;Stech, 2006 andGama et al., 2016; and 18 newly sequenced). Voucher information and Genbank accession numbers of the newly sequenced specimens are provided in Table 1. For ecological niche comparisons, voucher information from 1,242 additional collections of C. pilifer from the New and Old World as well as C. introflexus were obtained from different herbaria (L, MO, NY, SP and UB) and from the Global Biodiversity Information Facility (GBIF). Specimens were selected in order to cover the total distribution ranges of the three lineages as far as possible. Data from GBIF collected prior to 1990 were excluded to minimize errors, especially identification problems of older collections, considering that the taxonomic relationships between C. pilifer and C. introflexus started to be investigated in more detail from that decade onwards (Frahm, 1991(Frahm, , 1999Frahm & Stech, 2006;Stech & Dohrmann, 2004).
Separate phylogenetic analyses of ITS1 and ITS2 were performed under maximum likelihood (ML) and Bayesian inference (BI). The GTR+Γ model, selected under the Akaike information criterion in jModelTest2 (Darriba, Taboada, Doallo, & Posada, 2012;Guindon & Gascuel, 2003), was applied. Maximum likelihood trees were calculated withRaxML version 8.0.26 (Stamatakis, 2014) using raxmlGUI version 1.3.1 (Silvestro & Michalak, 2012). Bootstrap support (BS) values were obtained with a thorough bootstrap algorithm and 10,000 pseudoreplicates. Bayesian analyses were run using MrBayes version 3.2.5 (Ronquist et al., 2012). Bayesian posterior probabilities (PP) were estimated by the Markov Chain Monte Carlo (MCMC) method. Four runs with four chains each (three heated and one cold) were run with 30 × 10 6 generations, with chains sampled every 1,000th generation and the respective trees written to a tree file. A threshold of <0.01 for the standard deviation of split frequencies was used to assess convergence of runs. Fifty percent majority rule consensus trees and posterior probabilities of clades were calculated combining the four runs using the trees sampled after convergence of the chains and the "burn-in" (25% of the trees) discarded.

| Ecological niche comparisons
We selected environmental data regarding eco-physiological constraints of the target taxa according to recent literature on habitat preferences (Frahm & Stech, 2006;Klinck, 2010;Spagnuolo et al., 2014;Sparrius & Kooijman, 2011;Sparrius, Sevink, & Kooijman, 2012). The selected 10 environmental variables were obtained from WorldClim (www.worldclim.org) and are of high spatial and temporal resolution. The variables are derived from interpolation of weather stations data as described by Hijmans, Cameron, Parra, Jones, and Jarvis (2005) and are listed in Table 2. We included several components of temperature variation and precipitation variables to account for the different biomes within the area of occurrence and its constraints on the survival of the target taxa. All selected variables presented Pearson's correlation ≤0.70 (Dormann et al., 2013) and had a spatial resolution of 10 × 10 km at the equator.
We calculated ecological niche characteristics in order to assess how much of the environmental niche space is shared between the molecular lineages of C. introflexus, C. pilifer from the New World and C. pilifer from the Old World. We used an ordination technique with kernel smoothers (Broennimann et al., 2012) to extract the ecological niche space that is occupied by each of the molecular lineages and to quantify niche overlap, equivalence and similarity. The number of occurrences used per molecular lineage may be biased and not representative for the total distribution of the taxa in the environmental space, possibly resulting in an incorrect estimation of their density. Therefore, a kernel density function was applied for smoothing the density of occurrences throughout each cell in the environmental space, leading to a better indication of the suitability of the environmental conditions per lineage. We performed the analyses using a principal component analysis calibrated on the whole environmental space of the study area (PCA-ent). All analyses were carried out in R (R Development Core Team, 2014).
We obtained the niche breadth of each lineage (amount of ecological niche space available to the different lineages) by using the Levins' inverse concentration metric (Levins, 1968). To quantify the niches shared by the Campylopus lineages, we computed the niche overlap under Schoener's D statistic from the ecological niche space (Schoener, 1968;Warren, Glor, & Turelli, 2008), under which the value of D ranges from 0 to 1 (0 meaning each two lineages have no overlap in the environmental space and 1 meaning they share the same environmental space).
The niche equivalence test was performed in order to assess whether the ecological niches of each pair of molecular lineages differed significantly from each other or were interchangeable. We T A B L E 1 Voucher information and GenBank accession numbers for the newly generated sequences of Campylopus species compared the niche overlap values (D) of the pairs of molecular lineages to a null distribution of 100 overlap values. In case the niche overlap value of the molecular lineages being compared was significantly lower than those acquired by the null distribution (p < .05), we assumed the ecological niches not to be equivalent.
Considering that the test for niche equivalency test only assesses whether two species are identical in their niche space by using their exact locations, but disregards the surrounding space, we also performed a niche similarity test. This test assesses whether the ecological niches of any pair of species (in this case, lineages) are more different than would be expected by chance, and considers the differences in the surrounding environmental conditions in the geographic areas where both species are distributed (Warren, Glor, & Turelli, 2010).
We investigated the main environmental variables that constrain the distributions of the lineages based on the loadings of the first two axes of the PCA-ent.

| Phylogenetic analysis
The ITS1 dataset comprised a total of 1,376 characters (alignment positions 1−1071, indels 1072−1376). The large number of alignment positions was mainly caused by the two main ITS1 types, which for large parts were separated in different blocks in the alignment. The Bayesian inference (BI) consensus tree from ITS1 is shown in Fig. 1 All included Old World C. pilifer specimens as well as four specimens from Réunion shared the same atpB-rbcL haplotype (Fig. 3).
Sequence divergence was higher within C. introflexus and the C. pilifer from the New World (three haplotypes each). In addition, C. introflexus displayed the A-type loop inversion in the middle part of the spacer, in contrast to the T-type in C. pilifer (cf. details in Stech, 2004), and all three lineages displayed different numbers of AT repeats in a microsatellite region in the spacer.

| Ecological niche comparisons
The distributions of the three molecular Campylopus lineages (C. introflexus, C. pilifer Old World, C. pilifer New World) are related to different responses to the environment and resulted in distinct distributions in niche space (Fig. 4). The analysis of ecological niche properties showed that the two first axes of the PCA-ent were able to explain 79.95% of the variance of the data. The first axis was determined mostly by temperature-related bioclimatic factors (namely isothermality, minimum temperature of coldest month, mean temperature of driest quarter, mean temperature of wettest quarter and maximum temperature of warmest month), accounting for 54% of the total variation in environmental conditions for the taxa in the study area (Fig. 4).
The highest axis loadings where observed for minimum temperature T A B L E 2 Environmental variables used for ecological nichemodeling. WorldClim (Hijmans et al., 2005) (Table 3).
The pairwise niche similarity comparison between the C. pilifer New and Old World lineages indicated that their niche overlap falls within the 95% confidence limits of the null distributions (p > .05), leading to non-rejection of the hypothesis of retained niche similarity (Table 3).
The niche similarity between the C. pilifer New World clade and C. introflexus clade was higher than expected by chance. The niche similarity between the C. pilifer Old World lineage and C. introflexus was higher than expected by chance in one direction only, indicating that the niche of the Old World clade was more similar than expected by chance to the one of C. introflexus, but not vice versa. In contrast, the niche equivalency was rejected for all pairwise comparisons, which indicates that the lineages underwent significant alteration of their distribution in environmental niche space along the process of colonization of the areas within their current distribution.

| Ancestral state reconstructions
Reflexed hyaline hairpoints were restricted to C. introflexus, whereas erect hairpoints occurred in both C. pilifer lineages and in C. catarractilis (Figs 5 and 6). Apart from the outgroup, the hairpoint was rarely scored as absent, only in three C. introflexus specimens as well as a single specimen of C. pilifer from the New World. The ITS1 topology suggests that both the reflexed and erect hairpoint states have arisen more than once in the evolutionary history (Fig. 5)

| DISCUSSION
The model-based analyses of the present ITS dataset resulted in wellresolved and supported phylogenetic reconstructions, which allow to assess the different phylogenetic signals of ITS1 and ITS2 in the studied Campylopus species more precisely than the maximum parsimony analyses in Stech and Dohrmann (2004 separates the New World clade of C. pilifer from a clade comprising C. introflexus and the Old World C. pilifer (Fig. 1), whereas ITS2 separates C. introflexus from C. pilifer s.l. (Fig. 2). Further incongruence is observed concerning the positions of C. catarractilis as well as four samples from Réunion Island. The ITS2 thus seems to coincide with the current morphological species concept of C. introflexus and C. pilifer s.l., which is supported by the distribution of reflexed versus straight hairpoints at leaf apex as well as costal lamellae 1−2 versus >2 cell rows high in the ancestral state reconstructions (Fig. 6). The ITS1 topology, however, seems to be supported by morphology as well, separating specimens with costal lamellae (4−)5−6 cell rows high (New World C. pilifer) from specimens with mostly lower lamellae, namely 1(−2) cell rows in C. introflexus and 3−4(−6) cell rows in Old World C. pilifer (Fig. 5).
The presence of conflicting ITS-based hypotheses and morphological support for each of the ITS lineages, but also the observed overlap in lamellae height at least in some specimens, indicate that it is important to consider other sources of biological information to delimit Campylopus introflexus and C. pilifer. Despite the smaller number of plastid atpB-rbcL sequences analyzed here and the low resolution of earlier phylogenetic trees based on plastid sequences, the haplotype network approach supports the existence of the three lineages, C. introflexus, New World C. pilifer, and Old World C. pilifer (Fig. 3), without mixing of haplotypes except for the same four samples from Réunion Island that also deviate with ITS. Furthermore, ecological niche comparison proved a useful approach and showed that all three molecular lineages occupy distinct environmental spaces that are similar, but undoubtedly not equivalent ( suggest that hybridization has possibly occurred between C. lamellatus and C. pilifer. This is supported by the atpB-rbcL spacer sequences of samples C013, C120, C121, and C122, which all belong to C. pilifer.
Since the plastid DNA is maternally inherited, C. pilifer seems to be the maternal ancestor of the putative hybrid specimen on Réunion.
However, further analyses based on a larger taxon and marker sampling is necessary. Gradstein and Sipman (1978) considered the type specimen of is morphologically closest to C. introflexus in its short lamellae, but differs by the erect hair point (Frahm & Stech, 2006), and C. pilifer subsp.
vaporarius, confined to volcanic fumaroles in Italy. The latter has dorsal lamellae 2−3 (rarely 4) cell rows high, and both erect and reflexed hyaline hair points, even on the same stem (Spagnuolo et al., 2014).
No molecular data are available yet from C. pilifer subsp. galapagensis Frahm, another narrow endemic described from volcanic rock in Galapagos, which has dorsal lamellae 2−3 cell rows high but differs by the presence of ventral substereids instead of hyalocysts in costa cross section, possible as adaptation to drier habitats (Frahm, 1991 The improved understanding of the delimitations of C. introflexus, C. lamellatus, and C. pilifer is expected to facilitate the identification of collected specimens. This will be particularly helpful to assess the native distribution area of C. introflexus and monitor its distribution in areas where it is invasive. With the exception of the C. catarractilis specimen, all analyzed specimens originally identified as other Campylopus species could be assigned to one of the three species. Similar percentages of misidentified specimens in Campylopus (15%, present study) and the Racomitrium canescens species complex (20%; Stech et al., 2013) indicate that a percentage of 15%−20% may be expected when analyzing morphologically identified specimens of closely related bryophyte species in an integrative approach.
The African species Campylopus catarractilis has not been considered closely related to either C. introflexus or C. pilifer (cf. Frahm, 1985b).
The incongruent position of C. catarractilis (sister to C. pilifer based on ITS1 and nested in C. introflexus based on ITS2) suggested that the sequenced specimen might be misidentified and of hybrid origin similar to the specimens from Réunion discussed above. However, the combination of morphological characters of an erect hairpoint, low lamellae, and the typical serrate leaf apex as diagnostic character for C. catarractilis (Frahm, 1985b) in the sequenced specimen, did not allow to unambiguously assign it to either C. introflexus or C. pilifer. Analysis of further C. catarractilis specimens is necessary to infer its taxonomic status.
Despite new insights, further morphological traits should be explored to find additional diagnostic characters that facilitate morphological identification of C. introflexus, C. lamellatus, and C. pilifer.
Two observations concerning the costa cross-section not yet mentioned in the literature were made during this study. Firstly, a group of stereid cells is found above each dorsal lamella intercalated with a larger sub-stereid cell, except in the center of the leaves where two stereid groups are fused without a sub-stereid cell between them. This causes the two central lamellae to fuse as well and grow in a V-shaped orientation. This V-shaped pattern is more conspicuous when the lamellae are higher and therefore easier to be seen in C. lamellatus and C. pilifer. Secondly, a marked difference in the delimitation of the costa was observed, which is gradual in C. pilifer, most C. lamellatus, and C. introflexus p.p., but abrupt in a well-supported clade within C. introflexus, a few C. lamellatus and in C. catarractilis. Despite the fact that these characters do not clearly delimit the molecular lineages, these observations indicate that the morphological and anatomical characters within Campylopus are not yet fully employed and that there is potential of novel characters to be found.
Global moss diversity analyses are still hampered by taxonomic and spatial distribution knowledge gaps, particularly in the tropics (Geffert, Frahm, Barthlott, & Mutke, 2013). The impact of misunderstood species delimitations, and misidentifications based on morphology, on species distribution patterns was recently demonstrated for Brazil, where it was understood by the bryological community that no C. introflexus occurred in the country, and that all piliferous Campylopus specimens belonged to C. pilifer. Gama et al. (2016), however, revealed that both species have an overlapping distribution in many places of South America, including Brazil. These findings have immediate impact on the checklist of bryophytes of Brazil, which has not recognized C. introflexus yet (Costa et al., 2011). A similar situation may occur in Australia and New Zealand, where currently only C. introflexus is reported. Considering the potential distribution of C. pilifer and a possible identification bias, it seems likely that C. pilifer will be found to occur in Australasia as well, but further investigation is necessary. In accordance with Silva, Vilela, De Marco, and Nemésio (2014), we have shown that ecological niche assessments can aid in the understanding of "data deficient" species. 1848.

ACKNOWLEDGMENTS
We thank CAPES (Brazilian Education agency) for financing the PhD project of the first author as well as the current study. We thank the curators of herbaria MO, NY, SP and UB for loan of specimens.