Integrative species delimitation in the common ophiuroid Ophiothrix angulata (Echinodermata: Ophiuroidea): insights from COI, ITS2, arm coloration, and geometric morphometrics

Ophiothrix angulata (Say, 1825) is one of the most common and well-known ophiuroids in the Western Atlantic, with a wide geographic and bathymetric range. The taxonomy of this species has been controversial for a century because of its high morphological variability. Here we integrate information from DNA sequence data, color patterns, and geometric morphometrics to assess species delimitation and geographic differentiation in O. angulata. We found three deeply divergent mtDNA-COI clades (K2P 17.0–27.9%). ITS2 nuclear gene and geometric morphometrics of dorsal and ventral arm plates differentiate one of these lineages, as do integrative species delineation analyses, making this a confirmed candidate species.

Traditional species delineation of ophiuroids has focused on macro-morphological characters that can show substantial variability and overlap among species, causing taxonomic uncertainty (Arlyza et al., 2013). An integrative approach that combines information from live appearance, microstructure, life history, ecology, ethology, physiology, distribution, and especially DNA sequence data provides information to resolve species in challenging groups like Ophiothrix (Bickford et al., 2007;Boissin, Féral & Chenuil, 2008;Padial et al., 2010;Pérez-Portela, Almada & Turon, 2013;O'Hara et al., 2014a;Richards, De Biasse & Shivji, 2015;Taboada & Pérez-Portela, 2016;Dos Santos Alitto et al., 2019;Newton et al., 2020). Over the past decade, the use of microstructural characters has emerged as a valuable tool in systematic studies of brittle stars, revealing their phylogenetic value (O'Hara et al., 2014b;Thuy & Stöhr, 2016). Among these characters, arm plates have proven to be particularly important in establishing a congruence with molecular data, enabling the inference of phylogenetic relationships even at the genus level (Thuy & Stöhr, 2016). Therefore, these characters offer a valuable approach to analyze species complexes and contribute to species delimitation.
Ophiothrix angulata is highly variable and has a broad latitudinal and bathymetric distribution that has attracted taxonomic attention (Tommasi, 1970;Clark, 1933;Hendler et al., 1995;Hendler et al., 1999;Hendler, 2005;Santana et al., 2017). Variation is especially notable in color (Lyman, 1865;Verrill, 1899;Clark, 1901), andClark (1918) named five varieties based on this. The species has also been noted to vary in disc shape, and arrangement of spinelets around the disc, but Hendler et al. (1995), concluded that this variation does not seem to sort into species-level units.
The goal of this study was to assess whether the great morphological diversity of O. angulata is the result of high intra-specific variation or differentiation among multiple cryptic or pseudo-cryptic species. We tested species boundaries using an integrative taxonomic approach, by combining mtDNA COI and nrDNA ITS2 sequence data, color patterns, and geometric morphometrics of dorsal and ventral arm plates. We combined results from genetic and morphological assessments for species delimitation using an integrated Bayesian phylogenetic and phylogeographic approach (iBPP) (Solís-Lemus, Knowles & Ané, 2015). We also analyzed the population diversity and demographic history of the clades discovered.  Figure 1 Sampling locations (see Table S1 for details), with the proportion of specimens per clade are given as pie charts. Clade 1 (green), Clade 2 (blue), and Clade 3 (pink

DNA extraction and sequence alignment
DNA was extracted from ethanol-fixed arm tissue using the Chelex protocol (Walsh, Metzger & Higuchi, 1991) or the Omega Bio-Tek E.Z.N.A. Mollusc DNA kit according to the manufacturer's instructions. The echinoderm barcoding primers COIceF and COIceR (Hoareau & Boissin, 2010) were used to amplify a 655-base pair (bp) region of COI as described by Michonneau & Paulay (2014). Electropherograms were checked, assembled into contigs, and manually edited using Sequencher 4.6 (Gene Code Corps, Ann Arbor, MI, USA). Consensus sequences were aligned using Muscle (Edgar, 2004), and alignment was verified by eye using PhyDE v.10.0 (Müller et al., 2010). The sequence alignment was converted to protein using Genius v8.1.7 (Kearse et al., 2012) to ensure a proper reading frame and to verify the absence of stop codons. A 527 pb section of the nuclear DNA internal transcribed spacer-2 (ITS2) was amplified using the primers OphITS2F and OphITS2R as described by Naughton et al. (2014). Sequences were deposited in GenBank (COI accession numbers: MT338285-MT338398, ON245084-ON245096; ITS2 accession numbers: OQ225473-OQ225482).

Phylogenetic analyses
Phylogenetic analyses using Bayesian Inference (BI) and Maximum Likelihood (ML) methods were performed separately for COI and ITS2 sequence datasets on the CIPRES Science Gateway portal (Miller, Pfeiffer & Schwartz, 2010). jModelTest2 (Darriba et al., 2012) was used on the CIPRES portal to select the best model of molecular evolution based on Akaike information criteria tests (AIC). ML analyses were done using RAxML-HPC2 on XSEDE (Stamatakis, 2006;Stamatakis, Hoover & Rougemont, 2008) with the GTR+GAMMA model of sequence evolution; nodal support values were assessed with 1,000 rapid bootstraps (Felsenstein, 1985). BI analyses were performed using MrBayes v.3.2.7a on XSEDE (Ronquist et al., 2012), using the GTR+I+G model. The MCMC search was based on two independent runs of four chains each and 6,000,000 generations (sampled every 1,000 generations) until the final average and standard deviation were close to 0.01. Twenty-five percent of the initial trees were discarded as Burn-in. Results were summarized in Tracer v.1.7.1  based on the Effective Sample Size-ESS for each parameter. The gene phylogenies were represented using FigTree v.1.4.4 (Rambaut, 2018) and annotated using Adobe Illustrator CC v.2017-22.0.1.

Geometric morphometric analyses
We analyzed dorsal arm plates (DAP) and ventral arm plates (VAP) from 46 specimens. Images of DAP and VAP were obtained through scanning electron microscopy (SEM) from intermediate-sized specimens with disc diameters (DD) of 2.5 to 5.5 mm to assess the size-independent variability. The integument was removed from the 4th-8th arm segments (counted from the first arm segment that contained a regular vertebra and lateral plates Stöhr, 2005), due to the adult proximal arm plates showing the highest degree of morphological differentiation, reflecting differences between species (Thuy & Stöhr, 2011). The integument removal was performed by submergence in 0.3% sodium hypochlorite solution for 2-8 h, washed with distilled water and 98% ethanol, air-dried, and mounted on aluminum stubs using carbon tape. The samples were then gold-coated and scanned using a Hitachi-SU1510 SEM at the LANABIO facility at the Instituto de Biología, UNAM. In order to organize the data for analyses, the file format of the plate images was imported into TPS file format using tpsUtil v.1.58. Landmarks and curves were digitized using tpsDig2 v.2.17 (Rohlf, 2015). Along the external margin of each DAP, five homologous type II landmarks and sixty-eight evenly-spaced semi-landmarks were digitized ( Fig. 2A). The same procedure was followed for VAP but using four landmarks type II and 66 semilandmarks evenly-spaced (Fig. 2B). Geometric morphometric (GM) analyses followed the outlined by Masonick & Weirauch (2019), using the R package geomorph v.3.2.1 R package (Adams & Otárola-Castillo, 2013). The GM analyses compared four clades and subclades defined by COI sequence data set (Table S2): Clade 1A, Clade 1B, Clade 2A, and Clade 3. Generalized Procrustes analysis was used to extract the shape data for comparison, removal, translation, scaling, and rotation of all selected landmarks. A proxy of size in GM is centroid size which is the square root of the sum of squared distances of an object's landmarks from their centroid or center of gravity. The semi-landmarks digitized for each arm plate were optimized to reduce bending energy using the function ProcD = False in geomorph, thus providing the best fit during optimization. Shape variation was analyzed through principal component analysis (PCA) using the covariance matrix of the individual, and a graphical scatterplot was performed using the two principal components, which accumulated the maximum variance of the data (PC1 vs. PC2; S3 Appendix). Thin-plate spline deformation grids were calculated from the mean shape variance along each PC axis with the shapes v.1.2.5 R package (Dryden, 2018), representing the overall shape modification. To enhance the visualization of data dispersion, two scatterplots in 3D were constructed based on the first three PCs using the R package car v.3.0-6 (Fox, Weisberg & Price, 2018;Masonick & Weirauch, 2019). A Procrustes analysis of variance (ANOVA) was performed on all analyses using the ''procD.lm'' function in geomorph. The resulting pairwise Procrustes distances were compared to assess the significance of differences in mean arm plate shape among the groups. The statistical significance of the observed variation was assessed through a permutation test of the randomized model residuals with 999 iterations at an α-value of 0.05.

Measurement error and allometry-test
To evaluate the impact of measurement error, the selected landmarks were digitized twice on each image (dorsal and ventral arm plates), on different days, by the same observer (Viscosi & Cardini, 2011). The error was calculated as percent measurement error (%ME) by comparing the variation among measurements based on a formula developed by (Bailey & Byrnes, 1990;Yezerinac, Lougheed & Handford, 1992). A Procrustes ANOVA on the residuals of Procrustes distances was used to compare within and among individual shape variance components. The allometry was evaluated by regression of the Procrustes shape coordinates on centroid size using a log 10 scale. Interaction between centroid size and ''clade'' (dorsal or ventral arm plates grouped by clade) factor was estimated with Procrustes ANOVA in geomorph v.3.2.1 (Klingenberg, 2016).

Dorsal arm color pattern analysis
Color patterns were examined from live-taken images associated with 46 sequenced specimens from the UF and COREPY invertebrate image collections (Table S1). Twentyfive color characters were selected from dorsal arm views (Figs. S1, S2, and S3). All characters were treated as discrete and unordered (Appendix S1). Disc color characters were not selected because they showed a great amount of individual variation. Ophiactis savignyi was used as outgroup species (Fig. S4).
The character matrix was edited in Mesquite v.3.51 (Maddison & Maddison, 2018), with inapplicable data scored with ''-'' (Appendix S2). Parsimony analysis was conducted using TNT v.1.5 (Goloboff & Catalano, 2016). Optimal trees were searched using random addition sequences of Wagner trees, followed by the TBR algorithm, using 500 replicates, and saving 10 trees per replicate. The resulting trees were used as starting points for a round of TBR branch swapping. Bootstrap support values for the strict consensus tree were determined through 1,000 iterations, with default settings. Visualization, interpretation, and annotation of the cladogram were performed with FigTree, TNT, and Illustrator, respectively.

Molecular species delimitation
Two sequence-based species delimitation methods were employed using mtDNA data only. Multi-rate Poisson tree processes (mPTP) is a non-coalescent, sequence-based, maximum likelihood method that does not require pre-determined taxonomic designations and uses statistical cutoffs to delineate taxa on a phylogenetic input tree (Zhang et al., 2013;Kapli et al., 2017). It identifies variation in the pace of branching events, modeling speciation based on the number of substitutions. The online version of mPTP (https://mptp.h-its.org/#/tree) was used to calculate species delimitation, using the Bayesian tree as input, mPTP model selected, with nine specimens as outgroup taxa. The resulting trees were visualized using FigTree v.1.4.4.
The Bayeasian program BPP v.4.1.4 (Yang, 2015) was used to infer phylogenetic and phylogeographic patterns under the multispecies coalescent model (MSC) and to calculate potential species' posterior probabilities delimitation (Yang & Rannala, 2010;Yang, 2015). BPP appears to be relatively robust to the influence of unequal population sizes, rates of population growth, unbalanced sampling, and mutation rate heterogeneity (Luo et al., 2018), and has proven effective in analyses of taxon evolution and divergence (Zhang et al., 2013;Moritz et al., 2018). A 151-taxon dataset and the Bayesian tree were used for all BPP analyses, because sequences with missing data were eliminated. The estimation of appropriate starting species divergence times (τ s) and population size parameters (θ s) was initially performed through A00 BPP analysis (Masonick & Weirauch, 2019). This estimation was based on the expected number of mutations per kilobase, as suggested by Yang (2015). The parameters for the MSC model were estimated using BPP v4.1.4 (following the A00 analysis from Yang, 2015). The joint species tree estimation and species delimitation analyses (A11) were carried out with inverse-gamma parameters of θ (5, 0.05) and τ (5, 0.02). The A11 analysis was run for 100,000 generations, sampling every two steps after discarding the initial 10,000 generations as burn-in. The analysis results were confirmed by conducting three independent runs for each analysis. Only lineages with a posterior probability (pp) of ≥ 0.95 were considered well-supported (Masonick & Weirauch, 2019).

Integrative species delimitation
Morphological and mtDNA data were analyzed in a common coalescent Bayesian framework using the program iBPP v.2.1.3 (Solís-Lemus, Knowles & Ané, 2015). This method has been shown to improve species delimitation accuracy by incorporating molecular and quantitative phenotypic data in the assessment of a priori species assignments using a guide tree. iBPP analyses were performed using the Bayesian topology as a guide tree. The same values for demographic parameters θ and τ as in the BPP analysis were used. The total evidence analyses described below utilized two datasets, as iBPP is capable of incorporating morphological data represented by quantitative, continuous traits: (a) multistate character matrix with the 25 arm color-characters as trait data scored through the color pattern analysis, and (b) PC1 + PC2 values for DAP and VAP as trait data obtained through the GM analysis. For both analyses, 46 specimens were used that included sequences and GM data, and the second included the specimens with sequences and the photographic record in vivo. To determine if different data types result in congruent delimitations, the following comparisons were made among data types: sequence data only, coloration data only, GM data only, sequence and coloration data (iBPP Seq+COL ), and sequence and GM data (iBPP Seq+GM ) (Masonick & Weirauch, 2019). Posterior probabilities (pp) at each node were averaged after performing each analysis three times. After a burn-in phase of 10,000 iterations, every second tree was sampled for a total of 100,000 trees. Well-supported delimitations were only considered for nodes on the guide tree that were recovered with pp values of ≥ 0.95 (Masonick & Weirauch, 2019).

Population diversity
Standard measures of genetic diversity (number of haplotypes, haplotype diversity h, and nucleotide diversity π) were calculated using Arlequin v.3.5.2.2 (Excoffier & Lischer, 2010). Unique haplotypes were identified using DnaSP v6.12 (Rozas et al., 2017). Geographical relationships of mtDNA haplotypes were summarized using the TCS algorithm (Clement et al., 2002) in the software PopART v.1.7 (Leigh & Bryant, 2015). To perform the homologous character comparison, missing data were excluded by trimming sequences to 632 bp. For comparison of the extent of divergence with other ophiuroid species, evolutionary distance values were generated in MEGA 11 (Tamura, Stecher & Kumar, 2021) using the Kimura 2-parameter model (Kimura, 1980), support values based on 1,000 bootstraps, including both transitions and transversions, the rate variation among sites was modeled with a gamma distribution (shape parameter = 1), codon positions included were 1st + 2nd + 3rd, and missing data were treated as pairwise deletion (Table 1).

Demographic history
To test for past population expansions, the neutrality Fu's Fs test (Fu, 1997) was implemented in Arlequin v.3.5.2.2, and significance was assessed with 1,000 permutations. In addition, the frequency of the distribution of mismatches was obtained in Arlequin and plotted with the R package ggplot2 (Wickham, 2016) to determine whether the populations exhibit evidence of spatial/demographic expansions or a stationary population history (Tajima, 1989). The Raggedness index and the sum of squared deviations (SSD) obtained in Arlequin were used to analyze the goodness of fit for the population expansion model, according to Harpending (1994).

Clade 1
Clade 1 (n = 76) was encountered only in the Gulf of Mexico and east Florida, including the Veracruz platform reefs in the Southern Gulf of Mexico and Northern Gulf of Mexico  (Fig. S5). Although it overlaps in distribution and depth with clades 2 and 3, the latter are widely distributed across the sampled areas (Fig. 1). Three sympatric subclades can be differentiated within Clade 1 (Figs. 3 and 5), that cooccur in the Northern Gulf of Mexico and the Florida Keys.

Geometric morphometrics
Procrustes ANOVA showed significant differentiation among clades for DAP and VAP (p-value < 0.05). Clades 1A, 1B, and 2A clustered together, while Clade 3 separated in the morpho-space and was significantly different based on pairwise comparisons among clades for both arm plates (Table S3). Corresponding thin-plate spline (TPS) deformation grids for PC1 in DAP analysis illustrate the extremes of variation with the extension/shortening of proximal and lateral edges, while PC2 shows the extension/shortening of the distal edge (Fig. 7). TPS deformation grids in VAP indicate that a considerable portion of the variance in PC1 is attributed to the extension/shortening of both, the proximal and distal edges, as well as and the lateral edges. In contrast, PC2 shows the extension/shortening of proximal and distal edges (Fig. 8)

Dorsal arm color pattern analysis
All but four (characters 16, 19, 22, and 25) of the 25 selected color-characters were parsimony informative, but none showed concordant differences with genetic clade assignment (Fig. S8). The consensus tree length was 170 steps, with CI = 0.201 and RI = 0.479. Specimens from all clades clustered together.

Integrative species delimitation.
iBPP analyses recovered Clades 1A, 1B, 2, and 3 as distinct for both iBPP Seq+GM and iBPP Seq , with high support in all runs. iBPP GM separated Clade 3 with high support, but low support was found between clades 1A, 1B, and 2. iBPP Seq+COL and iBPP COL did not show congruence in runs, so all clade specimens clustered together (Fig. S9).

Population diversity
Haplotype diversity (h) and nucleotide diversity (π) ranged from 0.743 to 1.000 and from 0.007 to 0.048, respectively, among the clades (Table 2). TCS network recovered the same three clades as the phylogenetic analysis of mtDNA. Nucleotide (π) and haplotype (h) diversity values in the TCS network indicated that Clade 3 was more genetically diverse, followed by Clade 1, while Clade 2 exhibited lower nucleotide and haplotype diversity (Table 2).

Demographic history
Results of Fu's Fs tests, Raggedness index, and SSD analysis are provided in Table 2. Fu's neutrality test gave significant negative values for all clades, and subclades 1A and 1B, suggesting past demographic expansions. The Raggedness index and SSD were low and non-significant for all clades and subclades 1A and 1B, respectively, suggesting an unimodal distribution of mismatches as expected for a demographic expansion. The distribution of mismatches for Clades 1 and 3 was bimodal (Fig. S10), which may indicate demographic balance. However, when clades 1A and 1B were analyzed separately, mismatches showed a unimodal distribution for 1A while continued bimodal for 1B (Fig. S11). The mismatch distribution of Clade 2 was unimodal (Fig. S10).

Species delimitation in Ophiothrix angulata
Analysis of COI sequence data revealed three deeply-divergent clades within Ophiothrix angulata in the tropical Western Atlantic, one of which (Clade 3) was also divergent in ITS2 and geometric morphometrics. Species delineation algorithms recovered the same three clades using genetic and combined genetic and morphometric data. The deep level of genetic differentiation with COI among these clades is also consistent with the three lineages representing separate species (Table 1). The consistency between morphological and genetic data on differentiating Clade 3, together with its co-occurrence with clades 1 and 2, demonstrates the lack of gene flow between Clade 3 and the others, and clearly establishes that it is a separate biological species; thus, it can be considered a confirmed candidate species (CCS) (Padial et al., 2010). All three clades co-occur in the Florida Keys, where they were collected on the same day, site, depth, and habitat (UF10247-1A, UF10248-2A, and UF10250-3 in Table S1), which further suggests reproductive isolation for Clade 3.
The status of clades 1 and 2 remain open, as they were separated only by COI sequences, and thus could be species or deep conspecific lineages (DCL) (Padial et al., 2010). While the distribution of these two clades overlaps and they are sympatric at several localities surveyed, Clade 1 is mostly known from around central and North Florida, while Clade 2 was encountered in South Florida, Southern Gulf of Mexico, and the Caribbean. This distribution largely reflects the differentiation of Gatunian and Callosahatchian faunas that have been established since the Miocene (Vermeij, 2005) and suggest allopatric differentiation along this boundary. Additional studies are needed to assess their status. Clustering sequences into three subclades in Clade 1 and two subclades in Clade 2 is more challenging to interpret. In Clade 2, the two subclades are allopatric, with one represented by two specimens from the Lesser Antilles, the other widespread, suggesting geographic differentiation.

Geometric morphometrics and Dorsal arm color pattern
The shape of both the dorsal and ventral arm plates proved to be a useful indicator for distinguishing Clade 3 from Clades 1-2. Geometric morphometrics has been employed in echinoderm studies to investigate different approaches to understanding the biology and classification of the different orders and families. For instance, Martínez-Melo, De Luna & Buitrón-Sánchez (2017) used GM methods to differentiate between genera in the Cassidulidae family based on the cryptic morphology of plate shapes in Echinoidea. De los Palos-Peña et al. (2021), combined scanning electron microscopy and ontogenetic studies of the odontophore in Luidia superba to understand patterns of size and shape variation. Similarly, Swisher (2021) studied ontogeny in fossil clypeasteroids and confirmed size and shape changes in the oral/aboral plates using GM methods. However, our study incorporates the concept of allometry, which considers the effect of size on shape variation due to ontogeny and other ecological factors influencing morphology (Benítez et al., 2013;Klingenberg, 2022).
Closely related species frequently differ in color pattern and color differences are often among the first visible morphological changes that appear among differentiating species (Benavides-Serrato & O'Hara, 2008;Hoareau et al., 2013). Ophiothrix angulata displays high polymorphism in color patterns (Figs. S1, S2, and S3), and color differences have been suggested to potentially reflect cryptic species differentiation in this species (Clark, 1918;Tommasi, 1970). The absence of correlation between color pattern and genetic clade   López-González, 2013). The differentiation among Clades 1, 2, and 3 in COI is thus in line with species-level differences in other ophiuroids.

Haplotype diversity and demographic history
Haplotype networks display contrasting topologies, with Clades 1 and 3 showing higher intra-specific diversity than Clade 2. Clade 1 shows a great diversity of haplotypes in the Northern Gulf of Mexico off Florida, where all three subclades were present. In contrast, specimens from Eastern Florida, the Florida Keys, and the Southern Gulf of Mexico each fell into single and different subclades. This pattern is suggestive of allopatric differentiation along the periphery of the Clade's range, but few samples were available from these areas making interpretation challenging. Historical demography results show evidence of recent expansion (Fu's, r, SSD, and mismatches) for subclade 1A.
Clade 2 showed a star-like network suggesting recent population expansion (Allcock & Strugnell, 2012). Most haplotypes were within 3 m-s of the common one, except for the two specimens from Guadeloupe that were 30 m-s distant. These results suggest that continental populations along North and Central America are isolated from the insular population sampled in the Lesser Antilles, and that the former, at least, underwent a recent expansion, potentially following the Last Glacial Maximum.
Clade 3 showed the highest haplotype diversity and did not display signs of recent expansion in all analyses but showed a high diversity of haplotypes. It is noteworthy that this is the only clade that was represented among deeper water samples, suggesting that it may have the most extensive depth range, and thus potentially greater physiological tolerance. The high gene diversity has resulted in a larger population size than the other clades, suggesting success in exploitation and colonization of habitats. Some haplotypes in this clade were shared between localities separated by more than 500 km.

CONCLUSIONS
This study provides a broad evaluation of the systematics of one of the most common ophiuroids in the Tropical Western Atlantic, Ophiothrix angulata. COI sequence data revealed three deeply divergent genetic lineages. The high color variability exhibited by this group did not correlate with lineages suggesting that it represents intra-specific polymorphism. Clade 3 was separated by mtDNA, nrDNA, molecular species delimitation, the shape of dorsal and ventral arm plates, and the integrative analysis with mtDNA and geometric morphometric data. Therefore, we consider it as a confirmed candidate species. Results demonstrate that a thorough arm morphology analysis can help differentiate clades within this species complex. Molecular analyses and in situ records show that all three clades co-occur in some areas. For Clades 1A and 2, Fu's, r, SSD, and mismatches showed evidence of recent expansion. Additional geographic sampling combined with physiological, reproductive, and ecological data incorporating phylogeographic analysis may further resolve this species complex.