Do cytochrome c oxidase 1 Gene Sequences Differentiate Species of Spirostreptid Millipedes (Diplopoda: Spirostreptida: Spirostreptidae)?

ABSTRACT The aim of this study was to use cytochrome c oxidase 1 (CO1) sequences to recover a phylogeny for seven morphologically described spirostreptid millipede taxa from southern Africa, and to evaluate the correspondence between morphological and molecular phylogenies. Genetic p-distance generally increased with taxonomic divergence: inter-specific mean 15.33 % (14.09 % –17.02 %), inter-generic mean 18.43 % (6.83 %–26.81 %) and inter-order mean 24.16 % (range 18.56 %–30.77 %). Congruent Bayesian, maximum parsimony and neighbour-joining analyses of 520 nucleotides of the CO1 gene resolved the orders Spirostreptida, Julida and Callipodida. Members of genera within the Spirostreptidae (Archispirostreptus, Bicoxidens, Cacuminostreptus, Doratogonus, Orthoporoides, Plagiotaphrus and Spirostreptus) formed a single clade within which a sample of Thyropygus (family Harpagophoridae) was paraphyletically nested. Phylogenetic analyses failed to recover support for the genera Doratogonus, Bicoxidens, Archispirostreptus and Spirostreptus, as representatives of these genera were not monophyletic. Samples morphologically identified as the same species (Bicoxidens flavicollis) were part of two different clades, one of which was well supported and otherwise contained members of Doratogonus. This high level of divergence (mean 12.64 %) between morphologically identified spirostreptid millipede sister species could indicate that changes in genital morphology occur rather slowly relative to CO1 sequence substitution, and may underestimate species diversity.

Although male genitalia are central in spirostreptid millipede taxonomy (see Hoffman 2008;Hamer 2009;Mwabvu et al. 2010) because they are divergent and species specific (Bond et al. 2003), morphology-based classifications may be too inclusive (Adams et al. 2009;Ács et al. 2010) and may hide considerable variation. Molecular data have demon strated in many arthropod groups that molecular divergence may not reflect divergence in morphology, thus resulting in an underestimation of diversity (Brewer et al. 2012). Derkarabetian et al. (2011) also reported a lack of divergence in reproductive morphology in Opiliones, although genetic divergence is significant. Other studies, for example, in Anadenobolus millipedes (Bond et al. 2003) and spiders (Huber et al. 2005), also suggest that genetic and genitalic divergence are decoupled. Hence, groups that were described on the basis of morphological characters are being re-evaluated using appropriately variable DNA sequence data.
The use of DNA sequences to distinguish taxa, identify new species and estimate phylo geny is becoming prominent in invertebrate systematics (Hebert et al. 2004), including arthropods (Brewer et al. 2013). For example, mitochondrial cytochrome c oxidase 1 (CO1) gene sequences have been used to differentiate species in pelagic sea snails (Hunt et al. 2010), crustaceans (Witt et al. 2006), lepidopterans of North America (Hebert et al. 2010), tropical butterflies (Elias et al. 2007;Escalante et al. 2010) and arachnids (Boyer et al. 2005). Furthermore, mitochondrial 16S rRNA gene sequences have also been used to discriminate members of the orders Polydesmida (Marek & Bond 2007), Spirobolida (Bond & Sierwald 2002), Glomerida (Wesener 2012) and Julida (Enghoff et al. 2011(Enghoff et al. , 2013. Moreover, most of the centipede and millipede faunas of Bavaria, southern Germany (120+ species representing all major orders and families) have been subjected to molecular analysis, including barcoding and genetic distance assessments (Spelda et al. 2011). However, inter-generic genetic variations and the thresholds for designating species remain only scarcely touched upon despite, according to Enghoff et al. (2011), molecular data demonstrating that they could resolve inter-generic relationships in millipedes.
DNA sequence data could clarify the validity of species, flag undescribed taxa and aug ment taxonomy based on male genitalia. Given the poor dispersal abilities of the Diplopoda and their preference for moist micro-habitats, which are patchy, they tend to be local endemics, i.e. geographically isolated in small areas (Hopkin & Read 1992). As such, we expect strong inter-generic and inter-specific genetic differentiation among millipedes.
The aim of this study was to carry out a phylogenetic analysis based on mitochondrial CO1 sequences for seven taxa, morphologically determined to be members of the family Spirostreptidae, and outgroups. The objectives were as follows: (1) to assess the CO1 genetic distance thresholds characterising a range of taxonomic levels (intra-specific to inter-order) within the order Spriostreptida, and between the orders Spirostreptida, Julida and Callipodida within the Helminthomorpha; (2) to assess whether morphologybased taxonomic groups within the Spirostreptidae, comprising members of the genera Archispirostreptus, Bicoxidens, Cacuminostreptus, Doratogonus, Orthoporoides, Plagiotaphrus and Spirostreptus, are supported by molecular data; and (3) to assess whether the family Spirostreptidae (represented by the above genera) is monophyletic.
Legs removed from the midbody rings of individual specimens were ground using a mortar and pestle. DNA was then extracted following the manufacturer's instructions using the Qiagen DNeasy Blood and Tissue Kit (Qiagen). Polymerase chain reactions targeting part of the mitochondrial CO1 gene were performed after optimisation using gradient PCR. PCR amplifications were performed in a total volume of 25 µl containing 12.5 µl EconoTaq plus green mastermix, 0.25 µl of each primer, 1 µl of DNA and 11 µl of H 2 0. The thermal cycling conditions were 94°C for 2 min, 35 cycles of 94°C for 30 s, annealing at 50°C for 30 s, extension at 72°C for 1 min and a final extension of 72°C for 10 min. Primers for the amplification were as used by Lavrov et al. (2002) and Bond and Sierwald (2002). As PCR yields were low, amplicons were cloned prior to sequencing on an ABI 3730 capillary sequencer at Inqaba Biotechnical Industries (Pty) Ltd (Hatfield, Pretoria, South Africa).
Raw sequences were edited in BioEdit version 7 (Hall 1999), and aligned using the Clustal W algorithm in BioEdit, and by inspection. The CO1 alignments were trimmed to a common length of 520 nucleotides. A saturation plot of transitions and transversions versus genetic distance, and Xia's test of substitution saturation (Xia et al. 2003;Xia & Lemey 2009) were carried out in DAMBE (Xia & Xie 2001) to evaluate the level of saturation and establish the usefulness of the dataset as an indicator of phylogeny.
Phylogenetic relationships among the taxa were analysed using Bayesian, maximum parsimony and neighbour-joining analyses. Modeltest (Posada & Crandall 1998) was used to select the model of nucleotide substitution which best fitted the sequence dataset. The general time-reversible (GTR) model was selected using the Akaike Information Criterion (AIC), and was used in Bayesian, neighbour-joining and data-saturation analyses. Neighbour-joining and maximum parsimony analyses were implemented in PAUP (Swofford 2002). Genetic distances were presented as distance matrices and as a neighbour-joining tree, which was bootstrapped using 1 000 pseudo-replicates. For parsimony analysis, the random additions sequence option (n = 100) for discrete, unordered characters was used. The shortest tree was obtained using the heuristic search option under the tree bisection-reconnection (TBR) branch-swapping option. The degree of support for each node of the resulting tree was estimated using bootstrap re-sampling analysis (1 000 pseudo-replicates; Felsenstein 1985). Bayesian analysis was implemented in Mr Bayes 3.0 (Huelsenbeck & Ronquist 2001) using flat priors. For all analyses, four Markov chains were run for 5 million generations each, until the standard deviation of the split frequencies was less than 0.01. The first 500 000 trees were discarded as burn-in. A preliminary run was carried out in order to determine that the burn-in value was sufficiently high to discard all less likely trees generated before the likelihoods had stabilised. Using the methods described above, we also created a tree based on amino-acid sequences, to see if deep node support was greater than in the tree based on nucleotide sequences.
Nucleotide sequence data were submitted to GenBank. The GenBank accession numbers are KF219542 (A.  (Table 1). Outgroup sequences of taxa belonging to the Orders Callipodida, Julida, and Spirostreptida (Harpagophoridae: Thyropygus sp.) were obtained from the National Center for Biotechnology Information (NCBI) GenBank database.

RESULTS
The results of Xia's test of substitution saturation (Xia et al. 2003;Xia & Lemey 2009) showed that ISS (0.3219) was significantly less than ISSC (0.7166) (p = 0.0000), consistent with little saturation in the data and indicating that it is useful for phylogenetics. The saturation plot (Fig. 2) showed no saturation in transversions and some saturation in transitions at GTR genetic distances greater than 17 %.
In order to enable comparisons to be made with genetic distances reported in the literature, we report genetic distances among different taxonomic levels as p-distances. Genetic p-distances within the Spirostredptidae were: inter-specific mean 15.33 % (range 14.09 % -17.02 %); and inter-generic mean 18.43 % (6.83 % -26.81 %) ( Table 2, Fig. 1). The mean genetic distance between the orders Spirostreptida, Julida and Calli- Bayesian inference (BI), maximum parsimony (MP) and neighbour-joining (NJ) analyses of the CO1 nucleotide alignment produced congruent trees. These trees were also congruent with the trees based on amino acid sequences, which are not presented here. The Bayesian tree is presented, with bootstrap values from all three analysis methods indicated at the nodes, in the order (BI posterior probability/ MP bootstrap % /NJ bootstrap %) (Fig. 3). Members of the three orders formed monophyletic clades; the Spirostreptida (Clade A; 1.00 / 80 % / 89 %) and Julida (Clade B; 1.00 / 88 % / 92 %) were strongly supported, and the Callipodida relatively weakly supported (Clade C; 0.96 / 81 % /52 %). Within clade A, duplicate representatives of D. cristulatus, A. tumuliporus and P. sulcifer formed very strongly supported clades (1.00 / 100 % / 100 %). A single member of the Harpagophoridae, a Thyropygus species, was nested within the strongly supported clade (A) containing all representatives of the Spirostreptidae and showed a very weakly supported (0.85 / 51 % / 68 %) relationship with Archispirostreptus gigas. Members of the genus Archispirostreptus (tumuliporus and gigas) did not form a monophyletic clade, as would be expected. Although all members of Doratogonus (cristulatus, uncinatus and sp. 1) were part of the same well-supported clade (A1), a representative of Bicoxidens flavicollis was also nested within this clade. Although there  was very weak support (0.61 / -/ 60 %) for an association between Bicoxidens aridis and a representative of B. flavicollis, not all representatives of this genus were part of the same clade. Further, although members of the genus Spirostreptus (sebae and krugeri) were part of the same unsupported clade, this also included a representative of Cacuminostreptus mazowensis.

DISCUSSION
The overall aim of this study was to use CO1 sequences to recover a phylogeny for seven spirostreptid millipede taxa from southern Africa whose taxonomic position had previously been determined by morphological analyses, and to compare the relationships revealed by morphological and molecular methods. The CO1 gene is generally regarded as sufficiently divergent to allow identification and discovery of new species in a variety . Numbers adjacent to taxon names are GenBank accession numbers, and indicate sequences that were downloaded from the NCBI Genbank. This tree was congruent in structure with maximum parsimony and neighbour-joining analyses of the same dataset. Nodal support values are indicated as (posterior probability / maximum parsimony bootstrap / neighbour-joining bootstrap).
of taxa, including beetles (Funk 1999), neotropical skipper butterflies (Hebert et al. 2004) and amphibians (Vences et al. 2005). As there is little phylogenetic information available for spirostreptid millipedes, we aimed to evaluate the threshold genetic distances distinguishing different taxonomic levels. We found that genetic distance between taxa generally increased with taxonomic divergence, as would be expected based on a molecular clock hypothesis (Zuckerkandl & Pauling 1962). The mean genetic distance between the orders Spirostreptida, Julida and Callipodida was 24.16 % (range 18.56 %-30.77 %). Taxa belonging to the same order clustered together in the CO1 tree, providing support for the validity of these orders according to the phylogenetic species concept (Cracraft 1987). The mean inter-specific distances recovered were very high at 15.33 % (14.09 %-17.02 %) relative to those observed in other types of organisms. For example, 74 % of avian sister species are separated by CO1 distances of less than 2.7 % (Moritz & Cicero 2004), and 99 % of all marine nematode species are separated by >5 % (Derycke et al. 2010). This high level of divergence between morphologically identified spirostreptid millipede species could indicate that changes in genital morphology occur rather slowly relative to CO1 sequence substitution, and may underestimate species diversity. Phylogenetic analyses failed to recover support for the genera Archispirostreptus, Bicoxidens, Doratogonus and Spirostreptus. Representatives of Archispirostreptus and Bicoxidens respectively did not form monophyletic clades, as would be expected of a genus under the phylogenetic species concept (Cracraft 1987). Although members of the genera Doratogonus and Spirostreptus were all included in the same clade, both the Doratogonus and Spirostreptus clades included a member of another genus, and were not therefore monophyletic for members of the genus. Further, samples identified as the same species (Bicoxidens flavicollis) were part of two different clades, one of which was well supported and otherwise contained members of Doratogonus.
Based on the selected genera (Archispirostreptus, Bicoxidens, Cacuminostreptus, Doratogonus, Orthoporoides, Plagiotaphrus, Spirostreptus), the CO1 sequences illustrated that representatives of the family Spirostreptidae are all members of the same clade, although the presence within this clade of a sample of Thyropygus, sourced by a Blast search of the Genbank for sequences most similar to the Spirostreptidae study samples, is puzzling. Our phylogenetic analyses yielded a weakly supported sister group relationship between Thyropygus sp. and Archispirostreptus gigas, although genital morphologybased taxonomy places the two genera in different families, Harpagophoridae and Spirostreptidae, respectively.
At this stage, it is premature to question relationships based on genital morphology because of poor taxon sampling and, according to Sole-Cava and Wörheide (2007), possible identification errors in the Genbank database, because sequences are not from holotypes. Archispirostreptus gigas and A. tumuliporus do not form a subclade within the Spirostreptidae clade, even though based on gonopod morphology they are congeneric (Mwabvu et al. 2010); this was unexpected. Whether A. gigas and A. tumuliporus belong in different genera requires further scrutiny; possibly these taxonomic assignments are incorrect. At present it is unclear whether genital morphologic variation is correlated with genetic change in millipedes. Hence, there is a need to increase taxon sampling and use more than one gene sequence to assess intra-generic genetic divergence. C values for Diplopoda range from 0.31 pg to 1.4 pg (www.genomesize.com, Animal Genome Size Database, accessed 29 January 2015). Based on the mean, the fraction of the genome represented by the CO1 alignment analysed in this study is 5.6 x 10 -5 %. Based on this, it would appear advisable to employ a multi-gene system in molecular taxonomy, although this would not significantly increase the proportion of the genome analysed. In addition, Meier et al. (2006) observed that 21 % of dipteran species that they studied did not have unique CO1 sequences, which resulted in low identification success. Thus, in order to establish the degree of genetic divergence in spirostreptid millipedes, wider taxon sampling and the use of more genes to discriminate species are necessary.
Although the genetic distances among species are high, we found little evidence of saturation in the dataset, confirming that these data are suitable for conducting phylogenetic analyses. It is possible that some of the paraphyletic associations observed may be due to long branch attraction artifacts; however, if this were the case, we would expect a discordance in the structure of trees created using the parsimony method, which is most prone to grouping long branches together, and using Bayesian and neighbour-joining analyses. A lack of discordance when different analytical methods are used can be taken as evidence that all are inferring the same true tree (Bergsten 2005). Although it has been suggested that the CO1 locus lacks a phylogenetic signal at deep nodes in millipedes (Brewer et al. 2013), we obtained good support for the deep nodes (representing different orders) in our analyses. Our analyses were based on amino-acid sequences, which are more conservative than nucleotide sequences owing to the redundancy in the genetic code, and they revealed no increased nodal support at any level, and produced trees with congruent structures in which the same nodes were supported.
In summary, we found that genetic distance between taxa generally increased with taxonomic divergence. We recovered varying levels of concordance between morphological and CO1 sequence data. The orders Spirostreptida, Julida and Callipodida were completely resolved. Members of the family Spirostreptidae were all part of the same clade, but were not monophyletic, as a member of a sister family, the Harpagophoridae, was nested within this clade. Most genera within the Spirostreptidae were not supported, and were part of paraphyletic associations. Further, samples identified as the same species (B. flavicollis) were part of two different clades. This demonstrates that in millipedes, genitalia may be reliable indices of taxon validity at higher taxonomic levels (Order), but that the correlation between morphological and molecular (CO1) data is not unequivocal at family, genus or species level. All species-level taxa were characterised by unique CO1 sequences separated by high genetic distances (mean inter-specific distance 14.1 %), probably because genital morphology underestimates true diversity. We further provide CO1 sequence data for seven species of millipedes within the Spirostreptidae.