Extensive gene rearrangements in the mitogenomes of congeneric annelid species and insights on the evolutionary history of the genus Ophryotrocha

Annelids are one the most speciose and ecologically diverse groups of metazoans. Although a significant effort has been recently invested in sequencing genomes of a wide array of metazoans, many orders and families within the phylum Annelida are still represented by a single specimen of a single species. The genus of interstitial annelids Ophryotrocha (Dorvilleidae, Errantia, Annelida) is among these neglected groups, despite its extensive use as model organism in numerous studies on the evolution of life history, physiological and ecological traits. To compensate for the paucity of genomic information in this genus, we here obtained novel complete mitochondrial genomes of six Ophryotrocha species using next generation sequencing. In addition, we investigated the evolution of the reproductive mode in the Ophryotrocha genus using a phylogeny based on two mitochondrial markers (COXI and 16S rDNA) and one nuclear fragment (Histone H3). Surprisingly, gene order was not conserved among the six Ophryotrocha species investigated, and varied greatly as compared to those found in other annelid species within the class Errantia. The mitogenome phylogeny for the six Ophryotrocha species displayed a separation of gonochoric and hermaphroditic species. However, this separation was not observed in the phylogeny based on the COX1, 16S rDNA, and H3 genes. Parsimony and Bayesian ancestral trait reconstruction indicated that gonochorism was the most parsimonious ancestral reproductive mode in Ophryotrocha spp. Our results highlight the remarkably high level of gene order variation among congeneric species, even in annelids. This encourages the need for additional mitogenome sequencing of annelid taxa in order to properly understand its mtDNA evolution, high biodiversity and phylogenetic relationships.


Background
Mitochondrial DNA (hereafter mtDNA) has been invaluable in the field of molecular evolution and phylogenetics, and is still a widely used marker today [1,2]. Maternal inheritance and near absence of recombination have popularised its use in many eukaryotes [3]. Most of metazoan mitochondrial genomes are circular molecules that typically include 13 protein coding genes (PCG), two ribosomal RNA genes, 22 transfer RNA genes and a control region ( [4], but see [5,6]). The mitochondrial gene content is almost invariant among species, but the gene order has been found to vary considerably across Metazoans (such as flatworms, molluscs and tunicates [7]), generating interest in using mitochondrial DNA (mtDNA) gene order for phylogenetic inference [2]. The advent of next generation sequencing has made it easier to obtain mitochondrial genomes even for classical nonmodel organisms. This enables the detection of gene rearrangements, as well as phylogenetic relationships among and within diverse phyla [8]. Gene order is known to vary extensively within the phyla Mollusca [9], Arthropoda [10] and Annelida [11,12].
Although over two thousand metazoan genomes have been sequenced to date [13], entire phyla such as Annelida have been widely neglected, with many orders or families often represented by a single specimen of a single species [14]. With more than 17,000 species described, annelids are among the most speciose and ecologically important groups of metazoans [15][16][17]. The extraordinary morphological and ecological diversity of annelids is only comparable to that of crustaceans and molluscs [18]. Furthermore, their exceptional plasticity and adaptability have enabled them to colonise all domains, from marine and freshwater to terrestrial habitats, evolving a wide variety of life history strategies, reproductive modes and feeding habits [19]. Marine annelids have been extensively used as model organisms for the investigation of central questions in ecotoxicology, ecology, physiology, development and evolution, owing to the fact that they are relatively easy to maintain and rear under laboratory conditions (reviewed in [20][21][22]). For example, the bristle worm Capitella teleta Blake, Grassle & Eckelbarger, 2009 (former Capitella sp. I) and the Dumeril's clam worm Platynereis dumerilii (Audouin & Milne Edwards, 1833) have been primary biological systems for developmental, evolutionary and neurobiological studies [23][24][25], and are both part of a long list of annelids used as indicators for biomonitoring and eco-toxicology tests (reviewed in [26,27]).
For similar reasons, the interstitial worms in the genus Ophryotrocha Claparède & Mecznikow, 1869 have been widely used to investigate the eco-evolution of functional traits (e.g. [28,29]) and reproductive strategies (e.g. [30][31][32]), and more recently, to investigate species' ability to tolerate, respond and adapt to global changes [33][34][35][36][37][38]. Although this group of annelids has been the focus of several descriptive and experimental studies over several decades (Additional file 1), our understanding of their genomics is scarce when compared to other annelids and marine invertebrate genera, thus limiting its potential as model system [22]. As new areas in the sea are explored, new Ophryotrocha species are regularly discovered and described [39][40][41][42][43][44][45][46], and with them the need to clarify the phylogenetic relationships within this genus. For example, recent molecular evidence is suggesting the presence of complexes or lineages in species that were originally considered as independent taxonomic units [42,45]. Moreover, based on mitochondrial fragments of 16S and cytochrome oxidase I (COXI) genes, and nuclear H3 genes, a clear separation between gonochoric and hermaphroditic species has been proposed by multiple studies, with the hermaphroditism considered as the plesiomophic state for this genus [40,41]. Providing additional genomic information on Ophryotrocha would certainly increase its usefulness as an emerging interdisciplinary model.
In order to improve our knowledge on the evolutionary history of the genus Ophryotrocha, we: (i) characterised for the first time the complete mitochondrial genomes of six compared the gene orders of the six species with those described for the main annelids' taxonomic groups, (iii) used these novel mitogenomes to investigate the phylogenetic relationships among the six corresponding Ophryotrocha species and, finally, (iv) updating the Ophyrotrocha phylogeny to portray how reproductive mode may be linked to evolutionary history in this genus.

Sequencing
The number of pair-end reads obtained varied from 4, 328,646 and 12,012,958 million (Table 1). We succeeded Several overlaps and small noncoding regions between tRNA and genes, or between two adjacent genes, were observed in all species and showed sizes that ranged from a few to hundreds of bases (Additional files 2, 3, 4, 5, 6, and 7).    8, 9, 10, 11, 12 and 13). In the remaining Ophryotrocha species, Valine and Glutamate were not found, while Aspartate was missing only in O. japonica. In O. labronica, tRNA-Pro was not completely recovered.

Gene rearrangement
The gene order varied widely among Ophryotrocha species (Fig. 2). The higher level of similarity between gene order was found between O. robusta and O puerilis,    Magelonidae was closest to the putative bilateral pattern differing by several reversals of NAD3, NAD4L, NAD4, NAD5, NAD6, CYTB, ribosomal genes and NAD1. It also differed from the Lophotrochozoa pattern by several transpositions of COX3, NAD6 and CYTB.

Phylogeny
Mitochondrial phylogeny of the six studied Ophryotrocha species The concatenation of the amino-acid sequences resulted in a fragment of 4098 residues. After removing the poorly aligned positions, 2242 residues were kept for phylogenetic analyses. The concatenation of the nucleotide sequences resulted in a fragment of 12,297 bp. After removing the poorly aligned positions, 6726 bp were kept for phylogenetic analyses. Bayesian and maximum likelihood amino-acid and nucleotide phylogenies were mostly congruent (Fig. 3)  Both Bayesian inference (BI) and Maximum likelihood (ML) trees were mostly congruent and two major clades were found, i.e. clade 1 and 2, which displayed high support (aLTR = 100, pp. = 0.99) (Fig. 4 and Fig. 5 In the phylogeny of the Ophryotrocha genus, we identified 25 gonochoric species, seven protandrous hermaphrodite species and 11 simultaneous hermaphrodite species. All ancestral reproductive mode analyses suggest gonochorism as the reproductive mode for the last common ancestor of Ophryotrocha spp. (Figs. 4, 5, additional file 17). However, for the protandrous group containing all the Ophryotrocha puerilis species and the gonochoric species Ophryotrocha eutrophila, the support of the ancestral state was low (pp < 0.7), suggesting that the reproductive mode for the ancestor is not fully supported.

Discussion
Our work is the first to report gene rearrangements involving protein coding genes (PCG) among annelid species belonging to the same genus. Also, in contrast to other known annelid mitochondrial genomes, we show that Ophryotrocha diadema possesses the ribosomal genes encoded on the minus strand. Finally, the   Values represent the aLTR support and the posterior probabilities, respectively. Only posterior probabilities greater than 0.7 are shown. * represents the species we used in this study. Colors of the branch represent the reproductive mode for all the species and their common ancestors: light blue (protandrous hermaphrodite), dark blue (simultaneous hermaphrodite), pink (gonochoric) and grey for unknown mode ancestral state reconstruction of the reproductive mode shows that gonochorism is the plesiomorphic condition in the Ophryotrocha genus, not hermaphroditism as previously hypothesized. Below we discuss in detail these major findings, and their implications for understanding the molecular evolution of the mitochondrial genome within the genus Ophryotrocha.

Genome organisation and features of the six Ophryotrocha species
We provide here the complete mitochondrial genome sequences for six Ophryotrocha species. The features of these mitogenomes did not differ from those reported from other annelid species. Their AT-content, AT and GC-skew, the length of the ribosomal genes and the initiative codons of the PCG are congruent with those reported for other annelids [18,[47][48][49][50][51][52][53][54] and metazoans [55]. GC content and GC-skew have an influence on the usage of codon and proteins [50,56,57]. According to their base compositions, codons ending with C or G are avoided in all Ophryotrocha species, which is a trend observed for other annelid species [52].
Surprisingly, the gene order of the mitochondrial genome was not conserved as the positions of PCG and tRNA differed among the six Ophryotrocha species. Mitochondrial gene order tends to be highly conserved among metazoans [58] except in some groups such as molluscs [59,60] and c [61]. Most frequent changes in gene order involve tRNA, while rearrangements between rRNAs or PCG are rarer [9]. Interestingly, we found rearrangements involving both tRNAs, PCG and ribosomal genes within congeneric species. Gene rearrangements have already been observed in other annelid families. Specimens from the Diurodrilidae [62], Chaetopteridae [12] and Syllidae [11] have shown different gene orders. However, previous findings suggest that these rearrangements are confined to basal lineages and should not be found in more recent lineages such as the Errantia and Sedentaria [12]. Our results do not provide support for this hypothesis, and are instead in line with findings by Ocerguera-Figueroa et al. [63], who reported gene rearrangement in more recent lineages. Several hypotheses have been proposed to explain these rearrangements: the role of tRNAs as mobile elements within the mitochondrial genome [55], intra-mitochondrial recombination [64], or the influence of the oxidative stress [65]. The latter hypothesis would explain why certain clades with species experiencing greater levels of oxidative stress are more prone to rearrangements compared to clades with species experiencing lower levels of oxidative stress [66]. As a consequence, it remains to be further tested whether annelid species with greater degrees of gene rearrangement are also those species that exhibit higher oxidative stress levels.

Ribosomal genes on the minus strand
In addition, we report the presence of ribosomal genes on the minus strand. Usually, all annelid mitochondrial genes are encoded on the same strand except in the tubeworm Owenia fusiformis, the magelonid Magelona mirabilis (Johnston, 1865), and the ragworm Laeonereis culveri (Webster, 1879), which have one or two tRNA located on the minus strand [12,67]. To explain this, it was hypothesised that, in the last common ancestor of annelids, all the genes were encoded on the same strand by chance wherein a ratchet effect took place, eliminating all the transcriptional elements and preventing the translocation of genes to the other strand [1,68]. Weigert et al. [12] proposed two hypotheses to explain the translocation of genes between strands in annelids. The first hypothesis implies that in the last common ancestor of annelids, a single strand first encoded all genes and then underwent an inversion-transposition of the tRNAs and their transcriptional elements. The second hypothesis implies that the last common ancestor of annelids still had transcription signals on both strands and that these signals were kept in the basal lineages and lost in the more recent ones. However, O. diadema and Laeonereis culveri [67], two species found in most recent lineages, possess genes encoded on both strands, suggesting either (1) that a more recent ancestor still possessed transcript signals on both strands or (2) that tRNAs were transposed with transcript elements.
Indeed, sequencing strategies and/or annotation methodologies can influence the results in terms of length of the mitogenome sequence recovered or in the gene length [69]. Some studies have documented the presence of dissimilarities in the assembly and annotation of mitochondrial genomes that are not always associated with low coverage [69,70]. In addition, not all the species are necessarily sensitive to this problem, which complicates the identification of a uniform approach to obtaining reliable mitochondrial genomes [69]. Since gene order is not affected by the methodology used, our results can be reasonably considered not to be an artefact.
Phylogenetic relationships within the genus Ophryotrocha using mitogenomes and traditional marker  [40,41]. Genomic features differed between these two groups. The hermaphroditic species harboured negative AT-skew and GC-skew values in most of the PCG and ribosomal genes, as reported for other mitogenomes of annelids [50]. Gonochoric species had positive GC-Skew and negative AT-skew for each PCG. Codon usage was also different between the two-sexual modes: to code glutamic acid, the gonochoric species use the GAG codon, whereas the hermaphrodites use the GAA. Similar differences in codon usage remain to be confirmed in additional hermaphroditic and gonochoric Ophryotrocha species.
Based on the phylogeny of the Ophryotrocha genus we obtained, two main clades were identified: clade 1 that contains most of gonochoric species and clade 2 mostly composed of hermaphrodite species, but also including species with gonochoric strategies. Lending a closer look within the clade, reveal that a separation between reproductive modes was not as clear as it was reported in previous phylogenies including only a few species [40]. In addition, several groups were identified as O. labronica, this suggesting the existence of issues related to taxonomic identification. In contrast to Dahlgren et al. [40] and Heggoy et al. [41], we observed another clade within the hermaphrodite group. The separation of this clade was not linked to the reproductive mode of the species, although most of them are lacking information on their reproductive mode. Wilkund et al. [42] suggested a separation according to their habitat. As we do not possess this information for all species, it was not possible to comprehensively test for this hypothesis.
In addition, it is worth noticing the presence of different lineages with the same name in our phylogenies. This observation confirm the presence of cryptic species within groups that were originally described as independent species, as hypothesized for O. labronica lineages and for the O. puerilis complex, this confirming what reported in recent studies (e.g. [42,43,45]). In some cases, some gene sequences with different names corresponded to the same species, as for O. obscura and O. sanya (previously O. vellae by Paxton et al. [71]). In certain cases, the wrong taxonomic identification of the species may have occurred, prompting the incorrect association to a given reproductive mode, such as for sequences labeled as O. diadema #3 and O. permanni #2. Finally, we confirmed that Iphitime species do cluster in the Ophryotrocha genus [39,41,43,45,72,73]. However, in our phylogeny, the Iphitime species are closely related to O. adherens, O. puerilis and O. socialis, whereas in the phylogeny from Heggoy et al. [40] the Iphitime specimen was closer to O. gracilis and O. hartmanni. Contrary to Taboada et al. [45], we found O. clava closely related to O. jiaolongi and the Iphitime species. This result was also observed by Zhang et al. [39].
As reported in previous studies, three different reproductive modes are present in the Ophryotrocha genus, i.e. gonochorism, simultaneous hermaphroditism and proterandrous hermaphroditism: the latter found in only one species, O. puerilis [31]. Contrary to other phyla, such as Mollusca and Arthrpoda, the Annelida phylum is known for its high diversity in reproductive modes, even within families [74]. The relative simplicity of their reproductive system and the relaxed morphophysiological constraints to the evolution of alternative reproductive strategies seem to have favoured this remarkable variation of reproductive modes even among congeneric species [75]. Our results suggest that transitions between reproductive modes, i.e. gonochorism to hermaphroditism and vice versa, seem to have occurred multiple times along the evolutionary history of this genus. Indeed, both gonochoric and hermaphroditic species of the Ophryotrocha genus show some degree of sexual lability in the population that can potentially favour the expression of alternative reproductive strategies [76]. In particular, the presence of sexual phenotypes in the gonochoric species (i.e. pure male, male with a few oocytes, pure female, and female with a few sperm) is considered a vestigial trait of an ancestral hermaphroditic state [77]. Interestingly, our study shows that the ancestral reproductive strategy of this genus is most likely gonochorism, not hermaphroditism as previously reported [40,41,78]. Moreover, while clade 1 is composed of only gonochoristic species, clade 2 contained all modes of reproduction. These findings are in line with a recent study suggesting that transitions from gonochorism to hermaphroditism are more common than the reverse in many animal taxa. Factors promoting hermaphroditism in gonochoristic animals, such as low densities and inbreeding depression, are in fact more widespread or create stronger selection pressures than the conditions promoting gonochorism in hermaphroditic animals, such as high density and reproductive assurance [79].
The evolution of hermaphroditic strategies in this genus has been generally explained as an adaptation to conditions of low density, stabilized by poor mate search efficiency and high costs of searching [32]. In the Mediterranean Sea, the gonochoristic species O. labronica and O. japonica are indeed present in greater densities compared to the other hermaphrodite species [80]. In addition, Prevedelli et al. [31] demonstrated that gonochoristic and hermaphroditic species differed in a number of life-history traits. These differences confer to the former higher fitness and demographic advantages over the latter in conditions of high mate-search efficiency. However, the lack of information on environmental population densities and on life history for the majority of the Ophryotrocha species comprising our phylogeny prevents this hypothesis from being formally tested.
Finally, we cannot completely discard the idea that the ancestral reproductive mode could have been somewhat an "intermediate" one between gonochorism and hermaphroditism, given the documented wide variety of sexual phenotypes found in some species of these genus expressing these two extant forms of reproduction [76,77]. Further research is therefore needed to better understand the phylogenetic relationship among Ophryotrocha species as emerging models for the investigation of evolutionary global change biology [37,38].

Conclusions
The descriptions of unique gene rearrangements within the Ophryotrocha genus are remarkable as they suggest that mitochondrial genomes in this taxonomic group are highly dynamic, signalling that gene rearrangements can occur more rapidly than previously thought. The withingenus PCG rearrangement refutes the idea that the gene order is conserved among the Errantia, although further studies are required to determine the mechanisms involved. The use of next generation sequencing techniques on Ophryotrocha has revealed the significant potential of these species as model organisms for studying evolutionary history within this genus. Moreover, this study displays the remarkable level of genetic diversification in annelids found even among closely related species. It highlights the need to increase the taxonomic representation in future phylogenetic studies for a more accurate understanding of this phylum's diversity. Finally, developing an in-depth genomic understanding of the Ophryotrocha genus will help further the investigation of both evolution of life-history traits and the emerging field of evolutionary global change biology [37,38,81,82].  [80]. For each species, three samples (each from a single breeding pair) of 30 individuals were used for mtDNA extraction. Genomic extractions were performed using QIAGEN's DNeasy Blood and Tissue Kit with RNAse according to the manufacturer's protocol. Quantification of DNA was done with Quant-iT™ PicoGreen® dsDNA Assay Kit (Invitrogen™, Canada following the manufacter's protocol. Genomic DNA (500 ng) was mechanically fragmented for 40 s. using a covaris M220 (Covaris, Woburn MA, USA) with default settings. Fragmented DNA was transferred to PCR tubes and library synthesis was performed with the NEB Next Ultra II (New England Biolabs) according to manufacturer's instructions. TruSeq HT adapters (Illumina, SanDiego, CA, USA) were used to barcode the samples. The libraries were quantified and pooled using an equimolar ratio and sequenced on an Illumina MiSeq 300 base-pair (bp) pairedend run (600 cycle, v3 kit) at the Plateforme d'Analyses Génomiques of the Institut de Biologie Intégrative et des Systèmes (Laval University, Quebec, Canada).

Assembly and annotations
The quality of the sequencing was assessed with Fastqc [83], and adapters were removed with Trimmomatic [84] available in usegalaxy.org [85]. Default parameters were used to retrieve mitochondrial genomes using the perl script Novoplasty2.7.2 [86]. Briefly, based on a seedand-extend algorithm, mitochondrial genome is retrieved from the whole genome sequencing data, using a related or distant single seed sequence [87]. For each species, we used a fragment of COX1 as seed: JQ310756. As no mitochondrial genome from close relatives of each species was available, no references were used for the assembly. Annotation was performed with MITOS2 Web Server [87], verified with ORF finder [88] and ARWEN [89], and visualized using GeSeq [90]. Determination of the A + T content of protein-coding genes, tRNA genes, rRNA genes and the RSCU was performed with DAMBE 6 [91].

Gene rearrangement
CREx [92] was used to examine the possible scenarios of rearrangement between pairs of complete genomes. Briefly, this method compares two genomes and determines the most parsimonious scenario that has led to the observed rearrangement accounting for duplication reversals, transpositions or other events. As the tRNA order is not conserved among Annelida species [12], we kept only the PCG order to infer the possible scenarios of rearrangement. We examined all the rearrangement scenarios among Ophryotrocha species and other gene orders known for annelids obtained from Lavrov and Lang [93], Mwinyi et al. [54] and Weigert et al. [12]. Results were visualized with the R-package genoPlotR [94].

Phylogeny reconstruction
Phylogenies were assessed using maximum likelihood and Bayesian inference. First, PCG from whole mitochondrial sequences were used to reconstruct phylogenies for the six Ophryotrocha species considered. Each PCG was separately aligned and then concatenated using seaview4 [95]. Poorly aligned regions were removed using Gblocks [96] with default parameters. This step was performed on both nucleotide and amino-acid sequences of PCG. Two species were used as outgroup in the phylogenies: Marphysa sanguinea (NC_023124.1) and O. fusiformis (NC_ 028712.1). Secondly, in order to investigate the relationship among Ophryotrocha species, we built a phylogeny based on two mitochondrial fragments, the COI gene and the 16S, and a nuclear fragment from Histone 3 (H3). First, we retrieved the Histone 3 fragment for each Ophryotrocha species used in this study using a BLASTn search available in Galaxy [85] with a H3 fragment from a close relative available in Genbank. Sequences of H3 have been deposited in Genbank (MT733538-MT733543). In order to determine if the three genes could be concatenated for phylogenetic analyses, we tested the congruence among the three distance matrices genes using the congruence among distance matrices approach (CADM) developed by [97,98] available in ape Rpackage [99]. Briefly, the CADM tests for the presence of incongruency among all the distance matrices. The significance of the test was performed with a 1000 permutations. All three distance matrices were obtained in MEGA5 [100]. No incongruence was detected among the matrices (W = 0.65, p = 0.001) and all the sequences were concatenated. A total of 71 sequences representing 46 species from Genbank (listed in Additional file 16) and the six-species used in this study were aligned for each gene with MAFFT version 7 [101] available at https://mafft.cbrc.jp/alignment/software . All the COXI aligned sequences were trimmed to 400 bp length and all the H3 sequences were trimmed to 298 bp length. All three genes were concatenated. We used a combination of close and distant species as outgroups: Eunice pennata, Eunice norvegica, Protodorvillea gracilis, Dorvillea erucaeformis. For each phylogeny, the most adapted evolutionary model was determined based on Bayesian information criterion as implemented in W-IQ TREE [102]. Maximum likelihood (ML) trees were subsequently generated using IQ-Tree [103] and branch support estimated using the Shimodaira-Hasegawa approximate likelihood ratio test (aLTR), as described in Anisimova and Gascuel [104]. For each dataset, Bayesian inferences (BI) were performed on two runs until convergence was reached under the appropriate evolutionary model. Tree sampling was done every 1000 generations with a burn-in of 25%. Bayesian reconstructions were performed using MrBayes 3.2.6 [105] available on the CIPRES gateway [106]. The posterior probabilities (pp) were obtained for the 50% majority-rule consensus tree. Strong support of branches was considered when pp. ≥ 0.95, whereas intermediate support was defined with pp. values between 0.85 and 0.94.

Ancestral state reconstruction of reproductive trait in Ophryotrocha
Information about the reproductive mode of each species (gonochoric, simultaneous hermaphrodite and protandrous hermaphrodite) was retrieved from the literature (Additional file 16) and all species coded accordingly. When no information on the reproductive mode was found, species were marked as unknown. Parsimony ancestral trait reconstruction was performed on both ML and BI phylogeny in order to find the ancestral states that minimize the number of changes according to the phylogeny. Analysis was conducted in Mesquite v3.4 [107].
In addition to the maximum parsimony reconstruction, we also used a method based on Bayesian MCMC sampling methods to reconstruct the ancestral reproductive mode of the genus Ophryotrocha as implemented in BEAST2 [108]. In particular, we used the Bayesian phylogeny previously obtained and estimate the posterior probability of the state for each ancestor for each node of the tree. We conducted 10,000,000 iterations and sampled parameters every 10,000 generations. The posterior distribution was first verified with Tracer1.7 [109]. We discarded the first 25% samples of states obtained as burn-in as implemented in TreeAnnotaor from the BEAST2 package.