Six complete mitochondrial genomes of mayflies from three genera of Ephemerellidae (Insecta: Ephemeroptera) with inversion and translocation of trnI rearrangement and their phylogenetic relationships

As a small order of Pterygota (Insecta), Ephemeroptera has almost 3,500 species around the world. Ephemerellidae is a widely distributed common group of Ephemeroptera. However, the relationship among Ephemerellidae, Vietnamellidae and Teloganellidae is still in dispute. In this study, we sequenced six complete mitogenomes of three genera from Ephemerellidae (Insecta: Ephemeroptera): Ephemerella sp. Yunnan-2018, Serratella zapekinae, Serratella sp. Yunnan-2018, Serratella sp. Liaoning-2019, Torleya grandipennis and T. tumiforceps. These mitogenomes were employed to reveal controversial phylogenetic relationships among the Ephemeroptera, with emphasis on the phylogenetic relationships among Ephemerellidae. The lengths of the six mayfly mitogenomes ranged from 15,134 bp to 15,703 bp. Four mitogenomes of Ephemerella sp. Yunnan-2018, Serratella zapekinae, Serratella sp. Yunnan-2018 and Serratella sp. Liaoning-2019 had 22 tRNAs including an inversion and translocation of trnI. By contrast, the mitogenomes of T. tumiforceps and T. grandipennis had 24 tRNAs due to an extra two copies of inversion and translocation of trnI. Within the family Ephemerellidae, disparate gene rearrangement occurred in the mitogenomes of different genera: one copy of inversion and translocation trnI in the genera Ephemerella and Serratella, and three repeat copies of inversion and translocation of trnI in the genus Torleya. A large non-coding region (≥200 bp) between trnS1 (AGN) and trnE was detected in T. grandipennis and T. tumiforceps. Among the phylogenetic relationship of the Ephemeroptera, the monophyly of almost all families except Siphlonuridae was supported by BI and ML analyses. The phylogenetic results indicated that Ephemerellidae was the sister clade to Vietnamellidae whereas Teloganellidae was not a sister clade of Ephemerellidae and Vietnamellidae.


INTRODUCTION
Ephemeroptera has about 3,500 species around the world, which is a small order of Pterygota (Salles et al., 2018). Having significant roles among freshwater fauna, they are found in many kinds of microhabitats and function in various trophic roles (Salles et al., 2018;Jacobus, Macadam & Sartori, 2019). Considerable effort has been devoted to discovering the phylogenetic relationships among the Ephemeroptera families based on morphology (McCafferty & Edmunds, 1979;McCafferty, 1991;Kluge, 2004), molecular evidence (Ogden & Whiting, 2005), and combined data (Ogden et al., 2009). Despite this, the higher-level phylogeny of mayflies is still a controversial issue (McCafferty & Edmunds Jr, 1979;McCafferty, 1991;Kluge, 2004;Ogden & Whiting, 2005;Ogden et al., 2009), particularly the phylogenetic relationships within the Ephemerellidae, Teloganellidae and Vietnamellidae. Data on morphological characteristics supported a close phylogenetic relationship between Teloganellidae and Vietnamellidae, with the two families forming a sister group to Ephemerellidae (McCafferty, 1991;Kluge, 2004). By contrast, Ephemerellidae was supported as a monophyletic group via phylogenetic trees based on sequences of 12S, 16S, 18S, 28S and H3 genes, but the relationships with Teloganellidae and Vietnamellidae remained problematic (Ogden et al., 2009). In addition, Ephemerellidae was supported as the sister clade to Vietnamellidae via phylogenetic analyses based on mitogenomes of Ephemeroptera, whereas Teloganellidae was not a sister clade of Ephemerellidae and Vietnamellidae (Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Cao et al., 2020;Xu et al., 2020). Hence, the systematics of Ephemerellidae needed to be clarified by further studies.
The typical mitochondrial genome (mitogenome) of an insect is a circular molecule of 14-20 kb in length including 37 genes (two ribosomal RNA genes, 13 protein-coding genes and 22 transfer RNA genes) along with the control region (CR) (Boore, 1999;Cameron, 2014a). Because of their features including fast evolution rates, small genome sizes, low sequence recombination and maternal inheritance, mitogenomes have been used extensively as molecular markers for phylogenetic analyses and comparative or evolutionary genomic research (Boore, 2006;Cameron, 2014a;Ma et al., 2015;Cheng et al., 2016). Currently, 29 complete (or nearly complete) mitogenomes of Ephemeroptera from thirteen families have been released on NCBI (Zhang et al., 2008;Li, Qin & Zhou, 2014;Tang et al., 2014;Zhou et al., 2016;Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Song et al., 2019;Cao et al., 2020;Xu et al., 2020), but there was only one partial mitogenome of Ephemerellidae. These small numbers of mitogenomes have restricted our understanding of the phylogenetic relationships and biogeography of the Ephemerellidae.
In order to discuss the characteristics of trnI inversion in the Family Ephemerellidae and explore the phylogenetic relationships of Ephemerellidae, we sequenced six complete mitogenomes belonging to three genera of Ephemerellidae. The compositional and structural features of these six mitogenomes are described and gene rearrangements were analyzed to explain the mechanism of mitochondrial gene rearrangements.

Sampling collection and DNA extraction
The voucher specimens of Ephemerella sp. Yunnan-2018, S. zapekinae, Serratella sp. Yunnan-2018, Serratella sp. Liaoning-2019, T. grandipennis and T. tumiforceps were separately captured from Ning'er Yunnan province, Xiuyan Liaoning province, Ning'er Yunnan province, Kuandian Liaoning province, Longquan Zhejiang province and Tonglu Zhejiang province, China, respectively. After morphological identification, the specimens were deposited at −40 • C in the Animal Specimen Museum, College of Life Sciences and Chemistry, Zhejiang Normal University, China. Total genomic DNA was extracted from tissues of each complete specimen by Ezup Column Animal Genomic DNA Purification Kit (Sangon Biotech Company, Shanghai, China). The Animal Research Ethics Committees of Zhejiang Normal University approved the experimental design.

PCR amplification and sequencing
Several partial fragments were amplified using common primers (Table S1), as described in Simon et al. (1994), Simon et al. (2006), Zhang et al. (2008) and Zhang et al. (2018b). Subsequently, we designed species-specific primers (Table S1) according to sequences previously obtained from universal primers using Primer Premier 5.0 (Lalitha, 2000). Both PCR (product length < 3,000 bp) and Long-PCR (product length >3,000 bp) methods were used separately with Takara Taq and Takara LA Taq DNA polymerase in a reaction volume of 50 µL. The amplifications for both normal PCR and Long-PCR were performed under the conditions as described in Zhang et al. (2018b). All the products were then sequenced bidirectionally using the primer-walking method and AB13730XL by Sangon Biotech Company (Shanghai, China).

Phylogenetic methods
Twenty-four formerly published mayfly mitogenomes along with the newly determined sequences (Table 1) were used in the phylogenetic analyses to discuss the relationships of Ephemeroptera (Zhang et al., 2008;Li, Qin & Zhou, 2014;Tang et al., 2014;Zhou et al., 2016;Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Cao et al., 2020;Xu et al., 2020). Because long-branch attraction existed in species of Baetidae, we deleted two species of Baetidae and rebuilt the BI and ML phylogenetic trees. The nucleotide sequences of the 13 protein-coding genes were used as the dataset to build the BI and ML phylogenetic trees as in Zhang et al. (2018b). In addition, Siphluriscus chinensis acted as the outgroup (Li, Qin & Zhou, 2014). We used Clustal W in the program Mega 7.0 to align each of the 13 protein-coding genes (Kumar, Stecher & Tamura, 2016). Also, we identified the conserved regions using the Gblock 0.91b program (Castresana, 2000). Concatenating the resulting alignments with Geneious 8.1.6 (Kearse et al., 2012), we inferred the optimal partitioning strategy using the program PartitionFinder 1.1.1 (Lanfear et al., 2012) and chose the best model based on the Bayesian Information Criterion (BIC). The best model is listed in Table S2. Each of three codon positions for the 13 protein-coding genes defined the data blocks. On the one hand, we implemented ML analysis in RAxML 8.2.0 with a GTRGAMMAI model. Also, we evaluated branch support for each node with 1,000 replicates (Stamatakis, 2014). On the other hand, we implemented BI analysis in MrBayes 3.2 with the best model estimated (Table S2), which were divided into four chains (three hot and one cold), with a run of 10 million generations in total length and sampling every 1,000 generations (Ronquist et al., 2012). The first 25% of generations were removed as burn-in. After the average standard deviation of split frequencies was below 0.01, we judged that BI analysis had reached sufficient convergence.

Mitogenome organization and composition
We obtained and characterized the complete mitogenomes of Ephemerella sp. Yunnan-2018, S. zapekinae, Serratella sp. Yunnan-2018, Serratella sp. Liaoning-2019, T. grandipennis and T. tumiforceps. These were deposited in the GenBank database (Table 1). The six new mitogenomes were double circular DNA molecules with lengths ranging from 15,134 bp to 15,703 bp (Fig. S1). The size variation of the six mitogenomes was large due to different intergenic nucleotides (IGNs), the size of the CR, and the presence of additional copies of trnI. The nucleotide compositions of the six mayfly mitogenomes had a high A+T bias between 61.1% and 66.1%. All six mitogenomes showed a negative AT-skew on the majority strand and negative GC-skew as well (Table 2), as occurs in most other Ephemeroptera mitogenomes. The AT-skew values on the majority strand for the six mayfly species ranged from −0.053 (S. zapekinae) to −0.017 (Serratella sp. Yunnan-2018) whereas the GC-skew for the majority strand ranged from −0.234 (Ephemerella sp. Yunnan-2018) to −0.159 (S. zapekinae). In addition, we analyzed the sizes and nucleotide compositions of the previously published mayfly mitogenomes. Differences in mitochondrial sequences of Ephemeroptera were mainly determined by different sizes of the CR. Among all sequenced Ephemeroptera mitogenomes (Table 1) (Zhang et al., 2008;Li, Qin & Zhou, 2014;Tang et al., 2014;Zhou et al., 2016;Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Cao et al., 2020;Xu et al., 2020),

Protein-coding genes (PCGs) and codon usages
The sizes of the protein-coding genes in the six mitogenomes were similar to each other. The locations and orientations of the 13 PCGs within the six mitogenomes were identical to those of most mayfly species (Tables S3, S4). The majority of the PCGs within the six mitogenomes began with conventional initiation codons, except TTG that was used for ND3  Incomplete terminal codons perform a significant function in polycistronic transcription cleavage and polyadenylation processes (Anderson et al., 1981;Ojala, Montoya & Attardi, 1981), which can be commonly observed in Ephemeroptera mitogenomes (Zhang et al., 2008;Li, Qin & Zhou, 2014;Tang et al., 2014;Zhou et al., 2016;Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Cao et al., 2020;Xu et al., 2020). The average A+T contents of the 13 PCGs within the six mitogenomes ranged from 60.4% to 65.5% (Table 2). The PCGs of both the majority and minority strands showed negative AT-skews, identical to the 29 other published Ephemeroptera mitogenomes (Zhang et al., 2008;Li, Qin & Zhou, 2014;Tang et al., 2014;Zhou et al., 2016;Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Song et al., 2019;Cao et al., 2020;Xu et al., 2020). Interestingly, a negative AT-skew existed in the PCGs on the majority strand of all mayfly mitogenomes, which contradicted the fact that the majority of hexapod species show a positive AT-skew (Perna & Kocher, 1995;Carapelli et al., 2007). This phenomenon indicates that a special phylogeny-related strand asymmetry reverse on the majority strand has appeared in mayfly species (Wei et al., 2010;Li, Qin & Zhou, 2014).
We calculated the relative synonymous codon usage (RSCU) of the six mayfly mitogenomes (Fig. 1, Table S5). The analysis showed a higher utilization of A or T nucleotides in the third codon position compared to other synonymous codons, generally regarded as the function of genome bias, optimum choice of tRNA usage or the benefit of DNA repair (Crozier & Crozier, 1993). According to comparative analyses, the most frequently utilized codons were highly similar within the six mayfly species. UUA (Leu), AUU (Ile) and UUU (Phe) were the most habitually used codons (>170) within the PCGs of the six mitogenomes, whereas UGC (Trp), CGC (Arg) and AGG (Ser) were the least used (<20). The strong AT mutation bias manifestly affected the codon usage, which proved that U and A were preferred in codons (Powell & Moriyama, 1997;Rao et al., 2011). Moreover, those AT-rich codons encoded the most plentiful amino acids, e.g., Leu (16.94%-17.47%), which suggested that the AT bias would also affect acid constituents of the proteins encoded by the mitochondrial genes (Foster, Jermiin & Hickey, 1997;Min & Hickey, 2007).

Transfer RNAs and ribosomal RNAs
Mitogenomes of Ephemerella sp. Yunnan-2018, S. zapekinae, Serratella sp. Yunnan-2018 and Serratella sp. Liaoning-2019 had 22 tRNA genes (Table 1). By contrast, the mitogenomes of T. grandipennis and T. tumiforceps had 24 tRNA genes (Table 1) including an extra two copies of trnI. Ephemerella sp. Yunnan-2018, S. zapekinae, Serratella sp. Yunnan-2018 and Serratella sp. Liaoning-2019 shared the translocation of trnI, which moved from between the CR and trnQ into a position between 12S rRNA and CR, along with the inversion of trnI (Fig. S1A). Furthermore, the mitogenomes of T. grandipennis and T. tumiforceps had three identical copies of the inversion of trnI, all of which were translocated to between 12S rRNA and CR (Fig. S1B). The lengths of these tRNA genes in the six mayfly mitogenomes ranged from 60 bp to 70 bp. The secondary structures of tRNA genes identified in these  mayfly mitogenomes are shown in Figs. S2-S7. All the predicted tRNAs showed the typical clover-leaf secondary structure of mitochondrial tRNAs except for trnS1 (AGN) in S. zapekinae (Fig. S3N). Its dihydrouridine (DHU) arm formed a simple loop, which has been observed in many insects (Zhang et al., 2018a;Wang et al., 2019). Quite a few mismatched pairs existed in terms of tRNA secondary structure within all six mayfly  mitogenomes, e.g., U-G in the DHU arm and amino acid acceptor arm of trnQ and U-G in the T ψC (T) arm of trnN and trnS1 (AGN). In general, these mismatches are fundamental units of tRNA secondary structure and are nearly isomorphic to normal base pairs, as commonly observed in mayflies or other insects (Varani & McClain, 2000;Zhang et al., 2008;Li, Qin & Zhou, 2014;Wang et al., 2019). It has been reported that mismatches may enhance aminoacylation and translation (McClain, 2006). In T. tumiforceps and T. grandipennis, three copies of trnI had the same structure due to the identical nucleotide sequences (Fig. 2). In general, the characteristics of tRNAs were conserved among the six mitogenomes, as is commonly observed in most insects, except for the inversion and translocation of trnI rearrangements.
In mitochondrial gene rearrangements of Ephemerella sp. Yunnan-2018, S. zapekinae, Serratella sp. Yunnan-2018 and Serratella sp. Liaoning-2019, we observed the inversion and translocation of trnI which is identical to Ephemerella sp. MT-2014(Tang et al., 2014. We suggest that these rearrangements are presumably caused by the Tandem Duplication and Random Loss (TDRL) model (Moritz, Dowling & Brown, 1987;Arndt & Smith, 1998) and Recombination model (Lunt & Hyman, 1997). A schematic illustration of rearrangement events for these mitogenomes is presented in Fig. 3. The inversion of trnI may result from the breaking of the mitogenomes at trnI along with recombination of trnI on the other strand. The translocation of trnI was probably caused by a tandem duplication of the gene block CR-trnI, resulting in a CR-trnI -CR-trnI arrangement, then the first copy of CR and the second copy of trnI were subsequently randomly lost. This left the trnI -CR arrangement. By contrast to the mentioned rearrangements, T. tumiforceps and T. grandipennis also had two extra copies of trnI, which were the same as the original trnI. The presence of three trnI copies may have been caused by additional gene duplication trnI -trnI -trnI. Nevertheless, since each rearrangement event was independent, the actual order of events has yet to be determined. It was reported by Boore (1999) that the flanks of the control region were the hot spots for gene rearrangements in Arthropoda, which is in accordance with the site of trnI rearrangement in mayflies. As Taylor et al. (1993) showed, tRNA order, as well as CR organization changed frequently in the evolution of insect mitogenomes.
Evidence for tRNA rearrangement has also been found in other Ephemeroptera families, such as Siphluriscidae [S. chinensis (Li, Qin & Zhou, 2014) ], Heptageniidae [P. youi (Zhang et al., 2008)  Proposed mechanism of gene arrangements in the six mayfly mitogenomes. Gene sizes are not drawn to scale. For each image, genes encoded by the minority strand are underlined and without underline are encoded by the majority strand. White boxes represent genes with the same relative position as in the ancestral insect arrangement pattern. Horizontal lines and cross symbols represent gene duplications and gene deletions, respectively. Red boxes represent gene inversions. Dark grey boxes represent non-coding regions. The remaining genes and gene orders were identical to the ancestral insect arrangement and are left out. The images show that firstly, trnI was inversed through recombination; secondly, CR-trnI was tandem duplicated; thirdly, the first CR and the second trnI were deleted according to the TDRL model; and fourthly, the inversion and translation of trnI was duplicated three times.
Full-size DOI: 10.7717/peerj.9740/ fig-3 observed in other insect orders such as Mantodea (Zhang et al., 2018a;Zhang et al., 2018b), Hymenoptera (Cameron et al., 2008) and Lepidoptera (Hu et al., 2010). The translocation of trnI to a position between 12S rRNA and CR also occurred in Anthonomus species (Coleoptera: Curculionidae), such as Anthonomus rectirostris, A. rubi, A. eugenii and A. pomorum (Van de Vossenberg et al., 2019). However, Anthonomus species did not show the inversion of trnI that occurs in mayflies. Remarkably, the present data for T. tumiforceps and T. grandipennis is the first time that three copies of trnI inversion and translocation have been reported in Insecta. The rearrangement of trnI may be a molecular marker for the family Ephemerellidae, whereas the three copies of trnI are probably a genus synapomorphy of the genus Torleya. These inferences will require more mitogenomes of Ephemerellidae and Torleya species to be fully proven. Like all other mitogenomes of Ephemeroptera, two rRNAs were detected in these Ephemerellidae species, the 16S rRNA and 12S rRNA. The 16S rRNA was located between trnL1 (CUN) and trnV. Unexpectedly, the 12S rRNA was located between trnV and trnI due to the translocation of trnI. The sizes of the 16S rRNA genes in these six mayfly species varied from 1,212 bp (T. grandipennis) to 1,230 bp (Serratella sp. Yunnan-2018), and the sizes of 12S rRNA ranged from 769 bp (T. grandipennis and T. tumiforceps) to 778 bp (Serratella sp. Yunnan-2018). Their length fit within the lengths detected in other Ephemeroptera mitogenomes (Tables S3, S4). The mitogenome of Serratella sp. Yunnan-2018 had the highest A+T content of the rRNA genes (68.8%) whereas the mitogenome of T. grandipennis had the lowest (63.1%). In the six mayfly mitogenomes, the AT bias of both 16S rRNA and 12S rRNA was slightly positive (≤0.075), whereas the GC bias was strongly positive (≥0.22), which revealed that the contents of A and G outnumbered T and C nucleotides, respectively (Table 2).

Non-coding control region
The A+T contents of the CRs of the six mayfly species ranged from 54.5% (Ephemerella sp. Yunnan-2018) to 68.5% (Serratella sp. Liaoning-2019) ( Table 2). The CRs all showed negative AT-skew values and negative GC-skew values. The various sizes of these mitogenomes were largely determined by the amounts and lengths of tandem repeats in the CR. Among insect mitogenomes, a large non-coding region was defined as the A+T region because of the high A+T content, which harbored the origin sites for transcription and replication (Wolstenholme, 1992;Taylor et al., 1993;Yukuhiro et al., 2002). We observed the existence of tandem repeats in the mitochondrial CR in Ephemerella sp. Yunnan-2018, S. zapekinae, and Serratella sp. Liaoning-2019 but not in the other three species analyzed in this study. These consisted of two tandem repeats of a 93 bp sequence and two tandem repeats of a 61 bp as well as one tandem repeat of a 16 bp in Ephemerella sp. Yunnan-2018; 2 tandem repeats of a 109 bp sequence as well as one tandem repeat of a 93 bp and three tandem repeats of a 72 bp sequence in Serratella sp. Liaoning-2019; and three tandem repeats of a 100 bp sequence, two tandem repeats of a 90 bp sequence and six tandem repeats of a 50 bp sequence in S. zapekinae (Fig. S8). It has been proposed that slipped-strand mispairing during mitogenome replication could lead to the appearance of tandem repeat units . The secondary structure of these repeat units, which could be predicted by Mfold (Zuker, 2003), are shown in Fig. S9. Stable secondary structures were detected in these repeat units except in Ephemerella sp. Yunnan-2018, which may make the slipped strand steady or block polymerase for the promotion of replication slippage (Savolainen, Arvestad & Lundeberg, 2000). These structural characteristics may pave the way for revealing the relationship between species evolution and duplicated mitochondrial sequences (Schultheis, Weigt & Hendricks, 2002;Tatarenkov & Avise, 2007;Sammler, Bleidorn & Tiedemann, 2011).
Interestingly, the A+T content of CRs in Ephemerella sp. Yunnan-2018, S. zapekinae and Serratella sp. Yunnan-2018 were obviously lower than the A+T content of corresponding whole mitogenomes, which was also observed in other mayflies (Paegniodes cupulatus, Parafronurus youi) (Table S6). This phenomenon was also observed in other insects, such as Orthoptera (Mecopoda niponensis) and Hemiptera (Lethocerus deyrollei, Aleurocanthus spiniferus, Corizus tetraspilus, Triatoma dimidiate, Hydaropsis longirostris) (Dotson & Beard, 2001;Hua et al., 2008;Yuan et al., 2015;Yang, Ren & Huang, 2016;Li et al., 2017). The abnormal situation of A+T content of CRs may be affected by A+T content of tandem repeat units. The presence of tandem repeats in the mitochondrial CR has also been detected in many Ephemeroptera species (Zhang et al., 2008;Li, Qin & Zhou, 2014;Zhou et al., 2016;Ye et al., 2018;Cao et al., 2020;Xu et al., 2020). In the CR of P. youi, the repeat region included 3 tandem repeats of a 94 bp sequence, which has a lower AT-content of 43.6% (Zhang et al., 2008). Also, the CR of Paegniodes cupulatus contained 11 tandem repeats of a 56 bp sequence (Zhou et al., 2016). Moreover, six replicates of a 140 bp sequence and seven partial 76 bp sequences existed in the CR of S. chinensis, an observation that is infrequent among the published mitogenomes of mayflies and other insects (Li, Qin & Zhou, 2014). However, the CRs in some species of different families among the Ephemeroptera had no tandem repeat, such as Caenidae [Caenis sp. YJ-2009 (GQ502451), Caenis sp. JYZ-2018(Cai et al., 2018 and Caenis sp. JYZ-2020 (Xu et al., 2020)

Intergenic Region
The mitogenomes of S. zapekinae, Serratella sp. Yunnan-2018 and Serratella sp. Liaoning-2019 contained 4, 6, and 6 non-coding intergenic spacer sequences, respectively, with total lengths of 22 bp, 32 bp and 29 bp, whereas Ephemerella sp. Yunnan-2018 had 7 non-coding intergenic spacer sequences of 92 bp in total length due to an intergenic spacer of 68 bp between ND4L and trnT. An intergenic spacer between trnS2 (UCN) and ND1 ranging from 17 bp to 19 bp was also detected in these four mitogenomes and is commonly found in Ephemeroptera. In terms of the intergenic spacer between ND4L and trnT in Ephemerella sp. Yunnan-2018, we also detected a similar sequence of 62 bp with a similarity of 72.58% in Ephemerella sp. MT-2014(Tang et al., 2014. Hence, this intergenic spacer is possibly specific to the genus Ephemerella. A discussion of the occurrence and function of this intergenic spacer will require mitogenome sequencing of more Ephemerella species. In terms of the mitogenomes of T. tumiforceps and T. grandipennis, these two mitogenomes both contained 7 intergenic spacers with total lengths of 565 bp and 237 bp. Except for small intergenic spacers (≤12 bp), we observed a large intergenic spacer between trnS1 (AGN) and trnE within the mitogenomes of T. tumiforceps and T. grandipennis, with lengths of 531 bp and 200 bp, respectively. The large intergenic region in T. tumiforceps and T. grandipennis had 52.17% and 52% A+T content, respectively, with a strongly positive GC-skew (0.31 and 0.13). The large intergenic region in T. tumiforceps and T. grandipennis had a similar starting sequence of 20 bp (GGGSCKAAAGGSCCTACCTA) and a conserved sequence of 19 bp (TTTTTAGCGAACTTAGGGG), respectively, with 4 tandem repeats of 87 bp and 3 tandem repeats of 42 bp along with 4 partial sequences, respectively. The repeat unit of T. tumiforceps could be folded into three stem-loop secondary structures, whereas T. grandipennis folded into two stem-loop secondary structures (Fig. S10). The latter two stem-loops of T. tumiforceps had a certain similarity with T. grandipennis, which may promote replication slippage and subsequently lead to an increase in duplicate copies , acting as a probable substitute replication origin for mtDNA (Crozier & Crozier, 1993;Dotson & Beard, 2001). Thus, it was suggested that this large intergenic region may be the result of the duplication and random loss of the conserved sequence (Fig. S11), similar to the large repetitive sequences between trnE and trnF in aphids and the intergenic regions in Parnassius bremeri reported by Zhang et al. (2014) and Kim et al. (2009), respectively. The large intergenic spacer between trnS1 (AGN) and trnE could be synapomorphic for the genus Torleya owing to the fact that sequences located between trnS1 (AGN) and trnE within mitogenomes of other Ephemeroptera were short (<13 bp) and the large intergenic spacer has only been found in Torleya. This is the first finding of a large intergenic region in the mitogenomes of mayflies. As a consequence of the non-coding particularity of the intergenic spacer sequence, these species may well have gone through further sequence divergence swiftly (Kim et al., 2009). With the accumulation of more sequence information from Torleya species, more instructive and clear conclusions could be given.
The mitogenomes of most insect species are seemingly economical, although certain species contain large intergenic regions (Boore, 1999;Kim et al., 2009;Zhang et al., 2014;Wang et al., 2019). An analysis of intergenic regions in the published mitogenomes of Ephemeroptera revealed the frequent occurrence of quite long intergenic spaces. For example, we observed the same intergenic region of 47 bp between ND4L and trnT in both Isonychiidae [Isonychia kiangsinensis (Ye et al., 2018) and I. ignota (HM143892)], whereas an intergenic spacer between trnQ and trnM ranging from 26 bp to 53 bp was found in Caenidae [Caenis sp. YJ-2009 (GQ502451), Caenis sp. JYZ-2018(Cai et al., 2018 and Caenis sp. JZY-2020 (Xu et al., 2020)], as well as in Siphlonuridae [S. immanis (FJ606783)] and Ephemeridae [E. orientalis (EU591678)]. In terms of the mitogenomes of Heptageniidae, we detected an intergenic spacer between trnA and trnR, ranging from 33 bp to 47 bp (Zhang et al., 2008;Gao et al., 2018b;Song et al., 2019). This intergenic spacer had a high similarity to the A+T region, ranging from 70% to 100%, with lengths ranging from 20 bp to 28 bp. In conclusion, based on intergenic spacers detected in different families and genera of Ephemeroptera, intergenic regions may be specific to family or genus. However, more mitochondrial sequences are needed to support a further discussion of intergenic regions within Ephemeroptera.
Our findings indicated that Ephemerellidae was the sister clade to Vietnamellidae, but Teloganellidae was the sister clade to Baetidae (Fig. 4), which may be caused by long-branch attraction (LBA). Considering that LBA existed in Baetidae species, we deleted the two species of Baetidae and rebuild the BI and ML phylogenetic trees (Fig. 5). Ephemerellidae was still the sister clade to Vietnamellidae, but Teloganellidae with LBA was a sister clade to Caenidae. Moreover, LBA existed in the family Baetidae, observed in both the BI and ML analysis (Fig. 4). It has been proposed that LBA is caused by a rapid evolutionary rate (Felsenstein, 1978;Brinkmann et al., 2005;Lartillot, Brinkmann & Philippe, 2007;Dabert et al., 2010). Under these circumstances, more slowly evolving species are necessarily needed to eliminate the LBA and reconstruct the phylogenetic relationship within the Ephemeroptera (Kim, 1996;Poe, 2003). The sister-group relationship between Ephemerellidae and Vietnamellidae was supported by current phylogenetic analyses using 13 PCGs of the mayfly mitogenomes (Figs. 4 and 5) as well as the results of many other studies (Cai et al., 2018;Gao et al., 2018b;Ye et al., 2018;Cao et al., 2020;Xu et al., 2020). There has been substantial controversy surrounding the phylogenetic relationships among Ephemerellidae, Vietnamellidae and Teloganellidae. Morphological characteristics have indicated that Teloganellidae and Vietnamellidae have a close phylogenetic relationship. The merged branch of Teloganellidae and Vietnamellidae formed a sister group to Ephemerellidae (McCafferty, 1991;Kluge, 2004). By contrast, via a phylogenetic tree based on the 12S, 16S, 18S, 28S and H3 genes, Ephemerellidae was supported as a monophyletic group, and their relationship with Teloganellidae and Vietnamellidae remains problematic (Ogden et al., 2009). In addition, Ephemerella, Serratella and Torleya formed a monophyletic clade within Ephemerellidae. Thus, this phylogenetic tree showed the main topology: ((((S. zapekinae + Serratella sp. Liaoning-2019) + Serratella sp. Yunnan-2018) + (T. tumiforceps + T. grandipennis)) + (Ephemerella sp. Yunnan-2018 + Ephemerella sp. MT-2014)). The inversion and translocation of one copy of trnI was found in Ephemerella and Serratella, whereas the inversion and translocation of three copies of trnI was found in Torleya (Fig. 4). According to the phylogenetic relationship of the three genera in Ephemerellidae ((Serratella + Torleya) + Ephemerella), we can deduce that the inversion and translocation of one copy of trnI is characteristic of the ancestral Ephemerellidae. Unfortunately, the presence of only one mitogenome of Teloganellidae restricted a discussion of its monophyly, which requires more species to be added.

CONCLUSION
We successfully determined the complete mitogenomes of Ephemerella sp. Yunnan-2018, S. zapekinae, Serratella sp. Yunnan-2018, Serratella sp. Liaoning-2019, T. grandipennis and T. tumiforceps. The mitogenomes of the genera Ephemerella and Serratella showed similar gene features to Ephemerella sp. MT-2014 (KM244691) and had the same inversion and translocation of trnI, whereas two extra copies of trnI were found in the genus Torleya. The translocation and extra copies of trnI could be explained by the tandem duplication/random loss model, whereas the inversion of trnI could be interpreted by the mitogenome recombination model. The evidence from this study suggests that these specific mitogenome rearrangements may be molecular markers of the family Ephemerellidae, especially the three copies of trnI as a synapomorphy for the genus Torleya.
The occurrences and mechanisms of the large intergenic region between trnS1 (AGN) and trnE, detected in T. tumiforceps and T. grandipennis, require more mitogenomes of Torleya species to be investigated. Phylogenetic analyses based on BI and ML trees both supported the monophyly of Ephemerellidae and Vietnamellidae. Moreover, Ephemerellidae acted as the sister clade to Vietnamellidae whereas Teloganellidae existed in the other clade. Considerable work may be needed to increase the number of mitogenomes to resolve the phylogenetic relationships of Ephemeroptera.
• Zi-Yi Zhang analyzed the data, prepared figures and/or tables, authored or reviewed drafts of the paper, and approved the final draft.
• Kenneth B. Storey analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.
• Dan-Na Yu and Jia-Yong Zhang conceived and designed the experiments, analyzed the data, authored or reviewed drafts of the paper, and approved the final draft.

Field Study Permissions
The following information was supplied relating to field study approvals (i.e., approving body and any reference numbers): Field experiments were approved by the Research Council of Zhejiang Normal University.

DNA Deposition
The following information was supplied regarding the deposition of DNA sequences: The six mitogenomes are available at GenBank: MT274127-MT274132.