Introduction

The insect order Embioptera, commonly known as webspinners, is a small, lesser-known order comprising a distinctive, monophyletic group with about 2,000 species inhabiting tropical and subtropical regions of the world1, 2. The distinctive synapomorphy of the webspinners is the presence of silk glands in the enlarged basal tarsomere of forelegs in nymphs and adults. Webspinners live subsocially in the silken galleries constructed with their silk glands and feed primarilyon plant debris. The unique morphological and biological features of webspinners make them a mysterious insect group that warrants further study. Phylogenetic relationships between the Embioptera and other insect orders remain unclear3. Recent studies almost unanimously support a sister group relationship between Phasmida and Embioptera4,5,6,7,8,9,10.

Mitochondrial genomes (mitogenomes, mtDNA) have been a hotspot for insect research with wide utility in phylogenetics, population genetics and evolutionary biology11. Insect mitogenomes are typically circular, double stranded DNA molecules, usually 14–20 kb in length, and contain 13 protein-coding genes (PCGs), 22 transfer RNA (tRNA) genes, two ribosomal RNA (rRNA) genes and a control region (CR)12, 13.

Kômoto et al. (2012) described the mitogenome of the webspinner Aposthonia japonica from Japan, which was the first published mitogenome for Embioptera14. However, the A. japonica mitogenome was incomplete and lacked the control region and several tRNA genes14. Tang et al. (2014) subsequently reported a partial mitochondrial sequence of Aposthonia borneensis, a webspinner species endemic to continental southeastern Asia2, 15. Unfortunately, the incomplete, fragmented mitogenome sequence data from the two Aposthonia spp. are insufficient for structural and evolutionary analyses. In this study, we report the complete mitogenome of A. borneensis and compare the organization and sequence composition with A. japonica. The evolutionary rate of PCGs and the mitogenome rearrangements in webspinners are discussed, and we offer speculations based on the results of phylogenetic analyses. This study is a good supplement for the limited molecular data available for Embiopterans.

Results and Discussion

Mitogenome organization and composition

The complete mitogenome of A. borneensis was annotated and deposited in the NCBI database (GenBank accession no. KX965988). The A. borneensis mitogenome is a 15,660 bp circular DNA molecule, which is smaller than A. japonica (18,305 bp) due to differences in the size of intergenic nucleotides (IGNs) and the CR (Fig. 1). The A + T content of the A. borneensis mitogenome (75.9%) is similar to that of A. japonica (74.8%) (Table 1). The A. borneensis mitogenome contains 34 genes (13 PCGs, 19 tRNAs and two rRNAs) and a putative control region (CR); 21 genes (nine PCGs and 12 tRNAs) are located on the J-strand and 13 genes (four PCGs, seven tRNAs, and two rRNAs) map to the N-strand (Table 2). When compared to the partial mitochondrial sequences reported by Tang et al. (2014)15, the A. borneensis mitogenome in this study obtained more genes (atp8, Cytb, nad6, rrnL) and the CR, which ensures more reliable structural and phylogenetic analyses. The A. japonica mitogenome contains 13 PCGs, 20 tRNAs and three rRNAs. Three tRNAs, trnIle, trnGln and trnMet, are absent in both the A. borneensis and A. japonica mitogenomes. Although trnCys and trnSer2 are present in the A. borneensis mtDNA, these genes were not reported in the A. japonica mitogenome, possibly because of incomplete sequence data14.

Figure 1
figure 1

Mitochondrial map of A. borneensis. Genes outside the map are transcribed in a clockwise direction, whereas those inside the map are transcribed counterclockwise. The second circle shows the GC content and the third shows the GC skew. GC content and GC skew are plotted as the deviation from the average value of the entire sequence.

Table 1 Nucleotide composition in conserved regions of webspinner mitogenomes.
Table 2 Gene structure of the mitogenome of Aposthonia borneensis.

The A. borneensis mtDNA contains 48 overlapping nucleotides that are 2–22 bp in length and located in six pairs of neighboring genes. The longest overlap (22 bp) is located between rrnL and trnVal. A total of 983 IGNs ranging from 1 to 217 bp are dispersed between 22 pairs of neighboring genes; the longest IGN is located between rrnS and nad2 (Table 2). The A. japonica mitogenome contains 3013 bp of IGNs, and the largest IGN (979 bp) is located between trnThr and rrnSer1. The extremely large number of IGNs reveals the relaxed structure of the two webspinner mitogenomes.

The nucleotide base composition of the A. borneensis mitogenome (A = 43.6%, T = 32.3%, C = 17.1%, G = 7.0%) is similar to that of A. japonica (A = 42.5%, T = 32.3%, C = 18.7%, G = 6.5%); both genomes are significantly biased for the A and T nucleotides. The overall length and A + T content of the PCGs are similar in the two mitogenomes (Table 1). Comparative analyses indicate that the length of individual genes and the A + T content of PCGs are almost identical in the two species; a slight variability was observed in the A + T content of tRNA genes (Fig. 2). The AT-skew and GC-skews of the A. borneensis mitogenome were calculated and showed a bias for A and C nucleotides (Table 3). Unlike the strand bias of most other insects (positive AT- and negative GC-skew for the J-strand), the second codon position for PCGs (J-strand) shows a negative skew in A. borneensis 16.

Figure 2
figure 2

Gene size and A + T content of genes in the two webspinner mitogenomes. A. japonica and A. borneensis are represented by blue and red columns, respectively. Gene names are shown on the X axis; gene sizes and A + T content are shown on Y axis.

Table 3 Nucleotide composition of the Aposthonia borneensis mitogenome.

Protein-coding genes

The length and A + T content of PCGs in the A. borneensis and A. japonica mitogenomes were nearly identical (Fig. 2). In both two mitogenomes, nad5 is the largest PCG whereas atp8 is the smallest; atp8 has the highest A + T content while cox1 has the lowest. Most PCGs in the two webspinner mitogenomes initiate with the standard start codon ATN (ATA, ATT, ATC, and ATG); an exception is nad4 in the A. japonica mitogenome, which initiates with GTG. Most of the PCGs have complete termination codons (e.g. TAA or TAG), whereas nad1 contains the incomplete termination codon T for the two mitogenomes (Table 2). Ojala et al. (1981)17 speculated that the incomplete termination codon T could be completed by post-transcriptional polyadenylation.

In the A. borneensis mitogenome, the AT-skew of the concatenated PCGs is positive whereas the GC-skew is negative, indicating that the concatenated PCGs contain a higher percentage of AT vs. GC nucleotides. AT-skews were positive at the first and third positions but negative at the second position of the combined PCGs (Table 3). The relative synonymous codon usage (RSCU) values of the two mitogenomes were compared and showed a biased use of A and T nucleotides in both webspinners (Fig. 3). The frequent use of anticodons NNA and NNU reflects the preference for A and T in the third codon position. Overall, the five most frequently used codons in A. borneensis and A. japonica are ATA (Met), ATT (Ile), AAA (Lys), AAT (Asn) and TTA (Leu), which further supports the strong bias for A and T nucleotides.

Figure 3
figure 3

Relative synonymous codon usage (RSCU) in the two webspinner mitogenomes. Codon families are indicated below the X axis.

The mean Ka/Ks ratios for individual PCG of the two webspinner mitogenomes are calculated to evaluate the evolutionary rate of each PCG (Fig. 4). The results indicate that atp6 has the highest evolutionary rate, followed by nad1, while cox1 has the lowest rate. The Ka/Ks ratios for atp6, nad1, nad2, nad4 and nad4l were all above 1, indicating that these five genes are evolving under positive selection. The Ka/Ks ratios for the remaining eight PCGs were all below 1, suggesting the existence of high evolutionary pressure.

Figure 4
figure 4

Evolutionary rates of PCGs in the two webspinner mitogenomes. The green columns indicate the average Ka/Ks for each gene.

Transfer RNA and ribosomal RNA genes

Unlike the typical set of 22 tRNA genes in metazoan mitogenomes, there are 19 and 20 tRNA genes predicted in the A. borneensis and A. japonica mtDNAs, respectively. The size of tRNA genes was conserved in the two species, but the A + T content varied in the tRNA genes of the two mitogenomes. In the A. borneensis mitogenome, 14 tRNA genes are predicted to fold into typical cloverleaf secondary structures composed of four arms; however, the remaining five tRNAs lacked dihydrouridine (DHU) arms (Supplementary Fig. S1). A total of 20 mismatched nucleotides (G-U pairs) were identified in the A. borneensis tRNAs. In general, the anticodons of the 19 tRNA genes in A. borneensis are essentially identical to those reported in other insects.

In the two webspinners, the size of the large ribosomal RNA (rrnL) genes ranged from 1305 (A. japonica) to 1376 bp (A. borneensis), and the A + T content of rrnL ranged from 76.6% (A. japonica) to 80.0% (A. borneensis) (Table 1). In both species, the two rrnL genes mapped between trnLeu2 and trnVal, which is common in other insect mitogenomes (Fig. 5).

Figure 5
figure 5

Mitogenome arrangements in Drosophila yakuba, A. borneensis, and A. japonica. Genes are transcribed from left to right except for the underscored genes, which are transcribed in the opposite direction. Rearrangements are indicated by the diagonal black lines.

The small ribosomal RNA (rrnS) gene of A. borneensis is 723 bp with an A + T content of 75.7%. In A. japonica two rrnS genes were identified, one gene is 756 bp and located adjacent to the rrnL gene, while the other copy is 676 bp and maps between trnThr and trnTrp. Kômoto et al. (2012)14 speculated that the 676 bp copy is a pseudogene due to its truncated 3′ end.

The control region

The non-coding CR was not identified in the A. japonica mitogenome, presumably due to incomplete sequencing14. In A. borneensis, the putative CR is 377 bp, has an A + T content of 89.9%, and is located between trnSer2 and nad1 (Table 2). The unusual location of the CR in A. borneensis is presumably due to mtDNA rearrangement. Structural elements in the CR of A. borneensis can be predicted beginning at the 5′ end: a 66-bp region with high A + T content (90.9%); a 44-bp region containing two complete and one incomplete tandem repeats; and six stem-loop structures (SL) (Fig. 6). The small size of tandem repeats may explain the relatively small CR in A. borneensis when compared to other insects18, 19. The tandem repeats and SL structures in the CR have potential regulatory effects on mitogenome replication and transcription20.

Figure 6
figure 6

Predicted structural elements in the control region of A. borneensis. The regions with high A + T content are indicated by red shaded rectangles. Stem-loop regions are demarcated by orange rectangles.

Gene rearrangements

The mtDNA map is generally conserved in insects and similar to the model insect, Drosophila yakuba 21; however, exceptions exist and some insect groups exhibit rearranged mitogenomes. Analysis of the mtDNA sequences of the two webspinners revealed gene rearrangements in both species. In A. borneensis, a translocation is present between trnCys and trnTrp, and the putative CR mapped between trnSer2 and nad1, which is also presumably due to translocation (Fig. 5). In general, the organization of the A. borneensis mitogenome is highly similar to D. yakuba. However, the mitogenome of A. japonica differs from D. yakuba and shows multiple rearrangements including an extra copy of rrnS and the translocation of trnTrp, trnThr, the trnAsp-nad5 gene cluster, and perhaps trnCys and trnSer1 (Fig. 5). Several hypotheses for these rearrangements were discussed by Kômoto et al. (2012)14, but the underlying mechanisms remain unclear.

The arrangement of mitochondrial genes is generally conserved among closely related species; consequently, the magnitude of structural variations in the two Aposthonia spp. is noteworthy. This phenomenon is very rare for animals but has been recorded for booklice in the genus Liposcelis 22 and has been found recently in a thrips genus23.

Several hypotheses have been proposed to explain the heightened mitogenome rearrangement rates in insect species. One hypothesis considers parasitism as a predisposing factor for mtDNA reorganization, but this has been gradually rejected24, 25. Another hypothesis is concerned with the duplication of the control region, which has been confirmed in the Thysanoptera and Psocoptera26,27,28, but does not apply to webspinners. Another theory is related to the evolution of haplodiploidy, which has been reported in Hymenoptera, Thysanoptera, Aleyrodidae, Sciaridae but does not apply to webspinners29. Although the underlying mechanisms for mtDNA rearrangement in insects are not clear, these rearrangements are very informative with respect to insect genesis and evolution30. More comprehensive analyses of insect mitogenomes in multiple insect taxa are needed to further understand the mechanistic basis of mtDNA rearrangements.

Speculations on phylogeny and evolution

In this study, phylogenetic analyses were conducted using the concatenated DNA sequence data of rrnL, 28 s, cox1 and H3 genes from 19 Oligotomidae spp., and two Teratembia spp. from Teratembiidae were included as outgroups (Supplementary Table S1). The Bayesian inference (BI) and Maximum likelihood (ML) analysis generated similar topological structures (Fig. 7 and 8, Supplementary Fig. S2). The non-monophyly of the genus Aposthonia is supported in this study, which is congruent with the results of Miller et al. (2012)9. Meanwhile, the divergence time for A. borneensis and A. japonica was estimated at ca. 103 Ma (mid-Cretaceous), when the Pangea was in process of breakup. According to Blakey (2006)44, the Japanese islands had already been separated from the continental southeastern Asia at ca. 105 Ma. Based on these results, we speculate that the most recent common ancestor (MRCA) of A. borneensis and A. japonica was divided into several geographic groups during the Pangea breakup. Geographic isolation between the Japanese islands and the continental southeastern Asia accelerated the speciation process of the MRCA and the divergent evolution of the two webspinners. The continuous geographic isolation resulted in the genetic diversity observed between the two webspinners. For example, A. borneensis retained the conserved ancestral mitogenome, while a high substitution rate and frequent genetic rearrangements occurred in A. japonica.

Figure 7
figure 7

Dated Bayesian tree of Oligotomidae. Numbers on branches are Bayesian posterior probabilities. The tree was rooted with two outgroups, T. geniculata and Teratembia sp. KBMEB7.

Figure 8
figure 8

Maximum likelihood (ML) tree of Oligotomidae. Numbers on branches indicate bootstrap values. The tree was rooted with two outgroups, T. geniculata and Teratembia sp. KBMEB7.

In the phylogenetic trees, Oligotomidae can be divided into four clades of species according to their general distributional patterns (Figs 7 and 8). Earlier diverged clades of species (clades III and IV) tend to have a limited distribution in Asia, while later-diverged clades of species (clades I and II) show a farther distribution away from Asia (Supplementary Table S1). This result should be a good evidence for our speculation that divergent evolution of webspinners accompanied the Pangea breakup, and the species of Oligotomidae might originate from Asia. Furthermore, the five PCGs with relatively high evolutionary rates (atp6, nad1, nad2, nad4 and nad4l) may also have potential influence on this divergent evolutionary process. Finally, based on phylogenetic analyses, we predict that other webspinners exist with highly rearranged or relatively conserved mitogenomes like A. japonica and A. borneensis, respectively. Additional mitogenome sequencing of Embioptera is needed to investigate the processes driving the structural variation and evolutionary history of webspinners.

Materials and Methods

Sampling and DNA extraction

Specimens of A. borneensis were collected from South China Agricultural University in Guangdong Province, China. Our research activities were not banned by any organization or individual and did not involve endangered or protected species. Specimens used in this study were preserved in 100% ethanol and stored at −20 °C. Total genomic DNA was extracted from adults using the Column mtDNAout kit (Tianda Beijing, China) and stored at −20 °C until used for PCR.

PCR amplification and sequencing

Five pairs of LA-PCR primers were used to amplify segments of the A. borneensis mitogenome (Supplementary Table S2). LA-PCR was performed with LA Taq DNA polymerase (Takara, Japan) in an ABI thermal cycler as follows: initial denaturation at 94 °C for 3 min, followed by 35 cycles at 94 °C for 30 s; annealing at 55 °C for 30 s; and elongation at 68 °C for 9 min, and final elongation at 68 °C for 15 min. LA-PCR products were purified with an Axygen DNA Gel Extraction Kit (Axygen Biotechnology, Hangzhou, China) after separation by electrophoresis in 1.0% agarose gels.

Ten additional gene-specific PCR primers were designed using Primer Premier 5.0 and used to amplify the remaining gaps (Supplementary Table S2). These reactions were performed using templates obtained from LA-PCR as follows: primary denaturation at 94 °C for 3 min, 35 cycles at 94 °C for 30 s, annealing at 40–55 °C for 30 s, elongation at 72 °C for 90 s, and final elongation at 72 °C for 10 min. PCR products were separated by 1.0% agarose gel electrophoresis, and all PCR fragments were sequenced after separation and purification.

Genome assembly and annotation

The software CodonCode Aligner (http://www.codoncode.com/aligner/) was used for sequence assembly and annotation. PCGs and rRNA genes were identified by MITOS (http://mitos.bioinf.uni-leipzig.de/index.py) and by comparison with the previously sequenced webspinner mitogenome14, 31. The graphical map of the complete mitogenome was drawn with the online software CGView Server (http://stothard.afns.ualberta.ca/cgview_server/index.html)32. tRNAs were identified by MITOS combined with tRNAscan-SE Search Server v. 1.21 (http://lowelab.ucsc.edu/tRNAscanSE/)33. The secondary structure of tRNA and rRNA genes was also obtained from MITOS. The A + T content and base composition were analyzed by MEGA v. 634. Composition skew analysis was performed using the AT-skew = [A–T]/[A + T] and GC-skew = [G–C]/[G + C] formulas35. The software package DnaSP v. 5.1036 was used to calculate the synonymous substitution rate (Ks) and the nonsynonymous substitution rate (Ka). Stem loop structures of the putative control region were predicted by DNAMAN, and the tandem repeats were analyzed with the Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.advanced.submit.html)37. Sequence data were deposited into GenBank as accession number KX965988.

Phylogenetic Analysis and Divergence

To investigate the phylogenetic relationship between A. borneensis and A. japonica, nucleotide sequences of rrnL, 28s, cox1 and H3 genes for 19 Oligotomidae spp. were downloaded from GenBank (Supplementary Table S1). Two other webspinners from the family Teratembiidae, Teratembia geniculata and Teratembia sp. KBMEB7 were used as outgroups. The four genes were respectively aligned with Clustal X as implemented in MEGA v. 634. The best partitioning schemes and substitution models were determined with PartitionFinder v. 2.1.138 using the Bayesian Information Criterion (BIC) and a greedy search algorithm. Four partitions (rrnL, 28s, cox1 and H3 partitioned by genes) and the GTR + I + G model of nucleotide substitution were predetermined for analyses. Bayesian inferences (BI) were implemented in the program BEAST v. 1.8.4 on an.xml file produced in the program BEAUti v. 1.8.4 from the concatenated DNA sequence data, and the MRCA of Oligotomidae was calibrated with a lower limit of 94.3 Ma39. The BEAST analyses (uncorrelated lognormal relaxed clock, Yule prior) were performed for 100 million generations with sampling every 1000 generations and a burn-in of 25% trees. Stationarity of all runs was examined by Tracer v. 1.5 (Effective sample sizes over 200)40. Treeannotator v. 1.8.4 was used to obtain the maximum clade credibility (MCC) timetree before viewing in FigTree v. 1.4.241. Maximum likelihood (ML) analysis was performed using RAxML v. 8.2.9 via the online CIPRES Science gateway portal42, 43. The ML analyses performed 1000 bootstrap replicates with the GTRGAMMA substitution model used for all partitions.