Skip to main content

Novel insight into lepidopteran phylogenetics from the mitochondrial genome of the apple fruit moth of the family Argyresthiidae

Abstract

Background

The order Lepidoptera has an abundance of species, including both agriculturally beneficial and detrimental insects. Molecular data has been used to investigate the phylogenetic relationships of major subdivisions in Lepidoptera, which has enhanced our understanding of the evolutionary relationships at the family and superfamily levels. However, the phylogenetic placement of many superfamilies and/or families in this order is still unknown. In this study, we determine the systematic status of the family Argyresthiidae within Lepidoptera and explore its phylogenetic affinities and implications for the evolution of the order. We describe the first mitochondrial (mt) genome from a member of Argyresthiidae, the apple fruit moth Argyresthia conjugella. The insect is an important pest on apples in Fennoscandia, as it switches hosts when the main host fails to produce crops.

Results

The mt genome of A. conjugella contains 16,044 bp and encodes all 37 genes commonly found in insect mt genomes, including 13 protein-coding genes (PCGs), two ribosomal RNAs, 22 transfer RNAs, and a large control region (1101 bp). The nucleotide composition was extremely AT-rich (82%). All detected PCGs (13) began with an ATN codon and terminated with a TAA stop codon, except the start codon in cox1 is ATT. All 22 tRNAs had cloverleaf secondary structures, except trnS1, where one of the dihydrouridine (DHU) arms is missing, reflecting potential differences in gene expression. When compared to the mt genomes of 507 other Lepidoptera representing 18 superfamilies and 42 families, phylogenomic analyses found that A. conjugella had the closest relationship with the Plutellidae family (Yponomeutoidea-super family). We also detected a sister relationship between Yponomeutoidea and the superfamily Tineidae.

Conclusions

Our results underline the potential importance of mt genomes in comparative genomic analyses of Lepidoptera species and provide valuable evolutionary insight across the tree of Lepidoptera species.

Peer Review reports

Background

Lepidoptera is the second largest insect order, with > 160,000 species [1]. This order includes both butterflies and moths, many of which are important model organisms in ecology and evolutionary biology [2]. In Lepidoptera, mitochondrial (mt) genomes are widely used to study population genetics, phylogeography, phylogenetics and molecular taxonomy [3,4,5]. In particular, the mitogenome represents an ideal tool for the analysis of phylogenetic relationships due to its simple structure, maternal inheritance, low recombination, and high conservation over the course of evolution [6, 7]. Mitogenomes may also provide information to identify novel genes that may serve as targets in future research [8]. The mitogenome size of Lepidoptera ranges from 15,000 bp to above 16,000 bp [6, 7], mostly due to the variable length of noncoding regions, particularly the control region [8]. Moreover, lepidopteran mt genomes have a conserved rich adenine and thymine (A + T) region and usually consist of 37 genes, encoding 13 conserved protein-coding genes (PCGs), 22 tRNAs, 2 rRNAs, and a noncoding control region [6, 7, 9]. Until recently, the bulk of Argyresthiidae phylogenetic analyses utilized a common set of 8–11 mitochondrial and nuclear genes [10] and a set of up to 27 protein-coding genes [11,12,13,14,15,16]. However, inadequate node support hindered research that attempted to unravel relationships among superfamilies, even with over 1500 genes [17,18,19,20]. The potential causes and consequences of the competing phylogenetic hypotheses were discovered to be compositional bias and other model violations [21].

Yponomeutoidea is a large superfamily of Lepidoptera with 11 families and 1800 species [16, 22]. Surprisingly, only seven mt genome species of this superfamily are available in databases [23]. For the Argyresthiidae family, which belongs to this superfamily and contains 157 species [16], no mt genomes exist. Thus, obtaining the mt genomes of this family may be warranted for further resolving patterns of genomic evolution and assessing phylogenetic relationships.

The apple fruit moth (Argyresthia conjugella, Zeller) has a wide circumpolar distribution [24, 25]. Its main host, rowan (Sorbus aucuparia), is a masting species with spatiotemporally synchronized crop output [26]. In heavy intermast years, the apple fruit moth can hatch to find no host material available and will therefore seek secondary hosts, causing serious damage to apple crops [27]. A. conjugella is known to have high genetic diversity and a wide distribution in Fennoscandia [28,29,30], but the lack of complete mitogenomes for A. conjugella and the family Argyresthiidae hampers further studies on systematics, population genetics, taxonomy and evolutionary biology.

Our primary aim was to characterize the first entire mitogenome of a species in the Argyresthiidae family using the apple fruit moth in Norway as our study organism. Second, we analysed genome structure, base composition, substitution, and evolutionary rates among superfamilies using previously published Lepidoptera mitogenomes to obtain a better understanding of the phylogeny of Lepidoptera. We hypothesized that our phylogenetic analysis would recover Argyresthiidae nested with Yponomeutoidea. Furthermore, we evaluated the phylogenetic hypothesis that Argyresthiidae shows a sister-group relationship with Lyonetiidae, i.e., the ‘AL’ clade (Argyresthiidae + Lyonetiidae) of Sohn et al. (2013) [16] obtained based on nuclear genes. Finally, we wanted to provide an up-to-date identification of source taxa of lepidopteran sequences lacking superfamily-, family-, and/or genus-level ID on GenBank using a phylogenetic systematics framework.

Results and discussion

Genome assembly

Except for the variable control region (CR) and Norgal assemblies, we recovered the same gene order and content using both of our mitogenome assembly strategies. We discovered that Norgal failed to assemble mitogenome sequences when using the de novo assembly strategy, with no mitogenomic features (PCGs, tRNAs, and rRNAs) found for both assemblies resulting from using default (assembly size = 24,871 bp) and adjusted parameters (assembly size = 29,520 bp, -m 500). However, when run using the baited de novo assembly strategy, Norgal recovered the same gene order and content as SPAdes and Geneious Prime®, with the exception of the large ribosomal RNA gene (rrnL), which differed in size by 1 bp and sequence from that recovered by SPAdes and Geneious Prime® (pairwise p-distance = 0.01). We found that the variations in mitogenome sizes were associated with the properties of the control region (CR), which include variation in the copy number of tandemly repeated sequences and extensive length variation of a variable domain [31, 32]. When using SPAdes under the de novo assembly strategy, the nearly complete CR (1101 bp) was recovered. When the baited de novo assembly strategy was used, SPAdes recovered a partial CR of 380 bp in which the repetitive sequences could not be assembled. As a result, we present the complete mitogenome sequences of the apple fruit moth from the SPAdes de novo assembly, where the mitogenome is a 16,044 bp closed circular molecule (GenBank accession: ON496993; Fig. 1). Interestingly, the mitogenome size of the apple fruit moth was similar to available Yponomeutoid mitogenomes [33,34,35], which are relatively longer on average compared to other superfamilies of Lepidoptera (n = 4, 16092 ± 353 bp, Table 1).

Fig. 1
figure 1

Circular map of the complete mitogenome of Argyresthia conjugella depicting gene order. Labelling of tRNA genes was conducted in accordance with IUPAC-IUB single-letter amino acid codes

Table 1 General information and nucleotide composition for a subset of 51 representative mitochondrial genomes of the order Lepidoptera and 5 Trichopteran outgroups used in this study

Genome organization and base composition

The gene content of the apple fruit moth mitogenome is similar to that of other Ditrysian insects studied previously, with 22 tRNA genes, 13 PCGs, 2 rRNAs and a noncoding control region. The low-strand codes for 9 PCGs (cob, cox1, cox2, cox3, atp6, atp8, nad2, nad3 and nad6), 14 tRNAs (trnM, trnI, trnW, trnL2, trnK, trnD, trnG, trnA, trnR, trnN, trnS1, trnE, trnT and trnS2), 4 PCGs (nad1, nad4, nad4L and nad5), 8 tRNAs (trnC, trnF, trnH, trnL1, trnP, trnQ, trnV, trnY) and two mitochondrial rRNAs (rrnL and rrnS) (Fig. 1, Table 2). The lengths of the tRNA genes range from 64 to 75 bp (Table 2), which is well within the range of the corresponding tRNA genes of other lepidopterans: Plutella xylostella [34], Parnassius apollo [36], Leucoma salicis [7], Ephestia kuehniella [37] and Speiredonia retorta [6]. All 22 tRNAs had cloverleaf secondary structures, except trnS1, where one of the dihydrouridine (DHU) arms is missing (Fig. 2). The loss of the DHU arm in tRNAs has been detected in various Lepidoptera species [6, 38, 39]. DHU lacking arm was hypothesized to have evolved in response to recognition signals for seryl-tRNA synthetases, reflecting potential differences in gene expression [40, 41]. The location of rrnL is between trnV and trnL1, while rrnS is detected between the control region and trnV. These are the same gene positions found in P. xylostella [34]. The lengths of rrnL and rrnS in A. conjugella are 1371 bp and 783 bp, while the lengths of these genes are 1371 bp & 783 bp, 1344 bp & 840 bp and 1413 bp & 781 bp in S. retorta, L. salicis and P. xylostella, respectively [6, 7, 34]. The rRNA genes were A + T rich (82%), falling within the range detected in other Lepidoptera species, including Agrotis segetum [42], Agrotis ipsilon [43], Spodoptera frugiperda [44], and Papilio machaon [45]. The rRNA AT and GC skewness values were found to be negative in most of the analyzed Lepidoptera mitogenomes in the study, including A. conjugella; however, in Tecia solanivora [46], Spilarctia subcarnea [47] and S. retorta [6], these values were positive. In A. conjugella, the cox1 gene starts with ATT, which is different from the start codon in the superfamily Yponomeutoidea members P. xylostella, Leucoptera malifoliella and Prays oleae, where the gene start codon is CGA. The start codon of the cox1 gene was found to be variable in other Lepidoptera species [48]. The size of this gene (1534) in A. conjugella is 3 bp larger than that in these three species (P. xylostella, L. malifoliella and P. oleae) in the same superfamily. The cox2 gene size (682 bp) is the same size as that of L. malifoliella but larger than that found in P. xylostella and P. oleae (679), while all these species have the size of the cox3 gene (789 bp). The largest PCG found in A. conjugella mitogenomes is nad5 (1732 bp), and the smallest one is atp8 (162 bp). These results are widely reported in various insect mitogenomes [49, 50]. Overlap of the alginate sequences of atp6 and atp8 in A. conjugella (Fig. 3) showed the conserved nucleotide sequence ATG ATA A, which is detected in most lepidopteran species [34, 51].

Table 2 The organization and characteristics of the complete mitochondrial genome of Argyresthia conjugella
Fig. 2
figure 2

Predicted secondary structures of the 22 typical tRNA genes in the A. conjugella mitogenome

Fig. 3
figure 3

Alignment of atp8 and atp6 overlap of the selected lepidopteran species in the study, including A. conjugella. The green arrow shows the apt6 start codon, and the red arrow shows the atp8 stop codon

We found that the locations of the trnM gene follow the ditrysian type trnM-trnI-trnQ [52], which is different from non-ditrysian groups in Lepidoptera and from the ancestral order in which trnM is translocated: trnI-trnQ-trnM [52,53,54]. The control region of A. conjugella is large (1101 bp), which is a common feature detected in the superfamily Yponomeutoidea [35]. In comparison, the CR of the olive and diamondback moths were found to be ~ 1600 bp and ~ 1081 bp, respectively [34, 35]. We found that the CR is comprised of nonrepetitive sequences, including the motif ‘ATAGA’ followed by a 20 bp poly-T stretch, dinucleotide microsatellites (AT)18 and (AT)53, each flanked by ATTTA motifs, a (TAAA)4 adjacent to trnM instead of the 11 bp poly-A adjacent to tRNAs, and several imperfect repeat elements, indicating that the sequence in the present study may be partial. We found that the nucleotide composition of the CR was highly AT-rich, where the AT content was estimated at 94.3%, (A: 47.6%, T: 46.7%, G: 1.8%, C: 3.9%), where the AT skew was positive and the GC skews was negative, 0.010 and − 0.368, respectively. Overall, the nucleotide composition of the apple fruit moth mitogenome was also highly AT-rich, where the AT content was estimated at 82%, (A: 40.8%, T: 41.2%, G: 7.4%, C: 10.6%), and AT and GC skews were negative, − 0.005 and − 0.178, respectively (Table 1). These results are in agreement with results obtained in P. xylostella [34], L. salicis [7], E. kuehniella [37] and S. retorta [6].

The codon usage in A. conjugella was compared with twelve Lepidopteran species from different families (Fig. 4). The comparison showed that the pattern of codon usage in the PCGs of the A. conjugella mitogenome is very similar to the patterns in these Lepidopteran mitogenomes. Asn, Ile, Leu2, Met and Phe are the most commonly used codon families in all these species, while Cys codons are the rarest (Figs. 4 and 5). The relative synonymous codon usage (RSCU) was analysed for A. conjugella and compared with the same set of Lepidopteran insects (Fig. 6). CTG, CTC, AGG and ACG were completely absent in the A. conjugella mitogenome PCGs. Codons with high G and C contents are also rare or absent in the PCGs in other Lepidopteran mitogenomes. Moreover, TTA (Leu2), TCT (Ser2), CGT (Arg), GCT (Ala), and GGA (Gly) are the most frequently used codons and account for 36.41%. These five amino acids are also detected in other Lepidoptera species, such as Manduca sexta [55], Helicoverpa armigera [56], P. xylostella [34], T. solanivora [46], P. machaon [45], and Ostrinia nubilalis [57]. In particular, Leu2 was found to be the most frequently detected amino acid in all Lepidoptera species in the study, and this result is supported by results found in L. salicis [7] and S. retorta [6].

Fig. 4
figure 4

Comparison of codon usage of the 20 selected mitochondrial genomes of the Lepidoptera species in the study, including A. conjugella

Fig. 5
figure 5

Relative synonymous codon usage (RSCU) of the 20 selected mitochondrial genomes of Lepidoptera in the study, including A. conjugella. Codon are plotted on the x-axis

Fig. 6
figure 6

The distribution of codons among the selected lepidopteran species in the study. CDspT codons per thousand codons

Phylogenetics

To obtain an overview of A. conjugella and its relationships with other Lepidoptera species, our study investigated 18 superfamilies representing 42 families and 507 Lepidoptera species (Tables S1, S2 and Figure S2). This is the first phylogenetic study (using the mt genome) of A. conjugella in the Argyresthiidae family, which belongs to the Yponomeutoidea superfamily. Various studies tried to resolve phylogenetic tree of Lepidoptera using mitochondrial genomes, nucleotide alignments, amino acid alignments and transcriptomes and target enrichment approaches [6, 7, 9, 17,18,19,20,21, 58]. However, inadequate node support hindered research that attempted to unravel relationships among superfamilies [17,18,19,20,21]. The challenges are not the lack of data but, how to the data analyze, the quality of data and the number of taxon investigated [18, 21]. We constructed a phylogenetic tree using 507 Lepidopetera species (Fig. S2), and the subset data using 51 species (Fig. 7) to understand the position of A. conjugella in Lepidoptera phylogenetic tree. Using the ML approach, analyses of the three datasets (specified in the materials & methods section) resulted in the generation of three topologies. Generally, our study agrees with the most updated study Rota et al. (2022) [21], that detected nine main clades superfamilies in a butterfly and moth phylogeny using 331 genes for 200 taxa. Additionally, our phylogenetic analysis supports the previous morphological characterization of the Yponomeutoidea superfamily [16, 59, 60]. The 507 Lepidoptera species showed that some families clustered together, such as Papilionidae & Pieridae, Pyralidae & Tortricidae, Geometridae & Sphingidae, Erebidae & Noctuidae and Gelechiidae & Sphingidae, while other families as Tortricidae and Crambidae clustered alone and separately. Yponomeutoidea was recovered as a well-supported monophyly group and as one of the earliest lepidopteran groups after Tineoidea and the basal Hepialoidea (Fig. 7, Figures S1 and S2). However, the paraphyletic Tineoidea to some extent led to the phylogenetic instability of the monophyly of Yponomeutoidea in cases of Datasets 1 and 2 (Fig. 7, Figure S1), which was fully resolved with dense taxon sampling (Figure S2). Wang et al. (2018) [61], Bao et al. (2019) [38], Jeong et al. (2022) [23] and Zhang et al. (2020) [62], all found similar results for Yponomeutodiea and Tineoidea superfamilies. Furthermore, Boa et al. (2019) [38] and Jeong et al. (2022) [23] also found that Yponomeutoidea, Tineoidea and Gracillarioidea in Ditrysia have strong phylogenetic relationships. We also detected strong relationships between Yponomeutoidea, Zygaenidae and Tortricoidea, findings that are in line with results found by Liu et al. (2016) [48], Zhang et al. (2020) [62], Wang et al. (2018) [61], and Kim et al. (2014) [63]. Only a weak phylogenetic relationship was observed between the superfamilies Yponomeutoidea and Bombycoidea, results that are supported by Liu et al. (2016) [64] and Liu et al. (2017) [65]. Nonetheless, we consistently recovered Argyresthiidae embedded in Yponomeutoidea with a sister-group relationship to Plutellidae (Dataset 1: SH-aLRT = 92, UFBoot2 = 100; Dataset 2: SH-aLRT = 88, UFBoot2 = 100; Dataset 3: SH-aLRT = 87, UFBoot2 = 99). Our phylogenetic tree hypothesis rejects the provisional ‘AL’ clade (Argyresthiidae + Lyonetiidae) recovered with nuclear gene datasets by Sohn et al. (2013) [16]. We found that Lyonetiidae was unstable, possibly due to its relatively long branch length. We recovered Lyonetiidae as basal to the Yponomeutoidea clade (Figure S1, Dataset 1: SH-aLRT = 99, UFBoot2 = 100) or as a sister-group to Praydidae with Yponomeutoidea (Figure S2, Dataset 3: SH-aLRT = 84, UFBoot2 = 100), and as sister-group to Gracillariidae of the order Tineoidea, although with weak support (Fig. 7, Dataset 2: SH-aLRT = 43, UFBoot2 = 91). With increased taxon sampling, our phylogenetic tree hypotheses strongly supported the basal placement of Lyonetiidae within the Yponomeutoidea clade (Fig. 7, Figure S2, Dataset 2: SH-aLRT = 98, UFBoot2 = 99). Moreover, we consistently recovered the previously described pairing of Yponomeutoidea and Gracillariidae as internested subclades [16, 22]. At a higher level, our phylogenetic tree hypothesis recovers some fundamental and uncontroversial lepidopteran clades that agree with the majority of mitogenomic phylogenies as well as those that included both mitochondrial and/or nuclear markers. The analyses found that A. conjugella had the closest relationship with P. xylostella, L. malifoliella and P. oleae, which belong to the Plutellidae, Lyonetiidae and Praydidae families, respectively (Fig. 7, Figure S2). Wei et al. (2013) [34], Sohn et al. (2013) [16], Liu et al. (2016) [48], Yang et al. (2020) [66], Jeong et al. (2021) [67] and Jeong et al. (2022) [23] all found that P. xylostella, L. malifoliella and P. oleae are closely related.

Fig. 7
figure 7

The phylogenetic tree included A. conjugella, constructed using the nearest neighbor interchange (NNI) approach to search for tree topology and for computing branch supports with 1000 replicates of the Shimodaira-Hasegawa approximate likelihood-ratio test SH-aLRT [68] and 1000 bootstrapped replicates of the ultrafast bootstrapping (UFBoot2) approach [69]. Phryganea cinerea, Phryganopsyche latipennis, Cheumatopsyche brevilineata, Limnephilus hyalinus, and Stenopsyche angustata were used as controls and outgroup species (Table 1)

In our study, Tineodiea superfamily was represented by four species (Amorophaga japonica, Dahlica ochrostigma, Gibbovalva kobusi and Eudarcia gwangneungensis) with relatively high nodal support (Fig. 7, Figure S2). This superfamily is known to have high genetic diversity and has three different lineages [21]. The crosstalk of the complexity and the relationships among Tineidae group and the disagreements within the superfamily Gelechioidea, Carposinoidea, and Pterophoroidea remain unresolved issues. Both this study and that of Rota et al. (2022) [21], detected a sister relationship between Yponomeutoidea and the superfamily Tineidae, and the sub-clades Gelechioidea, Tortricoidea, Zygaenoidea are clustered together in the same clade. Rota et al. (2022) [21], found Gelechioidea clustered at different positions, when different analyses were performed with different datasets, these may be explained by high amount of compositional heterogeneity, or the limited materials used in the study (five species). While our study showed, the 20 species from Gelechioidea superfamily were clearly clustered together using both datasets and data two analyses (EME and NJ), but surprisingly, one single species (Periacma orthiodes) belonging to the superfamily Noctuoidea was clustered together with this family. This might be misidentification of the taxon of the mt genome found in the genebank. Our study showed, Gelechioidea grouped together with Pyralidae, these results are in agreement with the results of [17]. Pyaloidea was also sister to Carposinoidea; and Calliduloidea, Pterophoroidea, Gelechioidea and Thyridioidea are recovered in the same part of the tree, but with Thyridoidea sister to Macroheterocera [21]. Previously, Pterophoroidea was reported as a sister group with a monophyletic Papilionoidea, included Hedyloidea and Hesperioidea. In the same study, Choreutoidea and Immoidea were recovered as sister to Tortricoidea [21]. However, when 50 genes were removed, Choreutoidea were recovered as sister to Urodoidea and Pterophoroidea [21]. One phylogenetic study reported Pterophoroidea within the clade Obtectomera, [19] but a more recent study showed results contrary to these findings [21]. The position of Pterophoroidea is highly dependent on the dataset. This superfamily is recovered in the same clade with Urodoidea regardless of the alignment analysed, whereas it’s recovered in the clade with Gelechioidea, Calliduloidea and Thyridoidea is dependent on which datasets are analysed [21]. Pterophoroidea can also be recovered as sister to Papilionoidea and Noctuoidea, when different datasets were used [70]. It should also be noted that, using software with systematic errors and alignment issues can persist with regard to detecting homologies due to use of designed to assess the alignment quality using a threshold of alignment scores, [71].

Comprehensive analyses of insect mitogenomes provide important phylogenetic information to identify potentially novel genes that may serve as valuable targets in future research efforts. Further investigations of the whole genome of A. conjugella along with other genomes of Lepidoptera species will facilitate the understanding of the taxonomy and evolutionary process acting on the Ditrysia natural group.

Materials and methods

Specimen collection and DNA extraction

During August 2016, we collected a single female apple fruit moth larva from an infested rowan berry in the field in Skiftenes (N 6471746 and E 472502) in southern Norway. To confirm species identification of the larva, we employed both morphological [24, 65] and molecular methods [28] using microscopy and STR markers, respectively. We placed the apple fruit moth larva on rolls of corrugated cardboard until it entered pupal diapause, and then we stored it at -80 °C until DNA was extracted. DNA was extracted from the apple fruit moth pupal tissue using the DNeasy Blood and Tissue Kit (Qiagen, Tokyo) following a modified version of the manufacturer’s instructions [28].

Mitogenome sequencing and assembly

We outsourced the whole genome sequencing of the apple fruit moth to the Norwegian Sequencing Centre (Oslo, Norway), where the whole genome library was prepared (insert size = 350 bp) and sequenced on one lane of the Illumina HiSeq 4000 platform (Illumina, USA) with paired end (PE) sequencing (2 × 150 bp). A total of 820,368,162 raw reads (of which 820,365,390 were paired in sequencing, i.e., 410,182,695 PE read clusters) were generated. We evaluated the quality of the Illumina sequencing run using MultiQC v.2.31 [72]. Then, we used AdapterRemoval v.2.1.3 to search for and remove adapter sequences and to trim low-quality bases from the 3' end of reads following adapter removal [73, 74]. After quality control (QC), we used the cleaned PE reads for mitogenome assembly by means of two assembly strategies: (i) de novo and (ii) ‘baited’ de novo.

For de novo sequence assembly (i), we employed the programs Norgal v.1.0 [75] and SPAdes v.3.15.3 [76]. The Norgal assembler was executed with (1) default parameters and (2) a k-mer range of 21–255 with an interval of 28 and a contig length threshold of 500 bp. We executed SPAdes for all k-mer sizes from 21 to 127 (-k 21, 33, 55, 77, 99, 127), with –careful option to minimize number of mismatches in the final contigs. For the baited de novo assembly strategy (ii), we first constructed the FM-index for the mitogenome reference sequence of the olive moth P. oleae (Bernard, 1788; Lepidoptera: Yponomeutoidea: Praydidae) (NCBI accession number NC_025948.1; [35] using the index command of the BWA v.0.7.17 aligner [77]. Additionally, we also used the mitogenome reference sequence of the diamondback moth P. xylostella [34] (Linnaeus, 1758; Lepidoptera: Yponomeutoidea: Plutellidae) (NCBI accession number JF911819.1). We selected the mitogenomes of the olive and diamondback moths as references due to their completeness and taxonomic and phylogenetic placement in Yponomeutoidea and the reliability of the PCR-based amplification method used to sequence these mitogenomes, 14 segments of 1.2–2.4 kb and nine segments of variable size, respectively. Second, we aligned the cleaned PE reads of the apple fruit moth separately to the indexed reference genome of the ‘model’ moths using the BWA-MEM algorithm of BWA, excluding reads with a minimum quality score of < 30, and then used the SAMtools v.1.9 suite [78] to convert the SAM to BAM alignment file. Third, we sorted and indexed the BAM alignment file using the sort and index commands, respectively, from SAMtools. Fourth, we obtained the QC statistics for the sorted and indexed alignment using BAMQC as implemented in Qualimap v.2.2.20 [79]. Fifth, we extracted reads that mapped properly as pairs using SAMtools. Finally, we used the mitochondrial filtered reads for de novo mitochondrial genome assembly using Norgal, SPAdes and Geneious Prime® v.2022.1.1 (Biomatters Ltd., Auckland, New Zealand; [80].

Mitogenome annotation and visualization

We conducted a preliminary annotation of the mitogenome assembly referring to the results of the MITOS2 webserver (http://mitos2.bioinf.uni-leipzig.de/index.py; [81] Donath et al. 2019) by assessing the location of protein coding (PCGs), transfer RNA (tRNAs), and ribosomal RNA genes (rRNAs). Then, we confirmed gene boundaries for PCGs and rRNAs manually using BLASTn, SMART BLAST, BLASTp, and ORF Finder as implemented at the National Center for Biotechnology Information (NCBI) database [82]. Subsequently, we also validated that coding sequences were translated in the correct reading frame and confirmed the initiation and termination codons in Geneious Prime® using the published mitochondrial genome sequences of other moths as references, including the olive moth. We then used the program ARWEN [83] to detect the tRNA genes of the apple fruit moth and finally predicted the secondary structures of tRNAs using MITFI [84] as implemented in MITOS2 and tRNAscan-SE v.2.0 [85]. We also annotated the control (A + T-rich) region (CR) of the apple fruit moth by screening for structural elements characteristic of the region, which include (i) tandem repeats, identified using Tandem Repeats Finder v.4.10 [86] using default settings, and (ii) the motif ‘ATAGA’ and poly-T stretch. We produced the annotated circular map of the complete mitochondrial genome of the apple fruit moth using the beta version of the CGview server (http://cgview.cahttp://cgview.ca; [87]. The secondary structure of tRNAs was predicted using tRNAscan-SE-2.0 [88].

Comparative mitogenomics of Lepidoptera

We conducted a systematic and comprehensive search for complete mitochondrial genomes of Lepidopteran species published in the NCBI nucleotide database using the following keywords: (“lepidoptera”[Organism] OR “lepidoptera”[All Fields]) AND “complete mitochondrial genome”[All Fields] AND mitochondrion[filter] (10 May 2022: 842 hits). We downloaded and processed the full GenBank files in Geneious Prime® to (i) obtain taxonomy metadata, (ii) remove least recently modified duplicates, (iii) remove nonlepidopteran species, and (iv) remove mitogenomes with > 90% missing annotations (retained 507 species). To ensure that the taxonomic status of all species was the latest, we verified all the species names against 60 taxonomic databases, including the Catalogue of Life and the Integrated Taxonomic Information System (ITIS), using the R package taxize v.9.94.91 [89]. Then, we corrected any misspellings and used the classification function implemented in taxize to retrieve the taxonomic ranks of individual species. We included seven species of Trichoptera, representing five families and four superfamilies, to serve as outgroups.

We compared the assembled A. conjugella mitochondrial genome with the mitochondrial genomes of 507 other Lepidoptera obtained from GenBank, representing 18 superfamilies and 42 families (Supplementary Table S2). We included only one representative per valid species (longest mitogenome sequence) when more than one known sequence was available in GenBank. We calculated the overall composition of individual mitogenomes based on the proportion of A + T out of the total (%AT content) using MEGA v.11.0.11 [90]. To measure the base composition skewness of nucleotide sequences, we used the formulae of Perna and Kocher (1995) [91]: AT-skew = [A-T]/[A + T] and GC-skew = [G-C]/[G + C].

Sequence alignment and phylogenetic reconstruction

We produced codon-aware multiple sequence alignments for each of the 13 PCGs using MACSE v.2.01 [92]. We inspected and manually trimmed each set of alignments using MEGA, and any remaining ambiguously aligned sites were then further trimmed using BMGE v.1.12.1, with a sliding window size of 3 and maximum entropy of 0.5 [93]. We aligned rRNA genes using the online version of MAFFT v.7.299 [94, 95] and removed ambiguously aligned sites using BMGE. Before phylogenetic analysis, we produced two concatenated mitogenomic datasets from (i) the aligned individual PCG datasets (Dataset 1: 13PCGs_NT dataset) and (ii) the 13 PCGs plus the large and small mitochondrial ribosomal RNA (rRNA) genes (rrnL and rrnS) (Dataset 2: 13PCGs_rRNAs_NT dataset) with the R package concatipede v1.0.1 [96]. We derived the third mitogenomic dataset by translating the 13PCGs_NT dataset in MEGA (Dataset 3: 13PCGs_AA dataset). Furthermore, we used DAMBE v.7.2.141 [97] to conduct two-tailed tests of substitution saturation [98] for each codon position of the 13 PCGs, taking into account the proportion of invariant sites as recommended by Xia and Lemey (2009) [99]. According to the observed index of substitution saturation (ISS), all codon positions showed little saturation (ISS < ISScSym (assuming a symmetrical topology) and ISS < ISScAsym (assuming an asymmetrical topology); see Supplementary Table S2). Likewise, visual inspection of nucleotide saturation for each codon position of the 13 PCGs with DAMBE by plotting transitions and transversions against Kimura two-parameter [100] distances showed little saturation in all codon positions. Therefore, none of the codon positions were excluded, and the 13 PCG nucleotide (Dataset 1) and protein (Dataset 3) datasets were initially gene-by-codon partitioned (39 partitions) and gene partitioned (13 partitions), respectively. For Dataset 2, we designated two partitions for the rRNA genes (rrnS and rrnL, treated each as a single partition) and 39 partitions covering the three codon positions in each of the 13 protein-coding genes.

We used ModelFinder [101] to select the best-fitting partitioning scheme and models of evolution using the corrected Akaike Information Criterion (AICc) and the edge-linked proportional partition model [102] as implemented in IQ-Tree v. 2.2.0.3 [103]. We applied the new model selection procedure (-m MF + MERGE), which additionally implements the FreeRate heterogeneity model inferring the site rates directly from the data instead of being drawn from a gamma distribution (-cmax 20; [104]. To reduce the computational burden, the top 30% partition merging schemes were inspected using the relaxed clustering algorithm (-rcluster 30), as described in [105].

We reconstructed phylogenies based on the maximum likelihood (ML) criterion in IQ-Tree, where we used the substitution models indicated by ModelFinder (Table 3). We used the nearest neighbor interchange (NNI) approach to search for tree topology and for computing branch supports with 1000 replicates of the Shimodaira-Hasegawa approximate likelihood-ratio test SH-aLRT [68] and 1000 bootstrapped replicates of the ultrafast bootstrapping (UFBoot2) approach [69]. We abided by the advice that clades with UFBoot2 ≥ 95 and SH-aLRT ≥ 80 can be regarded as being well supported [106].

Table 3 The substitution model MODELFINDER [101] was used to reconstruct phylogenies based on the maximum-likelihood (ML) criterion in IQ-TREE

Availability of data and materials

A. conjugella. mitochondrial genome has been deposited in GenBank under accession: ON496993; Fig. 1 (https://github.com/Simo-N-Maduna/Mito-Phylogenomics/tree/main/Mitophylogenomics_PartII_Lepidoptera). The 507 mitogenomes from the study were downloaded from GenBank. Their accession numbers and references are listed in Table S1. Other supporting results are included within the article and its additional files.

References

  1. van Nieukerken E. Order lepidoptera linnaeus, Zootaxa 3148. Magnolia Press 1758; 2011.

  2. Regier JC, Zwick A, Cummings MP, Kawahara AY, Cho S, Weller S, Roe A, Baixeras J, Brown JW, Parr C. Toward reconstructing the evolution of advanced moths and butterflies (Lepidoptera: Ditrysia): an initial molecular study. BMC Evol Biol. 2009;9(1):1–21.

    Article  Google Scholar 

  3. Xu C, Pan Z-Q, Nie L, Hao J-S. The complete mitochdrial genome of Issoria eugenia (Lepidoptera: Nymphalidae: Heliconiinae). Mitochondrial DNA Part B. 2019;4(1):1662–3.

    Article  Google Scholar 

  4. Chen D-B, Zhang R-S, Jin X-D, Yang J, Li P, Liu Y-Q. First complete mitochondrial genome of Rhodinia species (Lepidoptera: Saturniidae): genome description and phylogenetic implication. Bull Entomol Res. 2022;112(2):243–52.

    Article  PubMed  Google Scholar 

  5. Liu X, Qi M, Xu H, Wu Z, Hu L, Yang M, Li H. Nine Mitochondrial Genomes of the Pyraloidea and Their Phylogenetic Implications (Lepidoptera). Insects. 2021;12(11):1039.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Sun Y, Huang H, Liu YD, Liu SS, Xia J, Zhang K, Geng J. Organization and phylogenetic relationships of the mitochondrial genomes of Speiredonia retorta and other lepidopteran insects. Sci Rep. 2021;11(1):2957.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Sun YX, Wang L, Wei GQ, Qian C, Dai LS, Sun Y, Abbas MN, Zhu BJ, Liu CL. Characterization of the complete mitochondrial genome of Leucoma salicis (Lepidoptera: Lymantriidae) and comparison with other lepidopteran insects. Sci Rep. 2016;6:39153.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Singh D, Kabiraj D, Sharma P, Chetia H, Mosahari PV, Neog K, Bora U. The mitochondrial genome of Muga silkworm (Antheraea assamensis) and its comparative analysis with other lepidopteran insects. PLoS One. 2017;12(11):e0188077.

    Article  PubMed  PubMed Central  Google Scholar 

  9. Wei ZX, Sun G, Shiu JY, Fang Y, Shi QH. The complete mitochondrial genome sequence of Dodona eugenes (Lepidoptera: Riodinidae). Mitochondrial DNA B Resour. 2021;6(3):816–8.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Wahlberg N, Wheat CW. Genomic outposts serve the phylogenomic pioneers: designing novel nuclear markers for genomic DNA extractions of Lepidoptera. Syst Biol. 2008;57(2):231–42.

    Article  CAS  PubMed  Google Scholar 

  11. Cho S, Zwick A, Regier JC, Mitter C, Cummings MP, Yao J, Du Z, Zhao H, Kawahara AY, Weller S. Can deliberately incomplete gene sample augmentation improve a phylogeny estimate for the advanced moths and butterflies (Hexapoda: Lepidoptera)? Syst Biol. 2011;60(6):782–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Kawahara AY, Ohshima I, Kawakita A, Regier JC, Mitter C, Cummings MP, Davis DR, Wagner DL, De Prins J, Lopez-Vaamonde C. Increased gene sampling strengthens support for higher-level groups within leaf-mining moths and relatives (Lepidoptera: Gracillariidae). BMC Evol Biol. 2011;11(1):1–14.

    Article  Google Scholar 

  13. Zwick A, Regier JC, Mitter C, Cummings MP. Increased gene sampling yields robust support for higher-level clades within Bombycoidea (Lepidoptera). Syst Entomol. 2011;36(1):31–43.

    Article  Google Scholar 

  14. Regier JC, Mitter C, Zwick A, Bazinet AL, Cummings MP, Kawahara AY, Sohn J-C, Zwickl DJ, Cho S, Davis DR. A large-scale, higher-level, molecular phylogenetic study of the insect order Lepidoptera (moths and butterflies). PLoS One. 2013;8(3):e58568.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Regier JC, Mitter C, Davis DR, Harrison TL. SOHN JC, Cummings MP, Zwick A, Mitter KT: A molecular phylogeny and revised classification for the oldest ditrysian moth lineages (L epidoptera: T ineoidea), with implications for ancestral feeding habits of the mega-diverse D itrysia. Syst Entomol. 2015;40(2):409–32.

    Article  Google Scholar 

  16. Sohn JC, Regier JC, Mitter C, Davis D, Landry JF, Zwick A, Cummings MP. A molecular phylogeny for Yponomeutoidea (Insecta, Lepidoptera, Ditrysia) and its implications for classification, biogeography and the evolution of host plant use. Plos One. 2013;8(1):e55066.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Kawahara AY, Breinholt JW. Phylogenomics provides strong evidence for relationships of butterflies and moths. Proc Royal Society B: Biol Sci. 2014;281(1788):20140970.

    Article  Google Scholar 

  18. Breinholt JW, Earl C, Lemmon AR, Lemmon EM, Xiao L, Kawahara AY. Resolving relationships among the megadiverse butterflies and moths with a novel pipeline for anchored phylogenomics. Syst Biol. 2018;67(1):78–93.

    Article  CAS  PubMed  Google Scholar 

  19. Kawahara AY, Plotkin D, Espeland M, Meusemann K, Toussaint EF, Donath A, Gimnich F, Frandsen PB, Zwick A, Dos Reis M. Phylogenomics reveals the evolutionary timing and pattern of butterflies and moths. Proc Natl Acad Sci. 2019;116(45):22657–63.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Mayer C, Dietz L, Call E, Kukowka S, Martin S, Espeland M. Adding leaves to the Lepidoptera tree: capturing hundreds of nuclear genes from old museum specimens. Syst Entomol. 2021;46(3):649–71.

    Article  Google Scholar 

  21. Rota J, Twort V, Chiocchio A, Peña C, Wheat CW, Kaila L, Wahlberg N. The unresolved phylogenomic tree of butterflies and moths (Lepidoptera): Assessing the potential causes and consequences. Syst Entomol. 2022;47(4):531–50.

    Article  Google Scholar 

  22. Mitter C, Davis DR, Cummings MP. Phylogeny and evolution of Lepidoptera. Annu Rev Entomol. 2017;62:265–83.

    Article  CAS  PubMed  Google Scholar 

  23. Jeong JS, Park JS, Sohn J-C, Kim MJ, Oh HK, Kim I. The first complete mitochondrial genome in the family Attevidae (Atteva aurea) of the order Lepidoptera. Biodiversity Data J. 2022;10:e89982.

    Article  Google Scholar 

  24. Ahlberg O. Ronnbarsmalen, Argyresthia conjugella Zell. En redogorelse for undersokningar aren 1921–1926. Meddel Nr 324 fran Centralanstalten for forsoksvasendet pa jordbruksomradet 1927.

  25. Agassiz D. British argyresthiinae and yponomeutinae. In: Proceedings and Transactions of the British Entomological and Natural History Society. 1987. p. 1987.

    Google Scholar 

  26. Kobro S, Søreide L, Djønne E, Rafoss T, Jaastad G, Witzgall P. Masting of rowan Sorbus aucuparia L. and consequences for the apple fruit moth Argyresthia conjugella Zeller. Population Ecology. 2003;45(1):25–30.

    Article  Google Scholar 

  27. Bengtsson M, Jaastad G, Knudsen G, Kobro S, Bäckman AC, Pettersson E, Witzgall P. Plant volatiles mediate attraction to host and non-host plant in apple fruit moth. Argyresthia conjugella Entomologia Experimentalis et Applicata. 2006;118(1):77–85.

    Article  CAS  Google Scholar 

  28. Elameen A, Eiken HG, Floystad I, Knudsen G, Hagen SB. Monitoring of the Apple Fruit Moth: Detection of Genetic Variation and Structure Applying a Novel Multiplex Set of 19 STR Markers. Molecules. 2018;23(4):14.

    Article  Google Scholar 

  29. Elameen A, Eiken HG, Knudsen GK. Genetic Diversity in Apple Fruit Moth Indicate Different Clusters in the Two Most Important Apple Growing Regions of Norway. Diversity-Basel. 2016;8(2):12.

    Google Scholar 

  30. Elameen A, Klutsch CFC, Floystad I, Knudsen GK, Tasin M, Hagen SB, Eiken HG. Large-scale genetic admixture suggests high dispersal in an insect pest, the apple fruit moth. PLoS ONE. 2020;15(8):24.

    Article  Google Scholar 

  31. Zhang D-X, Szymura JM, Hewitt GM. Evolution and structural conservation of the control region of insect mitochondrial DNA. J Mol Evol. 1995;40:382–91.

    Article  CAS  PubMed  Google Scholar 

  32. Zhang D-X, Hewitt GM. Insect mitochondrial control region: a review of its structure, evolution and usefulness in evolutionary studies. Biochem Syst Ecol. 1997;25(2):99–120.

    Article  Google Scholar 

  33. Wu YP, Zhao JL, Su TJ, Li J, Yu F, Chesters D, Fan RJ, Chen MC, Wu CS, Zhu CD. The Complete Mitochondrial Genome of Leucoptera malifoliella Costa (Lepidoptera: Lyonetiidae). DNA Cell Biol. 2012;31(10):1508–22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Wei SJ, Shi BC, Gong YJ, Li Q, Chen XX. Characterization of the Mitochondrial Genome of the Diamondback Moth Plutella xylostella (Lepidoptera: Plutellidae) and Phylogenetic Analysis of Advanced Moths and Butterflies. DNA Cell Biol. 2013;32(4):173–87.

    Article  CAS  PubMed  Google Scholar 

  35. van Asch B, Blibech I, Pereira-Castro I, Rei FT, da Costa LT. The mitochondrial genome of Prays oleae (Insecta: Lepidoptera: Praydidae). Mitochondrial DNA Part A. 2016;27(3):2108–9.

    Google Scholar 

  36. Chen YH, Huang DY, Wang YL, Zhu CD, Hao JS. The complete mitochondrial genome of the endangered Apollo butterfly, Parnassius apollo (Lepidoptera: Papilionidae) and its comparison to other Papilionidae species. Journal of Asia-Pacific Entomology. 2014;17(4):663–71.

    Article  CAS  Google Scholar 

  37. Lammermann K, Vogel H, Traut W. The mitochondrial genome of the Mediterranean flour moth, Ephestia kuehniella (Lepidoptera: Pyralidae), and identification of invading mitochondrial sequences (numts) in the W chromosome. European Journal of Entomology. 2016;113:482–8.

    Article  Google Scholar 

  38. Bao L, Zhang YH, Gu X, Gao YF, Yu YB. The complete mitochondrial genome of Eterusia aedea (Lepidoptera, Zygaenidae) and comparison with other zygaenid moths. Genomics. 2019;111(5):1043–52.

    Article  CAS  PubMed  Google Scholar 

  39. Park B, Hwang UW. The complete mitochondrial genome of the woodwasp Euxiphydria potanini (Hymenoptera, Xiphydrioidea) and phylogenetic implications for symphytans. Sci Rep. 2022;12(1):17677.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Bessa MH, de Re FC, de Moura RD, Loreto EL, Robe LJ. Comparative mitogenomics of Drosophilidae and the evolution of the Zygothrica genus group (Diptera, Drosophilidae). Genetica. 2021;149(5–6):267–81.

    Article  CAS  PubMed  Google Scholar 

  41. Watanabe Y, Suematsu T, Ohtsuki T. Losing the stem-loop structure from metazoan mitochondrial tRNAs and co-evolution of interacting factors. Front Genet. 2014;5:109.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Wu QL, Cui WX, Du BZ, Gu Y, Wei SJ. The complete mitogenome of the turnip moth Agrotis segetum (Lepidoptera: Noctuidae). Mitochondrial DNA. 2014;25(5):345–7.

    Article  CAS  PubMed  Google Scholar 

  43. Wu QL, Cui WX, Wei SJ. Characterization of the complete mitochondrial genome of the black cutworm Agrotis ipsilon (Lepidoptera: Noctuidae). Mitochondrial DNA. 2015;26(1):139–40.

    Article  CAS  PubMed  Google Scholar 

  44. Liu TT, Li ZF. Phylogenetic and taxonomic study of the complete mitochondrial genome of Spodoptera frugiperda. Mitochondrial DNA Part B-Resources. 2019;4(2):2759–61.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Pan ZQ, Xu C, Nie L, Hao JS. The complete mitochondrial genome of the Papilio machaon annae Gistel (Lepidoptera: Papilionidae: Papilioninae). Mitochondrial DNA Part B-Resources. 2019;4(1):1945–6.

    Article  Google Scholar 

  46. Ramírez-Ríos V, Franco-Sierra ND, Alvarez JC, Saldamando-Benjumea CI, Villanueva-Mejía DF. Mitochondrial genome characterization of Tecia solanivora (Lepidoptera: Gelechiidae) and its phylogenetic relationship with other lepidopteran insects. Gene. 2016;581(2):107–16.

    Article  PubMed  Google Scholar 

  47. Xin Z-Z, Liu Y, Zhang D-Z, Wang Z-F, Tang B-P, Zhang H-B, Zhou C-L, Chai X-Y, Liu Q-N. Comparative mitochondrial genome analysis of Spilarctia subcarnea and other noctuid insects. Int J Biol Macromol. 2018;107:121–8.

    Article  CAS  PubMed  Google Scholar 

  48. Liu QN, Chai XY, Bian DD, Zhou CL, Tang BP. The complete mitochondrial genome of Plodia interpunctella (Lepidoptera: Pyralidae) and comparison with other Pyraloidea insects. Genome. 2016;59(1):37–49.

    Article  CAS  PubMed  Google Scholar 

  49. Kabiraj D, Chetia H, Nath A, Sharma P, Mosahari PV, Singh D, Dutta P, Neog K, Bora U. Mitogenome-wise codon usage pattern from comparative analysis of the first mitogenome of Blepharipa sp. (Muga uzifly) with other Oestroid flies. Sci Rep. 2022;12(1):7028.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Yi JQ, Wu H, Liu JB, Li JH, Lu YL, Zhang YF, Cheng YJ, Guo Y, Li DS, An YX. Novel gene rearrangement in the mitochondrial genome of Anastatus fulloi (Hymenoptera Chalcidoidea) and phylogenetic implications for Chalcidoidea. Sci Rep. 2022;12(1):1351.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Chen SC, Zhao FH, Jiang HY, Hu X, Wang XQ. The complete mitochondrial genome of the bagworm from a tea plantation in China, Eumeta variegata (Lepidoptera: Psychidae). Mitochondrial DNA Part B-Resources. 2021;6(3):875–7.

    Article  PubMed  PubMed Central  Google Scholar 

  52. Boore JL, Daehler LL, Brown WM. Complete sequence, gene arrangement, and genetic code of mitochondrial DNA of the cephalochordate Branchiostoma floridae (Amphioxus). Mol Biol Evol. 1999;16(3):410–8.

    Article  CAS  PubMed  Google Scholar 

  53. Wang YL, Peng CM, Yao QL, Shi QH, Hao JS. The complete mitochondrial genome of Gonepteryx rhamni (Lepidoptera: Pieridae: Coliadinae). Mitochondrial DNA. 2015;26(5):791–2.

    Article  CAS  PubMed  Google Scholar 

  54. Cao YQ, Ma CA, Chen JY, Yang DR. The complete mitochondrial genomes of two ghost moths, Thitarodes renzhiensis and Thitarodes yunnanensis: the ancestral gene arrangement in Lepidoptera. BMC Genomics. 2012;13:276.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Cameron SL, Whiting MF. The complete mitochondrial genome of the tobacco hornworm, Manduca sexta, (Insecta : Lepidoptera : Sphingidae), and an examination of mitochondrial gene variability within butterflies and moths. Gene. 2008;408(1–2):112–23.

    Article  CAS  PubMed  Google Scholar 

  56. Yin JA, Hong GY, Wang AM, Cao YZ, Wei ZJ. Mitochondrial genome of the cotton bollworm Helicoverpa armigera (Lepidoptera: Noctuidae) and comparison with other Lepidopterans. Mitochondrial DNA. 2010;21(5):160–9.

    Article  CAS  PubMed  Google Scholar 

  57. Zhou N, Dong YL, Qiao PP, Yang ZF. Complete mitogenomic structure and phylogenetic implications of the genus ostrinia (Lepidoptera: Crambidae). Insects. 2020;11(4):232.

    Article  PubMed  PubMed Central  Google Scholar 

  58. Liao C-Q, Yagi S, Chen L, Chen Q, Hirowatari T, Wang X, Wang M, Huang G-H. Higher-level phylogeny and evolutionary history of nonditrysians (Lepidoptera) inferred from mitochondrial genome sequences. Zool J Linn Soc. 2023;198(2):476–93.

    Article  Google Scholar 

  59. Kyrki J. The Yponomeutoidea: a reassessment of the superfamily and its suprageneric groups (Lepidoptera). Insect Syst Evol. 1984;15(1):71–84.

    Article  Google Scholar 

  60. Kyrki J. Tentative reclassification of holarctic Yponomeutoidea (Lepidoptera). Nota lepidopterologica. 1990;13(1):28–42.

    Google Scholar 

  61. Wang ZH, Yao S, Zhu XY, Hao JS. The complete mitochondrial genome of Pidorus atratus (Lepidoptera: Zygaenoidea: Zygaenidae). Mitochondrial DNA Part B-Res. 2018;3(1):448-+.

    Article  Google Scholar 

  62. Zhang XY, Tang L, Chen J, You P. The complete mitochondrial genome of Amesia sanguiflua (Lepidoptera, Zygaenidae). Mitochondrial DNA Part B-Res. 2020;5(1):988–9.

    Article  Google Scholar 

  63. Kim MJ, Wang AR, Park JS, Kim I. Complete mitochondrial genomes of five skippers (Lepidoptera: Hesperiidae) and phylogenetic reconstruction of Lepidoptera. Gene. 2014;549(1):97–112.

    Article  CAS  PubMed  Google Scholar 

  64. Liu GQ, Bi GQ, Du QW, Zhao EZ, Yang JQ, Zhang Z, Shang EL. Complete mitochondrial genome of Plodia interpunctella (Lepidoptera: Pyralidae). Mitochondrial DNA Part A. 2016;27(6):4538–9.

    Article  CAS  Google Scholar 

  65. Liu T, Wang S, Li H. Review of the genus Argyresthia Hübner, [1825](Lepidoptera: Yponomeutoidea: Argyresthiidae) from China, with descriptions of forty-three new species. Zootaxa. 2017;4292(1):1–135.

    Article  Google Scholar 

  66. Yang LL, Dai JJ, Gao QP, Yuan GZ, Liu J, Sun Y, Sun YX, Wang L, Qian C, Zhu BJ, et al. Characterization of the complete mitochondrial genome of Orthaga olivacea Warre (Lepidoptera Pyralidae) and comparison with other Lepidopteran insects. Plos One. 2020;15(3):e0227831.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Jeong SY, Park JS, Kim MJ, Kim S-S, Kim I. The complete mitochondrial genome of Monopis longella Walker, 1863 (Lepidoptera: Tineidae). Mitochondrial DNA Part B. 2021;6(8):2159–61.

    Article  PubMed  PubMed Central  Google Scholar 

  68. Anisimova M, Gascuel O. Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst Biol. 2006;55(4):539–52.

    Article  PubMed  Google Scholar 

  69. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35(2):518–22.

    Article  CAS  PubMed  Google Scholar 

  70. Regier JC, Mitter C, Zwick A, et al. A large-scale, higher-level, molecular phylogenetic study of the insect order Lepidoptera (moths and butterflies). PLoS One. 2013;8(3):e58568.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  71. Naser-Khdour S, Minh BQ, Zhang W, Stone EA, Lanfear R. The Prevalence and Impact of Model Violations in Phylogenetic Analysis. Genome Biol Evol. 2019;11(12):3341–52.

    Article  PubMed  PubMed Central  Google Scholar 

  72. Ewels P, Magnusson M, Lundin S, Käller M. MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics. 2016;32(19):3047–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Lindgreen S. AdapterRemoval: easy cleaning of next-generation sequencing reads. BMC Res Notes. 2012;5(1):1–7.

    Article  Google Scholar 

  74. Schubert M, Lindgreen S, Orlando L. AdapterRemoval v2: rapid adapter trimming, identification, and read merging. BMC Res Notes. 2016;9(1):1–7.

    Article  Google Scholar 

  75. Al-Nakeeb K, Petersen TN, Sicheritz-Pontén T. Norgal: extraction and de novo assembly of mitochondrial DNA from whole-genome sequencing data. BMC Bioinformatics. 2017;18:1–7.

    Article  Google Scholar 

  76. Prjibelski A, Antipov D, Meleshko D, Lapidus A, Korobeynikov A. Using SPAdes de novo assembler. Curr Protoc Bioinformatics. 2020;70(1):e102.

    Article  CAS  PubMed  Google Scholar 

  77. Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics. 2010;26(5):589–95.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Subgroup GPDP. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25(16):2078–9.

    Article  PubMed  PubMed Central  Google Scholar 

  79. Okonechnikov K, Conesa A, García-Alcalde F. Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data. Bioinformatics. 2016;32(2):292–4.

    Article  CAS  PubMed  Google Scholar 

  80. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, Buxton S, Cooper A, Markowitz S, Duran C. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–9.

    Article  PubMed  PubMed Central  Google Scholar 

  81. Donath A, Jühling F, Al-Arab M, Bernhart SH, Reinhardt F, Stadler PF, Middendorf M, Bernt M. Improved annotation of protein-coding genes boundaries in metazoan mitochondrial genomes. Nucleic Acids Res. 2019;47(20):10543–52.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Benson D, Karsch-Mizrachi I, Lipman D, Ostell J, Wheeler D. GenBank. Nucleic Acids Res. 2005;1:33.

    Google Scholar 

  83. Laslett D, Canback B. ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics. 2008;24(2):172–5.

    Article  CAS  PubMed  Google Scholar 

  84. Juhling F, Putz J, Bernt M, Donath A, Middendorf M, Florentz C, Stadler PF. Improved systematic tRNA gene annotation allows new insights into the evolution of mitochondrial tRNA structures and into the mechanisms of mitochondrial genome rearrangements. Nucleic Acids Res. 2012;40(7):2833–45.

    Article  PubMed  Google Scholar 

  85. Lowe TM, Chan PP. tRNAscan-SE On-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 2016;44(W1):W54–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27(2):573–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Grant JR, Stothard P. The CGView Server: a comparative genomics tool for circular genomes. Nucleic Acids Res. 2008;36:W181–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Chan PP, Lin BY, Mak AJ, Lowe TM. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 2021;49(16):9077–96.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  89. Chamberlain S, Szöcs E, Foster Z. Taxize-taxonomic search and retrieval in R. F1000Research. 2013;2:191.

    Article  PubMed  PubMed Central  Google Scholar 

  90. Tamura K, Stecher G, Kumar S. MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021;38(7):3022–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Perna NT, Kocher TD. Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J Mol Evol. 1995;41:353–8.

    Article  CAS  PubMed  Google Scholar 

  92. Ranwez V, Douzery EJP, Cambon C, Chantret N, Delsuc F. MACSE v2: toolkit for the alignment of coding sequences accounting for frameshifts and stop codons. Mol Biol Evol. 2018;35(10):2582–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  93. Criscuolo A, Gribaldo S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol. 2010;10:210.

    Article  PubMed  PubMed Central  Google Scholar 

  94. Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Katoh K, Rozewicki J, Yamada KD. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief Bioinform. 2019;20(4):1160–6.

    Article  CAS  PubMed  Google Scholar 

  96. Vecchi M, Bruneaux M. Concatipede: an R package to concatenate fasta sequences easily. 2021. https://doi.org/10.5281/zenodo.5130603

  97. Xia X, Xie Z, Salemi M, Chen L, Wang Y. An index of substitution saturation and its application. Mol Phylogenet Evol. 2003;26:1–7.

    Article  CAS  PubMed  Google Scholar 

  98. Xia X. DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution. Mol Biol Evol. 2018;35(6):1550–2.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  99. Xia X, Lemey P: Assessing substitution saturation with DAMBE. The phylogenetic handbook: a practical approach to DNA and protein phylogeny. 2009;2:615-630

  100. Kimura M. A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide-sequences. J Mol Evol. 1980;16(2):111–20.

    Article  CAS  PubMed  Google Scholar 

  101. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14(6):587-+.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  102. Chernomor O, von Haeseler A, Minh BQ. Terrace aware data structure for phylogenomic inference from supermatrices. Syst Biol. 2016;65(6):997–1008.

    Article  PubMed  PubMed Central  Google Scholar 

  103. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, Lanfear R. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol Biol Evol. 2020;37(5):1530–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  104. Soubrier J, Steel M, Lee MSY, Sarkissian CD, Guindon S, Ho SYW, Cooper A. The influence of rate heterogeneity among sites on the time dependence of molecular rates. Mol Biol Evol. 2012;29(11):3345–58.

    Article  CAS  PubMed  Google Scholar 

  105. Lanfear R, Calcott B, Kainer D, Mayer C, Stamatakis A. Selecting optimal partitioning schemes for phylogenomic datasets. BMC Evol Biol. 2014;14:82.

    Article  PubMed  PubMed Central  Google Scholar 

  106. Minh BQ, Trifinopoulos J, Schrempf D, Schmidt H, Lanfear R. IQTREE version 2.0: tutorials and manual phylogenomic software by maximum likelihood. 2019. http://www.iqtree.org

Download references

Acknowledgements

The Norwegian Institute for Bioeconomy Research (NIBIO) financed this work. The authors would like to thank Dr. Arne Hermansen for his interest in and support for the project. The authors would also like to thank Toril Sagen Eklo for collecting the materials of A. conjugella.

Funding

This project was funded by The Norwegian Institute for Bioeconomy Research (NIBIO).

Author information

Authors and Affiliations

Authors

Contributions

AE, SNM and HGE conceived and designed the study. GK collected the sample, and SNM and AVE analyzed the data. AE wrote the first draft of the manuscript, and AE and SNM wrote the draft manuscript with input from HGE, SBH, AVE, MHM, and GK. All authors have read, revised and approved the final manuscript.

Corresponding author

Correspondence to Abdelhameed Elameen.

Ethics declarations

Ethics approval and consent to participate

Our study was conducted as a component of a forecasting program for apple fruit moth attacks in Norway, which is being coordinated by the Norwegian Institute of Bioeconomy (NIBIO). NIBIO is authorized to collect apple fruit moth specimens.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Table S1. List of mitochondrial genomes of Lepidoptera (including A. conjugella mitochondrial genome) investigated in the study, representing 18 superfamilies and 42 families (Supplementary Figure S2), including outgroups species ( Phryganea cinerea, Phryganopsyche latipennis, Cheumatopsyche brevilineata, Limnephilus hyalinus, and Stenopsyche angustata).

Additional file 2:

Table S2. Results of two-tailed tests of substitution saturation for each codon position of the 13 PCGs. The following abbreviations were used: index of substitution saturation (ISS), critical value of ISS supposing symmetrical cladogenesis (ISS.CSym.), critical value of ISS supposing asymmetrical cladogenesis (ISS.CAsym.).

Additional file 3:

Figure S1. Maximum Likelihood phylogenetic tree based on 13 PCGs of 56 mitogenomes including outgroups species (Phryganea cinerea, Phryganopsyche latipennis, Cheumatopsyche brevilineata, Limnephilus hyalinus, and Stenopsyche angustata).

Additional file 4:

Figure S2. Maximum Likelihood phylogenetic tree based on 13 PCGs + 2 rRNAs compared A. conjugella mitochondrial genome with the mitochondrial genomes of 507 Lepidoptera obtained from GenBank, representing 18 superfamilies and 42 families (Supplementary Table S1), including outgroups species (Phryganea cinerea, Phryganopsyche latipennis, Cheumatopsyche brevilineata, Limnephilus hyalinus, and Stenopsyche angustata).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Elameen, A., Maduna, S.N., Mageroy, M.H. et al. Novel insight into lepidopteran phylogenetics from the mitochondrial genome of the apple fruit moth of the family Argyresthiidae. BMC Genomics 25, 21 (2024). https://doi.org/10.1186/s12864-023-09905-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12864-023-09905-1

Keywords