Complete mitochondrial genomes of Bactrocera ( Bulladacus ) cinnabaria and B. ( Bactrocera ) propinqua (Diptera: Tephritidae) and their phylogenetic relationships with other congeners

(


Introduction
True fruit flies of the genus Bactrocera Macquart are members of the family Tephritidae, subfamily Dacinae, tribe Dacini.Based on the classification with Zeugodacus Hendel as a genus, the genus Bactrocera consists of 461 species worldwide, with 451 species in the Asia-Pacific and 13 species in Africa (Doorenweerd et al. 2018).The actual number of species is much higher as many new species are being described, and there are unnamed species awaiting to be described (Leblanc et al. 2021a;Drew and Romig 2022;Singh et al. 2022;Starkie et al. 2022).
Most of the studies on the molecular phylogeny of the genus Bactrocera (and other tephritid fruit flies) are based on partial sequences of single or multiple mitochondrial and nuclear genes (Zhang et al. 2010;San Jose et al. 2018;Leblanc et al. 2021a;Starkie et al. 2022).In contrast, there are relatively few studies based on the complete mitochondrial genomes (Yong et al. 2021;Zhang et al. 2022).As of February 2023, 32 species of the genus Bactrocera (excluding the three taxa of Bactrocera dorsalis complex considered by some as conspecific and others as valid species) are available in the NCBI GenBank.Of these, 27 species belong to the subgenus Bactrocera.
In view of the potential application of mitochondrial genomes (mitogenomes) in studies regarding phylogeny and evolution (Cameron 2014), the present study reports the mitogenomes of Bactrocera (Bulladacus) cinnabaria Drew & Romig and Bactrocera (Bactrocera) propinqua (Hardy & Adachi) and their phylogenetic relationships with other congeners.This is the first report on the mitogenome for the subgenus Bulladacus.
Bactrocera cinnabaria is found in Andaman and Nicobar Islands (David and Ramani 2011), Peninsular Malaysia (Yong 1994) and Singapore (Hardy and Adachi 1954).When first discovered in these countries, it was named as Bactrocera mcgregori (Bezzi), a taxon found in the Philippines, but was subsequently described as a new species (Drew and Romig 2013).Its larva host plant is Gnetum gnemon (Hardy and Adachi 1954;Yong 1994).The male flies are not attracted to methyl eugenol and cue-lure/ raspberry ketone.
The larvae of B. cinnabaria and B. propinqua feed on a variety of fruits, including both cultivated and wild species.They are economically important pests, posing significant challenges to agriculture and fruit production.

Mitogenomes from GenBank, library preparation and genome sequencing
The complete mitogenomes of the genus Bactrocera (n = 32 species) available from the GenBank (Table S1) were used for phylogenetic comparison.Three other tephritid mitogenomes (Ceratitis capitata NC_000857, Ceratitis fas civentris NC_035497, and Ceratitis rosa NC_053847) available from the GenBank were used as the outgroup taxa.
Sample and library preparation (using Nextera DNA Sample Preparation Kit) and genome sequencing using the Illumina MiSeq Desktop Sequencer (150 bp paired-end reads) (Illumina, USA) were as described in Song et al. (2018).The mitogenome sequences have been deposited in the GenBank, under the accession numbers OR085849 (B.cinnabaria) and OR085850 (B.propinqua).

Analysis of mitogenome
A contig identified and established as mitogenome was annotated with MITOS (Donath et al. 2019) on the Galaxy platform (https://usegalaxy.eu).The circular map of the mitogenome was created using Blast Ring Image Generator (BRIG) (Alikhan et al. 2011).Transfer RNA (tRNA) genes were identified by MITOS (Donath et al. 2019).
MEGA X was used to determine the nucleotide composition, amino acid frequency and relative synonymous codon usage (RSCU) (Kumar et al. 2018).The ratios of non-synonymous substitutions (Ka) and synonymous (Ks) substitutions for the PCGs were estimated by DnaSP 6 (Rozas et al. 2017).The AT and GC skewness were determined according to Perna and Kocher (1995).Palindromes (inverted repeats) in the control region were checked with Tandem Repeats Finder (Benson 1999).

Phylogenetic analysis
Alignment of nucleotide sequences and reconstruction of phylograms based on 13 concatenated PCGs and 15 mt-genes (13 PCGs and 2 rRNA genes) followed that described in Song et al. (2018) and Yong et al. (2015Yong et al. ( , 2016a,b),b).Briefly, the gene sequences were aligned by MAFFT version 7 (Katoh and Standley 2013), using the Q-INS-I algorithm and subsequently edited and trimmed using BioEdit v.7.0.5.3 (Hall 1999).Kakusan v.3 (Tanabe 2007) was used to determine the best-fit nucleotide substitution models for maximum likelihood (ML) ana lysis selected using the corrected Akaike Information Criterion (Akaike 1973).Bayesian analysis was conducted using the Markov chain Monte Carlo (MCMC) method via Mr. Bayes v.3.1.2(Huelsenbeck and Ronquist 2001), with two independent runs of 2×10 6 generations with four chains, and with trees sampled every 200 th generation.Convergence and burn-in of likelihood values for all post-analysis trees and parameters were evaluated using the "sump" command in MrBayes and the computer program Tracer v.1.5(http://tree.bio.ed.ac.uk/software/tracer).The first 200 trees from each run were discarded as burn-in (where the likelihood values were stabilized prior to the burn-in), and the remaining trees were used for the construction of a 50% majority-rule consensus tree.Phylograms of the mtgenes were constructed using TreeFinder.Phylogenetic trees were viewed and edited by FigTree v.1.4(Rambaut 2012).Uncorrected pairwise ('p') genetic distances were estimated using PAUPb10 software (Swofford 2002).

Mitogenome features
The mitogenomes of B. cinnabaria and B. propinqua had similar gene order and contained 37 genes (13 protein-coding genes -PCGs, 2 rRNA genes, and 22 tRNA genes) and a non-coding region (A + T-rich control region) (Table 1; Fig. 1).There were 28 intergenic regions with spacing sequence totalling 150 bp in B. cinnabaria, and 222 bp in B. propinqua (Table 1).The region between trnQ and trnM genes had the longest sequence of 69 bp in B. propinqua; it was 11 bp in B. cinnabaria.The longest intergenic sequence in B. cinnabaria was 29 bp between
The frequency of individual amino acid was quite similar between B. cinnabaria and B. propinqua (Fig. 2).The predominant amino acids (with frequency above 200) in the two mitogenomes were glycine, isoleucine, leucine2, phenylalanine, serine2, threonine, and valine (Table S3); in addition, leucine1 had a frequency of 106 in B. cinnabaria, and methionine had a frequency of 203 in B. propinqua.Cysteine had the lowest frequency of 45 in B. cinnabaria and 43 in B. propinqua.
Analysis of the relative synonymous codon usage (RSCU) revealed that there was no biased usage of A/T than G/C at the third codon position (Table S4; Fig. 2).The frequency of each codon varied between the two Bactrocera mitogenomes.The most commonly used codon was UUA encoding for leucine2, and the least commonly used codon was CUG encoding for leucine1 ( Table S4; Fig. 2).
The Ka/Ks ratio (an indicator of selective pressure on a PCG) was less than 1 for all the 13 PCGs in the two Bactrocera mitogenomes, indicating purifying selection (Table S5; Fig. 3).The cox3 gene had the lowest ratio (Ka/Ks = 0.010) followed by cox1 gene (Ka/Ks = 0.012); the atp8 gene had the highest ratio of 0.221.

Ribosomal RNA genes and transfer RNA genes
The cloverleaf structure for some tRNAs was dissimilar in B. cinnabaria and B. propinqua (Fig. 4).Asparagine (trnA) lacked the TѰC-loop in both species.Arginine

Control region
The control region was flanked by rrnS and trnI genes respectively, with 358 bp in B. cinnabaria and 947 bp in B. propinqua.In the control region of B. propinqua, a long poly-A stretch of 12 bp was present in the anterior region, and a 24 bp poly-A stretch was present in the posterior region; a long poly-T stretch of 24 bp was present in the middle region.Long poly-A and poly-T stretches were not present in B. cinnabaria control region.Simple tandem repeats and palindromes were present in the control region of B. cinnabaria and B. propinqua (Table 3).Some were common in the two mitogenomes, while some were present only in B. cinnabaria or B. propinqua.There were more tandem repeats and palindromes in the control region of B. propinqua.Some palindromes (ATAATA, TATTAT, ATTAATTA, and TAAAATTA AA-AT) are also tandem repeats.
The 358-bp control region of B. cinnabaria mitogenome is exceptionally short for tephritid fruit flies.It aligns with the anterior portion of the long control region of other Bactrocera species.It is, however, not the shortest control region for Bactrocera species.Bactrocera rubigina (NC_046521) has a 235-bp control region (Wang et al. 2020b); it has, however, also been reported to have a long control region (MT121270 with 954 bp; Zhang et al. 2023) (Zhang et al. 2023).More studies are needed to clarify the occurrence of both long and short control regions in the same species.
In the present study, the subgenera of genus Bactrocera, particularly the subgenus Bactrocera represented by a large number of taxa, are monophyletic.Apart from the subgenus Bactrocera, a broader taxon sampling is needed to confirm the monophyletic status of the subgenera Bulladacus, Daculus and Tetradacus.A recent study, however, indicates that the subgenus Bactrocera based on current taxonomic classification is not monophyletic (Starkie et al. 2022).Some studies also indicate that the Bactrocera group and Melanodacus group of subgenera within the genus Bactrocera are not monophyletic (San Jose et al. 2018;Satrkie et al. 2022).
An earlier study based on partial COXI and 16S sequences shows that the subgenus Tetradacus is a sister group to the subgenus Paratridacus of the Melanodacus group (Zhang et al. 2010).In the present study, the subgenus Bulladacus of the Bactrocera group of subgenera (Hancock and Drew 2018) forms a clade with the subgenera Daculus and Tetradacus of the Melanodacus group, concurring with the clustering of the subgenera [(Bulladacus -Parazeugodacus) -(Notodacus -Bactrocera -Tetradacus)] based on partial sequences of six genes (COXI, COXII, 16S, DDOSTs2, RPA2 and EIF3L) (Starkie et al. 2022).The basal position of the subgenus Daculus to the subgenus Tetradacus is congruent with that of Zhang et al. (2010).However, the findings of San Jose et al. (2018) show B. (Tetradacus) tsuneonis to be basal to other Bactrocera taxa, including B. (Dacu lus) oleae which is closer related to Parazeugodacus, and the study of Starkie et al. (2022) shows the subgenus Hemizeugodacus (Melanodacus group) to be basal to the other subgenera.In an earlier study with limited taxon sampling, Daculus forms a lineage with Gymnodacus of Melanodacus group, which is distinct from the subgenus Bactrocera (Virgilio et al. 2015).
The sister lineage of B. oleae and B. biguttula (Fig. 5; da Costa et al. 2019;Zhang et al. 2023) is congruent with the proposal to name the subgenus Afrodacus Bezzi of the African taxa as a synonym of the subgenus Daculus Speiser; the Asian taxa of Afrodacus are members of the subgenus Bactrocera (Copeland et al. 2004).Based on the DNA sequences of three mitochondrial genes (NADH dehydrogenase -nad1, cytochrome c oxidase subunit I -cox1, and 16S rRNA), B. biguttula is basal to the sister lineage of B. oleae and B. munroi (Bon et al. 2016).In the present study, the Daculus lineage is basal to the lineage consisting of the subgenera Bulladacus and Tetradacus (and unassigned B. sp.'yunnanensis') (Fig. 5).
The very small genetic difference (lack of genetic differentiation, 'p' = 0.03% based on 15 mt-genes) between  (Drew and Romig 2013).This synonymy was also indicated by the small COXI genetic distance of 0.00% to 1.18% between B. ruiliensis and B. thailandica (Jiang et al. 2014).The genetic distance of the near complete COXI gene in the present study is 'p' = 0.00%.
In the present study, a small genetic difference is observed between B. melastomatos and B. rubigina with 'p' = 0.08% and 0.40%, and intra B. rubigina genetic distance of 0.37% (Table S6; Fig. 5).A recent study based on a single specimen of B. rubigina also shows its sister lineage with B. melastomatos (Zhang et al. 2023).Further studies are needed to determine the taxonomic relationship of these closely related taxa.
Likewise, B. tryoni and B. neohumeralis (each with one specimen) are genetically very similar with 'p' = 0.69% (Table S6; Fig. 5).The phylogenetic analysis of Starkie et al. (2022) shows two subclades of the B. tryoni complex.The two B. tryoni specimens do not form a sister lineage but are members of the lineage containing a B. neohumeralis specimen; two other B. neohumeralis specimens are members of another lineage.
The present phylogenetic analysis based on 15 mtgenes concurs with the finding of Wang et al. (2020a) that B. cheni and B. tsuneonis are clearly two different species; B. cheni is a member of the subgenus Bactrocera while B. tsuneonis is a member of the subgenus Tetradacus (Fig. 5), with 'p' = 19.33%(Table S6).The synonymy of B. cheni with B. lombokensis (Drew and Romig 2013) is also not supported by a genetic distance of 9.79% based on the partial COXI sequence of B. lombokensis (KT594922) and that of B. cheni (MN883026).In the present study, B. cheni forms a lineage with B. tuberculata (Fig. 5), with a small genetic distance of 'p' = 0.45% based on 15 mt-genes (Table S6).The taxonomic status of B. cheni therefore remains to be resolved.
Likewise, the present finding of 'p' = 1.70-1.74%(Table S6) based on 15 mt-genes does not support the synonymy of B. albistrigata with B. frauenfeldi (Doorenweerd et al. 2023), as opined by Drew and Hancock (2022) and evident in the phylogenetic trees of Starkie et al. (2022) and Zhang et al. (2023).
A recent mitogenomic study on specimens of the B. dorsalis complex from various geographic regions indicates that they do not group together, and is therefore paraphyletic (Zhang et al. 2023).In an earlier study, one B. invadens mitogenome sequence forms a lineage with B. dorsalis while another sequence forms a lineage with B. papayae containing also B. philippinensis; both the B. invadens specimens are from Kenya (Drosopoulou et al. 2019).Based on the structure of the male genitalia (aedeagus) and the relationship of the structure to mating, Drew and Hancock (2022)  The present phylogenetic analysis supports the high genetic similarity of B. papayae and B. philippinensis which form a sister lineage with 'p' = 0.85%; B. papayae has a genetic distance of 1.00% with B. dorsalis and 1.18% with B. invadens (Table S6; Fig. 5).B. dorsalis and B. invadens form a sister lineage with 'p' = 0.65%, indicating high genetic identity and therefore these two taxa/specimens may be conspecific.It is evident that an integrative study on broad representative samples from various geographic regions is needed to resolve the taxonomy of this and other species complexes.
In The phylogenomics will serve as a useful dataset for studying the genetics, systematics (including species differentiation) and phylogenetic relationships of the many species/species complexes and subgenera of the genus Bactrocera in particular, and tephritid fruit flies in general.

Figure 1 .
Figure 1.Complete mitogenomes of Bactrocera cinnabaria and B. propinqua with BRIG visualization showing the protein coding genes, rRNAs, tRNAs and non-coding region.GC skew is shown on the outer surface of the ring whereas GC content is shown on the inner surface.The anticodon of each tRNA gene is shown in parentheses.
B. cinnabaria was basal to the sister lineage of B. minax and B. tsuneonis, indicating that the subgenus Bulladacus was closer related to subgenus Tetradacus than to subgenus Daculus.B. propinqua was basal to the sister lineage of B. ritsemai and B. limbifera, forming a subclade with B. umbrosa, B. curvifera and B. moluccensis (Fig. 5).The sister lineage of B. latifrons and B. bryoniae was basal to the other subgenus Bactrocera lineages.

Figure 2 .
Figure 2. Amino acid frequency (A) and relative synonymous codon usage (B) of protein-coding genes in Bactrocera cinnabaria and B. propinqua mitogenomes.

Figure 4 .
Figure 4. Cloverleaf structure of the 22 inferred tRNAs in the mitogenomes of Bactrocera cinnabaria and B. propinqua.

Figure 5 .
Figure 5. Phylogenetic trees (ML/BI) of (a) 15 mt-genes (13 PCGs + 2 rRNA genes), and (b) 13 PCGs of the whole mitogenomes of Bactrocera fruit flies with Ceratitis capitata, C. fasciventris, and C. rosa as the outgroup taxa.Numeric values at the nodes are ML bootstrap and Bayesian posterior probabilities.Support values labelled with a "*" have 100% bootstrap support or 1.0 posterior probability.
conclude that B. papayae and B. philippinensis are conspecific, and B. papayae and B. invadens are good species separate from B. dorsalis.On the other hand, an integrative molecular and morphological study of B. invadens and B. dorsalis from across a wide geographic distribution supports the hypothesis that they represent a single biological species (Schultze et al. 2015).
summary, we have successfully sequenced the whole mitochondrial genomes of B. (Bulladacus) cinnabaria (the first report for the subgenus Bulladacus) and B. (Bactrocera) propinqua from Peninsular Malaysia by next generation sequencing.The genome features are similar to other Bactrocera fruit flies, excepting the short control region (358 bp) in B. cinnabaria.Phylogenetic analysis based on the mt-genes reveals two major clades of the Bactrocera taxa: (A) subgenus Bactrocera, and (B) subgenera Bulladacus, Daculus, Tetradacus and unassigned Bactrocera sp.'yunnanensis'.The subgenera represented by two or more species are monophyletic.A broad taxon sampling, including taxa of all the subgenera, will help to clarify their phylogeny.The present study supports the synonymy of B. ruiliensis with B. thailandica.It also shows a high genetic similarity between (a) B. melastomatos and B. rubigina, (b) B. papayae and B. philippinensis, (c) B. dorsalis and B. invadens, (d) B. tryoni and B. neohumeralis, and (e) B. cheni and B. tuberculata; and B. cheni is distinct from and not a synonym of B. tsuneonis or B. lombokensis.

Table 1 .
Gene order and organization of the mitochondrial genomes of Bactrocera cinnabaria (Bc) and B. propinqua (Bp).*Minus sign indicates overlap.J (+) or N (-) indicates gene directions.and 65.1% for cox3 to 80.1% for nad4L in B. propinqua (Table S2).Most of the PCGs had negative AT and GC skewness values (Table S2); cox2 and nad6 had positive AT skewness values in B. cinnabaria, nad4L had positive GC skewness value in B. cinnabaria, and nad1, nad4 and nad5 had positive GC skewness values in both B. cinnabaria and B. propinqua.B. cinnabaria and B. propinqua shared an identical start codon for their respective PCGs, except atp8 with ATC for B. cinnabaria and GTG for B. propinqua (Table .2% for B. cinnabaria and 74.1% for B. propinqua, with positive AT and negative GC skewness values (Table2).Like-wise, the A + T content was higher in B. propinqua than propinqua.Unlike the whole mitogenome, the 1 st codon position of PCGs, tRNA genes and rRNA genes had positive GC skewness values for both B. cinnabaria and B. propinqua, and the N strand had positive GC skewness value for B. cinnabaria (Table2); the other regions had negative GC skewness values.For the individual PCGs, the A + T content ranged from 62.9% for cox1 to 76.3% for nad4L in B. cinna-
era clade (Clade A) consisted of many more taxa than Clade B.

Table 3 .
Number of different repetitive sequences in the control regions of Bactrocera cinnabaria and Bactrocera propinqua mitogenomes.B. ruiliensis and B. thailandica supports the synonymy of B. ruiliensis with B. thailandica