The first report of complete mitogenomes of two endangered species of genus Propomacrus (Coleoptera: Scarabaeidae: Euchirinae) and phylogenetic implications

To understand the mitochondrial genome structure of two endangered and long-armed scarab beetles, Propomacrus davidi and Propomacrus bimucronatus, their complete mitogenomes were sequenced for the first time in this study. The complete mitogenomes of P. davidi and P. bimucronatus were 18, 042 bp and 18, 104 bp in length, respectively. The gene orders of their mitogenomes were highly consistent with other Coleopteran species, and the typical ATN was used as the start codon in most protein coding genes. The incomplete stop codon T was used in cox1, cox2, and nad5, and TAN was used as a complete stop codon in most protein coding genes. All predicted tRNAs could form a typical cloverleaf secondary structure, except that trnS1 lacked the dihydrouridine arm. Based on the maximum likelihood and the Bayesian inference methods, phylogenetic trees of 50 species were reconstructed. The results showed that P. davidi, P. bimucronatus, Cheirotonus jansoni and Cheirotonus gestroi clustered in the same branch, and were the most closely related. The results supported that subfamily Euchirinae is a monophyletic group of Scarabaeidae, which was consistent with the morphological classification. These molecular data enriched the complete mitogenome database of Euchirinae, and improved our understanding of the phylogenetic relationship and evolutionary characteristics of these two endangered species.


Introduction
The genus Propomacrus Newman, 1837 belongs to the subfamily Euchirinae in the order Coleoptera.It only includes three species, Propomacrus bimucronatus (Pallas, 1781), P. davidi Deyrolle, 1874 and P. muramotoae Fujioka, 2007.Propomacrus cypriacus Alexis & Makris, 2002 has been a subspecies of P. bimucronatus by genetic evidence [1].The species of Propomacrus are giant beetles, and mainly found in the alpine regions of Asia, the Near East and southeastern Europe [1,2], with high habitat requirements, sensitivity to environmental changes, poor adaptability, weak reproductive capacity, and long larval stage [3][4][5].In recent years, due to the intensification of human activity intervention and serious damage to forests, the distribution areas of Propomacrus have declined sharply and some species have become endangered.P. davidi is endemic to China and only distributed in Jiangxi and Fujian provinces.It includes two subspecies, Propomacrus davidi davidi Deyrolle, 1874 and Propomacrus davidi fujianensis Wu, 2008 [6-9].The beetle listed as one of the Grade II Key Protected Wild Animals based on the endangered status in China (https://www.forestry.gov.cn/main/5461/20210205/122418860831352.html).P. bimucronatus shows an intermittent distribution in Cyprus, Greece, Iran, Israel, Syria, Turkey, and the southern and central Balkans [1,3,10].It was listed as Near Threatened (NT) in the European Red List of Saproxylic Beetles [11].The adults of P. davidi and P. bimucronatus feed on the sap of trees and rotten fruits.The larvae feed on rotten wood and humus soil, which can accelerate the decomposition of dead wood and litter in forests [1].They play an important role in the material cycle of forest ecosystems and have considerable ecological value.Meanwhile, the two species are famous ornamental insects because of their peculiar appearance and good ornamental value [12,13].Recent studies on the two beetles P. davidi and P. bimucronatus has mainly focused on morphological classification, biology, and ecology [2,3,9,10,14], only a few studies involved molecular biology [2,8], and only cox1 gene and 16S of the two species were sequenced [1].
The family Scarabaeidae, within the order Coleoptera, comprises over 30,000 described species, with over 21 subfamilies [15].These subfamilies exhibit distinct divisions in their primary feeding ecologies, broadly categorized into saprophagous and phytophagous lineages [15,16].The former mainly feed on decaying matter, animal dung, etc., including the subfamilies Scarabaeinae and Aphodiinae, while the latter typically consume plant fruits, flowers, pollen, tree sap, wood (including deadwood), leaves, etc.This group includes the four main subfamilies Melolonthinae, Rutelinae, Cetoniinae, and Dynastinae, as well as some less numerous subfamilies such as Euchirinae.
Euchirinae, a subfamily of Scarabaeidae, is an under-studied group with uncertain taxonomic status [3,10,17,18], which includes three genera and 15 species and seven subspecies worldwide, namely 10 species and one subspecies of Cheirotonus Hope, 1841, two species and one subspecies of Euchirus Burmeister and Schaum, 1840, and three species and two subspecies of Propomacrus Newman, 1837.The phylogenetic study of Euchirinae has been conducted poorly, and it has been mainly based on morphological characteristics, rarely using molecular characteristics.Although its monophyly has been confirmed, the classification at the subfamily level remains uncertain [3,19,20].Some scholars believe that Euchirinae is a subfamily of the family Scarabaeidae [10,21,22], but the evidence is needs to be more substantial [18,23].The subfamily is included in the subfamily Melolonthinae as tribe Euchirini sometimes [3] or in a clade composed of several representatives of Melolonthinae, Rutelinae, Dynastinae, as well as Cetoniinae [6,23], or an independent family Euchiridae [1,20].Therefore, it is very important to develop reliable markers to reconstruct the phylogenetic relationship of the subfamily Euchirinae.
As a complete organelle genome, the mitochondrial genome (mitogenome) has the characteristics of maternal inheritance, small molecular weight, fast evolution rate and low recombination level [24].Thus, it is a good marker for studying the phylogenetic relationships between species and genera.Therefore, complete mitogenomes have been used as a common molecular marker in insect systematics studies [25,26].The mitogenome of insect is a closed circular and double-stranded DNA molecule of 14-20 kb in length [27,28].As of May 2024, the number of mitochondrial genomes of Scarabaeidae insects stored in GenBank has exceeded 100, primarily from seven subfamilies: Scarabaeinae, Melolonthinae, Rutelinae, Cetoniinae, Aphodiinae, Dynastinae, and Euchirinae.Mitochondrial gene characteristics of over 30 Scarabaeidae species have been described and used for phylogenetic analysis [6,29].Just as mitochondrial genomes of coleopteran insects tend to be conservative [30], with the exception of some species in the subfamilies Dynastinae and Scarabaeinae that show gene rearrangements, the mitochondrial genome structure, base composition, codon usage, and gene arrangement of other Scarabaeidae species exhibit high similarity to other coleopterans [31,32].However, only three species of Euchirinae, Euchirus longimanus, Cheirotonus gestroi, and C. jansoni, are recorded in GenBank [6,20,29].More complete mitogenome data of the Euchirinae species are needed for phylogenetic analysis.
In this study, the complete mitochondrial genomes of two species of genus Propomacrus (Coleoptera: Scarabaeidae: Euchirinae) were sequenced and analyzed for the first time.Combining these data with other available mitogenomes from GenBank, the phylogenetic relationships of Scarabaeoidea were reconstructed, which provides new insights into the relationships among Euchirinae and the other major subfamilies or families.These findings will be useful for better conserving endangered groups in the future.

Specimen materials
The adult specimens of P. davidi davidi and P. bimucronatus were obtained from beetle breeding enthusiast in Kunming, Yunnan Province, China.Their ancestors were roughly mapped to the Longhushan Geopark in Jiangxi, China and Izmir, Turkey, respectively.They were reared from one instar in October 2019 and emerged in August 2020.Adult samples were stored in absolute ethanol and kept in the herbarium of Southwest Forestry University.

Mitochondrial genome sequencing
The total DNA of the samples was extracted from the muscles of the adult's thorax using a Magnetic Animal Tissue Genomic DNA Kit (DP341) (TIANGEN, Beijing, China).After the quality detection by 1.0% agarose gel electrophoresis, the DNA was sent to Baiai Gene Information Technology Co., Ltd.(Xi'an, China) for sequencing.Whole Genome Shotgun (WGS) was used to construct the library, and next generation sequencing was used to perform Pairedend (PE) sequencing on the Illumina sequencing platform.

Sequence splicing and annotation
High quality next-generation sequencing data were assembled using SPAdes v3.9.0 [33] and contig and scaffold sequences were constructed.After comparing the sequence with the NT library on NCBI BLAST (BLAST v2.2.31+), the mitochondrial sequence of the splicing results was picked out.Combined with the reference sequences C. jansoni (GenBank ID: NC_023246) [34] and C. gestroi (NC_046890) [20], the mitochondrial sequences of splicing results were integrated.Collinearity analysis was performed using mummer v3.1 [35] to determine the positional relationship between contigs and fill the gap between contigs.The results were corrected using Pilon v1.18 software [36] to obtain the final mitochondrial sequence.The complete mitogenome sequence was uploaded to the MITOS website (http://mitos2.bioinf.unileipzig.de/index.py)for functional annotation [37].The secondary structure of tRNA was predicted by tRNA Scan-SE [38] and MITOS web [37], and it was manually corrected by the SVG online editor.The complete mitogenome map was drawn using OGDRAW v1.3.1 [39].

Phylogenetic analyses
Based on the mitogenome sequence of 48 species from four families of Scarabaeoidea, published in the NCBI database, two species from Staphylinidae and Byturidae were selected as outgroups (Table 1).The species selected from the superfamily Scarabaeoidea include 25 species from six subfamilies of the family Scarabaeidae, including Cetoniinae, Dynastinae, Rutelinae, Melolonthinae, Euchirinae, and Scarabaeinae, as well as two species from the subfamilies Hybosorinae and Ceratocanthinae of the family Hybosoridae, two species from the subfamily Passalinae of the family Passalidae, and 19 species from the subfamily Lucaninae of the family Lucanidae.The selected species were used to construct a phylogenetic tree with the two species and outgroups.PhyloSuite v1.2.2 aligned the complete mitochondrial genome sequence [40].IQ-TREE v1.6.12 [41] and MrBayes v3.2.7a [42] were used to construct phylogenetic trees of 13 PCGs based on the maximum likelihood (ML) and Bayesian inference (BI) methods.

Mitogenome composition
The complete mitogenome lengths of P. davidi and P. bimucronatus were 18,042 bp and 18,104 bp, respectively, which were longer than the 16, 899 bp of C. gestroi in the same subfamily [20].Most metazoan mitogenomes are approximately 16 kb in size [59,60].Longer mitochondrial genes may be characteristic of Propomacrus species, which is closely related to the length of control region sequence.The control regions of P. davidi and P. bimucronatus are 765-1,202 bp longer than in C. gestroi and C. jansoni in same subfamily.The A + T content was 69.49% for P. davidi, and 67.91% for P. bimucronatus, with high AT-skew.The gene composition and sequence were consistent with those of ancestral insects, including 13 protein coding genes (PCGs), 22 tRNAs, two rRNAs, and a control region (also known as A + T-rich region or D-loop) [33,61].The J-strand generally contains 23 genes (nine PCGs, 14 tRNAs), and the N-strand includes 14 genes (four PCGs, eight tRNAs, and two rRNAs).There was an intergenic spacer or gene overlap between adjacent genes of the two mitogenomes, and no gene rearrangement was found.The number of intergenic spacers in the whole mitogenomes of P. davidi and P. bimucronatus was seven overlapping regions.The sum of their lengths was 34 bp and 36 bp, respectively.The intergenic spacers seem to mid-length in Coleoptera, for example, nearly twice that of the Tenebrionidae species [62] but shorter than Coenochilus striatus [45].However, the longest intergenic spacers were both 19 bp and located between trnS2 and nad1.There were 15 regions without intervals and overlaps.The number of overlapping region was 15, the sum of length was 40 bp, and the longest overlapping region was 8 bp, which was located between trnC and trnW (Fig 1 and Table 2), consistent with the results of some other Coleoptera species with only slight differences in length, such as the region of 9 bp and 8 bp of Chrysochroa fulgidissima and Coomaniella copipes, respectively [63,64].The gene arrangement, nucleotide composition, and codon usage may be used as a piece of very helpful information to assist the classification and infer evolutionary relationships.

Mitochondrial genome PCGs
The total length of the 13 PCGs was 11,137 bp for both P. davidi and P. bimucronatus, and the A + T content was 68.38% and 66.74%, respectively, these two species showed a negative ATskew (−0.146, −0.163, respectively) and a negative GC-skew (−0.049 and −0.019, respectively) in PCGs (Table 3).The longest and shortest PCGs were nad5 (1,714 bp) and atp8 (156 bp).The combined 13 PCGs of P. davidi or P. bimucronatus mitogenomes encoded a total of 3,702 codons (excluding stop codons); the most frequently used codon was UUA (Leu2), and the  The start codon of cox1 was not detected in the two species, which may be related to the use of the non-canonical start codons of this gene [61,65]; nad6 genes were ATT and ATC, respectively; nad2, atp8, nad3, nad5, and nad1 were typical ATT; cox2 gene was ATA; atp6, cox3, nad4, nad4L, and cytb were typical ATG.The cox1, cox2, and nad5 used the incomplete stop codon T, the complete stop codon of cytb was TAG, and the complete stop codons for other genes were the typical TAA (Table 2).
The start codons of P. davidi and P. bimucronatus differ from those of other Coleoptera species.In Coleoptera, except for the cox1 gene, most protein-coding genes use a typical ATN as the start codon.However, incomplete stop codons (T or TA) are common in the mitochondrial genome of metazoans [66].

The mitochondrial tRNA and rRNA genes
The total length of 22 tRNA genes in the mitochondrial genomes of P. davidi and P. bimucronatus was 1,478 bp and 1,450 bp, respectively.Most tRNAs exhibit the typical cloverleaf  structures, except trnS1 was missing the DHU arm and the anticodon was changed from GCU to UCU.Although most arthropods use a GCU anticodon in tRNA-Ser(AGN), almost all beetle mitogenomes published so far have the UCU anticodon for this tRNA [61].From the known complete mitogenome of Coleoptera insects, Polyphage uses UCU as an anticodon, and the other suborders of Archostemata, Myxophaga, and Adiphaga are GCU [66].This suggests that the anticodon may be a molecular synapomorphy of Coleoptera [61,66].In the folding process of the cloverleaf structure, there were 26 pairs of G-U mismatches in P. davidi (Fig 4), of which nine pairs were located in the amino acid acceptor arms of trnQ, trnC (3), trnY, trnA, trnF, trnH and trnV; seven pairs were located in the DHU arms of trnQ (2), trnC, trnK, trnG, trnH and trnP, and seven pairs were located in the TCC arms of trnQ, trnC, trnS1, trnH, trnP (2), and trnV; three pairs of anticodon arms were located at trnI, trnD, and trnH.There were 24 pairs of G-U mismatches in P. bimucronatus (Fig 5), of which eight pairs were located on the amino acid acceptor arm of trnC (3), trnY, trnA, trnH, and trnV (2); seven pairs were located on the DHU arm of trnQ (2), trnC, trnK, trnG, trnH, and trnP; eight pairs were located on the TCC arm of trnQ, trnY, trnN, trnS1, trnP (3), and trnV, and one pair was located on the anticodon arm of trnH.The unpaired base A appeared in the anticodon arm of trnS1 in the two species.
rrnL genes of the two beetles were located between the trnL1 and trnV, and rrnS genes were located between trnV and control region, as in other insects [31,34,61], and rrnL had greater variation than rrnS.The sequence lengths of the rrnL (Fig 6A ) and rrnS (Fig 6B ) of P. davidi were 1,294 bp and 785 bp, respectively, and the A + T contents were 76.43% and 72.99%, respectively.The sequence lengths of the rrnL (Fig 7A ) and rrnS (Fig 7B ) of P. bimucronatus were 1,293 bp and 785 bp, respectively, and the A + T contents were 76.1% and 71.34%.The secondary structures of rrnL in the two species were predicted to have 42 helical structures and five domains (I, II, IV, V, VI).The domain III deletion of rrnL is a typical feature of arthropods [31,34,[67][68][69].The secondary structures of rrnS have 26 helical structures and three domains.

Control region
The mitogenome control region is a non-coding region of the sequence in its genome, also known as the A+T-rich region, which might be the key to characterizing typical CRs in mitogenomes [70,71].The control region is thought to regulate the replication and transcription of the mitogenome [72,73].The total lengths of the A+T-rich region of P. davidi and P. bimucronatus mitogenomes were 3,354 bp and 3,443 bp, respectively, both located between rrnS and trnI.However, the difference between these two species in CRs was not limited by these features and varied considerably.The T content in the control region is always higher than that in the coding region, which is a well-documented pattern in the insect mitogenomes [74].The A + T contents were 67.74% for P. davidi and 65.15% for P. bimucronatus; these two species showed a positive AT-skew (0.186 and 0.090, respectively) and a negative GC-skew (−0.429 and −0.224, respectively).There were two pairs of repeat units in the 15,475-16,362 bp region in the mitogenome of P. davidi.The interval between the two pairs of repeating units was 80

Phylogenetic analysis
Based on the 13 PCG datasets of 50 species, ML and BI trees with almost the identical topology were generated (Fig 9).The difference between the two trees is only the position of Although it is different from the phylogenetic relationship of the families of Scarabaeoidea based on studies from nuclear protein-coding genes [77,78], which indicated that the Lucanidae did not form a sister-group relationship with the Scarabaeidae and Hybosoridae, the results support the monophyly of the four families and are consistent with results of morphological characteristics studies [17,18].

Conclusion
In this study, the complete mitogenome sequences of P. davidi and P. bimucronatus were sequenced and analyzed; the total lengths of the two species were 18,042 bp and 18,104 bp, respectively, which are longer than those of most metazoans, and the genomes showed an obvious AT-skew.Each gene composition and sequence were consistent with those of ancestral insects, including 13 PCGs, 22 tRNAs, two rRNAs, a control region, and no gene rearrangements or deletions.The mitogenome of the two species uses UCU as an anticodon.The secondary structure models of rrnL and rrnS were similar to those of other insect genes, and rrnL had greater variation than rrnS.The domain III deletion of rrnL is a typical feature of arthropods.The T content in the control region is always higher than in the coding region, a well-documented pattern in the insect mitogenome.The total lengths of the A+T-rich region The analysis based on the ML and BI method of 13 PCGs produced a well-resolved framework supporting the conclusion that Euchirinae is a monophyletic subfamily of Scarabaeidae.Mitogenome analysis is useful for comparing genetic relationships at classificatory levels.More research is needed to reconstruct comprehensive phylogenies to better resolve the phylogenetic relationships of Euchirinae.

Fig 1 .
Fig 1. Circular maps of the mitogenome of Propomacrus davidi and P. bimucronatus.The direction of gene transcription is shown as an arrow.Dark blue represents transfer RNA (tRNA), yellow represents nicotinamide adenine dinucleotide (nadh), purple represents cytochrome c oxidase (cox) and cytochrome b (cytb), green represents adenosine triphosphate ATP synthase, and red represents ribosomal RNA (rRNA).https://doi.org/10.1371/journal.pone.0310559.g001 bp, and the repeat sequence length for P. davidi and P. bimucronatus was 164 bp and 210 bp, respectively.The interval between the 164 bp repeat sequences was 53 bp, and the interval between the 210 bp repeat sequences was 7 bp.In the P. bimucronatus mitogenome, there was a pair of repeat units with a length of 446 bp at 15, 504-16,149 bp region, and the length of the overlapping region between the repeat sequences was 246 bp (Fig8).The control region of the two species also has one poly-T with a length of 20 bp.There were many microsatellite structures in the control region.P. davidi had a total of 12 (TA) n structures, including one (TA) 8 , three (TA) 5 , one (TA) 7 , four (TA) 3 , two (TA) 4 and one (TA) 10 .P. bimucronatus has nine (TA) n structures, one (TA) 8 , one (TA) 7 , and seven (TA) 3 .

Table 2 . Organization of the mitogenomes of Propomacrus davidi and P. bimucronatus.
Note: N and J indicate that the gene was located in the minor (N) and major (J) strand.The '/' indicated that these from left to right were P. davidi and P. bimucronatus.