Complete mitogenome of Olidiana ritcheriina (Hemiptera: Cicadellidae) and phylogeny of Cicadellidae

Background Coelidiinae, a relatively large subfamily within the family Cicadellidae, includes 129 genera and ∼1,300 species distributed worldwide. However, the mitogenomes of only two species (Olidiana sp. and Taharana fasciana) in the subfamily Coelidiinae have been assembled. Here, we report the first complete mitogenome assembly of the genus Olidiana. Methods Specimens were collected from Wenxian County (Gansu Province, China) and identified on the basis of their morphology. Mitogenomes were sequenced by next-generation sequencing, following which an NGS template was generated, and this was confirmed using polymerase chain reaction and Sanger sequencing. Phylogenic trees were constructed using maximum likelihood and Bayesian analyses. Results The mitogenome of O. ritcheriina was 15,166 bp long, with an A + T content of 78.0%. Compared with the mitogenome of other Cicadellidae sp., the gene order, gene content, gene size, base composition, and codon usage of protein-coding genes (PCGs) in O. ritcheriina were highly conserved. The standard start codon of all PCGs was ATN and stop codon was TAA or TAG; COII, COIII, and ND4L ended with a single T. All tRNA genes showed the typical cloverleaf secondary structure, except for trnSer, which did not have the dihydrouridine arm. Furthermore, the secondary structures of rRNAs (rrnL and rrnS) in O. ritcheriina were predicted. Overall, five domains and 42 helices were predicted for rrnL (domain III is absent in arthropods), and three structural domains and 27 helices were predicted for rrnS. Maximum likelihood and Bayesian analyses indicated that O. ritcheriina and other Coelidiinae members were clustered into a clade, indicating the relationships among their subfamilies; the main topology was as follows: (Deltocephalinae + ((Coelidiinae + Iassinae) + ((Typhlocybinae + Cicadellinae) + (Idiocerinae + (Treehopper + Megophthalminae))))). The phylogenetic relationships indicated that the molecular taxonomy of O. ritcheriina is consistent with the current morphological classification.


INTRODUCTION
Coelidiinae is a relatively large subfamily within the Cicadellidae family, and it includes 129 genera and approximately 1,300 species (Nielson, 2015), including some species that serve as vectors of pathogens causing economically important plant diseases (Du et al., 2017;Frazier, 1975;Li & Fan, 2017;Maramorosch, Harris & Futuyma, 1981;Zhang, 1990). However, the taxonomic status of some species, on the basis of their morphology, remains controversial, and the phylogenetic relationships among major lineages of Membracoidea remain poorly understood (Dietrich et al., 2017). Moreover, knowledge regarding the taxonomic status of Olidiana within Cicadellidae and its phylogenetic relationship with other leafhopper genera is limited.
Complete mitogenomes provide large and diverse datasets for species delineation, and such mitogenomes have extensively been used for evolutionary studies of insects, particularly members of the orders Lepidoptera, Diptera, and Hemiptera (Salvato et al., 2008;Wang et al., 2011;Du et al., 2017;Su & Liang, 2018;Wang et al., 2018;Li et al., 2017). To date, approximately 35 species (26 complete and nine nearly complete) of the Cicadellidae mitogenome are available in GenBank. However, the mitogenomes of only two species [Olidiana sp. (partial genome, KY039119.1) and Taharana fasciana (NC_036015.1)] have previously been published for Coelidiinae, the largest subfamily of Cicadellidae.
Olidiana McKamey is the largest leafhopper genus in the tribe Coelidiini and it comprises 91 species. Among these, 54 species have been reported from China. However, to date, none of the characterized mitogenomes of the Olidiana sp. is complete; this lack of information restricts our understanding of the evolution of the Coelidiinae sp. at the genomic level. Therefore, new mitogenomic data will provide insights for determining the phylogenetic relationships and evolution of Cicadellidae in the future.

Sample collection and identification
The use of the specimens collected for this study was approved. The specimens were collected from Wenxian County, Gansu Province, China (32 • 95 N, 104 • 68 E) on October 17, 2018, and identified on the basis of their morphological characteristics, as described by Zhang (1990) andLi &Fan (2017). Fresh specimens were preserved in absolute ethanol and stored at −20 • C until DNA extraction.

DNA extraction
Genomic DNA was extracted from the whole body of adult males (after removing the abdomen and wing) using DNeasy c Tissue Kit (Qiagen, Hilden, Germany). The samples were incubated at 56 • C for 6 h for completely lysing the cells, and total genomic DNA was eluted in 100 µL of double-distilled water; the remaining steps were performed according to the manufacturer's instructions. The extracted genomic DNA was stored at −20 • C until further use. Voucher specimens with male genitalia and DNA samples have been deposited at the Institute of Entomology, Guizhou University, Guiyang, China.

Polymerase chain reaction (PCR) amplification and sequencing
Mitogenomes were sequenced using next-generation sequencing (Illumina HiSeq 4000 and 2 Gb raw data; Berry Genomic, Beijing, China), and two sequence fragments were reconfirmed by PCR amplification using primers (Table S1). Following this, an NGS template was generated and this was further confirmed using PCR and Sanger sequencing. PCR amplification of overlapping sequence fragments was performed using universal primers (Table S1). Two pairs of species-specific primers were designed using Primer Premier 6.0 (Premier Biosoft, Palo Alto, CA, USA) to amplify the control region (Table S1). PCR was performed using a PCR master mix (Sangon Biotech Co. Ltd., Shanghai, China), according to the manufacturer's instructions.

Sequence analysis
Next-generation sequences were assembled using Geneious R9 (Kearse et al., 2012). The assembled mitochondrial gene sequences were compared with the homologous sequences of Olidiana sp. (KY039119) and T. fasciana (KY886913) retrieved from GenBank and identified through BLAST searches in NCBI to confirm sequence accuracy. The sequences obtained by PCR amplification and TA cloning were assembled using SeqMan in the DNAStar software package (DNASTAR, Inc., Madison, WI, USA). The mitogenomes were annotated using the MITOS webserver (Bernt et al., 2013). Base composition and relative synonymous codon usage (RSCU) were analyzed using MEGA 6.06 (Tamura et al., 2013), and the boundaries and secondary structures of 22 tRNA genes were determined using tRNAscan-SE version 1.21 (Schattner, Brooks & Lowe, 2005) and ARWEN version 1.2 (Laslett & Canbäck, 2008). rRNA genes were identified on the basis of the locations of adjacent tRNA genes and comparisons with sequences of other Hemipterans. The secondary structures of rRNAs were inferred on the basis of models proposed for other Hemiptera (Wang, Li & Dai, 2017;Su et al., 2018). Helices were numbered according to the convention established by the Comparative RNA Web Site (Cannone et al., 2002). Strand asymmetry was calculated using the following formulas: AT skew = (A − T)/(A + T), GC skew = (G − C)/(G + C) (Perna & Kocher, 1995). Intergenic spacers and overlapping regions between genes were manually counted.
The following five datasets were concatenated for phylogenetic analysis: (1) P123: all codon positions of 13 PCGs (10,116 bp); (2) P12: first and second codon positions of 13  et al., 2014) with the best model for each partition selected under the corrected Akaike Information Criterion (AIC) using PartitionFinder2 (Table S2) (Miller, Pfeiffer & Schwartz, 2010) and evaluated using the ultrafast bootstrap approximation approach for 10,000 replicates. Bayesian (BI) analysis was performed using MrBayes 3.2 (Ronquist et al., 2012). Two independent runs with four simultaneous Markov chains (one cold and three incrementally heated at T = 0.2) were run for 50,000,000 generations, sampling every 100 generations under the GTR+I+G model. The best models were then selected on the basis of the corrected AIC (Nylander et al., 2004). The phylogenetic trees were visualized using FigTree 1.4.2.

General features of the O. ritcheriina mitogenome
The complete mitogenome of O. ritcheriina (MK738125) was 15,166 bp long, which is within the range of the complete mitogenomes of other Cicadellidae sp. (Nephotettix cincticeps, 14,805 bp and Idioscopus laurifoliae, 16,811 bp) ( Table 1). The mitogenome comprised 37 genes (13 PCGs, 22 tRNAs, and two rRNAs) and a large A + T-rich D-loop control region (Fig. 1). The majority strand (J strand) harbored most of the genes (nine PCGs and 14 tRNAs), whereas the minority strand (N strand) harbored the remaining genes (four PCGs, two rRNAs, and eight tRNAs) ( Fig. 1; Table 2). Moreover, the mitogenome of O. ritcheriina comprised intergenic spacers of 1 to 12 bp long at nine different loci. A total of 12 gene pairs overlapped with one another, with overlap lengths ranging from 1 to 13 bp. In addition, 16 gene pairs, including rrnL-trnV and trnV -rrnS (Table 2), were directly adjacent to one another. With a multicopy of trnI (AAT) located between the control region and trnI -trnQ-trnM, the mitogenome of O. ritcheriina exhibited a strong A + T bias. The A + T content of the whole genome was 78.0% (44.6% A, 33.4% T, 8.5% G, and 13.5% C) (Table 3); this percentage was between the A + T content of Yanocephalus yanonis (74.6%) and Trocnadella arisana (80.7%) ( Table 1). The segment with the highest A + T content was present in the control region (83.8%); the A + T content of this segment was generally higher than that of other segments (2 rRNAs, 81.1%; 22 tRNAs, 78.6%; whole genome, 78.0%; and 13 PCGs, 77.2%) ( Table 3).
Comparative analysis of the base composition of every component of the mitogenomes of Coelidiinae indicated that the control regions showed the highest A + T content  (Table 3).

PCGs and codon usage
The concatenated lengths of the 13 PCGs of O. ritcheriina were 10,116 nucleotide positions. Similar to the mitogenomes of other Cicadellidae sp., ND5 was the largest gene (1,671 bp) and ATP8 was the smallest gene (150 bp). Only four PCGs (ND4, ND4L, ND5, and ND1) were coded by the minority strand (N strand), whereas the other nine PCGs (COI, COII, COIII, ATP8, ATP6, ND2, ND3, ND6, and CYTB) were coded by the majority strand (J strand). Most PCGs exhibited the typical start codon ATN (ATA/ATT/ATG/ATC) and stop codon TAA or TAG, but COII, COIII, and ND4L showed an incomplete stop codon T. Analysis of the behavior of PCG codon families revealed an extremely similar codon usage among the mitogenomes of Cicadellidae, with TTA-Leu, ATA-Met, ATT-Ile, and TTT-Phe being the four most frequently used codons ( Fig. 2A). Furthermore, the RSCU of O. ritcheriina indicated that degenerate codons were biased to use more A/T than G/C at the third codon (Fig. 2B). Similarly, the biased usage of A + T nucleotides was reflected in the codon frequencies.   using tRNAscan-SE (Schattner, Brooks & Lowe, 2005) and ARWEN (Laslett & Canbäck, 2008). Among these, 14 were located on the J strand and eight on the N strand. All tRNAs exhibited the typical cloverleaf secondary structure, with the exception of trnS1 (AGN) in which the dihydrouridine arm formed a loop (Fig. 3). Abascal et al. (2006) and Abascal, Posada & Zardoya (2012) have shown that the invertebrate mitochondrial genetic code even shifts within the Hemiptera, with Triatoma (Cimicomorpha), Homalodisca (Cicadellidae), and Philaenus (Cercopoidea) using the AGG codon that was translated as Lys instead of Ser; accordingly, our tRNA analysis shows that the AGG codon in O. ritcheriina was translated as Lys instead of Ser. Two rRNA genes (rrnL and rrnS) in the mitogenomes of Cicadellidae were highly conserved. The putative lengths of the O. ritcheriina genes rrnL and rrnS were 1,180 bp between trnL2 and trnV and 731 bp between trnV and the control region, respectively (Tables 2 and 3). In the mitogenomes of Coelidiinae, the length of rrnL ranged from 1,178 (Olidiana sp.) to 1,192 bp (T. fasciana) and that of rrnS ranged from 729 (Olidiana sp.) to 775 bp (T. fasciana). The secondary structure of the O. ritcheriina gene rrnL comprised five domains (I, II, IV, V, and VI; domain III is absent in arthropods) and 42 helices (Fig. 4). Multiple alignment of the Coelidiinae gene rrnL extended over 1,180 positions and comprised 1,016 conserved (86.10%) and 164 variable (13.90%) sites. Domains IV and V were structurally more conserved than the other domains.
The secondary structure of rrnS comprised three structural domains and 27 helices (Fig. 5). Multiple alignments of the Coelidiinae gene rrnS extended over 730 positions and comprised 586 conserved (80.23%) and 164 variable (19.73%) sites. Domain III was structurally more conserved than domains I and II.
These rRNA secondary structures can be useful for the precise alignment of sequences for phylogenetic studies (Rijk & Wachter, 1997). Nevertheless, additional details regarding such rRNA structures should be accumulated in future studies.

Phylogenetic relationship
Phylogenetic trees were constructed on the basis of five concatenated nucleotide sequence datasets from 40 available mitogenomes of Membracoidea, with two species considered outgroups [Cicadoidea (T. auropilosa) and Cercopoidea (C. bispecularis)]. Saturation analysis addresses the issue on whether some positions or partitions of a dataset are saturated and to test whether these sites can be used for further phylogenetic analysis. These phylogenetic trees showed uncorrected pairwise divergence in transitions (s) and transversions (v) against divergences calculated with the GTR model, and none of the four candidate nucleotide sequence datasets (  Table 4; Fig. S1), thereby suggesting that the concatenated data is suitable for phylogenetic analysis. All the 10 trees are presented in Fig. 7 and Fig. S2A-F. Almost all nodes received high support (posterior probability, PP >0.88) in BI analyses, whereas a few nodes received only moderate or low support in ML analyses of some datasets (bootstrap support, BS <75). Monophyly at the subfamily level within Membracoidea was strongly supported in all the trees. Membracidae as a sister group to Cicadellidae was well supported by all the results (PP > 0.94, BS = 100). Within Cicadellidae, the 37 species sampled in this study represent seven subfamilies and the main topology was as follows: (Deltocephalinae + ((Coelidiinae + Iassinae) + ((Typhlocybinae + Cicadellinae) + (Idiocerinae + (Treehopper + Megophthalminae))))) (Fig. 7). The results of BI and ML analyses generated results that are consistent with those of previous phylogenetic studies on the basis of combined morphological and molecular data (Dietrich et al., 2001;Dietrich et al., 2017;Cryan et al., 2000;Cryan & Urban, 2012;Krishnankutty, 2013;Wang, Dietrich & Zhang, 2017).

CONCLUSIONS
We sequenced the mitogenome of O. ritcheriina from Coelidiinae and presented their structure and sequence characteristics. Consistent with previous observations related to Membracoidea, the mitogenome of O. ritcheriina was highly conserved in terms of gene content, gene size, gene order, base composition, PCG codon usage, as well as tRNA and rRNA secondary structures.
Furthermore, the phylogeny of Membracoidea was inferred with all 40 complete mitogenomes, namely, 35 Cicadellidae and five Treehopper. The overall phylogenetic structure of Membracoidea is consistent with that reported in previous studies. Coelidiinae was grouped with a clade comprising Iassinae. The mitogenomic information of O. ritcheriina can be useful for future studies aimed at exploring the mitogenomic diversity of insects and evolution of related insect lineages.
The lack of complete mitogenomes of Coelidiinae sp. has restricted the understanding of the evolution of this group at the genome level. Therefore, further studies are required to elucidate the phylogenetic status of species belonging to this group and their relationships. In this context, the addition of more taxa and genes to the leafhopper mitogenomic dataset may contribute to the determination of the relationships shared among major leafhopper lineages.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
This study was supported by the National Natural Science Foundation of China (No. 31672342). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Author Contributions
• Xian-Yi Wang and Ren-Huai Dai conceived and designed the experiments, performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
• Jia-Jia Wang conceived and designed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, authored or reviewed drafts of the paper, approved the final draft.
• Zhi-Hua Fan performed the experiments, analyzed the data, contributed reagents/materials/analysis tools, prepared figures and/or tables, approved the final draft.

Field Study Permissions
The following information was supplied relating to field study approvals (i.e., approving body and any reference numbers): The Institute of Entomology of Guizhou University approved field collection.

DNA Deposition
The following information was supplied regarding the deposition of DNA sequences: The species sequences are available at GenBank: MK738125.