Mitochondrial genome datasets for the sweetpotato weevil, Cylas formicarius elegantulus (Coleoptera: Brentidae), collected in the United States

The sweetpotato weevil, Cylas formicarius elegantulus (Summers) (Coleoptera: Brentidae), is one of the most destructive pests of sweetpotato worldwide. Genomic analyses of sweetpotato weevils can provide insights into their genetic diversity, population structure, and dispersal as well as provide information to support management strategies. Adult sweetpotato weevils were collected by various methods from Ipomoea batatas L. (sweetpotato) or I. coccinea L. (red morning glory) in the U.S. states of Georgia, Hawaii, South Carolina, and Texas. Genomic DNA was extracted from individual weevil specimens and sequenced using Illumina NovaSeq. A total of 181 GB of 150 base pair (bp) paired-end reads were generated for 40 specimens. Mitochondrial genomes were assembled for each specimen via reference mapping and annotated using Geneious Prime. Full mitochondrial genome sequences range from 17,141 to 17,152 bp with an average GC content of 21.8% and average coverage of 3307 × . A maximum likelihood phylogenetic analysis considering the mitochondrial protein coding genes is provided. Mitochondrial genomes and assembled reads are deposited in NCBI GenBank, providing 40 mitogenomes of C. formicarius elegantulus collected in the U.S.


a b s t r a c t
The sweetpotato weevil, Cylas formicarius elegantulus (Summers) (Coleoptera: Brentidae), is one of the most destructive pests of sweetpotato worldwide. Genomic analyses of sweetpotato weevils can provide insights into their genetic diversity, population structure, and dispersal as well as provide information to support management strategies. Adult sweetpotato weevils were collected by various methods from Ipomoea batatas L. (sweetpotato) or I. coccinea L. (red morning glory) in the U.S. states of Georgia, Hawaii, South Carolina, and Texas. Genomic DNA was extracted from individual weevil specimens and sequenced using Illumina NovaSeq. A total of 181 GB of 150 base pair (bp) paired-end reads were generated for 40 specimens. Mitochondrial genomes were assembled for each specimen via reference mapping and annotated using Geneious Prime. Full mitochondrial genome sequences range from 17,141 to 17,152 bp with an average GC content of 21.8% and average coverage of 3307 × . A maximum likelihood phylogenetic analysis considering the mitochondrial protein coding genes is provided. Mitochondrial genomes and assembled reads are deposited in NCBI Gen-Bank, providing 40 mitogenomes of C. formicarius elegantulus collected in the U.S.
Published by Elsevier Inc. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )

Value of the Data
• These data are useful for analysis of intraspecific divergence within the mitochondrial genomes of sweetpotato weevils. • These mitochondrial genome sequences will be a useful resource for entomologists and pest management professionals seeking to analyze and determine the genetic differences among sweetpotato weevil populations. • The data can be used to identify SNPs and other genetic markers to discriminate weevil populations, to study phylogenetic relationships among weevil populations, to determine region of origin, and to develop sequence-based weevil identification tools.

Objective
Adult Cylas formicarius elegantulus were collected for genome sequencing and analysis of genetic diversity and population structure of weevils captured across a wide swath of the U.S. sweetpotato production areas. Mitochondrial genomes were assembled for the purpose of providing the first mitochondrial genomes for sweetpotato weevils collected in the U.S., for determining intraspecific sequence divergence within mitochondrial genes, and for inferring phylogenetic relationships among sweetpotato weevil populations.

Data Description
Skim sequencing was performed on C. formicarius elegantulus ( Fig. 1 ) genomic DNA to generate genomic sequences for population genetics studies. Weevil specimen and collection details are summarized in Table 1 . Mitochondrial genome sequences were assembled in Geneious Prime and are available in NCBI Genbank at Accessions OQ763214-OQ763253. Assembly results are summarized in Table 2 . Corresponding mapped reads are available as SRA datasets under BioProject PRJNA945076 : Sweetpotato weevil genomics [1] . Mitogenomes were annotated using Geneious Prime. A maximum likelihood (ML) phylogenetic analysis was performed on the mitochondrial protein coding genes ( Fig. 2 ).

Specimen Collection, DNA Extraction, and Sequencing
Adult C. formicarius elegantulus were collected from locations in the U.S. states of Georgia, Hawaii, South Carolina, and Texas by various methods detailed in Table 1 . Genomic DNA was extracted from individual whole-body male and female sweetpotato weevils with the DNeasy Plant Mini Kit (Qiagen, Venlo, Netherlands) with modifications [2] . DNA quantity was measured on a NanoDrop 20 0 0 spectrophotometer (ThermoFisher Scientific, Waltham, MA, USA). Whole-genome skim sequencing was done using Illumina NovaSeq 60 0 0 at Novogene (www.novogene.com).

Mitochondrial Genome Assembly and Annotation
Mitochondrial genomes of 40 sweetpotato weevils were assembled in Geneious Prime version 2022.0.2 using the Map to Reference tool with the C. formicarius complete mitochondrial genome assembly from China as a reference (NCBI Reference Sequence: NC_046580.1; [3] ). The Geneious mapper was set at medium sensitivity and five iterations assembled the fastq paired-end sequence datasets. Assemblies were circularized by trimming overlapping ends. Complete mitogenomes and assembled read datasets were deposited in GenBank (BioProject PRJNA945076). Mitogenome sequences were annotated using NC_046580.1 as a reference and the 'Annotate from Database' feature in Geneious Prime. Annotations are available in GenBank.

Phylogenetic Analysis
Phylogenetic analyses were conducted under an ML framework in IQ-TREE (v.2.1.3) [4] . The nucleotide sequences for each of the 13 mitochondrial protein coding genes in 43 taxa were aligned with MAFFT (v.7.249) [5] , and the best nucleotide substitution model for each gene was selected with ModelFinder [6] . Branch support was estimated with 10 0 0 ultrafast bootstrap replicates [7] . Ten independent tree searches were performed, and the tree with the greatest log-likelihood score was taken as the ML tree ( Fig. 2 ).

Ethics Statements
The work involving insect invertebrates detailed herein complied with ARRIVE guidelines and the National Institutes of Health guide for the care and use of laboratory animals (NIH Publications No. 8023, revised 1978).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.