Population genetic structure and intraspecific genetic distance of Periplaneta americana (Blattodea: Blattidae) based on mitochondrial and nuclear DNA markers

Abstract The American cockroach (Periplaneta americana) is a globally invasive pest that can cause significant economic loss and threaten human health. Although it is abundant and lives in close proximity to humans, few studies have investigated the genetic diversity of P. americana. Our study analyzed 1,053 P. americana and other Periplaneta species' samples from different locations in China and the United States. A traditional tree‐based method using 17 unique mitochondrial COI haplotypes of P. americana and 20 haplotypes of the other Periplaneta species accurately identified P. americana with a barcoding threshold of 5.1%. To identify the population genetic structure of P. americana, we investigated wingless gene and pooled them with obtained mtDNA data for a combined analysis. Although the genetic diversity of the USA group was relatively higher than the China group, the number of haplotypes and alleles of both groups was small. The analysis of molecular variance (AMOVA), intraspecific phylogeny, and haplotype networks indicated that P. americana had very little global genetic differentiation. The weak geographic genetic structure might reflect the human‐mediated dispersal of P. americana. Despite no apparent phylogeographic assignment of mtDNA and nuclear lineages was observed in both BI trees, the integrated COI sequence data identified four distinct P. americana haplotype groups, showing four ancient maternal lineages of P. americana in China and the United States.

P. americana is the most abundant and widely distributed species within Periplaneta (Roth & Willis, 1960).
Accurate taxonomic identification is the cornerstone of developing management strategies for invasive species. However, traditional methods to identify P. americana and other Periplaneta based on morphological characteristics have been problematic due to highly similar external morphology (Evangelista, Buss, & Ware, 2013), high degree of polymorphism between adults and juveniles (Evangelista, Bourne, & Ware, 2014), and sexual dimorphism (Che, Gui, Lo, Ritchie, & Wang, 2017;Evangelista et al., 2014). Therefore, it would be indispensable to apply a rapid and effective molecular identification method, such as using the mitogenome, to complement the morphological taxonomy of P. americana. The mitogenome is characterized by its maternal inheritance, nonintrons, and rapid evolution (Cameron, 2014). The application of a short standardized mitochondrial cytochrome c oxidase I (COI) 5′ region has been highly successful in a wide range of insect taxa (Che et al., 2017;Talavera, Muñoz-Muñoz, Verdún, & Pagès, 2017;Versteirt et al., 2015).
Several different methods using molecular makers have been put forward to identify distinct species. Traditional DNA barcoding calculates intra-/interspecific genetic distances and constructs neighbor-joining (NJ) tree for species delimitation. The clustered clades in a phylogenetic tree and the existence of the barcoding gap are interpreted as distinct species (Ni, Li, Kong, Huang, & Li, 2012). In addition to traditional barcoding analysis (NJ analysis), generalized mixed Yule-Coalescent (GMYC) is also a popular approach for species identification based on single-locus data, which estimates species boundaries from branching rates in a phylogenic tree (Fujisawa & Barraclough, 2013). Another method, automatic barcode gap discovery (ABGD), automatically sorts sequences into hypothetical species based on the barcode gap (Puillandre, Lambert, Brouillet, & Achaz, 2012). A combination of methods can be helpful to evaluate the efficiency of DNA barcoding for P. americana identification.
Intraspecific diversity studies, using both mitochondrial and nuclear markers, enable us to evaluate the population genetic structure and genetic diversity of a species (Ferronato et al., 2019;Johnson, Morton, Schemerhorn, & Shukle, 2011;Roman, 2006;Wang et al., 2009). Although P. americana is an urban pest worldwide, there are limited studies on the genetic variety and population structure of P. americana. One study, based on multiple samples from eastern United States, suggested that "P. americana individuals from three or more historically isolated geographic populations are now effectively merged into a single global gene pool" (von Beeren, Stoeckle, Xia, Burke, & Kronauer, 2015). Understanding the population structure F I G U R E 1 Distribution and sampling localities of cockroaches analyzed in this work. Numbers for sampling localities are as indicated in Table S1 and genetic variety of invasive species across different continents may help to comprehend the possible pathways and invasion history of P. americana, but to our knowledge, there has been no examination of the genetic variation of P. americana groups in China.
There are two objectives that are pivotal to this study. Firstly, the COI barcode region of a broader geographic range of P. americana and other Periplaneta species collected in China and the United States was analyzed using traditional tree-based, ABGD, and GMYC species delimitation methods. Our intention was to observe which of these methods best corresponds to morphological species concepts in Periplaneta and the levels of intraspecific rate of genetic variation that exists within P. americana. Our second objective was to assess the genetic diversity and genetic structure of P. americana specimens from 18 sites in China using both wingless and COI markers. We then integrated the previously registered P. americana sequences (von Beeren et al., 2015) into our genetic data and reanalyzed the dataset in order to compare the phylogenetic structure of P. americana in China and the Americas.
Morphological species identification was carried out by using the taxonomic keys for the cockroaches (Liu, Zhu, Da, & Wang, 2017;Robinson, 2005). Specimens of nymphs were excluded for lack of discernible morphological characters.

| DNA extraction, amplification, and sequencing
Total DNA was extracted from muscle tissue using the TsingKe Genomic DNA kit (TsingKe). The partial sequences of mitochondrial COI gene (658 bp) and nuclear wingless gene (378 bp) were amplified by PCR with the primers LCO1490/HCO2198 (Folmer, Black, Hoeh, Lutz, & Vrijenhoek, 1994) and wg550F/wgcockR (von Beeren et al., 2015). Not all the COI genes of specimens were successfully sequenced by Folmer's universal primers because PCR always co-amplified a large number of pseudogenes, which can be due to bimodal sequencing (Song, Moulton, & Whiting, 2014). We designed a set of primers to eliminate nontarget DNA sequencing. The primers QF/ QR, HF/HR, and AUSF/AUSR were designed specifically for P. americana, P. fuliginosa, and P. australasiae, respectively (Table 1). Our specific primers were effective in preventing failed amplifications.

| Marker summary statistics and intrapopulation genetic diversity
In total, mitochondrial COI sequences of 853 cockroach specimens (including 563 P. americana specimens, 235 P. fuliginosa specimens, 52 P. australasiae specimens, and 3 P. brunnea specimens) were successfully amplified. was used to assemble sequences. Sequences were aligned using the program CLUSTAL W in MEGA 5.0 (Tamura et al., 2011). DNA sequences were checked visually and translated to DNA codons to avoid pseudogenes (Zhang & Hewitt, 1996). The number of conserved, variable, parsimony-informative sites and singletons were assessed in MEGA 5.0. DnaSP 5.10.01 (Librado & Rozas, 2009)

| Neighbor-joining clustering and species delimitation approaches
The neighbor-joining (NJ) tree was performed with bootstrap analy- for inferring putative species. The pairwise genetic distances were ranked from smallest to largest to detect the barcoding gap. ABGD uses the first significant gap beyond one-side confidence limit to partition the data and then recursively applies inference of the limit and gap detection to obtain finer partitions until no further partition can be detected. This method was implemented online (http:// wwwabi.snv.jussi eu.fr/publi c/abgd/) with default parameters (P min = 0.001, P max = 0.1, Steps = 10, number of bins = 20, distance method = Kimura).

| Population genetic structure
Calculation of Fixation Index (F ST ) and analysis of molecular variance (AMOVA) were performed using ARLEQUIN v3.5 (Excoffier & Lischer, 2010). The N m values were calculated to measure population contact as migrating reproductive individuals per generation and the equation used was N m = (1 − F ST )/4F ST . These analyses were conducted to assess genetic variation according to geographic distribution. The spanning network of COI haplotypes was constructed using TCS 1.21 at 95% confidence level (Clement, Posada, & Crandall, 2000) to study the relationships between haplotypes and their geographic distribution.
Phylogenetic relationships between P. americana groups based on nuclear and mitochondrial DNA were estimated using a Bayesian approach. The best-fitting model for BI analysis was calculated using Modeltest ver. 3.7 (Posada & Crandall, 1998) under Akaike information criterion (AIC; Akaike, 1974). The best-fit substitution model selected was GRT + I+G (N ST = 6, Rates = gamma) for COI sequences and HKY (N ST = 2, Rates = equal) for wingless sequences.

| Variations in nucleotide sequences
In total, we analyzed the genetic variability of P. americana mito- The average nucleotide composition of those sequences was 33.1% T, 18.8% C, 32.0% A, and 16.0% G. A + T (65.1%) was present in a much higher proportion than G + C (34.8%), as it is usual for insects (Simon, Buckley, Frati, Stewart, & Beckenbach, 2006
However, both ABGD and GMYC methods generated incongruent genetic lineages when compared with the traditional barcoding analysis (Figure 2a). The 20 recursive steps in the ABGD analysis resulted in ten different sequence partitions. The recursive partition produced one and 11 groups (=species), while four, nine, and seven groups in the initial partition ( Figure 2c). Too high or too low prior intraspecific divergence would underestimate or overestimate the number of species (Puillandre et al., 2012). Therefore, we decided to report only primary partitions in the output of ABGD with p value between 0.77% and 5.99% (no group was predicted by recursive partitions with p value between 0.77% and 5.99%).
P. brunnea, P. fuliginosa, P. australasiae, P. japonica, and P. sp can be distinguished with prior intraspecific divergence between 0.77% and 1.29%. However, P. americana had two genetic groups with prior genetic distance thresholds between 0.77% and 1.29%. Using the values between 2.15% and 5.99%, the species were partitioned into four groups (Figure 2a). P. fuliginosa, P. australasiae, and P. sp. were classified within the same OTU. GMYC model analysis yielded the same number of seven species as with the ABGD model (primary partitions with p value between 0.77% and 1.29%; Figure 2a; Figure   S1). P. americana was split into two species, while the other five Periplaneta species were represented as a single species.

| Phylogenetic and network analyses
We performed phylogenetic analyses of P. americana using the BI method based on COI and wingless genes to explore the geographic  Allele frequencies  55  61  85  124  138  151  184  298  331  355  376 Allele

| Population genetic structure
Using the dataset of mtDNA, genetic divergence (F ST ) and per-gen-

| Suitable analysis method for P. americana identification
It is difficult to identify cockroaches by morphology due to several factors including phenotypic plasticity, developmental stochasticity, and sexual dimorphism. The molecular approach provides useful information that can be used for both identifying and defining the boundaries of species (Evangelista et al., 2013;Ruiz-Lopez et al., 2012). OTUs obtained by applying the three methods (traditional tree-based, ABGD, and GMYC) in our barcode data were compared. Traditional tree-based delimitation approach with a 5.1% barcode gap was found to be more reliable and consistent in the identification of morphospecies when compared to the GMYC model and ABGD approach. GMYC analysis appeared to wrongly separate the P. americana morphospecies into two groupings ( Figure 2a). A recent study (Camelier, Menezes, Costa-Silva, & Oliveira, 2018) showed that GMYC typically generated a high number of OTUs than other methods. Errors in the ultrametric tree that underpins the analysis will lead to erroneous species identification (Zou et al., 2016).
Similar to the GMYC method, ABGD also oversplit P. americana into two candidate species with prior genetic distance thresholds between 0.77% and 1.29%. ABGD is a delimitation method based on genetic distances, intraspecific genetic variation and interspecific genetic divergence to congeners and would influence its delimitation accuracy (Pinto et al., 2015). P. americana samples in this study represent a mix of individuals from different locations within China and the United States. The maximum intraspecific COI sequence divergence within P. americana (5.1%) led to an overestimation of species diversity by the ABGD method (Hamilton, Hendrixson, Brewer, & Bond, 2014). When the p value ranged between 2.15% and 5.99%, ABGD placed P. fuliginosa, P. australasiae, and P. sp into a single candidate species and this was incongruent with the morphological evidence. Thus, the species delimited from ABGD analysis might be incorrect for the genetic distance threshold and can be produced by grouping closely related species into a single cluster, or even by separating relatively deep divergent
TA B L E 3 General barcode information and genetic variation (%) of COI barcodes haplotypes within (intra) and between (inter) Periplaneta species included in this study populations into multiple clusters (Yu, Rao, Matsui, & Yang, 2017).
ABGD is indeed influenced by the mix of shallow and deep genetic divergences.
We found that traditional tree-based approach with a 5.1% barcode gap was a much better analytical method for P. americana identification. However, many researchers argued that DNA barcoding approaches are imperfect, and they cannot be used in species discovery and identification (Meyer & Paulay, 2005;Will & Rubinoff, 2004). No approach is the panacea to this problem. The shortcomings of DNA barcoding often mirror those approaches that rely strictly on morphological characteristics (Hamilton et al., 2014). The 5.1% intraspecific genetic variation within P. americana could accurately and effectively distinguish P. americana from other Periplaneta species in this study. This cutoff value could be applied to later research when diverse species are included, and as a supplement to traditional taxonomic techniques.

| Low sequence variability and no genetic structure of P. americana populations
Population genetic structure and genetic diversity can provide important biological information for the study of invasive species (Wongsa, Duangphakdee, & Rattanawannee, 2017). However, there are limited studies on the population genetic structure of cockroaches (Cloarec, Rivault, & Cariou, 1999;Jaramillo-Ramirez, Cárdenas-Henao, González-Obando, & Rosero-Galindo, 2010;Vargo et al., 2014). Intraspecific phylogeny and haplotype networks all indicated the presence of very little genetic structure in P. americana (Figures 3 and 4). This result was in agreement with what was expected in the German cockroach (Blattella germanica; Vargo et al., 2014) and suggested that P. americana should be considered a global panmictic population (Troast, Suhling, Jinguji, Sahlén, & Ware, 2016). As one of the most widespread invasive insects, lack F I G U R E 3 The Bayesian tree of Periplaneta americana from China (18 sampling sites) and the USA groups samples based on analyses of the mt COI (a) and wingless genes (b). The posterior probabilities exceeding 50% are shown above the nodes. The labels in COI tree include group ID and haplotype codes. Colors depict different COI haplogroups. For each individual, its two nuclear alleles are coded using different numbers. Colors on the wingless tree clades give the individuals belong to the same mitochondrial haplogroups. The square and the circle represent samples from the United States and China, respectively of population genetic structure of P. americana was likely due to a high reproductive rate and the human-mediated range expansion of P. americana (Gonçalves et al., 2019;Vargo et al., 2014). AMOVA results for COI sequences in P. americana showed that most (>72%) of the genetic variation occurred within sampling sites and countries ( This high population admixture would lead to a wide spread of alleles and promote an increase in genetic diversity of invasive species (Gonçalves et al., 2019).
Unexpectedly, the relatively low genetic diversity within P. americana populations was observed in both mtDNA and nDNA markers.
The analysis of COI and wingless fragments defined only 17 haplotypes from 741 individuals and 7 alleles from 113 individuals, respectively. Low population variation is common in other insects (Mazur et al., 2016;Žitko, Kovaćić, Desdevises, & Puizina, 2011). Several aspects can explain such a pattern of genetic variation: the small size of the founding populations (Žitko et al., 2011); the low genetic diversity in the original source populations (Vargo et al., 2014); and extensive insect control measures involving insecticides and source reduction (Prijović et al., 2014). Distinguishing between those possibilities will require the genetic characterization of one or more populations of P. americana native to Africa. However, the successful distribution of P. americana around the globe shows that invasive species with low genetic diversity can also be widely distributed and spread explosively (Wang, Li, & Wang, 2005).

| Four main COI haplogroups of P. americana
Although no apparent phylogeographic assignment of mtDNA and nuclear lineages was observed in both BI trees, phylogenetic analyses based on P. americana COI haplotypes showed four divergent COI haplogroups (Figure 3). Mitochondrial DNA is inherited only through the maternal cytoplasm. Therefore, these four branches provide a record of the ancient maternal lineage of P. americana.
In contrast to mtDNA, there is recombination in autosomal DNA.
Recombination would distort the information on evolutionary history carried by the DNA sequence (Zhang & Hewitt, 2003), causing the discordance between the COI and wingless phylogenetic trees (Sota & Sasabe, 2006). The difference in rates of evolution between the mtDNA and nDNA may also be the reason of mito-nuclear discordance. In insects, the evolution rates of mtDNA are estimated to be 2 ~ 9 times faster than nuclear protein-coding regions (Lin & Danforth, 2004), and it may be insufficient to indicate phylogeographical patterns using nDNA with relatively low variation when compared to mtDNA (Hickerson & Cunningham, 2005 (Inoue et al., 2013;Prijović et al., 2014;Qin et al., 2016). Haplotypes of the USA group were divided into three branches A, B, and D, while the Chinese group was divided into B, C, and D clades (Figure 3a). It appeared that groups of P. americana in China and the United States were each from three genetically divergent source groups, with a total of four clades.
However, the rate of divergence for COI of Blattodea taxa was unknown, and a "universal" clock rate would cause error estimation in divergence time (Pfeiler, Bitler, Ramsey, Palacios-Cardiel, & Markow, 2006). In future studies, a molecular clock should be applied to P. americana to estimate the ages of population genetic divergences in this species. Additionally, haplogroups B and D were shared between the USA and Chinese groups, as phylogenetic analysis did not exhibit obvious association between the haplotype phylogeny and geographic distribution (Figure 3). This may indicate a combination of historical admixture between groups (von Beeren et al., 2015), which is consistent with the centurylong global migration of this species (Schal, 2011). Frequent global trade and human-mediated events likely presented advantageous conditions for the long-distance dispersal of P. americana, which is considered to be ongoing.

| CON CLUS ION
This is the first study based on the COI gene to analyze the genetic these nongeographic groupings, our study showed that traditional tree-based methods could accurately identify P. americana with a barcoding threshold of 5.1%. We believe that the mitochondrial COI gene can be effectively used in studying intra-and interspecific divergences of cockroaches.

ACK N OWLED G M ENTS
We sincerely appreciate Wenbo Zhang, Dr. Chuang Zhou, Shilin He, Dr. Ting Huang, and Dr. Wujiao Li at Sichuan University for the sample collection. We also thank Professor Timothy Moermond, Dr. Megan Price, Dr. Chao Du, Dr. Jake George James, and Dr.
Ting Huang for editing the manuscript and thank the anonymous reviewers for insightful comments. The research was funded by the Department of Science and Technology of Sichuan Province (2017SZ0019).

CO N FLI C T O F I NTE R E S T
None declared.

AUTH O R CO NTR I B UTI O N S
Zhang X conceived the project. Ma J and Liu J collected samples.
Ma J, Liu J, and Shen Y performed the experiments and analyzed the data. Ma J, Fan Z, Zhang X, and Yue B wrote the manuscript with help from all of the authors.

DATA AVA I L A B I L I T Y S TAT E M E N T
All COI and wingless sequences used in this study were depos-