Dispersal route of the Asian house rat (Rattus tanezumi) on mainland China: insights from microsatellite and mitochondrial DNA

Rattus tanezumi is a common commensal rat and an important host animal of bubonic plague in South China and Southeast Asia. The northward dispersal of this species in mainland China has been reported in recent decades, along with more recent intercontinental expansion. Population genetics of R. tanezumi in mainland China were studied to explain the relationship between dispersal history and the ancient and modern transportation networks of China. In total, 502 individuals belonging to 18 populations were collected from 13 provincial areas. Nine microsatellite loci and two mtDNA sequences were analyzed. The results indicate that R. tanezumi populations from Yunnan have highest genetic diversity and populations from Tibet with lowest genetic diversity. 18 populations can be divided into four clusters, the first cluster including populations from southwest Yunnan, the second including two populations of Tibet, the third for populations from middle and east of mainland China, and the forth for two populations from north Yunnan. Both microsatellite and mtDNA data reveal that the populations from coastal areas are closely related to populations from Yunnan, whereas populations from Tibet are closely related with populations from Sichuan. The results suggest that early dispersal of R. tanezumi in mainland China depended on shipping transportation, with subsequent expansion from coastal areas into Central China occurring along the Yangzi River. Further, the linkages between populations in Tibet and Sichuan point to a modern era introduction via the Chuan-Zang highway, rather than along the Tea Horse Ancient Road.


Background
The Asian house rat, Rattus tanezumi (Temminck, 1844) (Rodentia: Muridae), is a common commensal rat in East and Southeast Asia. From South China to Southeast Asia, R. tanezumi can be found in indoor and outdoor habitats. In outdoor habitats, it is the dominant pest rodent in rice fields, causing crop damage before harvest [1,2]. After the harvest season, R. tanezumi will migrate to residential areas and even indoor areas for food [3], causing household damage [4]. Because of its wide range of habitats and these seasonal migrations, R. tanezumi can carry a number of different pathogens including zoonotic agents of public health concern. Thus, R. tanezumi is regarded not only as a pest related to agriculture but also as a host of several zoonoses, including plague (Yersinia pestis), haemorrhagic fever with renal syndrome (Hantaan virus), and leptospirosis (Leptospira) [5][6][7].
The range expansion of R. tanezumi was not noted until recently after Huang et al. reported the northern border of the distribution area of R. tanezumi in China as the Yellow River [8]. From the 1990s on, R. tanezumi was found in provinces north of the Yellow River, including Hebei [9] and Shanxi [10], and in 2012 Ma et al. reported the first record of R. tanezumi northeast of Qinghai, a northwestern province of China [11]. The colonization of R. tanezumi in north China was easily recognized because this species can be distinguished from another commensal rat, R. norvegicus, with morphological characters. However, the expansion of R. tanezumi out of Asia was not recognized until genetic methods were used in congeneric species identification [12] because of the morphological similarities among members of the R. rattus species group. With molecular genetic methods, Bastos et al reported the presence of R. tanezumi in 13 locations in South Africa [13]. Mitochondrial haplotypes of R. tanezumi were also found in California in the United States and in Madagascar [14]. Based on these and additional data, Aplin et al inferred the prehistory expansion and the global distribution of R. tanezumi [15].
The extension of the geographic range of commensal rats is considered a result of human activity. The phylogeography of R. rattus in the western Indian Ocean is congruent with human colonization events in this area [16]. The distribution of R. exulans is also a good example and has been used to imply the history of human colonization on Pacific islands [17,18]. Rattus norvegicus was not reported at Xinjiang before the 1970s because of the Gobi Desert between Xinjiang and Central China. In the 1970s, colonies of R. norvegicus were first found in railway stations of Xinjiang, after the railway was built [19]. From then on, R. norvegicus spread to other cities of Xinjiang with railway and highway traffic. Because the dispersion of R. tanezumi was reported only in recent decades, the dispersal routes of this species and how these relate to human activity is not known.
In ancient China, because of the complex landscape of Southwest China, commercial transportation in this area (including Sichuan, Yunnan, Shaanxi and Tibet) usually depended on caravans. Starting approximately 1200 years ago (Tang Dynasty), a famous caravan road system for tea, salt and horse trading developed in Southwest China, which is now called the Tea Horse Ancient Road. This road system linked Sichuan, Yunnan and Tibet (Fig. 1a) and became a unique route for trade between Tibet and Central China [20]. Another important transportation system in ancient China was shipping via sea, river and canal. Shipping transportation has been used on the Yangzi River and Pearl River for more than two thousand years. During the twentieth century, a modern highway system was gradually constructed in Southwest China, and modern transportation based on the automobile and highway system became a primary method of commodity trade. An important highway linking Tibet and Central China is the Chuan-Zang Highway, which starts from Chengdu, the capital city of Sichuan, and ends at Lhasa in Tibet (Fig. 1a). Whether and how these transportation systems affected the migration of R. tanezumi on mainland China, especially from Yunnan to Tibet, could be explained by population genetics analysis of rats.
In this study, we collected samples of R. tanezumi from mainland China and examined the genetic structure of different populations with microsatellite and mitochondrial markers. Based on the results of these analyses, we aimed to explain the expansion of R. tanezumi through mainland China, the possible expansion routes of R. tanezumi through mainland China, including the Tea Horse Ancient Road for caravans, shipping transportation along the rivers and coastline of China, and transportation along modern highway and railway systems.

Sampled populations and DNA extraction
Samples belonging to 18 populations were collected at 22 locations (N = 10-39 rats per location) in 13 provinces covering almost the entire range of R. tanezumi in China ( Fig. 1a and Table 1). Samples of four populations (SHH, CHQ, SIC and QNJ) consisted of specimens pooled from two sites in close proximity. Rats were captured with living traps in rice fields and farms after getting verbal consents from owners. Individuals of R. tanezumi were selected and carried to the laboratory. Compressed carbon dioxide in gas cylinder was used to euthanize the rats. For the safety the flow rate of CO 2 was controlled at 9 l per minute. Rats were euthanized in plastic bag with CO 2 for ten minutes, and the death of them was confirmed with checking the heartbeat. Muscle or liver tissue was collected and stored in 95% ethanol at − 80°C until DNA extraction according to the handbook of the DNeasy Blood & Tissue kit (Qiagen, GER).

Microsatellite analysis
All samples were genotyped for 9 microsatellite loci developed for this research (Additional file 1: Table S1). DNA was amplified in a thermocycler (SensoQuest, GER) in 25-μL reaction mixes containing 1 μL of total DNA (about 30 μg), 2.5 μL of 10× LA PCR Buffer II (Mg 2+ Plus) (Takara, JPN), 2.5 mM of each dNTP (Takara, JPN), 10 pmol of each primer, and 0.75 U of LA Taq polymerase (Takara, JPN). The 5′ end of the forward primer was labelled with a fluorescent dye (FAM, HEX, and ROX). Cycling conditions were as follows: 5 min at 95°C, followed by 40 cycles of 30 s at 95°C, 30 s at the annealing temperature (TA), and 30 s at 72°C, and then 5 min at 72°C. The length of the DNA fragments was analyzed manually in GeneMapper v. 4.0 (Applied Biosystems).
The genetic diversity of every locus was characterized by estimating number of different alleles (N a ), number of effective alleles (N e ), Shannon's information index (I), number of private and locally common alleles, allele richness (Rs), observed heterozygosity (H o ), unbiased expected heterozygosity (H e ) and positive inbreeding coefficient (F IS ) using Fstat V2.9.3.3 (https://www2.unil.ch/ popgen/softwares/fstat.htm) and GenAlEx 6.41 [21]. Student's t-Test was used to determine whether two sets of data were significantly different from each other in SPSS 16.0 [22]. Fstat was also used to test for conformance of the genotype to Hardy-Weinberg equilibrium (HWE) for each locus and each population and to explore linkage disequilibrium (LD) between pairs of loci. For all nine loci, Microchecker V2.2.3 [23] was used to detect the presence of null alleles and genotyping errors such as large allele dropout or stuttering using 1000 randomizations. The null-allele-adjusted dataset was compared to the original dataset to explore the effect of null alleles on estimate of genetics differentiation.
Isolation by distance (IBD) was estimated with Mantel's test in Web Service of IBD [24], using the correlation between genetic and geographic distances by the regression of pairwise F ST /(1-F ST ) on the natural logarithm (Ln) of straight-line geographic distance. Two approaches were used for testing genetic bottlenecks at the microsatellite loci for each population, we calculated M, a ratio based on the number of alleles to the allelic size range [25] in Arle-quin3.5 [26], and heterozygosity excess tests implemented in Bottleneck V1.2.02 [27] to assess deviation from Mutation-Drift Equilibrium. The tests were performed using the Infinite Allele Model (IAM) [28], Stepwise Mutation Model (SMM) [29], and Two-Phase Model (TPM) [30], which were run twice for each population, assuming that the percentage of stepwise mutations was 80%. Statistical significance of the tests were assessed for each population across all loci by the Wilcoxon signed-rank test available in Bottleneck [31].
To identify the population genetic structure of R. tanezumi, two Bayesian model-based cluster analyses were contrasted: Structure [32] and Geneland [33]. Structure 2.3.4 [32] was first used to infer the number of clusters (Κ) in the dataset without prior information of the sampling locations. A model in which allele frequencies correlated within populations was assumed (λ was set at 1, the default value). The software was run with the option of admixture, allowing for some mixed ancestry within individuals, and α was allowed to vary. We used 20 independent runs for each value of Κ (K = 1 to 10), with a burn-in period of 50,000 iterations and 250,000 replications. The method of Evanno et al. [34] was used to determine the most likely number of clusters. This approach uses an ad hoc quantity, ΔK, based on the second-order rate of change of the likelihood function between successive values of K. The results of 20 replicate runs for each value of K were combined using the Greedy algorithm of Clumpp 1.1.1 [35], and summary outputs for each value of K were displayed graphically using Distruct v1.1 [36]. Differentiation among clusters was estimated with Wright's F-statistics [37] computed in Fstat according to Weir and Cockerham [38]. Pairwise F ST for statistical significance was also tested in Fstat to evaluate genetic differentiation between populations. The number of putative migrants (N m ) per generation between populations was esti- [39]. Critical significance levels for multiple testing were corrected using Bonferroni correction [40].
The second method relied on the spatial cluster model implemented in the Geneland package [33,41]. Following the user's manual recommendations, the Markov chain Monte Carlo (MCMC) repetitions were set to 100,000, the thinning was set to 100 and the burn-in period was set to 100 (we eliminated the first 100 iterations when the curve was not constant); the number of groups (K) to be tested was set from 1 ± 7. Each individual was assigned to one of K populations (1 ≤ K ≤ 7) based on its multilocus genotypes and spatial coordinates. To confirm that the run was long enough, we obtained 10 different runs and compared the parameter estimates (K, individual population membership, maps). The best result was chosen based on the highest average posterior density. The Neighbour-Joining (NJ) tree was computed with Powermarker 3.25 [42], based on microsatellite Nei's genetic distance [43]. The robustness of population trees was evaluated by bootstrapping over loci. One thousand bootstrapped trees were summarized to obtain a consensus tree, using the Consens module in the Phylip package [44], followed by editing with Treeview [45].

Mitochondrial DNA analysis
In total 502 individuals were used to explore sequence polymorphism in the mitochondrial COI gene and D-loop region. The mitochondrial COI gene was used not only in population genetics analysis bust also as DNA barcoding at first in specimens identification. DNA samples were used as templates to amplify a 702-bp fragment of COI and a 546-bp fragment of D-loop according to Robins et al. [12]. PCR products were sequenced directly in two directions with an ABI 3100 automatic sequencer (Perkin-Elmer, Waltham, Massachusetts) using the ABi PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit with AmpliTaq DNA polymerase (Applied Biosystems, Foster City, California).
Sequences were aligned using Muscle [46] within MEGA 6.0 [47]. Basic sequence statistics, including the number of segregating sites (S), the number of haplotypes per sample (H p ), haplotype diversity (H p D), the average number of nucleotide differences (K) and nucleotide diversity (π) were computed with DnaSP v5 [48]. Neutrality tests used the statistics from the Tajima test, Fu and Li test and Fu test in DnaSP v5.

Microsatellite analysis Genetic diversity
A total of 502 individual rats collected in this study from 18 populations were successfully genotyped at all 9 microsatellite loci. All loci were polymorphic, showing the number of alleles (N a ) as 21 (Rt1), 26 (Rt2), 27 (Rt3), 33 (Rt4), 16 (Rt5), 19 (Rt6), 26 (Rt7), 30 (Rt8) and 24 (Rt9) (Additional file 3: Table S3). The average number of alleles per locus ranged from 9.837 (Rt1) to 14.682 (Rt8). The minimum mean number of alleles of all loci was recorded for the NYC population from Tibet (6.56) and the maximum in the RUL population from Yunnan (14.67). For allele richness, the maximum R s in RUL (12.353) was almost twice the minimum R s in the LHS population from Tibet (6.14). The observed heterozygosity (H o ) for each locus ranged from 0.28 (Rt4) to 1.00 (Rt9); the minimum H o was in LHS (0.58) and the maximum H o in CHQ (0.82) ( Table 2). The expected heterozygosity (H e ) for each locus ranged from 0.37 (Rt1) to 0.94 (Rt4); the minimum H e was in LHS (0.67) and the maximum H e in RUL (0.91) ( Table 2). The same situation also occurred in N e and I: the minimum value was found in LHS of Tibet and the maximum value was in RUL of Yunnan. Private alleles were mainly detected in the LIH, QNJ, BAS and RUL populations and were not found in SJZ, XIS, SHH, CHQ, SIC, NYC, LHS and XIY populations. Student's t-Test showed that the mean values of H e and R s for Yunnan's and other region's populations had statistically significant differences (H e : P = 0.022; R s : P = 0.005). In general, populations in Yunnan showed a higher genetic diversity than other regions, while the genetic variability of populations in the Tibetan areas and SJZ was low.
Some specimens failed to amplify at one locus while succeeded at the remaining loci, suggesting the presence of null alleles. Null alleles were present in only a few populations for three loci, there were one individual at locus Rt6, two individuals at locus Rt5 and four individuals at locus Rt7. To determine whether the null alleles impacted the population genetic analyses, we performed these analyses both before and after the dataset were adjusted for estimated null allele frequencies. The effect of this treatment was minimal and did not significantly change the degree or statistical significance of the estimated parameters.

Hardy-Weinberg equilibrium and linkage disequilibrium
The Hardy-Weinberg exact tests were performed at nine loci. No locus was in HWE for all of samples assayed. However, at the population level, 28 out of 162 (17.28%) comparisons did not conform to Hardy-Weinberg expectations after sequential Bonferroni correction, and the deviations were associated with a positive inbreeding coefficient (F IS ), reflecting heterozygosity deficits. The populations in Yunnan (Southwest China) had the highest number of loci in departure from HWE (21 out of 28), LOY sample had 3 loci in departure from HWE, WZS sample had 2 loci in departure from HWE, SHH and LEZ populations had 1 locus in departure from HWE respectively (Table 3).
Fisher's exact tests were conducted for linkage disequilibrium within all populations. Only 4 pairs out of 648 comparisons (0.62%) were at linkage disequilibrium after sequential Bonferroni correction, and all three such pairs were found in SHH and LOY populations (Rt1/Rt3, Rt8/ Rt9, Rt1/Rt8). No pair of loci appeared in LD in more than one population, suggesting genetic independence between loci.

Demography
Tests for genetic bottlenecks yielded mixed results ( Table 2). The M ratio was low and indicated historical bottleneck in all populations. Significant deviations from mutation-drift equilibrium conditions under the IAM model were found in all populations except the SJZ, CHS, and LHS populations. In contrast, the test for each population gave non-significant results under the SMM  Bold characters denote a significant heterozygote deficiency (p < 0.05) after correction for multiple testing by the sequential Bonferroni procedure  (Table 3 and Fig. 2a), which indicated the coexistence of two gene pools in these two populations.
Moderate levels of genetic differentiation were found between populations. The overall F ST across all loci was 0.064 and was highly significant (P < 0.001). We then tested the genetic heterogeneity among populations within and between the clusters. We chose populations in which more than 70% of individuals were assigned to cluster I or II for analysis. Cluster I included 5 populations (LIH, XGL, QNJ, BAS and RUL), while cluster II included 9 populations (QUZ, SHH, CHQ, XIS, CHS, SJZ, and XIY). Within each cluster, pairwise F ST was moderate: 0.020 (CHQ-CHS) to 0.077 SHH-SIC) in cluster I and 0.022 (LIH-RUL) to 0.048 (LIH-XGL) in cluster  Table S4).
Analysis using Geneland yielded a modal number of populations between 2 and 7, with a higher proportion of K = 4 (posterior probability = 0.52). The run with the highest average posterior density was selected (− 14,771.26). Cluster 1 included three populations from Southwest of Yunnan (RUL, BAS and LIH), and cluster 4 included two populations from North of Yunnan (XGL and QNL). Cluster 2 was constituted by populations from Tibet (LHS and NYC). And cluster 3 was composed of other populations from south, east and middle of China (Fig. 2b).
Tests of isolation by distance were performed for each population cluster and for all the populations together. For all populations, weak correlations were detected between genetic and geographic distances (R 2 = 0.111, P = 0.025), while for cluster I and II, the correlations were stronger (R 2 = 0.527, P < 0.01). The correlations for cluster III were not significant (R 2 = 0.209, P = 0.17). These results suggest that geographic distance weakly contributes to the genetic differentiation in R. tanezumi populations.
The unrooted NJ dendrogram based on Nei's distance depicted the structure of population differentiation (Fig. 1b). It demonstrated that the populations of Yunnan clustered first and then showed a close relationship with populations from coastal areas (LOY of Zhejiang, LEZ of Guangdong, WZS of Hainan, and QUZ of Fujian). At the other end of the tree were populations from Central China and Tibet (LHS and NYC).

DNA sequencing analysis MtDNA polymorphism and divergence
Nucleotide sequences of mtDNA COI (702 bp) and D-loop (546 bp) were obtained for 502 R. tanezumi specimens from 18 populations. Overall, the polymorphism in the COI region was moderate, with 37 segregating sites defining 38 haplotypes. The nucleotide diversity indices of the XGL population from Yunnan were the highest (HpD = 0.842, K = 3.458, π = 0.00493), while the lowest indices occurred in the WZS population from Hainan (HpD = 0.061, K = 0.061, π = 0.00009). The SHH and LHS populations from Shanghai and Tibet also showed lower genetic diversity. The most frequent haplotype, H1 (48.1%), was detected in all geographic populations except LIH and RUL. For the D-loop region, the polymorphism was much higher than in COI. There were 60 segregating sites defining 70 haplotypes. The highest nucleotide diversity indices were similarly among the QNJ population in Yunnan (HpD = 0.905, K = 11.315, π = 0.02076), while the lowest indices occurred in the XIS population from Hubei (HpD = 0.148, K = 0.151, π = 0.00028) ( Table 4). Populations in Tibet (NYC and LHS) clearly manifested low genetic polymorphism compared with other regions. In common with the COI results, the most frequent D-loop region haplotype, H1 (44.5%), was detected in all geographic populations but was absent from LIH and RUL, as well as BAS.

Phylogenetic analysis
The Median-Joining Networks show a significant geographical component to variation. For COI, the network shows three distinct haplogroups. Haplotypes from non-Yunnan regions form a star-like pattern centred at H1, and haplotypes from southwest and northeast regions of Yunnan form two star-like patterns centred around H10 and H17, respectively. Furthermore, H4 (only found in LEZ) and H24 (found in both QNJ and LOY) are most closely related to centre H17 (Fig. 3b, Additional file 5: Table S5 for details). The structure of the network for D-loop (Additional file 5: Table S5) is approximately equivalent to that for COI, but the network is more complicated.
For the mitochondrial COI data, the BI analysis indicated that the whole haplotypes, including haplotypes of reported COI sequences, were separated into three middle supported clades (Fig. 3a). These clades were named after the three centre haplotypes of the networks (Fig. 3b). The H1 and H10 clades were more closely related than to the h17 clade (Fig. 3a). These reported sequences from Southeast Asia and Africa did not share haplotypes with sequences of present research.

Population genetics of R. tanezumi
In Yunnan R. tanezumi is controlled almost every year in either urban or rural areas because it is host of Yersinia pestis, the pathogen of bubonic plague. As a result, the high departure from HWE were observed in all populations from Yunnan, which indicated a population subdivision ( Table 3). The results of genetic bottlenecks test (M ratio test and IAM modle) also suggested that bottleneck occurred in most of populations after control with high pressure.
Rattus tanezumi population was first found at Shijiazhuang of Hebei at 2004 [9]. The genetic diversity of the population in present research showed that SJZ population and Tibet populations had lower genetic variability (Rs and He in Table 2). Because of the short colonization history of SJZ population, the low genetic diversity could be attributed to the founder effect. The low genetic diversity of populations from Tibet (LHS and NYC) could only be explained as a result of the founder effect as there were no rodent control measures in Tibet. The recent demographic expansion of these populations was also supported with the results of Tajima's D, and Fu & Li's statistics.

The origin area of R. tanezumi
Two different analysis methods based on microsatellites data showed different results. According to the result with Structure, all samples of R. tanezumi could be divided into three clusters, and populations collected from Yunnan belonged to cluster III. While the results of Geneland divided populations of Yunnan into two groups, the first including three populations from Southwest Yunnan and the second including two populations from North Yunnan, which was concurrent with the results of haplotype network based on mitochondrial COI sequence. The haplotype network of COI shows two haplotype groups from Yunnan, one from Southwest and another from North Yunnan (Fig. 3b, Additional file 5: Table S5 for detail). The haplotype group from North Yunnan were scarce in other populations.
Populations of R. tanezumi from Yunnan showed distinctive characteristics in this study: first, they had many alleles in microsatellite analysis (average N a for five populations was 12.78, Table 2); second, they had a high number of haplotypes in mtDNA analysis (Table 4). These characters mean that the populations of R. tanezumi from Yunnan have high genetic diversity compared with populations from other areas of China.
Yunnan is located at the border area of Southwest China and Southeast Asia. These areas are special regions with various habitats both for aquatic and terrestrial organisms [56][57][58][59] because of the intricate geological and climatic history and were therefore regarded as two hotspots of high biodiversity and endemism [60]. In these areas, the young uplifting plateau (Tibetan Plateau), the old eastern escarpment and the southeast slope (Hengduan Mountains) yield highly heterogeneous habitats not only to 'living fossil' species, such as the Laotian rock rat Laonastes aenigmamus [61] but also to some new speciated species, for example Apodemus ilex [62]. Besides the species level diversity, high intraspecific genetic diversity of some species in these areas was also frequently reported [63,64] because these areas were refuges for different species during glacial periods. According to Musser and Carleton [65], R. tanezumi is indigenous to the north of the India subcontinent, South and Central China, and mainland Indochina in which Yunnan is the central area. Therefore, according to the genetic diversity of Yunnan populations of R. tanezumi, the location and the biodiversity background of Yunnan, it could be assumed that Yunnan and the adjacent area of Southeast Asia could be regarded as the original centre of R. tanezumi. The expansion route of R. tanezumi into mainland China Although Yunnan is adjacent to Guizhou, Sichuan and Tibet, the rat populations from these provincial areas do not show a close relationship (Additional file 4: Table S4). The microsatellite analysis shows that there is no close relationship between populations of Yunnan, and those from Guizhou, Sichuan and Tibet ( Fig. 1b and Fig. 2). The haplotype analysis of mtDNA sequences shows that the populations of Yunnan have a high diversity of haplotypes (H10 and H17 groups in Fig. 3), while populations from these three provincial areas share only a few haplotypes (H1 group). Populations from Tibet show only two haplotypes of COI and lack any particular commonality with Yunnan haplotypes (Fig. 3b), although the geographic distance between Tibet and Yunnan is very short (Fig. 1a). This result suggests that R. tanezumi in Yunnan did not spread to Tibet naturally or with the help of caravan transportation.
The unrooted NJ dendrogram based on genetic distance demonstrated a visualized relationship of populations from different areas of China (Fig. 1b). Based on the relationship, the early expansion of R. tanezumi in China could be inferred as having two steps: the first step as spreading through coastal areas via ocean-based shipping transportation, and the second as expanding from the coastal areas into Central China via shipping along the Yangzi River. Aplin et a.l reported that R. tanezumi expanded in West Pacific with the diaspora of Austronesian-speakers about 4000 years ago [66], which supported the possibility of expanding with ocean shipping transportation in that period. Because the populations of Tibet are closely related to the populations of Sichuan, the sources of the Tibetan populations are presumed to be Sichuan instead of Yunnan, and the route of expansion was likely the modern Chuan-Zang Highway instead of the Tea Horse Ancient Road.
Data of allelic patterns across populations showed in detail that three populations (SJZ, LHS and NYC) have a low number of alleles and effective alleles (N a and N e ) compared with other populations, with the two populations from Tibet (NYC and LHS) having the lowest values (Table 2). These data supported that the populations from Tibet are newly imported and, as a result, suffer from the founder effect and lose genetic diversity during importation and colonization. The population from Hebei (SJZ) also has few alleles and effective alleles, and was only higher than those of populations from Tibet (Table 2). These data reveal that the SJZ is also a newly colonized population with low genetic diversity. As R. tanezumi was found in Tibet before 1995 [8], and this species was not reported in Hebei until the 2000s, SJZ population could be considered a young population compared with the population from Tibet. The relative high genetic diversity of SJZ compared with that of NYC and LHS suggests that there were more individuals imported to Hebei from South and Central China in a short time period through the complex modern transportation system. The convenient transportation systems between North and Central China give more chances for R. tanezumi to expand to North China [67,68] than to West China.
All reported COI sequences in Genbank were added in the phylogeny analysis. Most of these sequences came from samples of Southeast Asia, such as Thailand and Vietnam [52,54,55]. The result showed that the haplotypes of these sequences belonged to the three haplotype  Table S2). And these sequences did not share haplotype with sequences from China. This result suggested that there was no genetic exchange between populations from sampling areas of China and Southeast Asia, although there were imported individuals intercepted from international cargo ships from different ports of China [69].
The dispersal ability of R. tanezumi and other Rattus species The Mantel test for IBD presented interesting results. The dispersal pattern of R. tanezumi in Yunnan does not fit IBD, while populations in other areas of China show an obvious IBD pattern. Because of the complex landscape, the dispersal of R. tanezumi in Yunnan should be restricted by geographic barriers, such as valleys and mountains, and thus could not fit IBD. The dispersal of R. tanezumi in other areas was also expected to lack IBD because R. tanezumi, as a typical commensal rat, can use human transportation to achieve long-distance dispersal. However, the Mantel test shows a strong pattern of IBD for these populations. This result suggests that long-distance dispersal events were rare in the historical expansion of R. tanezumi in China, even though this species is widely distributed in China. However, in recent decades, the northward expansion of R. tanezumi has become obvious in North China, such as in Shanxi and Qinghai [11,68], which suggests that modern, quick transportation has begun to give more opportunities to R. tanezumi in long-distance dispersal.
Some species of Rattus are regarded as commensal species, including R. rattus, R. norvegicus, and R. tanezumi. These species are all able to utilize human-made environments and transportation tools for survival and expansion. However, when two Rattus species are analyzed together for their commensal ability, there are differences between the different species. R. rattus of the Western Ghats of India shows a stronger dispersal ability compared with R. satarae, a native species of the southwest Indian Peninsula [70]. While R. rattus in the United States shows low long-distance dispersal ability when it competes with R. norvegicus [71]. In China R. norvegicus and R. tanezumi are two important commensal rodents. The former occurs mainly in southern China, and the latter is recorded in all provincial areas except Tibet [72]. In the last half-century, R. norvegicus invaded Xinjiang along the railway and expanded through rural areas of Xinjiang with the help of modern transportation [19,73]. R. tanezumi has been reported in some provinces in North and Northwest China in recent decades [11,68]. Compared to the northwestward expansion of R. norvegicus in China, R. tanezumi shows a low speed in spreading northward in recent years. This phenomenon suggests that R. tanezumi is not as fit as R. norvegicus for long-distance dispersal using modern transportation systems. commented on the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate Ethical approval for this study was obtained from the Ethical Committee of National Institute for Communicable Disease Control and Prevention, Chinese Center for Disease Control and Prevention. All rats were live trapped in rodents control in residential areas, which were not privately owned or protected. No specific permits were required for the described field studies.

Consent for publication
Not applicable.