How many species of Apodemus and Rattus occur in China? A survey based on mitochondrial cyt b and morphological analyses

Apodemus (mice) and Rattus (rats) are the top rodent reservoirs for zoonoses in China, yet little is known about their diversity. We reexamined the alpha diversity of these two genera based on a new collection of specimens from China and their cyt b sequences in GenBank. We also tested whether species could be identified using external and craniodental measurements exclusively. Measurements from 147 specimens of Apodemus and 233 specimens of Rattus were used for morphological comparisons. We analysed 74 cyt b sequences of Apodemus and 100 cyt b sequences of Rattus to facilitate phylogenetic estimations. Results demonstrated that nine species of Apodemus and seven species of Rattus, plus a new subspecies of Rattus nitidus, are distributed in China. Principal component analysis using external and craniodental measurements revealed that measurements alone could not separate the recognized species. The occurrence of Rattus pyctoris in China remains uncertain.


INTRODUCTION
Small volant and nonvolant mammals are important components of ecological communities and play vital roles in ecological systems. They are among the most common agents for infections and, thus, have strongly affected human history. For example, black rats (Rattus rattus) are considered likely agents for the spread of Oriental rat fleas, which drove the Black Death plague throughout Europe and the Mediterranean during the 14th century and killed 30%-60% of the European population (Barnett, 2001;Duplantier et al., 2003). More recent examples of small mammal zoonoses include severe acute respiratory syndrome (SARS) caused by a coronavirus and Ebola hemorrhagic fever caused by Ebolavirus, with hosts including, but not limited to, bats and civets (Klein & Calisher, 2007;Menachery et al., 2015).
Rodent-borne diseases such as plague and hantavirus have made considerable contributions to human illnesses and are responsible for more deaths than all wars combined (Klein & Calisher, 2007). New pathogens, especially hantaviruses, have been isolated from rodents in China and adjacent countries annually (Huang et al., 2017). Because different species have specific immune systems and different levels of tolerance to zoonotic infections, identification of rodent reservoirs of zoonotic pathogens is a high priority (Meerburg et al., 2009).
Rats and mice often top the zoonoses reservoir list of the Chinese Center for Disease Control and Prevention (China CDC) because of the large number of species, substantial population sizes, and high potential for carrying zoonotic pathogens (Wu et al., 2017). Unfortunately, we still do not know how many species of rats and mice occur in China, or which species carry what pathogens, even for the most common genera such as Apodemus and Rattus. The reasons for this are complicated. Both Apodemus and Rattus have complex evolutionary and taxonomic histories, with classifications continuously being updated. Switching between valid species and synonyms causes considerable confusion, especially for non-specialist researchers. Furthermore, many species occur only in remote mountains or near national borders with high species diversity, such as Yunnan, Xizang (Tibet), and Xinjiang. Indeed, the rats and mice of southern Xizang and western Xinjiang remain to be studied carefully. Finally, many rodents are difficult to identify to species level due to the number of morphologically similar species (Galan et al., 2012).
Rattus, another problematic genus, has had 25 subgenera and more than 550 species and subspecies named (Simpson, 1945). Currently, 66 species are recognized but uncertainty persists. Previous supermatrix analysis did not obtain a monophyletic Rattus, indicating that systematics is far from resolved (Steppan & Schenk, 2017). Arguments also persist for the most common species, including black rats whose species boundary remains unfixed (Aplin et al., 2011). The number of species of Rattus in China is also uncertain and varies from four (Corbet, 1978), seven (Smith et al., 2008), and nine (Wang, 2003).
Similar to other rodents, species in these two genera are difficult to identify or distinguish morphologically due to their similar appearance, overlapping measurements, and key factors involving the single cusp on their teeth. Diagnosis often requires clean skulls, which are not always available or correctly prepared. DNA barcoding is a promising approach but requires a solid reference database (Moritz & Cicero, 2004). Unfortunately, GenBank data are problematic because many rodent sequences are uploaded by non-specialists such as epidemiological researchers.
This reduces the reliability of environmental assessment reports and hampers our understanding of host and disease associations.
Herein, we revisited the alpha diversity of Apodemus and Rattus in China based on a collection of more than 400 specimens and the integration of cyt b sequences. We evaluated the species of both genera in China and assessed if they could be identified easily using traditional morphometric approaches.

Morphological diagnoses and analyses
We examined 147 specimens of Apodemus and 233 specimens of Rattus collected from multiple localities across China. External and skull measurements followed Liu et al. (2012). External measurements of fresh specimens in the field were taken to the nearest 0.5 mm using a steel tape. These included head-body length (HBL), hind-foot length (HFL), ear length (EL) and tail length (TL) (museum specimens from original records Specimens were roughly identified based on external and craniodental morphology, following Kaneko (2010) and Smith et al. (2008). External and craniodental measurements largely overlapped between species (see Results) and were inadequate for identification. However, several diagnostic characters on the upper molars were constructive in classification, including the number of lingual angles of the first and second upper molar, presence/absence of cusp t3 on the first upper molar, and numbers of internal lobes on the third upper molars. We also cross-checked results based on morphological diagnoses with molecular sequences (when available) to refine identification. All specimens were identified by the same researcher (SYL) for consistency. We finally assigned our specimens to nine species of Apodemus, seven species of Rattus, and a new subspecies of Rattus nitidus, respectively.
We analyzed morphometric variation using principal component analyses (PCAs) on log10-transformed variables using two datasets for each species. The first dataset included both external and craniomandibular variables, whereas the second dataset included craniomandibular variables only. Inclusion of the external data tested whether these measurements could increase the accuracy of identification. Statistical analysis was performed using SPSS v16.0 (SPSS Inc., USA). When two or more recognized species were not well separated in the principal component (PC) plots, analysis of variance (ANOVA) was applied to analyze among group differences.

Molecular analysis
We sequenced mitochondrial cyt b for 74 and 100 specimens of Apodemus and Rattus, respectively. Localities of molecular samples used from China are mapped in Figure 1. All sequenced specimens were deposited in the SAF. Total genomic DNA was extracted using the standard phenol-chloroform method (Sambrook & Russell, 2001). We used the universal primers of mammalian cyt b L14724/H15915 for amplification (Irwin et al., 1991). Polymerase chain reaction (PCR) was conducted in a 25-µL reaction volume, including 2.5 µL of 10×EX Taq buffer (Mg 2+ Free), 2 µL of 2.5 mmol/L dNTP, 1.5 µL of 25 mmol/L MgCl 2 , 1 µL of 10 µmol/L primers, and 1 unit of EX Taq polymerase (TaKaRa Biotech, Dalian, China). The product was purified using an EZNA TM Gel Extraction Kit (Omega, USA), and was sequenced using the same primers for amplification on an ABI 3730XL sequencer. Sequences were assembled and edited using SeqMan and EditSeq (DNASTAR, Lasergene v7.1) before subsequent analyses. To avoid misidentification, we first conducted a "naïve identification" for the obtained sequences using the "identify organism" workflow in Geneious v11 (Biomatters, New Zealand). The software blasted each sequence against the GenBank nucleotide collection (nr/nt) database. When pairwise identity between the query (our sequence) and subject (in GenBank) sequences was higher 98%, Geneious considered them as the same species. We cross-checked the results of both morphological and molecular identifications, and when the identification was inconsistent, we revisited the skin and skull specimens before applying an identity.

Figure Legends
To provide a better picture of species diversity in China, we downloaded cyt b sequences of Apodemus (n=477) and Rattus (n=273) in China from GenBank, discarding sequences shorter than 800 bp. We also included cyt b data representing another 12 species of Apodemus and 14 species of Rattus from outside of China. An additional five sequences of R. pyctoris from Nepal were included. For better estimation of phylogenetic relationships, we downloaded the mitochondrial genomes (mitogenomes) of seven species of Apodemus and 14 species of Rattus (Supplementary Table S2). One mitogenome under the name of "Apodemus chejuensis" may not have been a valid species. Cyt b of Tokudaia spp. (n=3) and a mitogenome of Bandicota indica were selected as outgroup representatives for Apodemus and Rattus, respectively, following Steppan & Schenk (2017). In total, the datasets for Apodemus and Rattus included 572 (with seven mitogenomes) and 397 sequences (15 mitogenomes), respectively. We aligned the sequences for each genus using MAFFT v7.3 implemented in Geneious v11. We removed all tRNAs, D-loop, and ND6 sequences from the alignments, and only used rRNAs and 13 protein-coding genes for phylogenetic analyses. Sequence genetic distances were calculated for cyt b using MEGA v.5 (Tamura et al., 2011) under the Kimura 2-parameter model (Kimura, 1980).

Phylogenetic analyses
We employed RAxML v8.2.10, a maximum likelihood-based approach, for phylogenetic analyses. We partitioned the alignments by genes, except for cyt b, which we partitioned into the 1 st +2 nd and 3 rd codon positions. Analyses were performed on the CIPRES Science Gateway. We used GTR+G as the evolutionary model for each partition because RAxML does not accept models other than GTR or GTR+G. We ran each analysis using the rapid bootstrapping algorithm and let RAxML halt bootstrapping automatically. We also repeated analyses using alternative strategies, such as different partitioning schemes (e.g., partitioned by gene and codon positions for all coding genes) and evolutionary models (e.g., using GTR model instead of GTR+G), none of which strongly altered phylogenetic relationships (i.e., different relationships supported by bootstrap values (BS) ≥75)).

Morphological analysis Morphological analysis of Apodemus
Morphological measurement statistics of the eight Apodemus species, excluding A. semotus, are given in Table 1. In the first PCA, using all 12 measurements (n=139), the first and second principal components accounted for 57.6% (eigenvalue=6.9; Table 2, a) and 11.7% (eigenvalue=1.4) of total variation, respectively, with all other principal components having eigenvalues smaller than 1.
PC1 was positively correlated with all craniodental variables (loadings>0.63), and PC2 was positively correlated with external measurements (loadings>0.55). The PC1 and PC2 plot ( Figure 2A) did not clearly separate the species. Apodemus latronum plotted on the positive regions of PC1 and PC2, indicating a large body, long tail, long hindfeet, and long ears. In accordance with its small skull and small external measurements, A. uralensis occurred along the negative regions of PC1 and PC2. The sister-or closely related species A. agrarius and A. chevrieri as well as A. pallipes and A. uralensis were well separated, but both pairs overlapped with A. peninsulae, A. draco, and A. ilex, which, in turn, largely overlapped. For the second PCA, using eight craniodental measurements (n=141), the first principal component accounted for 69.5% of variation (eigenvalue=5.6; Table 2, b). The other principal components accounted for less than 9.4% (eigenvalue≤0.75) of total variation, indicating they were not stable (Shankardass, 2000). Seven variables were positively correlated with PC1 (loading>0.56), except for UMRL (loading=0.076), which was positively correlated with PC2 (loading=0.93). The PC1 and PC2 plot ( Figure 2B) was similar to the previous plot. None of these species were clearly separated from all others. Apodemus chevrieri and A. latronum plotted on the positive regions of PC1, indicating a relatively large skull, and A. uralensis occurred along the negative region of PC1 in accordance with its small skull.  One-way analysis of variance (ANOVA) revealed that the seven species differed significantly (P<0.05) in all external and cranial characters tested, except for NBL (P=0.497), TL (P=0.064), and HFL (P=0.094). Results showed significant differences as follows: UTRL, MRL, ABL, and ML between A. peninsulae and A. chevrieri; ZB, SBL, UTRL, HBL, and EL between A. peninsulae and A. ilex; SGL, ZB, SBL, HBL, TL, HFL, and EL between A. peninsulae and A. draco; ZB, SBL, and ABL between A. peninsulae and A. pallipes; SGL, ZB, SBL, UTRL, ABL, ML, HBL, and EL between A. chevrieri and A. ilex; SGL, ZB, SBL, MRL, UTRL, ABL, ML, HBL, TL, and HFL between A. chevrieri and A. draco; SGL, ZB, SBL, UTRL, MRL, ABL, and ML between A. chevrieri and A. pallipes; TL and HFL between A. ilex and A. draco; and ABL, TL, and HFL between A. draco and A. pallipes. Thus, morphological analysis indicated that the eight species of Apodemus could be separated by the 12 morphological characters, validating the taxonomic status of these species in China.
When all individuals of the two subspecies of R. nitidus were subjected to an independent sample t-test for each variable, significant differences appeared in UTRL, UMRL, ML, and TL between R. nitidus nitidus and R. nitidus from Xizang.

Molecular analysis
We obtained cyt b sequences for 78 specimens of Apodemus and 106 specimens of Rattus.
Cyt b K2P interspecies distances for Apodemus ranged from 5.4% to 20.7% (Supplementary Table S4). The smallest distance occurred between A. uralensis and A. pallipes, and largest between A. sylvaticus and A. latronum. The distances for Rattus ranged from 2.1% to 16.5% (Supplementary Table  S5). The smallest distance occurred between R. baluensis and R. tiomanicus, and the largest between R. leucopus and R. argentiventer. The K2P distance of R. nitidus from Xizang and R. nitidus nitidus was 0.019.

Matrilineal genealogy (haplotype phylogeny) of Apodemus
Matrilineal genealogy using the mitogenome and cyt b data for Apodemus (n=569) did not fully resolve the higher relationships (Figure 4), as in previous studies (see Discussion). Representative animals from China fell into nine clades that corresponded to nine species. Notably, A. uralensis from Xinjiang, China, fell into a clade (BS=100) comprised of A. pallipes from Xizang, China, and a sequence from GenBank (origin unknown), thus rendering A. pallipes paraphyletic (BS=69). A sole mitogenome representing "A. chejuensis" from Jeju Island was embedded in a clade containing A. agrarius. Apodemus draco, A. ilex, and A. semotus fell together in a well-supported clade (BS=100), but the relationships among the three species were not resolved (BS<50). Apodemus chevrieri, A. draco, A. ilex, A. latronum, and A. peninsulae also comprised subclades. Abbreviations are explained in the Materials and Methods section. All measurements are in mm.

Matrilineal genealogy of Rattus
The interspecific relationships of Rattus using the mitogenome and cyt b sequences (n=396) were well-resolved (BS=95-100) or moderately resolved (BS=55-77) ( Figure 5). Sequences representing animals from China fell into seven lineages that corresponded with R. nitidus, R. norvegicus, R. exulans, R. andamanensis, R. losea, R. rattus, and R. tanezumi. The clade of R. nitidus had two subclades, one from southern Xizang and the other from southeastern China ( Figure 5). The tree depicted GenBank sequences deposited under different names within a shallow clade, most commonly with R. andamanensis. However, some specimens were also associated with R. losea, R. nitidus, R. tanezumi as well as R. nitidus from southern Xizang.

Genealogy and taxonomy
Species of Apodemus are among the most destructive of all animal pests, yet little attention has been paid to their evolutionary relationships. Our trees were consistent with those from the robust study of Steppan & Schenk (2017), indicating the repeatability of both. However, the created molecular phylogenetic tree of Rattus was inconsistent with that of Aplin et al. (2011), which may be due to the different ways in which the trees were constructed (ML phylogeny here, but BI and NJ methods in Aplin et al. (2011)), different number of species, or different sequences of the cyt b gene (only two individuals of R. pyctoris (GenBank accession No. JN675511 and JN675512) from Aplin et al. (2011)). The unresolved relationships within Apodemus were not surprising and are likely due to early radiation in the evolution of this genus, as indicated by the saturation of the mitochondrial gene (Serizawa et al., 2000). Similar problems likely also occur in Rattus due to hybridization and introgression. Previously, Rattus was recovered as a paraphyletic genus (Steppan & Schenk, 2017).
Fully resolved phylogenies require many slowly evolving and unlinked genes, which is not within the scope of the current study.  Despite uncertainty in phylogenetic relationships, questions regarding taxonomy in both genera remain. The differences between A. pallipes and A. uralensis have been discussed previously in depth (Musser & Carleton, 2005). Our carefully identified specimens of A. pallipes were from southern Xizang (Pulan County). The average cyt b genetic distance between A. uralensis and A. pallipes was 5.4%, which was the smallest interspecific genetic distance in Apodemus. All our specimens of A. pallipes matched the original description and holotype (Musser & Carleton, 2005). Thus, A. pallipes undoubtedly occurs in China. The sequences of A. pallipes in GenBank were from Afghanistan and Pakistan, near the type locality of A. pallipes in Pamir Alta. However, as we had no access to these specimens, it was not possible to determine if they matched the morphological description of A. pallipes.  Johnson & Jones (1955) described A. chejuensis. Koh (1991) also recognized the species based on its large body size and mtDNA genotype. Corbet (1978) assigned it as a synonym of A. agrarius ningpoensis, whereas Musser & Carleton (2005) treated it as a synonym of A. agrarius. Our phylogeny embedded A. chejuensis in A. agrarius, and thus our results agree with the assignment of Musser & Carleton (2005).
The taxonomic statues of A. draco remains uncertain. Apodemus ilex and A. semotus are close relatives to each other ( Figure 4). Kaneko (2011) suggested that A. semotus did not differ significantly from A. draco. However, this endemic species of Taiwan was characterized by a dark gray pelage rather than the reddish-brown color of all other Asian species of Apodemus. Further, our ANOVA results demonstrated significant differences in TL and HFL between A. ilex and A. draco. Thus, we recognize all three as full species to better reflect their long evolutionary histories and distinct distribution patterns. Nevertheless, future comprehensive morphological diagnosis is desirable.
Hodgson described R. pyctoris in 1845 from Nepal (Hodgson, 1845). This name was later replaced by R. rattoides or R. turkestanicus. Musser & Carleton (1993) resurrected the oldest name and it has been reported to occur in China (Allen, 1940;Corbet, 1978;Ellerman & Morrison-Scott, 1951;Musser & Carleton, 1993, 2005Wang, 2003). Feng et al. (1986) identified a series of specimens of R. pyctoris from Xizang and claimed that R. pyctoris closely resembled R. rattus but with a pale underbelly, relatively long nasal bone, and cusp t3 on M 1 . Our series of specimens from Xizang coincide with the characteristics of R. pyctoris described by Feng et al. (1986). However, phylogenetic analysis associated the species with R. nitidus. The original description and comments of Musser & Carleton (2005) on R. pyctoris point to its diagnostic characters as a very small cusp t3 on M 1 , a wide and short rostrum (narrow and slender in R. nitidus), and chunky wide molars (thinner and gracile in R. nitidus). Except for the morphology of M 1 , the Xizang specimens differed from R. pyctoris. Furthermore, many characters of the Xizang specimens also differed from R. nitidus, including the cusp t3 being present, gray-white underbelly, and larger measurements. The molecular phylogeny also placed the Xizang specimens and R. nitidus in different clades. Accordingly, we assign the Xizang specimens to a new, undescribed subspecies of R. nitidus. Peale (1848) described R. exulans from Society Island. Nevertheless, its existence in Taiwan, China has been recognized for a long time (Motokawa et al., 2001). The Guangdong Insects Institute collected specimens of R. exulans from Yongxing Island in 1975. Rattus exulans is the smallest Asian species in its genus. The specimens from Yongxing Island conformed to the characteristics of R. exulans. Thus, we confirm that R. exulans occurs in China in Yongxing Island and Taiwan.
The earliest Chinese specimen of R. rattus (black type) was collected by A. B. Howell from Kuliang, Fukien in 1929 (Allen, 1940). In 1955 and 1956, the Fujian Epidemic Prevention Station collected specimens from Fujian, which were confirmed by Shou (1962) as being R. rattus. Our examination of these specimens and one specimen from Guangdong Province resulted in the same conclusion. Thus, we confirm that R. rattus occurs in Fujian and Guangdong.

Morphometrics-and molecular-based species identifications
Regardless of skull and external measurements being similar between species, many interspecies measurements differed significantly. Species of Apodemus were easier to identify than Rattus. Furthermore, the different species of Apodemus exhibited stronger geographic distribution.
For example, although measurements could not discriminate between A. draco and A. ilex (current study) or A. semotus (Kaneko, 2011), all three were found to be allopatric: A. ilex occurs in Hengduan Mountains, south of the Yangtze River and west of the Jinsha River; A. semotus occurs in Taiwan only; and A. draco occurs in the middle and lower reaches of the Yangtze River and in eastern China. Apodemus chevrieri, A. draco, and A. latronum co-occur in western Sichuan, but they were separated by the third upper molar and certain measurements (Figure 2A, B). Only one sequence of A. peninsulae in GenBank was likely misidentified (assuming no other error). Thus, the confusion between A. draco and A. ilex appears to be due to out-of-date taxonomy rather than misidentification.
Identification of Rattus species using either morphometrics or molecular data requires caution. Unlike for the species of Apodemus, most species of Rattus are invasive in China and have likely experienced strong selection resulting in morphological modification to adapt to local habitats. Notwithstanding, it was possible to identify some species based on morphology alone, such as, R. andamanensis, which has a unique white belly, R. norvegicus, which has very short ears, and R. nitidus and R. norvegicus, which do not have the cusp t3 on M 1 , with the former also having distinctly larger ears. The Chinese population of R. rattus is black all over its body, whereas R. exulans only occurs in islands of the South China Sea, including Taiwan, and has a very small head and body length. However, R. losea and R. tanezumi occur sympatrically in southern China. They are easily confused due to similar appearances and overlapping measurements. Most species showed significant overlap in the PCA plots ( Figure 3A, B). Perhaps due to challenges in identification, GenBank contains many misidentifications. For example, sequences under the name of R. norvegicus occur in almost all clades ( Figure 5).
Our new sampling and survey of sequences supported the occurrence of nine species of Apodemus and seven species of Rattus in China. However, it is necessary to be cautious with morphometric and molecular analyses for species identification due to considerable intraspecific variation and considerable errors in GenBank.

Alpha diversity of Apodemus and Rattus
We determined that A. agrarius, A. chevrieri, A. draco, A. ilex, A. latronum, A. pallipes, A. peninsulae, A. semotus, and A. uralensis occur in China. In addition, considerable intraspecific diversity occurs in several species. Future comprehensive and integrative analyses can determine if further splitting is necessary and/or desirable.
We determined that R. andamanensis, R. exulans, R. losea, R. nitidus, R. norvegicus, R. rattus, and R. tanezumi occur in China. Future research into the occurrence of R. pyctoris in China is not necessary. A new subspecies of R. nitidus is described as follows:

Etymology:
The name is derived from the type locality, southern Xizang (Tibet), China.
Diagnosis: Cusp t3 present on M 1 in first transverse loop, but very small; head and body relatively large; tail length usually larger than head plus body length; belly gray-white; transition between darker dorsal and lighter ventral pelage abrupt; dorsum of feet white, not glossy.
Description: Summer pelage from neck to hip uniform brown-black. Ventral hairs with gray-black base and gray-white tip, transition between darker dorsal and lighter ventral pelage relatively abrupt. Dorsal and ventral tail uniform brown-black; hairs on dorsal and venter of feet white, not glossy.
Skull sturdy (Figure 6), in dorsal profile straight and brain case flattened; highest point of skull in middle of parietal bone. Nasal broad anteriorly narrowing posteriorly. Posterior margin of nasals irregular and protruding in front of maxilla. Posterior and anterior of frontal broad, middle narrower. Interparietal broad, anterior part triangle-shaped and posterior margin arc-shaped ( Figure 6). Interorbital and temporal ridges present. Zygomatic arches medium in size, front part slightly broader. Auditory bullae moderately sized. Incisory foramen broad. Mandibles medium-sized ( Figure 6).
Upper incisors medium in size vertically downward and orange. Molars rooted; 1st upper molar with three transverse dental loops, first dental loop with 3 cusps, t3 present but small; 2nd upper molar with three transverse dental loops, first dental loop only on lingual cusp; 3rd upper molar with three transverse dental loops, first dental loop only on lingual cusp, third loop only single semicircle and second loop rectangular; mandibular condyle and coronoid process large, but lower molar same as in other species of Rattus.
Habitat: Specimens were collected from an abandoned farmland, along the footpath of a rice field where highland barley was grown, forest edge, shrubland, surrounding a house, and salvage station. Comparison with other subspecies: Compared with R. n. nitidus, t3 of the first dental loop present in Rattus nitidus thibetanus subsp. nov (vs. t3 absent or just vestigial; belly gray-white, and transition between darker dorsal and lighter ventral pelage relatively abrupt in Rattus nitidus thibetanus subsp. nov (vs. belly gray-white or yellow-gray, and transition vague in R. n. nitidus); dorsum of feet white, not glossy in Rattus nitidus thibetanus subsp. nov (vs. dorsum of feet white and shiny pearl in R. n. nitidus). The independent sample t-test demonstrated significant differences in UTRL, UMRL, ML, and TL between R. n. nitidus and R. n. thibetanus. The K2P distance for R. n. thibetanus and R. n. nitidus was 0.019, smaller than the smallest interspecies distance known in Rattus.

COMPETING INTERESTS
The authors declare that they have no competing interests.