Genotype distribution and evolutionary analysis of rotavirus associated with acute diarrhea outpatients in Hubei, China, 2013–2016

Group A human rotaviruses (RVAs) annually cause the deaths of 215,000 infants and young children. To understand the epidemiological characteristics and genetic evolution of RVAs, we performed sentinel surveillance on RVA prevalence in a rotavirus-surveillance network in Hubei, China. From 2013 to 2016, a total of 2007 fecal samples from hospital outpatients with acute gastroenteritis were collected from four cities of Hubei Province. Of the 2007 samples, 153 (7.62%) were identified positive for RVA by real-time RT-PCR. RVA infection in Hubei mainly occurred in autumn and winter. The highest detection rate of RVA infection was in 1–2 years old of outpatients (16.97%). No significant difference of RVA positive rate was observed between females and males. We performed a phylogenetic analysis of the G/P genotypes based on the partial VP7/VP4 gene sequences of RVAs. G9P[8] was the most predominant strain in all four years but the prevalence of G2P[4] genotype increased rapidly since 2014. We reconstructed the evolutionary time scale of RVAs in Hubei, and found that the evolutionary rates of the G9, G2, P[8], and P[4] genotypes of RVA were 1.069 × 10−3, 1.029 × 10−3, 1.283 × 10−3 and 1.172 × 10−3 nucleotide substitutions/site/year, respectively. Importantly, using a molecular clock model, we showed that most G9, G2, P[8], and P[4] genotype strains dated from the recent ancestor in 2005, 2005, 1993, and 2013, respectively. The finding of the distribution of RVAs in infants and young children in Hubei Province will contribute to the understanding of the epidemiological characteristics and genetic evolution of RVAs in China.

Among rotaviruses, group A human rotaviruses (RVAs) are the main cause of acute gastroenteritis among children and annually cause 215,000 deaths of infants and young children worldwide . Even in developed countries, RVAs are also a major cause of morbidity and levy a great financial burden (Charles et al., 2006). From 1994 to 2014, 40% of diarrhea-related hospitalizations and 30% of diarrhea-related outpatients among children under 5 years old in China were caused by RVAs (Wu et al., 2016). The great diversity of predominant genotypes of RVAs may affect the efficacy of rotavirus vaccination programs (Akran et al., 2010;Aminu et al., 2010;Arista et al., 2006;Glass et al., 2006;Todd et al., 2010). Because of the significant fluctuation of circulating genotypes of RVAs in different research periods and populations, continued surveillance programs in the pre-and post-vaccination eras have been carried out. Such programs can provide important information by monitoring the disease burden of RVA, identifying the temporal changes in circulating genotypes and assessing the effectiveness of existing vaccines (da Silva Soares et al., 2014;Hull et al., 2011;Jin et al., 2008;L aszl o et al., 2012;Pukuta et al., 2014;Stupka et al., 2012). Monitoring RVA genotypes detected from diarrhea-related patients will help identify viral strains more effectively, especially imported strains and zoonotic strains, in populations with large-scale vaccination programs.
Here, we performed a four-year study of sentinel surveillance program of RVAs in Hubei, China. We assessed the prevalence of G and P genotypes from RVA strains identified in this program and collected baseline data about circulating genotypes before the implementation of RVA vaccination program. We found that the key population of rotavirus infection was 1-2 years old of outpatients. Our findings could be exploited to understand the epidemiological characteristics and genetic evolution of RVAs in China.

Study populations
This active surveillance was conducted in the gastrointestinal clinics of sentinel hospitals in four cities across Hubei Province (Xiangyang, Yichang, Enshi, and Hanchuan). All outpatients visiting the clinics with symptoms of diarrhea (! 3 loose or watery stools within 24 h), vomiting or fever were included. The epidemiologic settings of this study for RVA infection included people of all ages that meet the case definition. From April 2013 to December 2016, a total of 2,007 stool specimens from outpatients with diarrhea were collected. All the specimens were kept frozen until testing.

RNA extraction and real-time RT-PCR
Stool specimens were diluted 1:10 with 1 Â PBS (pH 7.4). The suspension was centrifuged at 3000Âg for 30 min after shaking. Viral RNA was then extracted from the supernatant using a magnetic bead prefilled DNA/RNA extraction kit (Tianlong, Xi'an, China) according to the manufacturer's instructions. The extracted viral RNA was immediately stored at À80 C for further use. The viral RNA was detected for rotavirus using the AgPath-ID one-step RT-PCR kit (Thermo Fisher Scientific, Foster City, CA) on an ABI 7500 real-time PCR platform (Applied Bioststems, Foster City, CA). The previously published primers and probes were used (Mijatovic-Rustempasic et al., 2013). The reaction mixture (25 μL) consisted of 0.4 μmol/L of each of two primers and 0.2 μmol/L of TaqMan Probe.

Nucleotide DNA sequencing and genotyping analysis
The viral dsRNA was subjected to reverse transcription using a Pri-meScript II First-Strand cDNA Synthesis Kit (Takara, Dalian, China). The complete or partial VP7 and VP4-coding genes of RVA were amplified using the primers, as previously described (Iturriza-G omara et al., 2004;Simmonds et al., 2008). The PCR products were sequenced by Sangon Biotechnology (Shanghai, China). The nucleotide sequences for VP4 and VP7 obtained in this article were submitted to the GenBank database with accession codes MG788354 to MG788515 and MG788516 to MG788633, respectively (Supplementary Table S1).
Homology analysis of the RVA strains was performed using Meg-Align in Lasergene 7.0 (http://www.dnastar.com) (DNASTAR, Madison, WI). The G and P genotypes of RVA were determined from partial VP7 and VP4 sequences (452 and 572 bp) by phylogenetic analysis with MEGA 6.0 (Tamura et al., 2013) using the maximum likelihood method with 1000 bootstrap replicates. The models T92 þ G þ I and GTR þ G þ I were applied for the G and P genotypes of RVA, respectively. The phylogenetic trees were viewed with FigTree 1.4.3 (http://tree.bio.ed.ac.uk/software /figtree/). G and P genotype reference strains with established genotypes were selected from GeneBank (Kiseleva et al., 2018;Zhao et al., 2021).

Evolutionary analyses of RVA
Using the Akaike information criterion, the best-fit models of nucleotide substitutions were selected by the jModelTest program Version 0.1.1 (Posada, 2008). The time to the most recent common ancestor and the rate of nucleotide substitution for RVA were estimated by the Bayesian Markov chain Monte Carlo (MCMC) method using BEAST software Version 1.8.1 (Drummond et al., 2012). The best-fit model of nucleotide substitution was GTR þ G for both the VP7 and VP4 genes. Under an uncorrelated exponential derivation (UCED) model, a Bayesian skyline model for population growth was selected using the BEAST software. MCMC analyses were performed for 100 million generations, sampling each tree every 10,000 steps; the first 10% were discarded as chain burn-in. MCMC convergence and the effective sample size estimates (>200) were checked with the Tracer program Version 1.5. After a 10% burn-in, the maximum clade credibility (MCC) tree was generated by Tree Annotator Version

Statistical analysis
Clinical and epidemiological information, including gender, age, seasonality and date of sampling, was collected for all patients. The statistical analyses of different age groups were performed using the chisquare (χ 2 ) test. A value of P < 0.05 indicates statistically significant differences. The data were analyzed by using SPSS Statistics for Windows, Version 19.0 (IBM, Armonk, New York).

Seasonal characteristics of RVA in Hubei Province
The diarrhea monitoring project started from 2013 was performed in sentinel hospitals in four cities of Hubei Province (Xiangyang, Yichang, Enshi, and Hanchuan) (Fig. 1A). In the first two years, 161 and 313 samples were collected, respectively. From 2015, we added a new sentinel monitoring hospital, and 660 and 873 samples were collected in 2015 and 2016, respectively (Fig. 1B). Of the 2007 acute gastroenteritis outpatients, 153 (7.62%) were RVA-positive by real-time RT-PCR. The detection proportion was 9.32% (15/161), 9.27% (29/313), 6.36% (42/ 660) and 7.67% (67/873) in , 2015. No significant group difference was observed in the detection proportion for RVAs between the four years (χ 2 ¼ 3.344, P ¼ 0.342). Seasonal variation of RVAs prevalence was observed (Fig. 1B). RVA infection in Hubei Province mainly occurred in autumn and winter, the coldest and the driest seasons of the year.

Distribution of G and P genotypes by age and year
We further studied the distribution of G and P genotypes by age and year (Fig. 4). Seven G-P combination genotypes were found in seven different age groups (Fig. 4A). As shown in Fig. 4A, G9P[8] was the only genotype found in the 5-10 year age group and the most prevalent G-P combination genotype in the other six age groups, and G2P[4] was the second most prevalent genotype in the other six age groups. Genotype G4P[6] was only found in the 0-1 year age group and genotype G9P[4] was only found in the 1-2 year age group (Fig. 4A). As shown in Fig. 4B, G9P[8] was the most predominant G-P genotype in all four years but its proportion decreased year by year. G2P[4] was first identified in 2014 and there was a rapid increase in the prevalence in the following two years (Fig. 4B) (Fig. 4B). In the four years of testing, it is particularly noted that genotype G2P[8] was detected for the first time in 2016, the most recent year (Fig. 4B). Therefore, we should closely follow and monitor genotypes G2P   The root-to-tip regression analysis (Rambaut et al., 2016) based on the partial VP7 and VP4 gene sequences collected in this study (Supplementary Table S1) and reference strains (Supplementary Table S2) were performed to obtain sequences for Bayesian evolutionary analysis (Supplementary Figs. S1-S4). Bayesian evolutionary analysis was performed based on 172 selected VP7 gene sequences of G2P[4] and 31 sequences isolated in this study. The earliest evolutionary ancestor of the G2 genotype strains derived from 1933 and the significant time nodes for differentiation were in 1958, 1966, 1972, and 1986. As shown in Fig. 5, most G2 genotype strains from Hubei Province dated from the recent ancestor in 2005. Only a few G2 strains from Hubei Province had the most recent ancestor in 2013 (Fig. 5). Bayesian evolutionary analysis based on 148 selected VP4 gene sequences of G2P[4] and 32 sequences isolated in this study showed that the P[4] strains from Hubei Province were located in three clusters and the earliest ancestor could be dated back to 1959 (Fig. 6)  The MCC tree of the G9 genotype ( Supplementary Fig. S5) using the Bayesian skyline demographic model shows that this genotype of RVA strains in Hubei Province had different evolutionary time nodes, located in two different clusters. Most G9 genotype strains from Hubei Province were derived from the prior ancestor in 2005 and differentiated into five lineages, and some G9 genotype strains of Hubei Province dated from the recent ancestor in 2011 ( Supplementary Fig. S5). The P[8] genotype of RVA strains in Hubei Province was located in cluster-specific MCC trees, and the P[8] gene of RVA from Hubei Province derived from 1993 gradually evolved into two lineages ( Supplementary Fig. S6).
The molecular clock model, as a UCED model, was used to calculate both the rates of evolution and the time to the most recent common ancestor of the G9, G2, P[8] and P[4] genotypes based on the partial VP7 and VP4 gene sequences ( Table 2). The average times to the most recent common ancestor for the partial VP7 gene of the G9 genotype and the partial VP4 gene of the P[8] genotype were 46 years (range 23, 78) and 107 years (range 52, 183), respectively, which dated the most recent ancestor back to 1975 and 1913 ( Table 2). The evolutionary rates (the nucleotide substitution rates) of the G9, G2, P[8] and P[4] genotypes of RVA were 1.069 Â 10 À3 , 1.029 Â 10 À3 , 1.283 Â 10 À3 and 1.172 Â 10 À3 nucleotide substitutions/site/year, respectively, using the UCED clock, and the nucleotide substitution rate of the G9 genotype of RVA was similar to that of the RVA G2 genotype.

Discussion
In this article, we performed the epidemiology of RVAs on the 2,007 fecal samples from outpatients with acute gastroenteritis, in Hubei, from 2013 to 2016. Partial sequencing of the VP7 and VP4 genes revealed the diverse G/P genotypes and alternative prevalence genotypes of RVAs circulating in Hubei. We also analyzed the possible relevance of RVA infections and patient ages as well as seasons. Additionally, we performed the analysis of the evolutionary time scale for G2P[4] and G9P [8] of RVAs. This study provides basal data for understanding the molecular epidemiology and evolution of RVAs.
The epidemiology of RVA-associated diseases depends on not only the social and economic conditions of the study group but also on climatic differences (Cunliffe et al., 1998;Hacimustafao glu et al., 2011;Iturriza-G omara et al., 2011). In China, RVA infections primarily occur between October and the next March Fang et al., 2005;Xiao et al., 2014). RVA infections usually have a lower incidence rate in spring and summer, with an increased rate in late autumn (Dian et al., 2017). In this research, the epidemic seasons occurred during the cold months between October and the next February, consistent with the common characteristics of RVAs. The seasonal peak of onset is in winter, especially in November (Wang et al., 2007(Wang et al., , 2009Yu et al., 2019). We found that the highest positive proportion of RVA was in 1-2 years old children. Most literature suggests that the highest detection rate of RVA is in children under 2 years old Jin et al., 2009); our study is consistent with this conclusion. No association was found between gender and the detection rate of RVA.
In China, serotype G1 was the main strain before 2000, but G3 has been the predominant strain since 2000 . From 1994 to 2012, the genotypes G1, G3, and G2 were the most common G genotypes, followed by G9 but not G4 ). In 1998, G9 was first discovered in Yunnan Province of China, and its prevalence rate increased during subsequent years . By 2011, G9 had replaced the G1 and G3 strains, and has become the most common genotype in Wuhan . During 1994 to 2012, little variation in P genotypes was found in China and serotype P[8] is the most common strain, followed by P[4] . Comparatively, P[9] and P[10] strains were reported frequently in the southern and northern regions of China, respectively   (Yu et al., 2019). Two studies also reported that G9P[8] was the most common genotype combination circulating in the local population of Beijing, Gansu, and Kunming (Zhang et al., 2015. Our data showed that G9 and P[8] were the dominant genotypes in Hubei Province from 2013 to 2016, which was consist with data in other regions of China. Because of the segmented feature of the RVA genome, the genes of VP7 and VP4 can segregate in an independent manner which contributed to the diversity and recombination of RVA genotypes, resulting in a large number of different combinations of strains. Our molecular surveillance of RVA infections is of significance for comparing the changes in RVA strains and provides an important reference for the prevention and vaccine development. In this research, the time-scale evolutionary analysis based on partial VP7/VP4 genes of were performed using the UCED clock. The evolutionary rates of the G9, G2, P[8] and P[4] genotypes of RVA were 1.069 Â 10 À3 , 1.029 Â 10 À3 , 1.283 Â 10 À3 and 1.172 Â 10 À3 nucleotide substitutions/ site/year, respectively, similar to those of most RNA viruses, which evolve at a rate of approximately 1 Â 10 À3 nucleotide substitutions/site/year (Duffy et al., 2008). We then compared the substitution rates of VP7 and VP4 sequences from this study with reported data.  Matthijnssens et al., 2010). The evolutionary rate of the VP4 gene of RVA G2P[4] strains was 1.172 Â 10 À3 nucleotide substitutions/site/year, while the evolutionary rate of VP4 gene calculated from the VP4 sequences of G2P[4] strains and non G/P genes of DS-1-like RVA strains was 8 Â 10 À4 nucleotide substitutions/site/year (Agbemabiese et al., 2016). In our study, the evolutionary rate of VP4 gene of RVA G2P[4] strain is remarkably faster than the previous observation, therefore we should pay attention to the evolutionary rate of P[4] genotype in order to detect the evolutionary trend of rotaviruses.
Herein, the phylogenetic analysis and evolutionary history reconstruction showed that RVA strains from Hubei Province differentiated at different time points. Although the RVA from Hubei Province had a common ancestor in an earlier time, we observed that the genes of G9, G2, P[8] and P[4] divided into different evolution directions gradually. More importantly, the same genotype epidemic strains of RVA in the same year could have different evolutionary paths and sources. Therefore, we should not only monitor the genotypes of RVA but also pay close attention to the intergenotype genetic differences by evolution. Our finding of the distribution of RVAs in infants and young children in Hubei  (Nelson et al., 2008;Leshem et al., 2014). Introduction of rotavirus vaccines could confer partial protection against RVA gastroenteritis (Chandran et al., 2010;Fu et al., 2012;Li et al., 2019), but also impose additional selective pressure on circulating RVA strains, possibly influencing their evolutionary rates. Before the implementation of universal rotavirus vaccination in a country, it is important to understand the status of RVA infections in clinical settings (Nelson et al., 2008). A comparison of the related content research before and after vaccination has not been carried out in Hubei Province so far. Thus, research on this field should be conducted in the future.

Conclusions
In this study, we performed a four-year study of sentinel surveillance program of RVAs in Hubei Province and found that the key population of rotavirus infection is 1-2 years old of outpatients with acute gastroenteritis. Our data showed that G9P[8] was the most predominant strain in all four years and all G9 genotypes of RVA strains have different evolution trend. This rotavirus surveillance can be used as an entry point to further determine the molecular epidemiological characteristics, major circulating genotypes, evolutionary histories, and evolutionary divergence times of RVAs in Hubei Province.

Data availability
All the data generated or analyzed during the current study are included in this published article or available from the corresponding author upon request.

Ethics statement
All institutional and National guidelines about the use of clinical samples were followed. The informed consent has been obtained from all participants and the studies have been approved by the Ethics Committee of the College of Life Sciences, Wuhan University. 1959 (1923,1984) 95% highest posterior density (HPD) lower, 95% HPD upper; the nucleotide substitution rate is the mean rate for the three individual determinations. UCED, uncorrelated exponential deviation clock. TMRCA, time to most recent common ancestor. The values in parentheses are the 95% HPDs.