The Evolution of Post-Vaccine G8P[4] Group a Rotavirus Strains in Rwanda; Notable Variance at the Neutralization Epitope Sites

Africa has a high level of genetic diversity of rotavirus strains, which is suggested to be a possible reason contributing to the suboptimal effectiveness of rotavirus vaccines in this region. One strain that contributes to this rotavirus diversity in Africa is the G8P[4]. This study aimed to elucidate the entire genome and evolution of Rwandan G8P[4] strains. Illumina sequencing was performed for twenty-one Rwandan G8P[4] rotavirus strains. Twenty of the Rwandan G8P[4] strains had a pure DS-1-like genotype constellation, and one strain had a reassortant genotype constellation. Notable radical amino acid differences were observed at the neutralization sites when compared with cognate regions in vaccine strains potentially playing a role in neutralization escape. Phylogenetic analysis revealed that the closest relationship was with East African human group A rotavirus (RVA) strains for five of the genome segments. Two genome sequences of the NSP4 genome segment were closely related to bovine members of the DS-1-like family. Fourteen VP1 and eleven VP3 sequences had the closest relationships with the RotaTeq™ vaccine WC3 bovine genes. These findings suggest that the evolution of VP1 and VP3 might have resulted from reassortment events with RotaTeq™ vaccine WC3 bovine genes. The close phylogenetic relationship with East African G8P[4] strains from Kenya and Uganda suggests co-circulation in these countries. These findings highlight the need for continued whole-genomic surveillance to elucidate the evolution of G8P[4] strains, especially after the introduction of rotavirus vaccination.


Double-Stranded RNA Extraction and cDNA Synthesis
Rotavirus dsRNA extraction and cDNA synthesis were performed as described previously [14]. Briefly, the extracted RNA was incubated with 8 M lithium chloride (Sigma-Aldrich ® , St Louis, MO, USA) for 16 h at 4 • C to enrich it for rotavirus dsRNA, then purified using a MinElute PCR purification kit (Qiagen, Hilden, Germany). Finally, electrophoresis was performed on a 5 µL aliquot of dsRNA in 1% 0.5 TBE agarose (Bioline, London, UK) gel stained with Pronasafe (Condalab, Madrid, Spain) at 95 volts for 1 h to assess the integrity of the extracted and purified rotavirus dsRNA. The cDNA synthesis was performed using the Maxima H Minus Double-Stranded cDNA Synthesis kit and protocol (ThermoFisher Scientific, Waltham, MA, USA), albeit with a slight modification of the first strand synthesis step, whereby we increased the incubation period from 30 min to 2 h. The cDNA was then purified using the MSB® Spin PCRapace kit (Stratec Molecular, Berlin, Germany) following the manufacturer's instructions.

Preparation of DNA Library for Whole Genome Sequencing
The construction of DNA libraries was carried out in accordance with the Nextera XT DNA Library Preparation Kit and Protocol (Illumina, San Diego, CA, USA). Briefly, the process involved fragmentation of genomic DNA through tagmentation, amplification of the tagmented DNA, and purification of the post-tagmented DNA. A volume of 10 µL Tagment DNA buffer was added to each well of a PCR plate containing a 5 µL volume of purified cDNA, within the 0.2-0.3 ng/µL concentration range, followed by the addition of 5 µL of Amplicon Tagment Mix. The reaction mixture was subjected to centrifugation for 1 min at a force of 280× g, utilizing ambient temperature conditions. The PCR plate was placed in a thermocycler, which was programmed with a pre-heated lid option at 55 • C for 5 min, with a 10 • C hold temperature. A 5 µL volume of Neutralize Tagment buffer was added to neutralize the activity of the transposase enzyme. After centrifugation at 280× g and room temperature for 1 min, the PCR plate was subsequently incubated at room temperature for 5 min.
A 10 µL volume of index 1 and 2 primers was added to each sample well containing the tagmented DNA, based on the unique combinations provided on the Illumina Experimental Manager software v.1.18.1 sample sheet (Illumina Experimental Manager (IEM) Software Downloads). A 15 µL volume of Nextera PCR Mastermix was added to each well; then, the plate was centrifuged at 280× g at 20 • C for 1 min. The PCR program was performed on a thermocycler programmed with a pre-heated lid option at 72 • C for 3 min; for 12 cycles at 95 • C for 10 s, 55 • C for 30 s, and 72 • C for 30 s; then, finally, at 72 • C for 5 min, with a hold temperature of 10 • C.
The amplified and post-tagmented DNA library was subjected to purification, during which fragment size selection and elimination of undesired PCR contaminants was performed using AMPure XP beads (Beckman Coulter, Pasadena, CA, USA) and freshly prepared 80% ethanol. The DNA library was then validated using an Agilent 2100 Bio-Analyzer (Agilent Technologies, Waldbronn, Germany). The normalized DNA libraries were combined into a single tube and denatured using sodium hydroxide. They were then diluted with hybridization buffer, and a PhiX control was added until an ultimate concentration of 8 pM was reached. Then, sequencing was performed on a MiSeq platform (Illumina, San Diego, CA, USA) for 600 paired-end cycles at the University of Free State -Next Generation Sequencing (UFS-NGS) Unit, Bloemfontein, South Africa.

Genome Assembly
The raw data from the sequencing process were evaluated for quality using FASTQC, version 0.11.9 [38]. Adapter sequences were eliminated using the BBDuk trimmer tool (https://sourceforge.net/projects/bbmap/ (accessed 12 January 2023)). De novo and reference-based mapping using the prototype DS-1-like reference strain (with accession numbers HQ650116-HQ650126) was performed on Geneious Prime 2023.1 [39]. A consensus was generated from the mapped data by utilizing the Geneious Consensus Tool [39].

Genome Genotyping and GenBanK Accession Numbers
The complete genome constellation was determined by analyzing each genome segment using the rvaGenotyper, accessible at (https://legacy.viprbrc.org/brc/rvaGenotyper. spg?method=ShowCleanInputPage&decorator=reo (accessed 17 January 2023)), in the Virus Pathogen Database and Analysis Resource (ViPR) [40]. Briefly, the user inputted the nucleotide sequence of a given rotavirus genome segment into the FASTA format. The rvaGenotyper tool then compared the sequence to a reference database of known rotavirus sequences and used a combination of nucleotide alignments and genotype-specific cut-off values to determine the genotype of the input sequence. The accession numbers assigned to all gene sequences in this study were OQ201345-OQ201575, and these were deposited into the NCBI GenBank database (Supplementary Table S1).

Sequence and Phylogenetic Analyses
The MUSCLE tool [41] in the MEGA 11 software [42] was utilized for comparative analysis of the ORFs. The DNA Model Test program in MEGA 11 was used to determine the most suitable evolutionary model. Phylogenetic trees, using maximum likelihood (ML), were generated for each genome segment, and branch support was evaluated through 1000 bootstrap replicates. In order to determine the genetic similarities between strains for each gene, the p-distance algorithm, which is available in MEGA 11, was employed. The Virus Variation Resource of the National Center for Biotechnology Information (NCBI) and the basic local alignment search tool (BLAST) were employed to compile reference sequences [43,44]. The accession numbers of these reference sequences, which were used in the construction of the ML trees, are included in the Supplementary Table S2.

H2 (IV)
Color-coding represents genogroup assignment. The light blue color is linked with the G8 genotype; the light red color is associated with the DS-1-like genogroup; and the light green color is associated with the Wa-like genogroup. The nomenclature of the rotavirus strains is based on the Rotavirus Classification Working Group (https://rega.kuleuven.be/cev/viralmetagenomics/virus-classification/rcwg (accessed 15 January 2023)). Roman numerals in parentheses after the genotype names indicate the lineage, as observed in the phylogenetic trees in Figure

Analysis of Neutralization Epitopes
A comparison of the amino acids in the neutralization epitopes between Rwandan G8 VP7 genes and cognate VP7 genes of the vaccine strains [45] showed that only five of the 29 amino acid residues were fully conserved among all the Rwandan G8 strains ( Table 2). The Rwandan G8 strains displayed 24 amino acid differences from the vaccine strains' VP7 component ( Table 2). Apart from N91T, V/M129I, and R143K, the rest of the amino acid differences involved a change in either charge or polarity [46].  Alignment of epitope residues in VP7 between the wild-type G8 strains and the strains present in Rotarix ® , RotaTeq™, Rotavac ® , and Rotasiil ® . The three VP7 epitopes (7-1a, 7-1b, and 7-2) are shown. Amino acids that showed variations between the vaccine strains and the study strains are highlighted in a brown-red color for better visualization.

Phylogenetic Analyses
To determine the genetic relationship between Rwandan G8P [4] strains and other RVA strains from GenBank, we conducted phylogenetic analyses based on the nucleotide sequences of the complete ORFs of all 11 genome segments. We used previously described lineage framework for the G8 VP7 genes [48] and the P [4] genes [49]. The lineage framework for the G8 VP7 genes was based on maximum likelihood phylogenetic analysis of 246 G8 VP7 genes, which identified five G8 lineages [48], while for P [4] VP4 genes, it was based on global P [4] phylogenetic analysis [49]. We also utilized a lineage framework that categorized the nine genotype two backbone genes [50]. This lineage framework defined lineages for genes of genotype two based on stringent bootstrap support and pairwise analysis of the nucleotide sequences [50]. Selected reference sequences for each lineage were utilized in this study to determine the lineage of a particular genotype two strain.
The Rwandan G8 study sequences and GenBank G8 sequences (n = 180) segregated into five G8 lineages ( Figure 1A). Rwandan G8 sequences clustered lineage V into a sublineage comprising sequences predominantly from East Africa, and the highest nucleotide similarities (99.7-100%) were observed with a Kenyan G8 genotype (Table 4; Figure 1A). The P [4] sequences from the Rwandan study and the 43 selected GenBank P [4] reference sequences segregated into four P [4] lineages, and all fell under lineage II, which mainly consisted of African P [4] sequences ( Figure 1B). The VP4 genes from Rwandan G8P [4] strains showed the highest nucleotide similarities (ranging from 98.8% to 100%) with a Kenyan strain (Table 4; Figure 1B). The Rwandan VP6 sequences clustered in lineage V ( Figure 1C), and within this lineage, the highest nucleotide similarities, ranging from 99.5% to 99.9%, were observed with a Ugandan strain (Table 4; Figure 1C). Rwandan R2 sequences segregated into two lineages, lineage V and lineage XII ( Figure 1D), which were distantly related to each other, with nucleotide identities ranging from 85.2-100% (Supplementary Table S4). A total of seven Rwandan G8P [4] VP1 sequences clustered with other human R2 gene sequences, mainly from Africa, in lineage V ( Figure 1D), and the highest nucleotide similarities (99.6-99.9%) were observed with a Kenyan strain (Table 4; Figure 1D). The remaining fourteen Rwandan R2 sequences clustered in lineage XII, a bovine lineage consisting of cow sequences, including the RotaTeq™ vaccine bovine genes ( Figure 1D). In this lineage, the highest nucleotide homology (99.9-100%) was observed with the RotaTeq™ vaccine strain (Table 4). In a different sub-lineage within lineage XII, two Rwandan R2 sequences showed the highest nucleotide identity (97.5-97.6%) with a South African bovine strain (Table 4; Figure 1D). The VP2 genes of Rwandan G8P [4] strains showed a distant relationship with each other, ranging from 85.7-100% (Supplementary  Table S4), and clustered separately into two lineages within the C2 genotype ( Figure 1E). Nineteen VP2 genes clustered together with other human C2 genes in lineage IV ( Figure 1E), with highest nucleotide similarities to a Ugandan strain ranging from 99.5-99.9% (Table 4; Figure 1E). The other two Rwandan G8P [4] VP2 genes, clustered in lineage XIII ( Figure 1E), with the highest nucleotide similarities (96.3-96.4%) found with a Kenyan G8P[1] strain ( Figure 1E). The VP3 genes of the Rwandan G8P [4] strains were distantly related to one another (ranging from 82.2-100% similarity) (Supplementary Table S4). The VP3 genes formed separate clusters, with ten of them found in lineage V ( Figure 1F). These VP3 genes had the highest similarity (ranging from 99.3-99.9%) to a Ugandan strain (Table 4; Figure 1F). The remaining 11 VP3 genes of the Rwandan G8P [4] strains were found in lineage X, a hybrid lineage of human and bovine M2 genes ( Figure 1F). These VP3 genes had the highest level of similarity (ranging from 99.6-99.9%) with the VP3 bovine gene of a RotaTeq™ vaccine strain (Table 4; Figure 1F).
The NSP1 genes of the Rwandan G8P [4] strains clustered in lineage IV ( Figure 1G), and had the highest level of nucleotide similarities (99.3-99.9%) with a Ugandan strain (Table 4; Figure 1G). The NSP2 gene sequences of Rwandan G8P [4] were segregated into two genotypes, N1 and N2 ( Figure 1H). One sequence clustered into genotype N1, while the other twenty sequences clustered into genotype N2 ( Figure 1H). The 20 N2 genes of Rwandan G8P [4] strains that clustered into human lineage V had the highest level of similarity (99.6-100%) to a Ugandan strain ( Table 4). The one strain with the N1 genotype had the closest nucleotide similarity (99.7%) to the NSP2 gene (N1 genotype) of a G9P [8] Ugandan strain ( Figure 1H). Rwandan T2 gene sequences clustered into lineage V alongside other human T2 strains, with the highest nucleotide similarities (99.4-100%) observed with a Ugandan strain (Table 4; Figure 1I). The Rwandan E2 gene sequences had a distant relationship (ranging from 82.7-100%) among themselves (Supplementary Table S4), and clustered separately into two different lineages within the E2 genotype ( Figure 1J). Nineteen NSP4 genes clustered together with other human M2 genes in lineage XXIII ( Figure 1J), with high nucleotide similarities, ranging from 99.2-100%, to a Ugandan strain (Table 4; Figure 1J). The remaining two Rwandan G8P [4] NSP4 genes clustered in lineage XV, a hybrid lineage comprising both human and animal E2 genes ( Figure 1J), and the highest nucleotide similarities ranged from 97.5-97.7% with the NSP4 gene of a South African bovine strain (Table 4; Figure 1J). Rwandan H2 gene sequences clustered in lineage IV alongside other human H2 genes ( Figure 1K), with the highest nucleotide similarities observed with a Kenyan strain ( Figure 1K).

Discussion
The study provides a whole-genome analysis of the genome sequences of 21 Rwandan G8P [4] strains, revealing that 20 of the strains exhibited pure DS-1-like genotype constellations, consistent with reports from other parts of the world [21,48,51,52]. It is suggested that human RVA with pure genome constellations in the same genogroup could have co-evolved to generate sets of proteins that function optimally when maintained together [53]. Notably, one strain exhibited inter-genogroup reassortment at the NSP2 genome segment, and this reassortment phenomenon has also been reported in another G8P [4] study [54]. Reassortment of rotavirus genome segments is a relatively frequent phenomenon that generates reassortant rotavirus strains [11,12,15,[55][56][57][58]. These findings underscore the genotype constellation diversity of G8P [4] strains.
The close phylogenetic relationship of the G8 and P [4] genes with contemporary RVA human strains from Kenya and Uganda suggest co-circulation in these neighboring countries. Rotaviruses are highly contagious and can spread easily between individuals and populations [59]; this finding highlights the need for genetic data to be shared between neighboring countries to track circulating RVA strains. The clustering of the backbone genome segments of Rwandan G8P [4] RVA strains showed that not all of these genes evolved in the same way. The high degree of homology and clustering of some genes with contemporary human RVA strains from Kenya and Uganda suggests local evolution, while the clustering of other genes with artiodactyl genes suggests possible interspecies reassortment.
Notably, for the VP1 and VP3 genes, we observed the closest phylogenetic relationship with RotaTeq™ vaccine WC3 bovine genes, indicating possible reassortment between wildtype DS-1-like and RotaTeq™ WC3 bovine genes. The RotaTeq™ vaccine comprises five reassortants of human-bovine (WC3) rotaviruses, each of which has a bovine/WC3 core and a surface protein derived from a human rotavirus, namely, G1, G2, G3, G4, or P[8] [60]. Reassortment between RotaTeq™ vaccine strains and wild-type strains has been reported in other studies [15,[61][62][63][64]. It remains to be explored whether vaccine-derived strains could cause an increase in virulence. According to some studies, mutations occurring either de novo or through the selection of pre-existing minor variants in the vaccine may cause vaccine strains to revert to increased virulence [65][66][67][68][69].
The study found amino acid differences in the neutralization epitope regions between the VP7 study strains and the vaccine strains (Rotarix ® , RotaTeq™, Rotavac ®, and Rotasiil ® ), were mostly radical, meaning they involved changes in charge or polarity [46]. Amino acid differences in positions 94, 96, 147, 148, 190, 211, 213, and 217 have been identified as significant in altering the antigenicity of rotaviruses and potentially contributing to neutralization escape [70,71]. Similarly, the amino acid differences observed between the neutralization epitopes in the VP4 study strains and vaccine strains (Rotarix ® and RotaTeq™) were mostly radical, and could contribute to escape of host immunity [47,72].
Although analyzing pre-vaccination Rwanda G8P [4] samples would have provided more comprehensive insights and this was not feasible due to limited availability. Furthermore, the prevalence of circulating rotavirus strains is known to naturally fluctuate [73], which further impedes in-depth analysis. Other limitations of our study include a need for more detailed demographic data for a deeper interpretation of the presented data and the fact that our samples were collected a decade ago. Despite these limitations, we believe performing a whole-genome characterization of this uncommon rotavirus strain would still be valuable. Our study provides significant insights into the evolution of G8P [4] strains in Rwanda, which could be useful in predicting their presence in neighboring regions.

Conclusions
The results demonstrated a close evolutionary relationship of Rwandan G8P [4] with other African strains, especially East African G8P [4] strains from Kenya and Uganda, an indication of their co-circulation in this region. Notable radical amino acid differences, which were observed at the neutralization sites when compared with cognate regions in vaccine strains, require further investigation, as they could potentially play a role in neutralization escape. Our findings also suggest the existence of reassortment events between co-circulating human DS-1-like RVA strains with bovine and RotaTeq™ WC3 bovine backbone genes. The high homology and phylogenetic clustering with RotaTeq™ WC3 bovine genes in the VP1 and VP3 genome segments were rather unexpected. Continued whole-genome surveillance of RVA strains is essential to evaluate the effect of RVA vaccines and to provide insights into the frequency of reassortment events that occur naturally, as well as the epidemiological fitness of RVA strains resulting from these events.

Informed Consent Statement:
The diarrheal stool samples were collected as a routine diagnostic clinical specimen when the parents brought their children to health facilities for clinical management, requiring no written informed consent. As part of the World Health Organization (WHO) coordinated rotavirus surveillance network, the archived rotavirus-positive specimens were anonymized and utilized for whole-genomic characterization under a Technical Service agreement to the University of the Free State Next-Generation Sequencing Unit, a WHO-Collaborating Centre for Vaccine-Preventable Diseases Surveillance and Pathogen Genomics based in Bloemfontein, South Africa. The WHO Research Ethics Review Committee granted an "exemption activity", noting that the procedures involved in the study are part of routine hospital-based rotavirus surveillance. Data Availability Statement: All the gene sequences in this study were submitted to the NCBI GenBank database under accession numbers OQ201345-OQ201575, and are included in Supplementary Data S1.