Global Molecular Epidemiology of Respiratory Syncytial Virus from the 2017−2018 INFORM-RSV Study

Respiratory syncytial virus (RSV) is the leading cause of lower respiratory tract infection among infants and young children, resulting in annual epidemics worldwide. INFORM-RSV is a multiyear clinical study designed to describe the global molecular epidemiology of RSV in children under 5 years of age by monitoring temporal and geographical evolution of current circulating RSV strains, F protein antigenic sites, and their relationships with clinical features of RSV disease. During the pilot season (2017–2018), 410 RSV G-F gene sequences were obtained from 476 RSV-positive nasal samples collected from 8 countries (United Kingdom, Spain, The Netherlands, Finland, Japan, Brazil, South Africa, and Australia).

shipped to the University Medical Centre Utrecht for sequencing. Individual patient data collected included: location, sample date, age, gender, referring department, and length of hospital stay (18).
RNA extraction, subtyping, RSV genome amplification, and next-generation sequencing. Nucleic acids were extracted from RSV-positive nasal specimens using the MagNA Pure LC kit (Roche Diagnostics, Mannheim, Germany) as previously described (18). RSV subtyping and quantification were performed by multiplexed TaqMan RT-PCR analysis of the RSV N gene using RSV A and RSV B specific primer/probe mixes. Subsequently, subtype-specific RT-PCR was performed using the SuperScript IV one-step RT-PCR system (Invitrogen, Carlsbad, CA USA) to amplify 4 overlapping fragments covering the full RSV genome. The resultant 3.5 to 5.0 kb amplicons were pooled, purified from 1% agarose gels, used to construct libraries by means of the Nextera XT DNA Library Prep kit, and sequenced on a NextSeq 500 system (Illumina, San Diego, CA USA) (18).
Sequence assembly and genotyping analysis. Assembly of next-generation sequencing (NGS) reads into RSV G-F contigs was performed using AstraZeneca's open-source NGS-Microbial Sequencing Toolbox, as previously described (18,19). Alignment of RSV G HVR2 and full-length nucleotide sequences was performed in MUSCLE and evolutionary analyses of full-length RSV G sequences were conducted in MEGA7. Assignment of RSV genotypes was performed by phylogenetic clustering of RSV G HVR2 nucleotide sequences using a previously described 2014 reference database (11).
Amino acid sequence variation analysis of RSV F proteins. The RSV A and RSV B F sequences in FASTA format were translated into amino acid sequences and aligned against reference F sequences derived from year 2013 Netherlands RSV A/13-005275 (GenBank accession no: KX858757) and RSV B/13-001273 (GenBank accession no: KX858756) reference strains, respectively. Amino acid variation per position was determined and reported from pairwise alignments as previously described (18).
Statistical analyses. A two-sided Fisher's exact test was used to assess statistical significance of global subtype distribution among demographic categories and to compare proportions of amino acid changes between antigenic sites.

RESULTS
Geographic and demographic distribution of RSV A and B subtypes and genotypes. Between November 2017 and November 2018, 1,835 nasal samples tested RSV-positive among participating sites in 8 countries. Among the RSV-positive detections, 476 (25.9%) nasal samples were collected for inclusion in the INFORM-RSV study. The frequency and monthly pattern of RSV-positive samples collected from each country are shown in Fig. 2. Delayed study initiation resulted in fewer than the targeted   RSV-positive nasal samples failed sequencing due to unsuccessful RT-PCR amplification, insufficient sequencing depth, or low read quality. Among the 410 RSV strains with G-F sequence data, 127 (31.0%) were subtype A and 283 (69.0%) were subtype B. Overall, the proportion of RSV subtypes differed by country (P Ͻ 0.001), as RSV B was more prevalent than RSV A in 7 of 8 countries studied, with the exception being South Africa ( Fig. 1 and Table 1). Finally, genotype determination revealed that all RSV A strains were of the Ontario 1 (ON1) genotype and all RSV B strains were of the Buenos Aires 9 (BA9) genotype. Distribution of RSV strains by gender, age, and length of hospital stay was also determined. The median age of RSV-positive individuals was 5 months (interquartile range [IQR], 2 to 9 months) and 81.2% (333 of 410) were aged less than 1 year; 56.3% (231 of 410) were males; and 70.5% (289 of 410) were hospitalized for Ն24 h. RSV isolates from outpatients, characterized by a length of hospital stay of Ͻ24 h, were mostly derived from 3 countries (Finland, Japan, and Brazil) and accounted for 29.5% (121 of 410) of the total. Stratification by referring department revealed that most RSV isolates came from other/undefined locations (66.3%; 272 of 410), followed by the pediatric ward (PW) (18.0%; 74 of 410), emergency room/department (ER/ED) (6.1%; 25 of 410), and pediatric intensive care unit (PICU) (9.5%; 39 of 410) (Table 1). Overall, RSV B was more frequent than RSV A in all categories and there were no significant differences in the global proportion of subtypes by age group (P ϭ 0.141) or length of hospital stay (P ϭ 0.722). While a significantly higher proportion of RSV B cases were observed globally in females compared to males (P ϭ 0.0311), no gender differences were observed within individual countries.
Global analysis of RSV genetic variability. To understand genetic variability of the 2017Ϫ2018 RSV strains, we performed a phylogeographic analysis of G gene sequences by country. Within both RSV A (all ON1 genotype) and RSV B (all BA9 genotype) phylogenies, some sequences clustered within a country, suggesting microevolution, while other clusters contained sequences from multiple countries (Fig. 3). These data show that RSV A ON1 and RSV B BA9 strains from 2017-2018 were genetically diverse by geographic locale, consistent with wide transmission and continued evolution.  (Fig. 4). Only 2 amino acid changes in RSV A F were highly polymorphic: A23T (17.3%) in the signal peptide and T122A (11.8%) in the fusion peptide. In contrast, 7 amino acid changes in RSV B F were detected in a majority of sequences as follows: F15L (99.6%) in the signal peptide, A103V (100%) in F 2 , and L172Q (100%), S173L (99.6%), K191R (74.2%), I206M (77.4%), and Q209R (76.3%) in F 1 .
Amino acid variation was further examined in each antigenic site (Ø and I to V) by geography ( Table 2) and depicted on prefusion and postfusion F protein trimer structures (Fig. 4). No statistical differences in the global proportion of amino acid changes were observed between antigenic sites (data not shown) and some changes occurred in both RSV A F and B F at the same positions (Y33, I206, S255, and S276). Overall, 11 amino acid changes were detected in 4 of 6 antigenic sites for RSV A F, with  frequencies ranging from 0.8 to 9.4%, and 32 amino acid changes were detected in 6 of 6 antigenic sites for RSV B F, with frequencies ranging from 0.4 to 100.0%. Only 5 of the 32 antigenic site changes in RSV B F were highly polymorphic and detected in all countries: I206M (77.0%) and Q209R (76.3%) in site Ø and L172Q (100.0%), S173L (99.6%), and K191R (74.2%) in site V. With few exceptions, antigenic site changes of intermediate polymorphic frequency (Ն1% and Ͻ10%) were detected in multiple countries. These results indicate that F protein sequences and antigenic sites from 2017-2018 were generally well-conserved compared to year 2013 reference strains, although RSV B strains exhibited greater variability.

DISCUSSION
RSV A and B cocirculate during seasonal epidemic periods with alternating patterns of predominance over time (21). However, little is known about temporal evolution of RSV strains, global spread of unique genotypes, or how these factors relate to disease severity. Also important to the development of vaccines and MAbs is the need to identify and track patterns of F protein antigenic site changes, which may confer selective advantages in transmission or resistance. The INFORM-RSV study aims to describe global molecular evolution and epidemiology of RSV by prospectively monitoring temporal and geographical distribution of currently circulating strains. At the time of writing, the INFORM-RSV study has been ongoing for 3 years and is currently being conducted in 17 countries across 5 continents. The results herein provide baseline information on RSV strain distribution associated with different clinical param-  Because the impact of viral factors on clinical parameters of disease severity has remained inconclusive (28), it was important to understand the distribution of RSV strains among demographic and clinical characteristics. Ultimately, most RSV strains were collected from hospitalized male infants aged less than 1 year, consistent with estimates of incidence and hospitalization rates (29), known risk factors, and the anatomic nature of shorter and narrower airways in infant males who are more likely to develop bronchial obstruction due to RSV infection (5). Unfortunately, the outpatient burden of RSV on health care resources has not been well defined (1,2,30) and few INFORM-RSV countries collected RSV-positive samples from outpatients who were medically managed without hospital admission. While hospital-based laboratory data on RSV infections may markedly underestimate the global burden of RSV disease, nevertheless, we observed no significant or meaningful differences in subtype/genotype distribution on clinical features of disease severity as assessed by gender, age group, or length of hospital stay.
The RSV F protein has historically been relatively well conserved, yet continues to evolve (12,31). To that end, data from the INFORM-RSV 2017-2018 pilot season establishes an important molecular baseline of RSV F protein sequence and antigenic site variation from which to track frequency, geography, and evolutionary trajectory of potential neutralization escape variants as an early warning for vaccines and MAbs in development. Although the observed variability of the 2017-2018 RSV F sequences was low, with no differences in the proportion of amino acid changes between antigenic sites, the frequency and geographical distribution of some variants suggest a recent positive selection of favorable amino acid changes. Indeed, RSV B strains containing Q209R (site Ø) and L172Q/S173L (site V), first reported in China (2014 -2016) (32), have recently emerged as dominant variants, with the addition of the I206M (site Ø) and K191R (site V) changes detected in the United States (2015-2019) (22,33). These additional changes are possibly due to natural selective pressure from maternal or host neutralizing antibodies. Since site Ø and V elicit the greatest frequency of high-potency antibodies (34) in a structural area requiring a great deal of flexibility (13), these sites may tolerate greater amino acid variation than others. Additional, less frequent amino acid changes detected during the INFORM-RSV 2017-2018 study were frequent enough to be resampled in multiple countries but have yet to spread globally.
While the impact that widespread use of anti-RSV F MAbs will have on the emergence and transmission of resistant variants is unknown, these variants may also arise naturally in the absence of drug selection pressure. To date, palivizumab resistanceassociated polymorphisms have been rarely observed in circulating RSV strains (35). Consistent with these reports, the restricted use of palivizumab (Synagis) (7), and the growth disadvantage of resistant variants in the absence of palivizumab selective pressure (36), we observed no known palivizumab target site II polymorphisms among 2017-2018 RSV strains. Also consistent with the rapid emergence and outgrowth of a RSV B strains containing L172Q/S173L in the United States (2015-2019) (22,33), these nonconservative polymorphisms in suptavumab target site V were detected in 100% of global 2017-2018 RSV strains and coincide with clinical resistance and the recent failure of suptavumab to reduce overall RSV hospitalizations or outpatient LRTI in preterm infants in a phase 3 trial (6,37). Finally, conservative I206M/Q209R polymorphisms in nirsevimab target site Ø were detected in 77% of RSV B strains but have been shown to retain susceptibility to neutralization by nirsevimab (38). Accordingly, despite the recent emergence of these polymorphisms, nirsevimab significantly reduced medically attended RSV LRTI in healthy preterm infants in a recent Phase 2b trial (9).
There are some limitations to the INFORM-RSV study. Key challenges to temporal analyses between geographies include adequate country representation and timing of RSV epidemics by season and location. Although low rates of RSV A and B coinfection (Ͻ2%) have been reported (22,39), the use of subtype-specific primers/probes in the INFORM-RSV study did not permit detection of RSV A and B coinfection. Data on patients' viral load are unavailable and therefore additional phylodynamic evolutionary and viral spread analyses are not possible. Since our data are heavily weighted toward infants with severe RSV disease that required hospitalization, we do not know about trends and molecular analyses of RSV from children who were medically managed as outpatients or were asymptomatic and did not seek medical attention. Our use of a 2014 RSV G HVR2 reference database (11) to genotype contemporary isolates has limitations as RSV continues to evolve. Accordingly, an extensible, centralized, curated, open database of reference sequences is needed to standardize genotyping and allow comparability across studies. Finally, future phenotypic susceptibility data would help to understand the functional impact of F protein antigenic site changes against anti-RSV F MAbs.
The strength of the INFORM-RSV study is reflected in its prospective design to characterize temporal and geographic trends in RSV diversity and to progress for several years with widespread global participation. Historically, RSV molecular epidemiology studies have been retrospective, focused exclusively on G gene diversity, and/or have been limited by geographical and low sampling effort constraints (15,26,40,41). While global RSV surveillance is conducted by the European Influenza Surveillance Network (4) and the World Health Organization (42), none provide subtype differentiation or sequence analyses when reporting patterns of circulation. Findings from the INFORM-RSV study may have important implications in understanding the impact of RSV evolution on transmission, pathogenesis, and prophylaxis effectiveness. Tracking the frequency, recurrence, and distribution of amino acid changes that may confer selective advantages is a key focus of INFORM-RSV. Recent strains and dominant genotypes have genetic differences from the prototype virus strain used in most vaccine research (43). Since antigenic site changes could alter viral antigenicity for vaccines and affect their susceptibility to MAbs, novel agents for prophylaxis cannot afford to miss their contemporary targets when they are eventually deployed.
In conclusion, ongoing surveillance of global molecular epidemiology of RSV is important for detecting the emergence and spread of new strains, predicting their clinical impact, and providing an early warning system of antigenic changes that may affect the effectiveness of vaccines and MAbs. To that end, the INFORM-RSV 2017-2018 pilot season establishes an important molecular baseline of RSV strain distribution and sequence variability among hospitalized infants from which to investigate temporal and geographic relationships in the years ahead.

ACKNOWLEDGMENTS
We