Genotype-specific variation in West Nile virus dispersal in California

West Nile virus (WNV) is an arbovirus that was first reported in North America in New York in 1999 and, by 2003, had spread more than 4,000 km to California. However, variation in viral genetics associated with spread are not well understood. Herein, we report sequences for more than 100 WNV isolates made from mosquito pools that were collected from 2003 – 2011 as part of routine surveillance by the California Mosquito-borne Virus Surveillance System. We performed phylogeographic analyses and demonstrated that 5 independent introductions of WNV (1 WN02 genotype strain and 4 SW03 genotype strains) occurred in California. The SW03 genotype of WNV was constrained to the southwestern U.S. and had a more rapid rate of spread. In addition, geographic constraint of WNV strains within a single region for up to 6 years suggest viral maintenance has been driven by resident, rather than migratory, birds and overwintering in mosquitoes.


INTRODUCTION
West Nile virus (WNV) is an arbovirus of the family Flaviviridae that was first identified in North America in 1999. In the U.S., Culex mosquitoes are the primary vectors for WNV (1)(2)(3)(4), and passerine birds are the most common avian host (1,5). As incidental hosts, humans, equines, other mammals, and many bird species do not contribute to the enzootic cycle or evolution of WNV.
WNV was first reported in New York and reached the West Coast within 4 years. Beginning in July 2003, mosquito pools from California tested positive for WNV (6). In addition, three autochthonous human cases of WNV in southern California were reported to ArboNET in 2003. Human WNV disease cases in northern and southern California were reported to ArboNET the following year, and, by the conclusion of the 2004 transmission season, WNV had been detected in every county in the state of California (7).
The ecology of California has presented WNV with physical and climatic barriers as well as an assortment of vector and host communities. Although much of California has a Mediterranean climate with cool, rainy winters and dry summers, it is a very long state extending 1,240 km from the dry SE deserts at the Mexican border to the NW coastal rain forests at the Oregon border. The state is divided longitudinally by the Coast Range and the Sierra Nevadas, which form the east and west boundaries of the intense Central Valley agroecosystem, respectively, and by the Tehachapi and San Gabriel mountains, which enclose the southern end of the Central Valley and isolate the highly urbanized Los Angeles basin and San Diego areas from the remainder of the state. Natural physical divisions have altered mosquito and avian host communities. Culex tarsalis Coq., the primary rural vector throughout most of the state, is divided into distinct clades found in the SE Desert biome and the northern areas of the state (8). Culex pipiens L complex populations vary longitudinally, and recent genetic studies have detected four major groups based on structure analysis: Cx. p. quinquefasciatus south of the Tehachapi Mountains, Cx. pipiens form pipiens and Cx. pipiens form molestus in urban areas near San Francisco and Sacramento, and admixtures of all three forms in the Central Valley (9).
By mapping the sampling location and date of WNV-positive mosquito pools, a single introduction of WNV into southern California, followed by northward expansion, was proposed as the route for WNV emergence in California (6). The large-scale spatial dynamics of WNV emergence in the U.S. also was determined using sequence data from isolates collected across the country between 1999 and 2006 (10). In this analysis, the virus was shown to have spread from the East Coast to the West Coast at an average diffusion rate of 1,200 km/year, which includes a phase of increasing rate from 1999 to 2003 followed by a reduced rate of diffusion from 2004 to 2006. The high rate of WNV dispersal in the U.S. could be explained by WNV infection of migratory birds that travel long distances annually (11); however, evidence for infection in vernal northbound migrants has been limited for WNV (12,13) as well as other North American encephalitides (14,15). In contrast, movements by resident birds have also been shown to be important for WNV enzootic maintenance and movement (16). Furthermore, the phylodynamics of two emergent WNV genotypes, WN02 and SW03, have not been studied. These genotypes are characterized by the amino acid substitutions E-V159A and NS4A-A85T, respectively (17)(18)(19). WN02 genotype isolates have a shorter extrinsic incubation period in Culex mosquitoes than the founding NY99 strain (17,20,21), and WN02 and SW03 genotype isolates replicate to higher peak titers in house sparrows compared to the original NY99 genotype (22), suggesting that WNV transmission dynamics may differ by genotype.
In the current study, we used a consistent sampling method of collecting WNV isolates from mosquitoes, followed by sequence analysis, to evaluate WNV dynamics during its emergence in California from 2003 to 2011. We show evidence for multiple introductions of WNV into California. Next, we compared the rate of dispersal of WN02 and SW03 genotypes to infer adaptive phenotypes. Finally, using isolates collected during winter months, we provide evidence for overwintering of WNV in mosquitoes in southern California.

Mosquito pools
Mosquitoes were collected and pooled by species during routine surveillance by California mosquito control agencies from 2003 to 2011. Pools were triturated, and viral RNA was extracted at the UC Davis Center for Vectorborne Diseases. Samples were screened for WNV using quantitative real time RT-PCR, as recently described (23). Pools were selected for further analysis based on Ct score and date and location of WNV RNA detection. Isolates were stratified by year and month of collection, genotype, or mosquito species. Significance was determined using a chi-square test.

Viral genome sequencing and assembly
Mosquito homogenates were passaged once on Vero or C6/36 cells. Viral RNA was extracted from clarified primary cultures using an ABI MagMax kit. RNA amplification was performed as previously described (24). Illumina library construction was performed using NexteraXT (Illumina) by following the manufacturer's protocol. Sequencing was performed on the Illumina HiSeq2500 platform, generating paired-end 101 bp reads. The reads were assembled using the VICUNA assembly program (25). Sequences for 112 isolates were deposited into GenBank with accession numbers KR348915-KR349026 (http:// www.ncbi.nlm.nih.gov/Genbank/).

Phylogenetic analyses
143 previously sequenced full-length WNV genomes were selected from GenBank based on date and location of sampling and added to the dataset of 112 newly sequenced isolates, creating a final dataset of 255 sequences. These 143 previously-sequenced isolates were collected between 1999 and 2012. 23 out of the 143 isolates were collected in CA, and the remaining 120 isolates were collected across North America outside of CA. Latitude, longitude, and date were taken from GenBank entries or estimated as previously described (10). Dates were converted into decimals.
The coding regions of all 255 sequences were aligned using Clustal Omega (26) and edited manually. jModelTest2 was used to determine the most appropriate nucleotide substitution model, which was a generalized time reversible (GTR) model with a gamma (Γ) distribution of rate variation among sites (27,28). Phylogenies were constructed using the Bayesian MCMC method in BEAST v1.8 (29) with a GTR + Γ substitution model, a lognormal relaxed molecular clock, and a time-aware smoothed GMRF Bayesian Skyride coalescent model (30,31). Distances were calculated based on great circle distances. Spatial parameters Duggal  were estimated using relaxed random walks (RRW), a heterogeneous probability distribution of diffusion in order to allow for variation in dispersal rate, as previously determined to be most appropriate for WNV in North America (10). However, selection for the dispersal model was also performed by using the harmonic mean estimator (HME) to calculate log Bayes Factors for gamma, lognormal, and cauchy heterogeneous models and a homogenous Brownian model. The homogeneous model was rejected. A lognormal distribution of heterogeneous dispersal rates was used for further analyses because the cauchy RRW produced results that were very similar, and MCMC convergence was not reached with the analysis using a gamma RRW. MCMC chains were run for 200 million states, sampling every 20,000 states. Convergence was evaluated using Tracer v1.6. The maximum clade credibility (MCC) tree was determined using TreeAnnotator and visualized using FigTree v1.4.2 and SPREAD (32). Computing resources were used from the Cipres Science Gateway (33). WNV positive mosquito pools were first detected in California in 2003, and the number detected varied annually (Fig. 1A). Therefore, the number of WNV genomes sequenced from mosquito pools in this study also varied by year (Fig. 1B). In addition, the total number of sequenced isolates varied by month (Fig. 1C), mosquito species (Fig. 1D), and location of collection ( Fig. 1E and Fig. 2A). Across the years sampled, the number of isolates that were sequenced was generally proportional to the total number of WNV positive mosquito pools detected in California. The number of sequenced isolates were underrepresented in 2006, likely due degradation of samples, and overrepresented in 2003 due to an attempt to understand the genetic composition of isolates at the presumed time of WNV introduction into California. The majority of sequenced WNV RNA-positive mosquito pools were collected during July, August, and September (Fig. 1C) in a manner proportional to the number of positive mosquito pools and human WNV disease cases reported to ArboNET from California. Most isolates were collected in 5 mosquito control districts: GRLA, KERN, COAV, SAYO, and CNTR ( Fig. 2A), which together accounted for the locations of 50% of human WNV disease cases in CA. In total, isolates from ca. 1% of all WNV RNA-positive mosquito pools collected in CA between 2003 and 2011 were sequenced in this study.

Sampling and sequencing of WNV from mosquito pools
Isolates were stratified by genotype (WN02 or SW03). Approximately 60% of WNV isolates were WN02 genotype, and 40% were SW03 genotype. Analyses of isolates based on genotype showed significant differences based on year of collection (p<0.001; Fig. 1C). There were more WN02 genotype viruses than SW03 genotype viruses collected from 2003-2005, indicating that the WN02 genotype was likely responsible for human outbreaks during these years. The distribution of SW03 genotype viruses was also restricted based on location, such that the proportion of WN02 and SW03 genotype viruses from southern CA was significantly different compared to northern CA (p<0.001; Fig. 1D). The majority of viruses collected in northern CA were WN02 genotype. The mosquito species represented by the isolates included Cx. pipiens and Cx. tarsalis in the north and Cx. quinquefasciatus, Cx. tarsalis, and Cx. stigmatosoma in the south (Fig. 2B). However, there was not a significant difference in the proportion of WNV genotypes by mosquito species.

Phylodynamics of WNV in California
The  (Fig. 3)

Phylogeography of WNV in California
During 2003, WNV was only detected in the 6 most southern counties of California. The most likely sources of the WNV strains introduced into CA were the western and southern states, primarily Arizona and Texas ( Fig. 4A and B). Isolates from Mexico that were included in this analyses were also derived from Arizona (SW03 genotype) or CA (WN02 genotype) and were introduced in 2002-2003 ( Fig. 4A and B).
Previous studies have shown that WNV followed a heterogeneous dispersal pattern during emergence in the U.S., in which rapid dispersal from 1999  (Fig. 4B). In contrast, the WN02 genotype was not found among the 10 isolates from Arizona included in the analysis and was restricted in its diffusion in the southwest (Fig. 4A). However, it spread within northern CA to a much greater extent than the SW03 genotype (Fig. 4A). This suggests that the observation in Fig. 1E that the SW03 and WN02 genotypes were found in different proportions based on location may be due to genetic determinants of viral spread within the different climates, mosquito subspecies, or avian hosts in the southwestern U.S. had been maintained in southern California through the winter by mosquitoes, these winter isolates should phylogenetically cluster with viruses collected at the end of one transmission season and viruses collected at the beginning of the next transmission season. In addition, spatial clustering of these WNV variants would be expected.

Overwintering of WNV in mosquitoes
In Fig. 5A, isolate COAV 179_2011, which was collected on Feb. 15, 2011 from Culex tarsalis, is highlighted on an MCC Bayesian phylogeny. This winter isolate, which clustered with the SW03 genotype viruses, clustered temporally between isolates collected during the 2010 and 2011 WNV transmission seasons (Fig. 5B). COAV isolates were also geographically clustered north of the Salton Sea for 6 years. Similar results were found for a second winter isolate (GRLA 6143_2007, SW03 genotype, collected Dec. 5, 2007 from Cx. quinquefasciatus), also highlighted on the MCC Bayesian phylogeny (Fig. 5A). This isolate was phylogenetically and geographically clustered in the same location for 2 years (Fig. 5C).
These data indicate the overwintering of WNV amongst winter Culex mosquitoes in southern CA.
While no winter isolates were collected from northern CA, there was phylogenetic clustering of WN02 isolates from SAYO, SUYA, SHAS, and CNTR districts from 2005 to 2011. This indicates that WNV was also likely maintained in northern CA, rather than reintroduced annually from southern CA.

DISCUSSION
By sampling mosquitoes from the same locations throughout the emergence and maintenance of WNV in California ( Fig. 1 and 2), a unique dataset of viral isolates was created to evaluate the spatial and temporal evolution of WNV. With this dense sampling strategy, we were able to determine that WNV was introduced into California at least 5 times via southern CA, with the majority of introductions occurring around 2003 and by SW03 genotype viruses (Fig. 3). The first three of the introduced WNV strains (one WN02 and two SW03 genotype viruses) were maintained in California through the most recent sampling dates in 2011, and the first two of the introduced strains also spread to northern California (Fig. 3). These studies build upon previous reports of the variable rate of diffusion of WNV (10) by adding a genetic component to this heterogeneous dispersal pattern. SW03 genotype viruses had a higher dispersal rate than the average of all WNV isolates, and SW03 and WN02 genotype viruses occupy different spatial niches in the U.S. (Fig. 4). Horizontal transmission of WNV during winter months has previously been shown to occur in southern California in birds (35) and Culex tarsalis mosquitoes (36). The phylogeographic continuity of WNV evolution (Fig. 5) confirmed that overwintering contributes to enzootic maintenance. Together, these analyses suggest that the spread and maintenance of WNV is dependent on viral genetics and climate.
Since the introduction of WNV to the U.S., only two non-synonymous mutations in WNV have risen to high frequencies. The E-V159A substitution (WN02 genotype) has been found in all sequenced WNV isolates for several years. The NS4A-A85T substitution (SW03 genotype) has arisen independently at least four times in the southwestern U.S. While WNV lacked strong spatial clustering during its emergence in the U.S., likely due to a complex enzootic cycle involving birds, the isolates from CA showed evidence of spatial restriction based on genotype. Although there were no differences between WN02 and SW03 genotypes in the location of their introductions into California (Fig. 4), SW03 genotype viruses were more frequently located in southern CA than WN02 genotype viruses (Fig. 1), were introduced more frequently to CA than WN02 genotype viruses (Fig. 3), and were estimated to have a higher dispersal rate than WN02 genotype viruses (Fig. 4). These seemingly contradictory characteristics of a spatially restricted yet more diffuse virus suggest that the amino acid substitution that defines the SW03 genotype, NS4A-A85T, may be a temperature-sensitive adaptive mutation that drives a higher dispersal rate only in warmer climates. The mechanism by which SW03 viruses could diffuse faster than WN02 viruses is unknown but may be due to faster kinetics of viral replication or higher competence in mosquitoes (17,21) or peridomestic passerine birds such as house sparrows (22) compared to WN02 genotype viruses. In addition, the warmer climate in southern CA may naturally decrease the extrinsinic incubation period of WNV in mosquitoes.
Other WNV mutations that have been previously characterized include several that are known to increase temperature sensitivity of WNV (37). These mutations were only found in CA isolates from 2003 through 2005. The decreased ability of WNV isolates containing these temperature-sensitive mutations to replicate at the high temperatures that are generated during corvid infections (38) may have contributed to the disappearance of these mutations. Therefore, temperature sensitivity is a critical phenotype to consider in the analysis of WNV spread.
In this dataset, there is evidence of few long-distance WNV dispersal events after the virus was introduced to CA. While the introduction of WNV into a naïve location may be more visibly dependent on migrating birds, the maintenance of WNV in an endemic area is likely dependent on resident birds, which contribute only small-scale movements. In order to understand whether the differential dispersal of emergent WNV genotypes in the U.S. is due to differences in host or vector competence, the replication dynamics of the WNV genotypes will need to be compared in multiple avian and mosquito species.    The SW03 genotype of WNV was constrained to southern California. The SW03 genotype spread with a more rapid dispersal rate than the WN02 genotype. The overall dispersal rate of WNV has decreased dramatically since its emergence. WNV overwintered in California in Culex mosquitoes.