A phylogenomic analysis of Marek's disease virus reveals independent paths to virulence in Eurasia and North America

Abstract Virulence determines the impact a pathogen has on the fitness of its host, yet current understanding of the evolutionary origins and causes of virulence of many pathogens is surprisingly incomplete. Here, we explore the evolution of Marek's disease virus (MDV), a herpesvirus commonly afflicting chickens and rarely other avian species. The history of MDV in the 20th century represents an important case study in the evolution of virulence. The severity of MDV infection in chickens has been rising steadily since the adoption of intensive farming techniques and vaccination programs in the 1950s and 1970s, respectively. It has remained uncertain, however, which of these factors is causally more responsible for the observed increase in virulence of circulating viruses. We conducted a phylogenomic study to understand the evolution of MDV in the context of dramatic changes to poultry farming and disease control. Our analysis reveals evidence of geographical structuring of MDV strains, with reconstructions supporting the emergence of virulent viruses independently in North America and Eurasia. Of note, the emergence of virulent viruses appears to coincide approximately with the introduction of comprehensive vaccination on both continents. The time‐dated phylogeny also indicated that MDV has a mean evolutionary rate of ~1.6 × 10−5 substitutions per site per year. An examination of gene‐linked mutations did not identify a strong association between mutational variation and virulence phenotypes, indicating that MDV may evolve readily and rapidly under strong selective pressures and that multiple genotypic pathways may underlie virulence adaptation in MDV.

The debate has also been reinvigorated following recent research into the evolution of Marek's disease virus (MDV, also referred to as Gallid herpesvirus 2 [GaHV-2]), which is a member of the genus Mardivirus in Herpesviridae (subfamily Alphaherpesvirinae). MDV infects chickens and rarely other avian species and is controlled in the global poultry industry by the near-universal application of modifiedlive virus vaccines, without which infected chickens typically develop acute rash, and edematous neuronal and brain damage: severe lymphomas, paralysis, and death at a very young age (Witter, 1997(Witter, , 2001. The severity of disease in unvaccinated chickens has steadily increased since the adoption of large-scale farming techniques in the 1950s and mass vaccination since the 1970s (Nair, 2005;Osterrieder, Kamil, Schumacher, Tischer, & Trapp, 2006;Witter, 1998). Data from a recent study show that newer MDV lineages that evolved in the vaccination era are significantly fitter than ancestral vaccine-naïve strains (Read et al., 2015). These findings support theoretical predictions (Gandon et al., 2001;Smith & Mideo, 2017) that the use of vaccines to suppress but not block pathogen replication or transmission (so called antidisease, imperfect, or leaky vaccines) results in the evolution of viruses with increased replication (i.e., fitness) or transmission and therefore virulence. Meanwhile, previous studies addressing the industrialization of farming have discussed how MDV virulence can increase independently of vaccine use Rozins & Day, 2017). In these studies, denser flocks and longer durations for rearing in combination with shorter intercohort intervals and limited virus elimination by cleaning and disinfection can lead to reduced MDV virulence. Many of these factors seem counterintuitive from the perspective of disease prevention, but they indicate how substantially shortened host cohort durations, which have halved for broiler chickens in 60 years due to advances in nutrition and breeding (Anthony, 1998), could have inadvertently contributed to the evolution of increased virulence.
Both vaccine use and intensive farming can influence pathogen transmission dramatically because they artificially manipulate the immune status and population dynamics of the host. The impact of these factors on transmission could in principle result in a drive toward increased pathogen virulence. Given that both imperfect vaccination and industrialized farming practices are now pervasive in the global poultry industry, we set out to explore their impact on the evolution of MDV. We present findings from a phylogenomic study aimed at resolving the geographical, temporal, and ancestral trait patterns of MDV virulence evolution in the context of these changes.

| Genome sequencing
Viral DNA was obtained from the blood of infected animals or from chicken embryo cells (CEC) that were infected with the respective strain of MDV as described earlier (Schumacher, Tischer, Fuchs, & Osterrieder, 2000). None of the viruses were passaged more than five times on CEC. For DNA extraction, 200 μl buffy coat from infected animals or infected CEC (at day 7 postinfection) from a 100-mm dish was resuspended in 500 μl TEN buffer (100 mm NaCl, 10 mm Tris, 1 mm EDTA, pH 8.0) and lysed by the addition of 250 μl sarcosine lysis buffer (75 mm Tris, 25 mm EDTA, 3% (w/v) N-lauryl sarcosine, pH 8.0) followed by a 15-min incubation at 65°C. RNA was degraded by the addition 10 μl RNAse A (10 mg/ml) and 30-min incubation at 37°C.
Protein was digested during a 16-hr incubation after the addition of 10 μl proteinase K (20 mg/ml). DNA was extracted using a standard phenol/chloroform procedure followed by ethanol precipitation (Sambrook, Fritsch, & Maniatis, 1989). After centrifugation, DNA was resuspended in 50 μl TE buffer and DNA concentration determined by spectrophotometry.
Five micrograms of total DNA was diluted in 130 μl TE and fragmented to a peak fragment size of 500-700 bp using the Covaris M220 focused ultrasonicator (peak incident power: 75 W, duty cycle: 5%, cycles per burst: 200, duration: 52 s). Fragmented DNA (1 μg) was used for NGS library preparation using the NEBNext Ultra II Library Prep Kit for Illumina platforms (New England Biolabs).
To capture viral sequences from total cellular DNA extracts, we developed a tiling array using a bait set designed to capture the sequence of MDV strain RB-1B (GenBank EF523390.1). The tiling array consisted of approximately 6,200 biotinylated RNA 80-mers, employed according to manufacturer's instructions (Mycroarray, Ann Arbor, MI).
Highly enriched viral sequence libraries were generated from 500 ng of indexed total DNA libraries. For sequencing on a MiSeq instrument (Illumina), target-enriched libraries were quantified by qPCR using the NEBNext Library Quant Kit (New England Biolabs). Up to 48 samples were pooled to equal molarity and subjected to sequencing using Illumina's v3 chemistry for 2 × 300 bp paired-end sequencing.
Read quality of NGS data was assessed with FastQC (Andrews, 2010).
Alignment was performed using BWA ) against the reference genome of the very virulent (vv) MDV strain Md5 (RefSeq NC_002229.3), which consists of 177,874 bp and encodes 103 predicted proteins (Tulman et al., 2000). Freebayes (Garrison & Marth, 2012) and BCFtools  were then used to create a consensus sequence of mapped reads.

| Evolutionary analysis
We combined four newly sequenced consensus genomes with 18 complete or near-complete MDV genomes obtained from GenBank (for detailed strain information, see Table S1). The genomic sequences were aligned using MAFFT v7.205 (Katoh & Standley, 2013). For phylogenetic analysis, alignment gaps associated with incomplete genomic data, variable repeat regions, mini-F vector sequences, and reticuloendotheliosis virus long terminal repeats (the latter two being present in some of the full-genome sequences) were removed using Gblocks v0.91 (Castresana, 2000). Remaining ambiguous nucleotide positions were treated as missing data in subsequent analyses.
Temporal signal was examined by plotting root-to-tip phylogenetic divergences using TempEst v1.5 (Rambaut, Lam, Max Carvalho, & Pybus, 2016). We employed RDP, GENECONV, and BootScan within the software package RDP4 v4.56 (Martin, Murrell, Golden, Khoosal, & Muhire, 2015) to detect evidence of recombination, using a threshold of p < .05 and Bonferroni correction to account for multiple test- 2009) using a symmetric substitution model. MCMC were run and combined as described above, followed by mapping of inferred ancestral traits onto a maximum clade credibility (MCC) tree. Geographical locations (North America, Eurasia) were inferred from the literature as were pathotypes, which are based on a standardized in vivo virulence grading scheme (Witter, 1997), ranging from mild (m), virulent (v), very virulent (vv), and very virulent+ (vv+) categories. More recent field strains with a history of high virulence but not pathotyped within this grading scheme were treated as a separate category that we refer to here as "hypervirulent" (hv).

| RESULTS
The class E genomes of MDV consist of one long (U L ) and one short (U S ) unique region that are separated by a variable long (IR L ) and short (IR S ) internal repeat region ( Figure 1). Together, this core portion of the genome contains all the predicted ORFs. There are additional invertedrepeat long (TR L ) and inverted-short (TR S ) regions that flank this core region. As the TR L and TR S are identical to the IR L and IR S , respectively, those regions were not used in subsequent analysis. The final alignment used in phylogenetic analysis consisted of 141, 110 bp, including 643 variable sites, of which 353 were singletons. Variation was distributed unevenly across the alignment. The first half, coinciding approximately with the U L , proved to be highly conserved with variation steadily increasing thereafter and peaking in the repeat regions of the genome (IR L and IR S ) before decreasing again across the U S (Fig. S1).

| Evolution
Two genome sequences, GA and 584Ap80C (BAC20), were removed from the alignment after inspection of a linear regression between ML tree root-to-tip divergences and time. The passaging history of GA before sequencing is unknown and has nucleotide sequence discrepancies in 14 genes (S. Spatz, pers. comm.), while 584Ap80C has a history of 80 passages in CEC and is known to be attenuated (Schumacher et al., 2000). The disruptive influence of the two sequences on the temporal signal (but not topology) of the tree is therefore not surprising (Fig. S2a vs. b). An analysis of recombination in the remaining 20 MDV genomes pointed to the possible influence of recombination in the evolutionary history of MDV. Two events were detected in two neighboring regions of the alignment spanning ORFs MDV040 to MDV066 and MDV072 to MDV073, respectively (Table S2). The impact of the recombinant regions on ML tree topology was minimal, but their removal influenced temporal signal (Fig. S2b vs. c). The resulting temporal signal in the final dataset was high (Figure 2a; R 2 = 0.93; Correlation Coefficient = .97), indicating that a time-scaled phylogenetic analysis was appropriate.
For dated phylogenetic reconstructions in BEAST, a GTR + Γ 4 site model, constant size coalescent tree prior, and relaxed uncorrelated lognormal clock (Drummond, Ho, Phillips, & Rambaut, 2006) were used. These were selected following Bayes factor comparison of marginal-likelihood estimates ( Table 1)

| Mutation analysis
Variation in standardized ORF mutations mirrored the pattern of genetic variation across the MDV genome with the majority of mutations falling in ORFs in the latter half of the U L , the U S , or the internal repeat regions (IR L , IR S ) (Figure 4a, Fig. S1). We then looked beyond this broader pattern to search for genetic variation that may be associated with the evolution of MDV virulence. We compared standardized point mutations in contemporary strains within the virulent clades EUA and NA (Figure 4b), in addition to quantifying independent mutational changes that occurred between the root of the tree and either the MRCA of EUA or NA (Figure 4c, File S1). We did not detect a strong correlation in the pattern of ORF point mutations between clades EUA and NA in tip genetic variation (Spearman's rho = .205, p-value = .069, Figure 4b), but a marginally significant correlation was detected in mutational changes that evolved between the root and the ancestral sequences of clades EUA and NA, respectively (Spearman's rho = .228, p-value = .043, Figure 4c). Most ORFs harbored mutations that were not obviously associated across virulent viruses. However, we did identify a small number of ORFs with greater-than-average mutations in both EUA and NA (Table 2)

| DISCUSSION
Understanding the evolutionary history of MDV is important for two reasons. Firstly, the history of MDV in the 20th century represents a very useful case study for understanding the evolutionary origin and causes of increasing virulence. Here, high-throughput sequencing in combination with a novel MDV tiling enrichment array has enabled us to determine genomic sequences quickly and gain new insight into the genomic origins and underlying mechanisms of this process. Secondly, understanding how and why MDV virulence has increased is of significant applied relevance not just for the poultry industry, but also in the wider context of the emergence of treatment-resistant pathogens.
The increasing incidence of drug and vaccine resistance represents a major global challenge to human and animal health, companion animals, and wildlife alike, yet meaningful solutions to this growing problem are lacking.   (Duggan et al., 2016;Firth et al., 2010) and MYXV (~1 × 10 −5 ) (Kerr et al., 2012(Kerr et al., , 2017. Overall, such rates are generally higher than typically expected for dsDNA viruses and imply higher rates of nucleotide substitution than those inferred by host-virus codivergence analysis (Firth et al., 2010). One possibility is that these  Despite this general pattern, we did identify a minority of candidate ORFs with mutations that appeared to be associated specifically with virulent MDV strains (  (Amor et al., 2011;Brown et al., 2006;Jarosinski, Osterrieder, Nair, & Schat, 2005;Kamil et al., 2005;Liu et al., 1999;Lupiani et al., 2004;Nair, 2013;Reddy et al., 2002;Tischer, Schumacher, Messerle, Wagner, & Osterrieder, 2002). While known as an important MDV oncoprotein, Meq is also expressed during lytic replication and has a potential role in the rapid production of virus progeny (Coupeau, Dambrine, & Rasschaert, 2012). Watson, & Nair, 2009), which in the context of MDV virulence evolution should not be overlooked (Zhao et al., 2011). Another particularly variable ORF associated with virulent Eurasian and North American viruses is R-LORF4. R-LORF4 is in the direct vicinity of Meq, and splice variants between Meq and R-LORF4 have been identified. In addition, R-LORF4 was shown to be a virulence factor, much like Meq (Jarosinski et al., 2005;Kim, Hunt, & Cheng, 2010). However, overall we consider it as likely that subtle epistatic interactions among many genes and noncoding regions, which are difficult to characterize and detect, have also played a significant role in MDV virulence evolution. For example, of the genes associated with virulence, only 10% were shared by both virulent Eurasian and North American strains, with the majority of genes either containing no variation (>50%) or harboring variants that were specific to one but not both lineages (10%-20%). This is not surprising given the size and complexity of the MDV genome, and it supports findings from recent studies of virulence evolution in the distantly related dsDNA virus MYXV, where similar patterns of genotypic evolution were observed during the processes of attenuation and virulence adaptation (Kerr et al., 2012(Kerr et al., , 2017. This contrasts with smaller RNA viruses such as HIV-1 and influenza (Baigent & McCauley, 2003;Kimata, Kuller, Anderson, Dailey, & Overbaugh, 1999), where the limited repertoire of genes is thought to restrict the mutational landscape and therefore the available routes to adaptive evolution.
The root of the MDV phylogeny is predicted to precede the first records of increased MDV infection severity by only a decade or so, indicating that the diversity of MDV sampled globally since the 1960s has a recent evolutionary past. MDV was first described by Jozsef Marek in Hungary, 1907 (Marek, 1907), but the wider history of MDV in Eurasia or North America is not clearly understood. It is possible that the virus and the chicken host have undergone a long coevolutionary history (Weiss & Biggs, 1972), in which case our analysis indicates expansion of a relatively narrow subset of MDV in the 20th century.
This is perhaps not surprising in the context of increasingly uniform selection dynamics: For example, most production broiler lines have significantly introgressed B21 (MHC-I) alleles. Further whole-genome sampling from additional geographical sources and historical periods, particularly prior to the 1960s, is required to strengthen support for the inferred phylogeography and patterns of historical MDV virulence presented here. It may also be useful in future to sample feral junglefowl and related species, which although less commonly infected (Nair, 2005), will help to characterize the wider genetic diversity of MDV, particularly because such wild bird populations have not been exposed to the extreme and artificial selective pressures found on poultry farms.
An interesting aspect of our genome-scale analysis is that MDV virulence appears to have evolved independently in Eurasia and North America, respectively. Virulent strains clustered together in derived positions within well-supported and geographically coherent clades.
The ancestors of these clades are predicted to have emerged in concert with or soon after the introduction of vaccines to control clinical disease induced by MDV, which had been increasing in frequency and Based on greated-than-average number of point mutations found in the reconstructed ancestors of both EUA and NA (compared to the root of the tree). severity in the late 1950s and 1960s (Witter, 1997). In this context, it is of interest to note that two different routes to vaccine development were followed in Europe and in North America. HPRS-16, originally isolated and characterized in the UK, is an MDV strain that exhibited lower virulence and was modified by serial passage to a homotypic modifiedlive virus vaccine (Churchill, Payne, & Chubb, 1969). On the other hand, HVT was developed as a heterotypic vaccine in the USA (Okazaki, Purchase, & Burmester, 1970). Both vaccines were developed simultaneously for use in Europe and North America, respectively, very shortly after the agent responsible for disease was first identified (Churchill & Biggs, 1967;Purchase & Biggs, 1967). Whereas in the USA, bivalent vaccines were subsequently adopted in the early 1980s in the wake of HVT vaccine failures, in Europe a vaccine derived from a naturally mild isolate of MDV (Rispens/CVI988) was used and later disseminated globally in the 1990s. Given the different approaches of vaccinal control of MDV in different regions of the world, it is not entirely surprising that two clearly distinguishable routes to vaccine resistance may have evolved in each continent over the past five decades.