Molecular characterization of type 1 porcine reproductive and respiratory syndrome viruses (PRRSV) isolated in the Netherlands from 2014 to 2016

Porcine reproductive and respiratory syndrome virus (PRRSV) is the causative agent of a devastating pig disease present all over the world. The remarkable genetic variation of PRRSV, makes epidemiological and molecular analysis of circulating viruses highly important to review current diagnostic tools and vaccine efficacy. Monitoring PRRS viruses supports modern herd management by explaining the source of found viruses, either internally or externally from the herd. No epidemiological or molecular study has been published on circulating PRRS-viruses in the Netherlands, since the early nineties. Therefore, the objective of this study is to investigate circulating PRRS-viruses in the Netherlands in 2014, 2015 and 2016 on a molecular level by sequencing ORF2, ORF3, ORF4, ORF5, ORF6 and ORF7. The results demonstrate that the 74 PRRSV strains belong to PRRSV-1, but the diversity among strains is high, based on nucleotide identity, individual ORF length and phylogenetic trees of individual ORFs. Furthermore, the data presented here show that the phylogenetic topology of some viruses is ORF dependent and suggests recombination. The identity of the strain of interest might be misinterpreted and wrong conclusions may be drawn in a diagnostic and epidemiological perspective, when only ORF5 is analyzed, as performed in many routine sequencing procedures.


Introduction
Porcine reproductive and respiratory syndrome (PRRS) is the most significant swine disease worldwide since its appearance in the eighties, and is endemic in many countries [1], including the Netherlands [2,3]. The disease is characterized by abortions and weak born piglets, increased mortality in suckling and weaned piglets, and respiratory disease in weaners and finishers [4,5]. This disease is caused by the PRRS-virus (PRRSV) and this virus also aggravates infections like Influenza A, Streptococcus suis and porcine respiratory coronavirus infections [1]. PRRSV belongs to the Arteriviridae family within the order Nidovirales and contains a single-stranded positive sense RNA genome ranging from 14.9 to 15.5 Kb in length [6,7]. PRRSV is divided into two genotypes, PRRSV-1, and PRRSV-2 [8]. Both types share only about 50-a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 60% and there is considerable sequence variability within each type [9][10][11][12]. Both subtypes now have a worldwide distribution [10,[13][14][15][16].
The genome encodes at least 10 open reading frames (ORFs) [17]. ORF1a and ORF1b encode non-structural polyproteins with replicase and polymerase activities. Downstream, ORF2, ORF3 and ORF4 encode for the minor structural glycoproteins GP2, GP3 and GP4, respectively and together these proteins form a trimeric complex that is heavily N-glycosylated and functions in viral entry [17]. In addition to GP2, ORF2 encodes an alternative reading frame (ORF2b) for a small unglycosylated envelope protein (E) [18]. Further downstream, ORF5, ORF6 and ORF7 encode for the major structural proteins GP5, matrix (M) and nucleocapsid (N), respectively [17]. Besides the N-glycosylated GP5, that is involved in cell attachment, another small unglycosylated protein that is required for virus viability is translated from an alternative reading frame of ORF5, designated ORF5a [19].
Monitoring PRRS viruses supports modern herd management by explaining the source of found viruses, either internally or externally from the herd. The remarkable genetic variation of PRRSV [20][21][22], makes epidemiological and molecular analysis of this virus of high importance to monitor changes of these circulating viruses. Consequently, it is also important to review current diagnostic tools since the variability of PRRSV has impact on the sensitivity and specificity of used ELISAs and PCR tests and may affect vaccine efficacy. Of the Dutch pigs, approximately 95% of the sows and 30% of the piglets is vaccinated with either modified life vaccine (MLV) Porcilis PRRS (MSD), Reprocyc/PRRSFLEX EU (Boehringer Ingelheim), Unistrain (Hipra) or Suvaxyn MLV PRRS (Zoetis). In addition, on a few sow farms the inactivated vaccine Progressis (CEVA) is used.
Since the first description of PRRSV in the Netherlands in 1991 [23] no comprehensive epidemiological or molecular study has been published on circulating PRRS-viruses in the Netherlands except two posters presented at the International Pig Veterinary Society congress in 2008 [24] and 2010 [25]. These studies showed a phylogenetic analysis of ORF5 sequences of Dutch strains from 2004-2009 and concluded that the average sequence similarity with Lelystad virus (<90%) is decreasing over the years. Therefore, the objective of this study is to investigate circulating PRRS-viruses in the Netherlands in 2014, 2015 and 2016 on an extensive molecular level by sequencing ORF2, ORF3, ORF4, ORF5, ORF6 and ORF7. Subsequently, these sequences are compared to PRRSV sequences originating from other European countries and available in GenBank in order to investigate genetic relatedness.

Samples
Seventy-four serum samples containing genetic material of PRRS type 1 viruses confirmed by PRRSV genotype specific real-time TaqMan PCR targeting the ORF7 region of the PRRSV genome [26] were selected for further molecular characterization. The samples originated from fifty-four Dutch pig farms, predominantly located in the eastern part of the Netherlands and were collected from and including the years 2014 to 2016.

Sequencing PRRS viruses
Viral RNA was extracted from serum samples using the MagMAX pathogen DNA/RNA isolation kit as deposited in protocols.io [27] in combination with the semi-automated Mag-MAX TM Express-96 Deep Well Magnetic Particle Processor (Thermofisher Scientific) according to the manufacturer's protocol. First-strand cDNA synthesis was done with the SuperScript1 III kit (Invitrogen) using the 3'-end poly(dT) reverse transcription (RT)-primer (S1 Table) as described in protocols.io [28,29]. Subsequently, a long range amplification PCR of ORF2-ORF7 was performed with the AccuPrimeTM Taq DNA Polymerase High Fidelity mix (Invitrogen) using a forward and reverse primer (S1 Table) as deposited in protocols.io [30]. Samples without visible product after PCR were re-tested using a second, and when necessary, a third forward primer. Subsequently, amplicons were sent to BaseClear (Leiden, the Netherlands) for purification and Sanger sequence analysis. PCR amplification primers and Sanger sequence primers are presented in S1 Table.

Sequence analysis
For each virus, sequences were edited (trimming of the sequence of the primer binding sites) and assembled using Lasergene Seqman Pro version 15 (DNAstar inc. Madison, Wisconsin USA). The 74 ORF2-ORF7 sequences were deposited in GenBank and accession numbers are presented in Table 1. Dutch sequences were aligned with sequences obtained from GenBank and phylogenetic trees were constructed using MEGA6 software by Neighbor-Joining method (1000 replicates for bootstrap). The evolutionary distances were computed by using the Maximum Likelihood method based on the Tamura-Nei model [31]. The trees were drawn to scale, with branch lengths measured in the number of substitutions per site.

Genetic variation
The large genetic variation of the Dutch PRRSV strains was studied using a percent identity matrix by determining the pairwise nucleotide identities (including gaps) between different ORFs of different isolates using the online Clustal Omega tool [32]. Because the structure of the viral proteins may affect the infectivity, pathogenicity and viral persistence of the virus [33,34], individual ORF lengths were determined and putative Nlinked glycosylation sites were estimated. Nucleotide sequences were translated using EditSeq version 15 (DNAstar inc. Madison, Wisconsin USA) and putative N-linked glycosylation sites were predicted by means of the NetNGlyc1.0 Server [35]. Furthermore, signal peptides were determined with SignalP 4.0 [36]. The PRRSV-1 strain Lelystad virus (M96262) was used as reference strain to compare similarity scores, ORF lengths and glycosylation sites.
The distribution of the frequency of the similarities with Lelystad virus in percentages is presented for the complete ORF2-ORF7 sequence and for the individual ORF genes (S1 appendix). ORF6 and ORF7 have the highest nucleotide identity with Lelystad virus. Although lower similarities can be seen in ORF5, the lowest average similarity with Lelystad virus was observed in ORF3, since five viruses had lower nucleotide identities than 82% (S1 appendix and S2 Table).
While the length of ORF2, ORF5, ORF6 and ORF7 seems conserved compared to the Lelystad reference strain (GenBank: M96262), the length of ORF3 and ORF4 are quite diverse among the isolates (Fig 2 and S3 Table). For ORF3, no less than eight variants in length were observed and five lengths were found more than once (Fig 2). All deletions found in ORF3 are positioned at the 3'-end of this gene and affect automatically ORF4, since the open reading frames overlap. Two viruses (NL/GD-2-8/2014 and NL/GD-3-7/2014) with a short ORF3 length of 732 nucleotides are not caused by deletions, but due to an alternative stop codon. The prediction of the amount of putative N-glycosylation sites for ORF3 varied from 4-7 (S4 Table). For ORF4, three variants in length in the Dutch strains (Fig 2) were observed, and whereas most Dutch viruses contain four N-glycosylation sites, two viruses have 5 putative   sites (S4 Table). The one ORF6 sequence (519 nucleotides) with a codon missing (Fig 2), seemed to be unique among the known ORF6 sequences in GenBank. Besides analysing the complete ORF2-ORF7 sequences of 74 Dutch PRRS viruses, the individual ORFs were also compared with sequences in GenBank in phylogenetic trees (Fig 3). For some viruses the phylogenetic topology varied per ORF. For instance, NL/GD-4-13/2015 was placed with viruses NL/GD-3-19/2014 and NL/GD-6-7/2016 in ORF5 and ORF6, whereas in ORF2, ORF3, ORF4 and ORF7 no clustering can be seen (Fig 3, highlighted in green). Another example is the clustering of NL/GD-3-11/2015 with NL/GD-2-9/2014 and NL/GD-3-7/2014 (Belgium-likes) in ORF2 and ORF3, whereas in ORF4, ORF5, ORF6 and ORF7 NL/GD-3-11/ 2015 seems to be a totally different virus (Fig 3, highlighted in purple). The most striking example is strain NL/GD-5-18/2015 that clustered with Austrian AUT/15-33/2015 (GenBank KU494019) in the phylogenetic trees constructed from complete ORF2, ORF3, ORF4 and ORF5 sequences (86.3-89.5% similarity), whereas in the ORF6 and ORF7 trees it clustered (96.2-96.9% identity) with Lelystad virus (Fig 3, highlighted in red; S2 Table).

Discussion
In regional elimination and national eradication efforts PRRSV genotyping is one of the key tools to assess the performance of action to help improve internal and external biosecurity measures and to better understand the virus ecology. This manuscript describes the first extensive molecular study about Dutch PRRS viruses since its first description in the early nineties. The results demonstrate that all 74 PRRSV strains belong to PRRSV-1 (Fig 1), but the diversity among strains is high, based on the nucleotide identity (S1 appendix and S2 Table), individual ORF length (Fig 2 and S3 Table) and phylogenetic trees of individual ORFs (Fig 3). Furthermore, the most investigated viruses in this study form a distinct Dutch cluster based on ORF2-ORF7 sequences (Fig 1). The Netherlands is exporting approximately 10 million and importing half a million live pigs per year [37] and is endemic for PRRSV, like many other countries in Europe. Therefore, it is expected that with the animals, PRRS virus strains spread over these countries. Sampling bias is a probable clarification for this as many countries do not sequence and/or deposit their sequences in GenBank, resulting in incomplete databases for ORF2-ORF7. With an economic high impact disease like PRRS across country borders, monitoring of circulating viruses is an important tool that would greatly benefit from more frequent sharing of sequence data by different European countries, including the Netherlands.
The high genetic diversity of the presented viruses suggests that PRRSV is constantly evolving to adapt to naturally, as well as vaccine induced immunity as also previously described [38]. Despite vaccination of the majority of the pig population in the Netherlands, still many outbreaks are reported, and in literature it is already suggested to improve vaccines to overcome the disadvantages of current vaccines [39].
The nucleotide identity of individual ORFs with Lelystad virus (S1 appendix) confirms that ORF6 and ORF7 are the most conserved genes and justifies these targets in the diagnostic PRRS detection PCR [40]. However, a primer based approach is biased in advance. This was demonstrated by the fact that some of the viruses selected for ORF2-ORF7 sequencing based on strong positive results in the detection PCR, were not picked-up by the primers designed to obtain the ORF2-ORF7 amplicons. Even after designing two additional forward primers, not for all PRRSV isolates ORF2-ORF7 amplicons could be generated. Therefore, it is likely that the presented diversity of the viruses is an underestimation of the true diversity of PRRS viruses. As a result, as long as primer-based approaches are used for PRRS detection and sequencing, we must be aware that there will be viruses missed in routine testing.
Currently, PRRSV genotyping is being performed mostly based on ORF5 and/or ORF7 sequence analysis [41][42][43][44]. The data presented here shows that the phylogenetic topology of some viruses is ORF dependent, as also previously described [44]. For example, based on ORF5 alone, NL/GD-5-18/2015 would be categorized as a field virus (Fig 3D), whereas based on ORF6 or ORF7 the collected virus would be designated as a vaccine virus derivative (Fig 3E  and 3F). A possible explanation for this ORF dependent clustering in phylogenetic trees is recombination [17]. PRRSV strains can recombine if coinfection of a cell with two or more strains occurs. So, the identity of a given PRRSV strain is largely incomplete if only ORF5 or ORF7 are known. Although whole genome sequencing of the PRRS viruses may give a more complete picture in support of the molecular epidemiology, sequencing ORF2-ORF7 is already an enormous improvement for PRRSV routine sequencing. Based on one ORF alone, the identity of the strain of interest might be misinterpreted and wrong conclusions may be drawn in a diagnostic and epidemiological perspective, as also recently observed [45].
Regarding PRRSV genome diversity studies, ORF5 is the most examined gene and one of the most variable regions of the genome [41,44,46]. Also in the present study, a high genetic diversity, when comparing the 74 Dutch sequences with Lelystad virus, was seen for ORF5 (S1 appendix and S2 Table). Although this was not due to its variation in length (Fig 2) or the amount of N-glycosylation sites (three putative sites for 69 of the 74 viruses; S4 Table).
Of the genome region that encodes for one of the structural proteins, ORF3 is the most variable part. The 3'-end of this region is prone for deletions as previously described [43,44] and the deletions found in the present study always affected the open reading frame of ORF4. Not only the variation in nucleotide identity compared to the Lelystad strain (S1 appendix) was high, but the viruses studied varied also in their ORF3 length (Fig 2). In addition, another eight variants in length were observed at the other ORF3 (non-Dutch) sequences from Gen-Bank used in Fig 1: 744, 750, 762, 765, 774, 780, 786 and 792 nucleotides, respectively. Although it is suggested that viruses with ORF3 deletions will be outcompeting nondeleted viruses in the field [43], our results show that besides shorter ORF3 genes, also longer ORF3 genes compared to Lelystad virus were observed (Fig 2). For ORF4, three variants in the Dutch strains (Fig 2) were observed, whereas the used GenBank sequences in Fig 1 contain another seven variants in length: 516, 519, 528, 534, 540, 546 and 555 nucleotides, respectively. The variation in length of ORF4 can be explained by the fact that the 5'-end of ORF4 and the 3'-end of ORF3 coding regions overlap and hence any deletion found in ORF3 will also affect ORF4 [47].
Future research will be focusing on PRRSV-1 recombination and its implication on genetic analysis. It is of upmost importance that the variation of circulating PRRSV is being monitored and genetically analyzed, in order to improve diagnostics, increase vaccine efficacy and identification of found virus strains to support internal and external biosecurity measures on local farms.
Supporting information S1 Appendix. Genetic similarity (%) with Lelystad virus (LV). Comparison is based on ORF2-ORF7 nucleotide sequences and the individual ORFs sequences of 74 Dutch isolates