Genetic Variability and Recombination of the NSP2 Gene of PRRSV-2 Strains in China from 1996 to 2021

Simple Summary Genetic variability and recombination of the NSP2 gene are of great significance in gaining an in-depth understanding of the prevalence of PRRSV in China over the past 25 years. We compared the nucleotide and amino acid homologies of the NSP2 sequences of different PRRSV-2 lineages, and examined phylogenetic relationships based on an analysis of the NSP2 sequences of 122 strains. What is more, recombination analysis revealed the occurrence of five recombinant events among the 135 selected PRRSV-2 strains. These results provide a theoretical foundation for evolution and epidemiology of the spread of PRRSV. Abstract Porcine reproductive and respiratory syndrome (PRRS) is one of the most serious infectious diseases that detrimentally affects the pig industry worldwide. The disease, which is typically difficult to control, is an immunosuppressive disease caused by the porcine reproductive and respiratory syndrome virus (PRRSV), the genome of which (notably the NSP2 gene) undergoes rapid mutation. In this study, we sought to determine the genetic variation in the PRRSV-2 NSP2 gene in China from 1996 to 2021. Strain information was obtained from the GenBank database and analyzed from a molecular epidemiological perspective. We compared the nucleotide and amino acid homologies of the NSP2 sequences of different PRRSV-2 lineages, and examined phylogenetic relationships based on an analysis of the NSP2 sequences of 122 strains. The results revealed that NADC-30-like strains, which are represented by lineage 1, and HP-PRRSV strains, which are represented by lineage 8, were the most prevalent in China from 1996 to 2021. Close similarities were detected in the genetic evolution of lineages 3, 5, and 8. For nucleotide and amino acid sequence comparisons, we selected representative strains from each lineage, and for the NSP2 among different PRRSV-2 strains, we accordingly detected homologies of 72.5–99.8% and 63.9–99.4% at the nucleotide and amino acid levels, respectively, thereby indicating certain differences in the degrees of NSP2 amino acid and nucleotide variation. Based on amino acid sequence comparisons, we identified deletions, insertions, and substitutions at multiple sites among the NSP2 sequences of PRRSV-2 strains. Recombination analysis revealed the occurrence of five recombinant events among the 135 selected PRRSV-2 strains, and that there is a high probability of recombination of lineage 1 strains. The findings of this study enabled us to gain an in-depth understanding of the prevalence of PRRSV in China over the past 25 years and will contribute to providing a theoretical basis for evolution and epidemiology of the spread of PRRSV.


Introduction
Porcine reproductive and respiratory syndrome (PRRS) is a highly prevalent infectious disease caused by the porcine reproductive and respiratory syndrome virus (PRRSV). PRRSV infection causes immunosuppression; reproductive disorders in pregnant sows, including miscarriage, premature birth, and mummified fetuses; and respiratory diseases in piglets, thereby having particularly detrimental impacts on the pig industry worldwide. PRRS was first discovered in North Carolina in 1987, and the LV and VR2332 strains were subsequently isolated from infected pigs in Europe and America in 1991 and 1992, respectively [1,2]. At present, PRRSV-1 and PRRSV-2 were classified as the species Betaarterivirus suid 1 and Betaarterivirus suid 2, respectively. PRRSV-2 strains are mainly prevalent in China, wherein PRRSV was isolated for the first time in 1996 [3]. In 2006, a highly pathogenic PRRSV (HP-PRRSV) was detected, which was associated with a high fatality rate among piglets [4,5]. More recently, in 2012, a new strain type, NADC30, appeared in China [6]. PRRSV continues to recombine and mutate, which accordingly presents considerable challenges with respect to the prevention and control of PRRS [7].
PRRSV, a single-stranded positive-sense RNA virus, is a member of the genus Arterivirus in the family Arterivirus of the order Nidoviridae. It comprises a 15 kb genome containing 11 open reading frames (ORFs), namely, ORF1a, ORF1b, ORF2a, ORF2b, ORF3-7, ORF5a, and ORF1aTF, that overlaps the non-structural protein (NSP) 2-encoding region of ORF1a [8]. Among these, ORF1a and ORF1aTF translationally encode at least 12 NSPs, including NSP1α, NSP1β, NSP2-related proteins (NSP2N, NSP2TF, and NSP2), NSP3, NSP4, NSP5, NSP6, NSP7α, NSP7β, and NSP8 [9,10]; NSP2 at 2.9 kb in length is the longest. NSP2 and ORF5 are highly variable and ORF5 is associated with the neutralizing epitope. They are usually used as target genes for PRRSV molecular epidemiological surveillance [11]. Since the time of its isolation in the United States in 2001, the MN184 strain has been found to have discontinuously lost 131 amino acids in NSP2 sequence [12], whereas isolates obtained during the HP-PRRSV outbreak in China in 2006 were found to have discontinuously lost 30 amino acids from NSP2 sequence [5]. Similarly, the NSP2 sequence of the NADC34-like PRRSV strains recently isolated in China was found to have undergone a continuous loss of 100 amino acids. Studies have demonstrated that the recombinant strain HLJ/2017/1127a is a result of the recombination of the FZ06A and QYYZ strains. The recombination breakpoint is located at 1892-2730 (1742-2442) [13].
Shi et al. proposed a systematic classification of PRRSV-2 based on ORF5 gene in 2011, dividing PRRSV-2 into 9 lineages and 37 subfamilies [14]. Given the increasing number of PRRSV recombinant strains and increases in recombinant frequency in recent years, the epidemic situation of PRRSV in our country has become progressively more complex, indicating that recombination of PRRSV has played an important role in virus evolution [15,16]. Moreover, this continual variation and recombination of the PRRSV genome also hampers measures currently taken to control the incidence and spread of PRRS. Consequently, timely monitoring of the patterns of PRRSV mutation is essential for ensuring effective epidemic evolution and epidemiology. To this end, in this study, we compared the nucleotide and amino acids homologies of PRRSV-2 NSP2 among the strains of different lineages, and analyzed phylogenetic relationships, thereby enabling us to gain more understanding of genetic variation in the NSP2 protein. These findings will contribute to establishing a theoretical basis for assessing future epidemic trends in PRRS and for identifying changes in the NSP2 sequence that these viruses use to evade host immunity, thereby perpetuating the genetic evolution of PRRSV.

Phylogenetic Analysis
Phylogenetic analysis of the NSP2 gene was based on the reference strains sequence information shown in Table 1. The comparison was first performed using the Clustal W method in the MegAlign function of DNAStar software (version 7.0) and then using the neighbor-joining (NJ) method of MEGA software (version 7.0) with 1000 bootstrap replicates. It was then analyzed using the Maximum Likelihood (ML) method of PhyloSuite software (version 1.2.2) with 1000 bootstrap replicates.

Alignment of NSP2 Nucleotide Sequences
For the purpose of determining similarities among the NSP2 nucleotide sequences of different PRRSV lineages, we analyzed the reference strains information shown in Table 1 based on the Clustal W method in the MegAlign function of DNAStar software.

Alignment of NSP2 Amino Acid Sequences
Similarities among NSP2 amino acid sequences were analyzed using the Clustal W method, based on reference strains information, and multi-sequence alignment analysis was performed using the BioEdit software (version 7.2).

Recombination Analysis
When potential recombinant events were detected based on RDP software (version 4.0), GENECONV, BootScan, MaxChi, Chimera, SiScan, and 3eq analyses, five or more methods were identified as genetic recombination and P < 0.05 in RDP software. The strains thus identified were considered recombinant strains. In addition, we used SimPlot (version 3.5.1) to confirm the detected recombination events.

Phylogenetic Analysis
Based on the global PRRSV classification system and NSP2 sequence information in the GenBank database, we selected the NSP2 sequences of 122 PRRSV-2 strains for phylogenetic analysis (Table 1). The phylogenetic tree constructed using these sequences revealed that the PRRSV-2 strains prevalent in China could be classified into four lineages, namely, lineages 1, 3, 5, and 8 (Figures 1 and 2). Among these, lineage 3 is represented by GM2-2011, QYYZ-2011, and FJFS-2012, which appear to be closely related, whereas lineage 1 and lineage 8 strains appear to be separated by comparatively large genetic distances.

Nucleotide Similarity
In order to further examine the genetic variation in NSP2 that has occurred during the course of PRRSV evolution, we selected 15 strains from each of the aforementioned four lineages for nucleotide homology analyses, and thereby determined evolutionary relationships among the different lineages at the nucleotide level ( Figure 3). We accordingly detected a nucleotide homology of between 72.5% and 99.8% among the NSP2 protein of different PRRSV-2 strains, of which the CHsx1401-2014 strain showed the lowest homology of 72.5% with the QYYZ-2011 and GM2-2011 strains. Contrastingly, the nucleotide sequence of the BJ-4-1996 strain was found to be highly similar to that of the RespPRRS MLV-1994 strain, with a homology of 99.8%. With respect to each of the four assessed lineages, we obtained homology values of 86.0% to 91.2% among lineage 1 strains, 92.4% to 98.8% among lineage 3 strains, 99.4% to 99.8% among lineage 5 strains, and 92.3% to 99.6% among lineage 8 strains, of which the homologies of classic-type CH-1a-like strains were between 92.3% and 99.3% and those of HP-PRRSV-like strains were between 99.4% and 99.6%. The largest differences in NSP2 sequences were detected in lineage 1 (NADC-30-like) strains, and we speculate that this reflects extensive mutation and recombination among these strains. Lineages 3 and 5 were found to be highly similar at the nucleotide level, and the clinical detection of these two lineages in China is generally very low.

Nucleotide Similarity
In order to further examine the genetic variation in NSP2 that has occurred during the course of PRRSV evolution, we selected 15 strains from each of the aforementioned four lineages for nucleotide homology analyses, and thereby determined evolutionary relationships among the different lineages at the nucleotide level ( Figure 3). We accordingly detected a nucleotide homology of between 72.5% and 99.8% among the NSP2 protein of different PRRSV-2 strains, of which the CHsx1401-2014 strain showed the lowest homology of 72.5% with the QYYZ-2011 and GM2-2011 strains. Contrastingly, the nucleotide sequence of the BJ-4-1996 strain was found to be highly similar to that of the RespPRRS MLV-1994 strain, with a homology of 99.8%. With respect to each of the four assessed lineages, we obtained homology values of 86.0% to 91.2% among lineage 1 strains, 92.4% to 98.8% among lineage 3 strains, 99.4% to 99.8% among lineage 5 strains, and 92.3% to 99.6% among lineage 8 strains, of which the homologies of classic-type CH-1a-like strains were between 92.3% and 99.3% and those of HP-PRRSV-like strains were between 99.4% and 99.6%. The largest differences in NSP2 sequences were detected in lineage 1 (NADC-30-like) strains, and we speculate that this reflects extensive mutation and recombination among these

Amino Acid Sequence Similarity
Similar to our analysis of nucleotide similarities, we also examined the genetic variation of the PRRSV NSP2 protein at the amino acid level among 15 representative strains of the four lineages, in order to gain an understanding of the evolutionary relationship among amino acids sequences in each lineage ( Figure 4). The results revealed amino acid homologies of between 63.9% and 99.4% among the different PRRSV-2 strains, of which that between CHsx1401-2014 and GM2-2011 at 63.9% was the lowest. The highest similarities were detected between the TJ-2006 strain and strains HUN4-2006 and JXA1-2006, with homologies reaching 99.4%, which was found to be inconsistent with the differences among these strains detected at the nucleotide level. In this regard, it is conceivable that amino acid substitution between lineage 1 and 5 strains is more common. In terms of each lineage, we detected amino acid homologies of 80.8% to 87.8%, 88.8% to 98.3%, 98.7% to 99.3%, and 87.9% to 99.4% among lineage 1, 3, 5, and 8 strains, respectively. Of these, homologies of between 89.6% and 98.7%, and 99.2% and 99.4% were obtained for the classical type CH-1a-like and HP-PRRSV-like strains, respectively. The largest differences in NSP2 sequences were identified in lineage 1 (NADC-30-like) strains, which is consistent with the nucleotide alignment results.

Amino Acid Sequence Similarity
Similar to our analysis of nucleotide similarities, we also examined the genetic variation of the PRRSV NSP2 protein at the amino acid level among 15 representative strains of the four lineages, in order to gain an understanding of the evolutionary relationship among amino acids sequences in each lineage ( Figure 4). The results revealed amino acid homologies of between 63.9% and 99.4% among the different PRRSV-2 strains, of which that between CHsx1401-2014 and GM2-2011 at 63.9% was the lowest. The highest similarities were detected between the TJ-2006 strain and strains HUN4-2006 and JXA1-2006, with homologies reaching 99.4%, which was found to be inconsistent with the differences among these strains detected at the nucleotide level. In this regard, it is conceivable that amino acid substitution between lineage 1 and 5 strains is more common. In terms of each lineage, we detected amino acid homologies of 80.8% to 87.8%, 88.8% to 98.3%, 98.7% to 99.3%, and 87.9% to 99.4% among lineage 1, 3, 5, and 8 strains, respectively. Of these, homologies of between 89.6% and 98.7%, and 99.2% and 99.4% were obtained for the classical type CH-1alike and HP-PRRSV-like strains, respectively. The largest differences in NSP2 sequences were identified in lineage 1 (NADC-30-like) strains, which is consistent with the nucleotide alignment results.

Amino Acid Sequence Alignment
The NSP2 nucleotide sequences of the 15 selected strains of each lineage were initially translated into amino acid sequences, for which we subsequently performed a multi-sequence comparative analysis ( Figure 5). We accordingly detected differences in the lengths of PRRSV-2 NSP2 amino acid sequences, characterized by significant variability in the sequences encoding NSP2, including deletions, insertions, and substitutions at different amino acid sites. As previously stated, lineage 1 and 8 NADC-30-like and HP-PRRSV-like strains are currently the most prevalent in China, and have long been detected in the country, during which time they have caused substantial economic losses in the pig industry. Based on our brief analysis of amino acid mutations in the highly pathogenic lineage 8 strains, we detected 30 discontinuous deletions at amino acid residues 481, and 533 to 561 in the HP-PRRSV representative strains HUN4-2006, JXA1-2006, and TJ-2006, which is consistent with the findings of a previous study [4]. Moreover, we identified high amino acid homologies of up to 99.4% among these strains, with up to only 10 amino acid differences. In the TJ-2006 strain, these changes are located at positions 8, 194, 213, 253, 296, 488, 504, 544, 745, and 796, at which we detected proline, asparagine, histidine, glutamic acid, leucine, valine, serine, alanine, and asparagine residues, respectively. In JXA1-2006, changes were identified at positions 8, 253, 296, 488, 745, and 796, at which we detected the substitution of threonine, valine, phenylalanine, valine, and glycine, respectively, whereas in the HUN4-2006 strain, substitutions of threonine, tyrosine, phenylalanine, methionine, and proline were detected at positions 8, 194, 213, 488, 504, and 544, respectively. Furthermore, in representative strains of lineage 1, we detected different degrees of amino acid deletion at positions 481, 320-323, 325-345, 347-381, 383-429, 431-434, and 502-520, whereas in representative lineage 3 strains, in addition to the deletion mutations, two amino acid deletions were found at positions 300 and 301, and there has been a continuous insertion of 36 amino acids at positions 817-852.

Amino Acid Sequence Alignment
The NSP2 nucleotide sequences of the 15 selected strains of each lineage were initially translated into amino acid sequences, for which we subsequently performed a multisequence comparative analysis ( Figure 5). We accordingly detected differences in the lengths of PRRSV-2 NSP2 amino acid sequences, characterized by significant variability in the sequences encoding NSP2, including deletions, insertions, and substitutions at different amino acid sites. As previously stated, lineage 1 and 8 NADC-30-like and HP-PRRSV-like strains are currently the most prevalent in China, and have long been detected in the country, during which time they have caused substantial economic losses in the pig industry. Based on our brief analysis of amino acid mutations in the highly pathogenic lineage 8 strains, we detected 30 discontinuous deletions at amino acid residues 481, and 533 to 561 in the HP-PRRSV representative strains HUN4-2006, JXA1-2006, and TJ-2006, which is consistent with the findings of a previous study [4]. Moreover, we identified high amino acid homologies of up to 99.4% among these strains, with up to only 10 amino acid differences. In the TJ-2006 strain, these changes are located at positions 8,194,213,253,296,488,504,544,745, and 796, at which we detected proline, asparagine, histidine, glutamic acid, leucine, valine, serine, alanine, and asparagine residues, respectively. In JXA1-2006, changes were identified at positions 8, 253, 296, 488, 745, and 796, at which we detected the substitution of threonine, valine, phenylalanine, valine, and glycine, respectively, whereas in the HUN4-2006 strain, substitutions of threonine, tyrosine, phenylalanine, methionine, and proline were detected at positions 8,194,213,488,504

Recombinant Analysis
To gain a more complete picture of the recombination of PRRSV-2, we added some classic strains from the United States (Table 2). Recombinant analysis of the NSP2 gene revealed seven potential recombinant events by RDP4 (Table 3), among which, five potential recombinant events was verified by SimPlot. According to the RDP4 results, the credibility of all seven recombination events is high with a statistical significance of P < 0.05. Events 2 and 3 were detected in lineage 1 strains, involving a recombination of lineage 1 and 8 sequences ( Figure 6).

Discussion
Our analysis of PRRSV evolution covering a period of approximately 25 years revealed apparent changes in the highly variable region of the NSP2 protein. In order to gain a comprehensive insight into evolutionary changes in the NSP2 genetic evolution, we performed phylogenetic analyses based on the NSP2 sequences of 122 selected PRRSV-2 strains, the results of which revealed the genetically close evolutionary distances of lineage 3 and 5 strains. Lineage 3 comprises variants that have emerged since 2010, the transmission of which has been recorded primarily in southern China (Jiangxi, Fujian, Guangdong, and Guangxi provinces) with a clinical detection rate of less than 10% [17,18]. In contrast, although lineage 5 (BJ-4-like/VR2332-like) strains appeared as early as 1996, these do not appear to be prevalent in China and have a low clinical detection rate [19]. In China, lineage 8 (HP-PRRSV-like) and lineage 1 (NADC30-like) strains have become the predominant strains, which is consistent with previous reports, which may be attributable to the fact that these strains are characterized by high genetic variation and recombinant properties, which are features facilitating the evasion of immune surveillance promoted by existing vaccines. Zhao et al. [20] reported that NADC30-like PRRSV has gradually become the most prevalent genotype in Sichuan Province. Yu et al. [21] proofed NADC30-like PRRSVs are undergoing a decrease in population genetic diversity. Zhou et al. [22] found that the NADC30 and the HP-PRRSV strains mainly circulated in southwest China.
For each of the four assessed lineages, we selected different representative strains for comparisons of nucleotide and amino acid sequences, which revealed nucleotide and amino acid homologies of 72.5% to 99.8% and 63.9% to 99.4%, respectively, among the NSP2 proteins of different PRRSV-2 strains. We speculate that these strains have undergone relatively limited mutation during the course of genetic evolution, and accordingly have been unable to effectively evade vaccine-mediated immunity and host immune surveillance, although further studies are necessary to ascertain specific details in this regard. Currently, lineage 1 and 8 strains are the prevalent epidemic strains in China, among which, the NADC30-like strains appear to be characterized by a high frequency to recombinant events, thereby enabling these viruses to adapt to changing environmental pressures during long-term evolution [6]. Further, it is speculated that the NSP2 sequence has also undergone a corresponding series of changes as a consequence of mutation and recombination of the different lineage 1 strains. Moreover, we detected high amino acid homologies between lineages 3 and 5 strains, which is consistent with the patterns observed in the nucleotide comparison.
Compared with the classical PRRSV strains, HP-PRRSV strains are characterized by a discontinuous deletion of (29 + 1) amino acids, whereas for NADC30-like strains there is evidence of a discontinuous deletion of 131 amino acids. Lineage 3 strains of PRRSV-2, specifically GM2-2011, QYYZ-2011, and FJFS-2012, exhibit 37 amino acid insertions at positions 817-853 of NSP2. Comparison of amino acid sequences among different PRRSV-2 strains reveals variations in deletions, insertions, and substitutions at multiple sites, which suggest differences in antigenicity.
There are some reports showing the recombination of NSP2 [13,23]. Taking NADC30like strains as an example, new deletion of NADC30-like strains occurs, which makes it easier to recombine. Recombinant analysis performed for PRRSV-2 NSP2 revealed a total of five recombinant events among the 135 selected strains, a majority of which appear to have occurred between lineage 1 and 8 strains, possibly because there are many types of PRRSV vaccines in the Chinese market (inactivated vaccines: KV; activated vaccines: Ingelvac PRRS MLV, MLV CH-1R, MLV R98, MLV JXA1-R, MLV TJM-F92, MLV HuN4-F112, MLV GDr180), and the abuse of vaccines is also the main reason why PRRSV is prone to recombination. There is no solid evidence that the changes in NSP2 gene affects the recombination of PRRSV-2. Maybe through reverse genetics, mutation, or deletion of NSP2, the results of recombination can be further observed. So far, no such result has been reported. We will conduct such research in the future. Fang et al. found a high frequency of interlineage recombination hot spots in NSP9, so we hypothesize that changes in the NSP2 gene can affect PRRSV recombination. For the clinical severity, variation of NSP2 sequence was not related to viral virulence. Studies have shown that virulence is related to PRRSV NSP9 and NSP10 (NSP9 and NSP10 contribute to the fatal virulence of HP-PRRSV emerging in China), and two residues in NSP9 contribute to the enhanced replication and pathogenicity of HP-PRRSV [24]. Although HP-PRRSV derived vaccines have contributed to reducing the severity of clinical signs and restricting the spread of PRRS to a certain extent, their efficacy still falls far short of that initially anticipated. Moreover, the frequent use of vaccines can have a number of undesirable side effects, including the anti-virulence of live attenuated vaccines [25,26], viral recombination, and a significant increase in the rate of mutation [27][28][29].
Compared with the results of Jiang [13], our study focuses on the PRRSV-2 NSP2 gene, which we have divided into four lineages. We conducted nucleotide and amino acid similarity analyses, as well as sequence alignment and recombination analysis on NSP2. Although today we have a better understanding of the clinical pathogenesis of PRRS, molecular epidemiology, viral proliferation, pathogenesis, immune response, and immune evasive mechanisms, PRRSV is still not effectively controlled. In the past, investigators tended to focus on amino acid mutations in the hypervariable region (NSP2, ORF5) of the virus genome; however, in recent years, several strain recombination events have been discovered, and the frequency of recombination appears to be rising. In this regard, Zhao et al. [30] have confirmed that the highly pathogenic JL580 strain has a mixed genetic background of NADC30-like, HP-PRRSV, and local strains, and Chen et al. [31] concluded that HBap4-2018 is a new PRRSV strain with HP-PRRSV and NADC30-like PRRSV as primary and secondary parental strains, respectively, which retains most of the virulencerelated regions of the HP-PRRSV genome and has been shown to be highly pathogenic to piglets. Moreover, Li et al. [32] have established that the HNyc15 strain is derived from the recombination of VR2332 and CH-1a gene fragments. The increases in recombinant strains and high rates of recombination in recent years indicate that recombination has played a very important role in the evolution of the virus. PRRSV NSP2 gene is prone to deletion, insertion, and recombination, and recombination plays an important role in the evolution of viruses. Therefore, we speculate that the change in NSP2 gene may affect the evolution of PRRSV. From the perspective of the prevention and control of PRRSV, it is accordingly imperative to ascertain the epidemiological characteristics of PRRSV, establish the molecular mechanisms underlying the high variability and recombinant frequency, and elucidate determinants of the variable virulence observed among different PRRSV strains.

Conclusions
From 1996 to 2021, PRRSV-2 isolates in China have been categorized into lineages 1, 3, 5, and 8, based on variations in their NSP2 genes. Lineages 1 and 8 have been found to be predominant in the prevalence of PRRS in China. Additionally, the removal of the NSP2 sequence prior to 2006 has been identified as a key factor driving the evolution of PRRSV and has also contributed to the evolution of virus. In recent years, recombination has played a significant role in the evolution of PRRSV-2. Therefore, monitoring the NSP2 gene of PRRSV-2 can serve as a reference for effectively controlling PRRS in China in the future.