Vol. 55 No. 4/2008, 731–739 Regular paper on-line at: www.actabp.pl Complete nucleotide sequence of a Polish strain of Peanut stunt virus

Peanut stunt virus (PSV) is a common legume pathogen present worldwide. It is also infectious for many other plants including peanut and some vegetables. Viruses of this species are classified at present into three subgroups based on their serology and nucleotide homology. Some of them may also carry an additional subviral element - satellite RNA. Analysis of the full genome sequence of a Polish strain - PSV-P - associated with satRNA was performed and showed that it may be classified as a derivative of the subgroup I sharing 83.9-87.9% nucleotide homology with other members of this subgroup. A comparative study of sequenced PSV strains indicates that PSV-P shows the highest identity level with PSV-ER or PSV-J depending on the region used for analysis. Phylogenetic analyses, on the other hand, have revealed that PSV-P is related to representatives of the subgroup I to the same degree, with the exception of the coat protein coding sequence where PSV-P is clustered together with PSV-ER.


INtRoduCtIoN
Peanut stunt virus (PSV) is a very serious pathogen, mainly of legume plants, infecting peanut (Arachis hypogaea L.), bean (Phaseolus vulgaris L.), pea (Pisum sativum L.), yellow lupine (Lupinus luteus L.), alfalfa (Medicago sativa L.), celery (Apium graveolens L.), black locust trees (Robinia pseudoacacia L.), etc. Together with two other species -type member Cucumber mosaic virus (CMV) and Tomato aspermy virus (TAV) -it belongs to the Cucumovirus genus in the family Bromoviridae.Viruses from this group are characterized by a tripartite positive-sense single-stranded genome comprising RNA1, RNA2 and RNA3, named according to their decreasing size.In addition to the genomic RNAs cucumoviruses carry two subgenomic RNAs.Some strains may also contain an additional subviral component -satellite RNA (satRNA) -that may be packaged along with the genomic or subgenomic viral RNA strands (Rossinck et al., 1992).
The genome of cucumoviruses carries five open reading frames (ORFs).ORF1 and ORF2 encode the viral replicase complex consisting of two proteins, 1a and 2a, synthesized from RNA1 and RNA2, respectively.The 1a protein has in its C-terminus a putative helicase motif and in the N-terminus a methyltransferase domain functioning probably in capping (Rozanov et al., 1992).The 2a protein carries motifs characteristic of an RNA-dependent RNA polymerase (RdRp) (Koonin & Dolja, 1993).RNA 2 also encodes another small protein -2b, partly overlapping the 2a protein and synthesized from subgenomic RNA4a strand (Ding et al., 1994).RNA 3 is bicistronic and encodes two ORFs: ORF3a with a conservative motif typical for the 30K superfamily of movement proteins (Mushegian & Koonin, 1993), and coat protein (CP) that is expressed from subgenomic RNA4 and pack-
The presence of the satellite RNAs in genomes of cucumoviruses was described for some CMV and PSV strains (Militao et al., 1998;Devic et al., 1990;Naidu et al., 1992;Yamaguchi et al., 2005) but no satellite RNAs have been found in TAV.The satRNAs of plant viruses are small, single-stranded linear or circular RNA molecules.They need a helper virus that provides replication enzymes to amplify satRNA sequences.The satRNAs of CMV contain about 335-368 nucleotides.The satRNAs of PSV are longer -of 391-393 nucleotides.The homology among PSV satRNAs is very high, likewise in CMV strains.However, the homology between satRNAs of PSV and CMV is rather low.Expression of symptoms during pathogenesis depends on the trilateral interaction of the virus, its satRNA, and the host plant (Palukaitis, 1988).The presence of satRNA in the virion may modulate severity of symptoms in the host plant, and both aggravation and attenuation have been reported (rewieved in Simon et al., 2004;Pelczyk et al., 2006).
Although the incidence of PSV is not as broad as CMV, it has been reported in many countries.Since 1966, when it was first identified in the United States of America (Miller & Troutman, 1966) many strains of PSV have been characterized.In Poland, PSV (PSV-P) was first reported in 1983 (Twardowicz-Jakusz & Pospieszny, 1988;Pospieszny, 1988), then the satRNA sequence of the P strain was described (Ferreiro et al., 1996).
Peanut stunt virus strains were first divided into two subgroups: I (eastern) and II (western) based on serological relationships, homologies con-firmed in competition hybridizations and also on the identities of RNA3 and its ORFs and UTRs (Diaz-Ruiz & Kaper, 1983;Zeyong et al., 1986;Hu & Ghabrial, 1998).Then a third subgroup was proposed, represented by strains from China, with a much lower homology to the previous two subgroups (Zeyong et al., 1998).Recently, a Mi strain from the third subgroup was described and fully sequenced (Yan et al., 2005).
In this study we present the complete nucleotide sequence of a Polish PSV-P strain, associated with satellite RNA, its comparison with other strains of viruses belonging to the Cucumovirus genus and their phylogenetic relationships.

MAtERIAlS ANd MEthodS
Virus strain.The PSV-P strain was initially isolated from Lupinus luteus L., then mantained and propagated in Phaseolus vulgaris L., Nicotiana benthamiana or Pisum sativum L. From the latter it was taken for further analyses.
Purification of viral particles and RNA extraction.Viral particles were purified as described before (Pospieszny, 1998).From the obtained particles RNA was isolated by treatment with proteinase K followed by phenol/chloroform extraction and subsequent ethanol precipitations, according to the protocol of Sambrook et al. (1989).The length and integrity of extracted RNA was analyzed by electrophoresis on agarose denaturing gel stained with ethidium bromide and viewed under UV light.
Rt-PCR reaction.Viral RNA was reverse transcribed by using Superscript III Reverse Tran- To amplify the 5' and 3' distal fragments of viral RNA strands rapid amplification of cDNA ends (5' and 3' RACE) was carried out using 5' RACE and 3' RACE Systems for Rapid Amplification of cDNA Ends (Invitrogen), according to the manufacturer's instruction.To obtain 5' terminal sequences two gene-specific primers 5' GSP1 and 5' GSP2 (Table 2) for each RNA strand and an abridged anchor primer supplied by the manufacturer were applied.For amplification of 3' distal fragments, first a poly(A) tail was added to the 3'termini of RNAs in a reaction with poly(A) polymerase (Amersham) and ATP (Fermentas) at 37°C for 10 min.Next, 3' RACE was performed using an adapter primer containing oligo(dT) (Invitrogen) in reverse transcription and then appropriate genespecific primer 3' GSP1 (Table 2) for each RNA strand and an abridged universal amplification primer provided by the manufacturer for subsequent PCR amplification.
dNA cloning and sequencing.DNA fragments obtained in PCR reactions were cloned in pGEM ® T-Easy Vector System (Promega) or pCR-XL TOPO vector from the TOPO XL PCR cloning kit (Invitrogen) according to the instructions of the manufacturers.The obtained recombined plasmids were transformed into Escherichia coli DH5α competent cells by electroporation using a Micro Pulser electroporation system (Bio-Rad).Plasmids from positively screened bacterial colonies were isolated using a Qiaprep Spin Miniprep Kit (Qiagen) and then automatically sequenced.The sequenced fragments were submitted to the GenBank database.
Comparative and phylogenetic analyses.The nucleotide sequences of all RNAs, their noncoding regions: 5' UTRs, 3' UTRs, and IR region from RNA3, as well as nucleotide sequences of all ORFs and their predicted amino-acid sequences were analyzed.They were compared with other PSV viruses from subgroups I, II and III available in the GenBank database and also with representatives of the subgroups I and II of CMV -CMV-Fny and CMV-Trk7, respectively, as well as the TAV-KC strain (Table 3).Multiple sequence alignments (MSA) of all sequences were performed using CLUSTALX (Thompson et al., 1997).The percentage comparisons were performed using Lasergene Package.Molecular masses of proteins were predicted in BIOEDIT (Hall, 1999).After an initial comparison and predetermining of the variable and conservative regions, phylogenetic analyses of coding sequences were carried out using MEGA version 3.1 software (Kumar et al., 2004) with the Neighbor-Joining method (NJ) (Saitou & Nei, 1997) and 1000 repetitions in the bootstrap test.The genetic distance was assessed by the Kimura's twoparameter distance method (Kumar et al., 2004).Then phylogenetic trees were drawn and visualized using MEGA 3.1 software.

RESultS ANd dISCuSSIoN
The symptoms caused by this strain were described previously (Twardowicz-Jakusz & Pospieszny, 1988;Obrepalska-Steplowska et al., 2008).In general, in the tested plants, including Phaseolus vulgarsis and Pisum sativum, it induced local chlorosis or systemic mosaics; ocassionally the infection was latent.

Nucleotide and amino-acid sequence data of the PSV-P virus
The data was collected after sequencing of at least four clones obtained for each PCR product.Each clone was sequenced bidirectionally.
The PSV-P strain has three genomic RNAs as do all peanut stunt viruses, encodes five ORFs, and as some PSVs, has satellite RNA.Lengths of the RNAs are as follows: RNA1 -3355 nt, RNA2 -2982, RNA3 -2186 nt, and satRNA -393 nt.Sequences for the genomic strands are available in the GenBank under accession numbers given in Table 3 and for satRNA under EF535259.

Genome-wide comparative analysis of PSV-P and other chosen cucumovirus strains
The nucleotide sequences of the whole RNA strands and their respective constituents, namely 5' and 3' UTRs, ORFs for 1a, 2a, 2b, 3a, and coat protein, and IR (internal non-coding region) were compared in BioEdit.Additionally, also a comparison of the amino-acid sequences of the viral proteins was performed.Sequences of fully sequenced PSV strains representing the three subgroups were included in those analyses (two strains ER and J -representing subgroup I, W strain from subgroup II, and Mi strain from subgroup III) as well as two representatives of subgroups I and II of CMV: CMV-Fny and CMV-Trk7, respectively, and KC strain of TAV virus.The complete sequence of Peanut stunt virus-P strain Whole strands comparison of PSV-P with the other PSV strains shows that each strand is homologous with PSV-ER and J at almost the same degree.RNA1 and RNA3 show a higher level of identity with those two strains from subgroup I, amounting to 87.9%, whereas RNA2 a lower one -83.8-83.9%.The situation is similar when the sequence of PSV-P is compared with other subgroups of PSV and other cucumoviruses, where the homology of RNA1 and RNA3 is much higher than that of RNA2.
The analysis of 5' and 3' UTRs shows that the sequence of PSV-P is the most similar to that of with PSV-J (with an exception for 5' UTR of RNA3 that shares 92.5% identity with both PSV-ER and J).The highest level of identity in these parts -higher than 90% -is shown by 5' and 3' UTR of RNA3.Other UTRs are less than 90% identical with the J strain, or even less than 80% in the case of 3' UTR of RNA2.The 5' UTR of RNA3 contains an UG tract important for efficient accumulation of RNA3 (Boccard & Baulcombe, 1993) present in all PSV viruses sequenced to date and in PSV-P it is identical to those in all other PSV strains with the exception of PSV-Mi that differs by one nucleotide.Regarding subgroups II and III of PSV, PSV-P is more similar to PSV-W in all UTR fragments except for 3' UTRs of RNA1 and RNA3 where PSV-Mi is much more similar.
The nucleotide sequences of ORFs of PSV-P are more similar to ER for 1a, 2b and 3a, and to J strain for 2a and CP within the subgroup I.For subgroups II and III, those of PSV-P ORFs for 1a, 2a and 3a are more similar with PSV-W and the remaining ORFs with PSV-Mi.
The amino-acid sequences those of PSV-P proteins are in turn more similar to those of PSV-J (1a, 2a, and CP) and PSV-ER (2b and 3a) from subgroup I strains, and from remaining groups -to PSV-Mi (1a, 2b, 3a, CP), and PSV-W (2a).
In general, almost all analyzed components of the PSV-P genome have less than the 90% identity with other members of the subgroup I required for unambiguous classification with them.Detailed results are collected in Table 4.
As was mentioned in the Introduction, Peanut stunt virus strains are proposed to be divided into three distinct subgroups based among others on their nucleotide sequence homology.The PSV-P strain does not fall strictly within any of these subgroups but is very closely related to the first one.
According to Hu et al. (1997) in order to be classified to the same subgroup viruses have to share at least 90% nucleotide homology, and to be classified in different subgroups their homology level should be lower than 80%.However, such a division, although very straightforward, produces a gap for virus strains that show a homology level between 80 and 90%, and precludes their classification.They may only be considered as derivatives of the subgroup they are more closely related with.PSV-P is such a strain and its full genomic RNA strands are 87.9%,83.9% and 87.9% (for RNA1, RNA2 and RNA3, respectively) identical to those of the PSV-ER strain representing subgroup I. What is more, also ORFs' identities are lower than 90%, as well as deduced amino acid sequences (except for ORF1a).A similar case of the so-called atypical Old World strain PSV-I has been described that shares only 88.8% and 86.7% identity in RNA1 and RNA2 sequences, respectively, with its most closely related W strain from subgroup II, and only its RNA3 may be robustly classified to this subgroup (Hajimorad et al., 1999).Those authors proposed that PSV-I could be a reassortant between a subgroup II and another, uncharacterized so far subgroup of PSV.A strain probably resulting from reassortment between subgroups I and II was found before: it is PSV-BV-15, whose RNA1 is closely related with subgroup II whilst RNA2 and 3 with subgroup I (Hu et al., 1997).
In the predicted protein sequences motifs characteristic for specific functions were also compared.On the basis of homology with the ER strain (GB NC_002038 for protein 1a and NC_002039 for protein 2a) the amino-acid motif of the putative helicase domain was found in PSV-P protein 1a.This motif consists of 92 amino acids (aa 81-173 in PSV-P).In this motif, PSV-P 1a differs from PSV-ER only by two amino acids and is identical with PSV-J.Differences with other members of the subgroups were five and four amino acids (for W and Mi strains, respectively).The next motif in PSV-P ORF1a functioning as a putative methyltransferase domain consists of 240 amino acids (aa 720-959) and differs by nine and eight amino acids from those of the ER and J strains, respectively.The differences between PSV-P and other subgroups of PSV as well as other cucumoviruses were much higher.In ORF2a, a sequence containing conserved RNA-dependent RNA polymerase core domain was found, consisting of two regions: 11 amino acids and then three amino acids (in PSV-P: 563-573 aa and 596-598 aa).This motif is identical in all cucumoviruses tested with the exception of CMV-Fny in which it differs by one amino acid only.
The motif characteristic of the 30K movement protein family was found in PSV-P on the basis of homology with PSV-Mi (Yan et al., 2005) and consists of 33 aa (aa 86-118).It differs from that in the ER and J strains by four and five amino acids, respectively.With the other tested virus strains the differences are larger.Only in TAV-KC is the analyzed motif more similar to the PSV-P one and differs in only three amino acids.

Analysis of satRNA
Analysis of PSV-P satellite RNA had been performed before (Ferreiro et al., 1996) but it was sequenced again by our group and no differences that might have been caused by replicase errors since the time of the first sequencing were found.The satRNA sequence of PSV-P was compared with the satRNA sequences of other PSVs available in the GenBank (Ag -NC_009522, P6 -Z981197, P4 -Z98198, PARNA-5 -K03110) and those known from the literature (satG and satV; Naidu et al., 1991a).The level of identity was extremely high amounting to 98.9-99.7%, with only one to four nucleotides different between satP and others.
The analysis of satRNAs nucleotide sequence was performed putting the special emphasis on two satRNA particles that have been reported to have no effect on infection symptoms (satV) or to attenuate them (satG).This two satRNAs differ from each other by only six nucleotides.SatP shows four nucleotide changes relative to either of them: in the case of satV 166 G→T, 226 T→C, 362 C→A, and 337 T→C, and in case of satG: 56 G→C, 57 C→G, Δ366 G, and 376 T→C.
The substitutions 56 G→C and 57 C→G were reported not to have an influence on symptoms.In these positions satRNA from PSV-P is identical with satV.Reduction of symptoms intensity, suppression and delay of symptoms development were related to substitutions in the 3' part of the satellite RNAs studied, especially 362 A→C and 226 T→C (Naidu et al., 1992) found in the symptom attenuating satG and PSV-P associated satellite RNA.This finding is in agreement with our conjecture that satP which contains these substitutions may attenuate expression of symptoms, and based on the observation that infections of the plants tested with PSV-P have usually a rather mild progress (Pospieszny, 1988;Obrepalska-Steplowska et al., 2008).However, the symptoms described for this strain by another group (Ferreiro et al., 1996) were much more severe.Those authors suggested that it might have been caused by the presence of satRNA associated with this virus strain that in turn was suggested to exacerbate the expression of symptoms.Interestingly, symptoms of the same virus strain reported previously from The complete sequence of Peanut stunt virus-P strain Spain and in this paper were completely different.It has been speculated that this is due to climatic and light period differences between these countries, conditions that have been reported to influence the biological effects of satRNAs at least in CMV (Wu et al., 1993Kaper et al., 1995;White et al., 1995).
Nonetheless, to clarify effects of satP, experiments with infection by PSV-P with and without its satRNA should be carried out.This was not tested in the study mentioned before (Ferreiro et al., 1996) where satRNA was only added to another non-satpossessing PSV strain and found not to change the expression of symptoms nor here.We do, however, plan such experiments to elucidate the role of satRNA in PSV-P, at least in the moderate climate conditions in Poland.

Phylogenetic analysis
Analyses of PSV-P phylogeny show clearly that this strain is closely related to representatives of subgroup I (Fig. 1).With the exception for the CP nucleotide sequence, other trees show consistently that PSV-ER and PSV-J are most closely related with each other but PSV-P had a common ancestor with them in the past and that it is the sister strain to those two strains.Analysis of phylogeny based on the CP sequence shows on the other hand that PSV-P has more in common with PSV-ER although comparative analysis of nucleotide sequence (Table 4) indicates that the CP of PSV-P shares more identity with PSV-J.Comparison of other subgroups of PSV shows that PSV-P is more related to PSV-W (sequences encoding 2a, 2b, 3a, CP) or the relatedness PSV-P with both PSV-W and PSV-Mi strains is comparable (1a).Based on the topology of trees for 2b, 3a and CP ORFs it can be concluded that TAV is more closely related to all PSV than it is in the case of CMV -what is in agreement with the results of the earlier study (Yan et al., 2005).In the case of trees obtained for ORFs 1a and 2a, each species analyzed forms a separate cluster.
The present comparative study of strains from subgroup I has revealed that specific regions of PSV-P are most similar to PSV-ER or PSV-J depending on the tested region.On the other hand, phylogenetic analyses show that the relatedness of PSV-P with these strains -that are clustered together -is similar and only the tree obtained for the CP coding sequence indicates higher relation to the PSV-ER than  A. Obrepalska-Steplowska and others the J strain.Such discrepancies are also observed for comparisons with strains representing subgroups II and III, as well as other cucumoviruses.However, it may be explained that in phylogeny analyses the first and second bases of the codon are assigned higher weights than the third one, whereas in simple percentage comparisons each base of the codon is treated equally.

Figure 1 .
Figure 1.Phylogenetic trees of cucumoviruses obtained from analysis in MEGA 3.1 for each ORF nucleotide sequences.Numbers at nodes indicate the percent occurrence of nodes in 1000 bootstrap resampling.Roman numerals indicate respective PSV subgroups.