Introduction

Neonatal calf morbidity and mortality due to infectious gastroenteritis account for significant economic losses in beef and dairy production. Noroviruses (NoVs) belong to the family Caliciviridae, genus Norovirus. They are positive-sense, single-stranded RNA viruses and are known to be common agents of gastroenteritis in humans. Morphologically similar viruses were detected in bovine species in the late 1970s [44], and some of them were subsequently found to exhibit a molecular relationship to NoVs [23, 33], while Newbury-1 virus and viruses subsequently found to be molecularly related to it have been classified in the recently established genus Nebovirus [34]. Bovine noroviruses (BoNoVs) have been detected worldwide in stool from diarrhoeic and non-diarrhoeic cattle [1, 8, 17, 26, 31, 32, 36, 37, 41, 43, 45].

Three open reading frames (ORFs) have been described in the NoV genome: ORF1 encodes a polyprotein, which is further processed into six non-structural proteins (N-terminal protein, NTPase, 3A-like protein, VPg, 3C-like proteinase, polymerase), ORF2 encodes the single capsid protein, and ORF3 encodes a minor structural protein [16]. Recently, a fourth ORF has been reported in the murine NoV genome [30], but evidence is lacking for its existence in other NoV genomes. The genus Norovirus has been divided into five genetic clusters (genogroups GI to GV). Within each genogroup, strains have been further divided into genotypes on the basis of phylogenetic analysis performed on partial or complete sequences from the polymerase and capsid genes [47]. Genogroup III includes all of the BoNoVs, with Bo/NoV/Jena/1980/DE and Bo/NoV/Newbury2/1976/UK as reference strains for genotype 1 and 2, respectively.

Positive-strand RNA viruses evolve quickly due to mutations (point mutations or recombination), resulting in fitness modifications, antigenic drift and/or receptor-switching capabilities. The mutated strains are thereafter subjected to immune pressure and positive or negative selection in their respective hosts, but also to several other environmental/epidemiological factors that drive and regulate virus evolution [10, 14]. In NoVs, evidence of recombination events has been found in silico in various genogroups [5], and recombination was recently demonstrated in vitro in a murine NoV model [25]. For human noroviruses (HuNoVs), in particular those belonging to GII.4, the hypothetical mechanisms involved in the periodic emergence and/or persistence of epidemic strains seem to depend on herd immunity and the possibility of antibody-driven receptor switching, especially between histo-blood group antigen-related attachment structures [6, 7, 21, 22]. However, these hypothetical mechanisms may not occur in a similar way for all human genotypes [4]. Data for BoNoVs have been lacking, but epidemiological studies have shown a higher molecular prevalence worldwide of genotype 2 viruses [38], and only three complete genotype 2 BoNoV genomes have been completely sequenced: Newbury2/1976/UK, Dumfries/1994/UK, and B309/2003/BE [29, 35].

Accumulation of new nucleotide sequences is therefore crucial to extend the current knowledge on the molecular epidemiology of BoNoVs and to study the mechanisms that govern their particular molecular evolution. The aim of this retrospective study was first to genetically characterise BoNoVs detected in stool specimens from Belgian cattle during the years 2002–2003, in particular those detected later in 2007–2008; second, to genetically compare BoNoVs sequences from around the world; and third, to study the genetic evolution of GIII.2 BoNoVs.

Materials and methods

Faecal specimens

Cattle stool specimens (n = 317) were systematically collected from field samples (individual samples collected by practitioners on farms) received for routine diagnosis in the regional diagnosis laboratory (Association Régionale de Santé et d’Identification Animales, ARSIA) in Belgium over a 2-year period (2002–2003). They were collected without regarding the age of the sampled animal or stool consistency (diarrhoeic or non-diarrhoeic). Two stool specimens that were identified as positive in 2007 and are already described elsewhere (BV15 and BV24, partial sequences from the polymerase coding region, [26]) were also included in the study in order to determine the complete sequences of the capsid coding regions. The specimens were all stored at 4 °C until RNA extraction and then at −80 °C before analysis.

RT-PCR, cloning and sequencing

Briefly, 10 % stool specimen suspensions in phosphate-buffered saline were centrifuged. Extraction of RNA from the supernatants was then performed using a QIAamp Viral RNA Mini Kit (QIAGEN, Leusden, The Netherlands). RT-PCRs were performed using an Access RT-PCR System Kit (Promega, Leiden, The Netherlands), and all samples (N = 317) were tested with different primer sets (listed in Supplementary Table 1) to increase the sensitivity of the procedure during the study.

A separate RT protocol was employed to obtain longer genomic fragments (i.e., from the 3′ end of the polymerase gene up until the poly-A tail) from samples BV15 and BV24. Both Superscript III reverse transcriptase (Invitrogen, Merelbeke, Belgium) and a consecutive PCR with different high-fidelity polymerases—Phusion (Westburg, Leusden, The Netherlands), iProof (Bio-Rad, Nazareth, Belgium) or Platinum (Invitrogen, Merelbeke, Belgium)—were used for this purpose, following the manufacturers’ instructions. These longer fragments were obtained by combining the use of a forward primer targeting the end of the polymerase coding region and the strategy described by Dingle and collaborators [9] (the use of a TVN-linker primer to perform the reverse transcription step and then the use of the linker in PCR reactions).

The complete sequence of the B309/2003/BE strain (from a BoNoV-positive specimen from a calf stool sample from the same cohort of 317 samples) has already been reported elsewhere [29].

The small RT-PCR products of the expected size were excised from a 2 % agarose gel and purified using a QIAquick Gel Extraction Kit (QIAGEN, Leusden, The Netherlands). Purified RT-PCR products were then sequenced in both directions using an automated sequencer (Megabase autosequencer) or in the GIGA facilities of the University of Liège. For long fragments (from the 3′ end of the polymerase gene to the poly-A tail), cloning into TOPO Zero Blunt plasmid (Invitrogen, Merelbeke, Belgium) was carried out using Top10 bacteria, and at least three clones were then sequenced.

Bioinformatics

Nucleotide sequences (primers omitted) were compared with sequences in international databases using the BLAST application (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Sequences were aligned using the ClustalW program within the Bioedit software [15]. Phylogenetic analysis was performed by the maximum-likelihood method with bootstrapping (n = 1000), using MEGA version 5.10. Based on their lowest Bayesian information criterion scores, the Jukes and Cantor + γ distribution and the Tamura 3-parameter + γ distribution were used as substitution models for the phylogenetic trees inferred from the partial polymerase and partial capsid coding sequences, respectively (the number of sequences in the sets was variable) [40]. Similarity plots were created using Simplot software version 3.5.1 (available at http://sray.med.som.jhmi.edu/SCRoftware) with a 200-bp window. Rates of nucleotide substitution per site and per year within the complete genomes of GIII.2 BoNoV were estimated using the Bayesian Markov chain Monte Carlo method as implemented in the BEAST software package [13]. For comparison, complete representative GII.4 sequences (CHDC5191/1974/US, JX023286; Lordsdale/1993/UK, X86557; Farmington Hills/2002/US, AY502023) were arbitrarily selected on the basis of their contemporaneous detection with the available complete GIII.2 BoNoV sequences. The substitution model was the Tamura-Nei + γ distribution; the dataset was not partitioned by codon position. As no demographic assumption could be selected a priori for NoVs, the Bayesian skyline model was preferred [3, 11]. BEAST files were run assuming either a strict molecular clock or an uncorrelated relaxed lognormal relaxed clock (three times each). The 95 % highest probability density values gave the statistical uncertainty in parameter values across the sampled phylogenetic trees. The same analysis was then performed on a set of 30 complete ORF2 sequences from GIII.2 BoNoVs available in GenBank (accession numbers available upon request), including those determined in this study (BV15/2007.BE and BV24/2007/BE) and the previously described B309/2003/BE, with the Tamura and Nei + γ distribution + invariant sites as the substitution model and the dataset portioned by codon positions as the only modified settings. For both analyses, chain convergence was checked with TRACER from the BEAST package.

Statistics

Statistical analysis of the substitution rates was performed using a two-tailed unpaired t-test on the means and standard errors of three runs as obtained in TRACER, with p < 0.001 considered significant.

Results

Bovine norovirus sequences detected in Belgium in 2002–2003

Fifteen different stool specimens from cattle gave one or several RT-PCR products at the expected molecular weight (giving an apparent molecular prevalence of 4.73 %): 6 sequences were obtained using the JV12–JV13 primers (polymerase coding region), 8 using the CBECUF-R primers (polymerase coding region) and 10 using the CCV3–CCV4 primers (capsid coding region). Longer products (about 3.5 kb, including the 3′ end of the genome from the end of the polymerase gene to the poly-A tail) were also obtained and sequenced from two stool specimens (BV15 and BV24) identified as being BoNoV positive during a later study in 2007. Phylogenetic analysis performed on partial polymerase and partial capsid coding regions showed that the detected BoNoV sequences were all closely genetically related to GIII isolates and that they all clustered with the genotype 2 reference strain Newbury2/1976/UK. Amino acid identities between the detected sequences and the Newbury2 reference strain were found to range from 95.9 % to 99.6 %, and the nucleotide sequence identity was not less than 91 % in the polymerase coding region (data not shown). No sequences that were genetically related to HuNoV were found despite the use of primers specifically designed to detect such viruses (JV12–JV13). Phylogenetic relationships were detected in both the partial polymerase and capsid genes of the sequences (Fig. 1), and no evidence of recombination events was found in samples B123, B128, B199, B200 or B242 (as for the previously described contemporaneous sample B309) when their respective phylogenetic relationships in these two regions were compared.

Fig. 1
figure 1

Maximum-likelihood phylogenetic tree of bovine norovirus sequences detected in Belgium (2002–2003) based on partial polymerase (region B, primer pair CBECUF-R, 255 nucleotides, A) and capsid protein gene sequences (region D, primer pair CCV3-CCV4, 237 nucleotides, B) with reference sequences and other sequences detected thereafter (2007–2008) in the same country. Sequences obtained during this study are shown in bold, and all belong to the genogroup III genotype 2 noroviruses. Reference strains are shown in italics and underlined. Bootstrap values (1,000 replicates) are reported. The GenBank accession numbers are as follows: Hu/NoV/Norwalk/1968/US, M87661; Hu/NoV/Bristol/1993/UK, X76716; Hu/NoV/Saint Cloud/624/1998/US, AF414427; Mu/NoV/CW1/2002/US, DQ285629; Bo/NoV/Newbury2/1976/UK, AF097917; Bo/NoV/Jena/1980/DE, AJ011099. Information about bovine norovirus sequences identified in Belgium and used in this study is available in Supplementary Table 2

Genetic analysis of BoNoV sequences detected in Belgium and those from around the world

Phylogenetic relationships were inferred from some Belgian sequences and several other BoNoV sequences from around the world (United Kingdom, Norway, Germany, Japan, United States), detected on different dates. A phylogenetic tree was constructed using sequences covering the start of the capsid coding region (so-called region C, [42]). From that phylogenetic tree, neither topotypes (considered here to be closely genetically related viruses present in the same geographic area) nor an evolutionary trend over time (years of detection) could be deduced (Fig. 2). Sequence identity matrixes between the selected GIII.2 BoNoV nucleotide and amino acid sequences are given in Supplementary Files 1 and 2, respectively.

Fig. 2
figure 2

Maximum-likelihood phylogenetic tree of some of the bovine norovirus sequences detected in Belgium and sequences from around the world, based on partial capsid protein sequence (region C, 402 nucleotides). No evidence for topotypes is seen in the tree. Sequences generated during this study are shown in bold; reference strains are in italic and underlined. Bootstrap values (1,000 replicates) are reported if higher than 70 %. The GenBank accession numbers are as follows: Hu/NoV/Norwalk/1968/US, M87661; Hu/NoV/Bristol/1993/UK, X76716; Hu/NoV/Saint Cloud/624/1998/US, AF414427; Mu/NoV/CW1/2002/US, DQ285629; Bo/NoV/Newbury2/1976/UK, AF097917; Bo/NoV/Jena/1980/DE, AJ011099; Bo/NoV/Thirsk10/2000/UK, AY126468; BoNoV sequences detected in this study, EU794905–EU794907. Other accession numbers are available upon request

On the basis of a multiple alignment analysis performed on the complete amino acid sequence of the capsid protein, very few mutations were observed in Newbury2/1976/UK, Dumfries/1994/UK, B309/2003/BE, BV15/2007/BE and BV24/2007/BE during the 30-year-long period covered by their respective detection dates. As expected, most of these mutations were reported in the sequence corresponding to the protruding domain. Nucleotide and amino acid sequence identities were 89 % and 98 %, 87 % and 96 %, and 88 % and 96 % for B309, BV15 and BV24, respectively, in comparison with the Newbury2/1976/UK reference strain. Alignment of the capsid sequences showed the presence of the LAGNA motif in the S domain of the capsid protein (Supplementary Fig. 1), which is also found in other members of the genus Norovirus. This motif is absent in members of the genera Vesivirus, Lagovirus and Sapovirus.

Genetic comparison of complete BoNoV complete sequences

The complete genome sequences of the genotype 1 (Jena/1980/DE) and genotype 2 (Newbury2/1976/UK) reference strains were plotted and compared to the two other published complete BoNoV sequences: Dumfries/1994/UK and B309/2003/BE (Fig. 3). Nucleotide and amino acid sequence identities are given as percentages for the different open reading frames of bovine strain B309/2003/BE in Table 1. When only genotype 2 nucleotide sequences were analysed, very few differences were noted, except for three interesting patterns showing similarity percentages that were completely divergent between B309/2003/BE and Dumfries/1994/UK. These patterns were localised at nucleotide positions 3900–4200 (the similarity percentage for B309/2003/BE was reduced), nucleotide positions 5900–6100 (the similarity percentage for B309/2003/BE was more closely related to Newbury2/1976/UK than was Dumfries/1994/UK) and nucleotide positions 6700–7100 (the similarity percentage with Newbury2/1976/UK was first higher for Dumfries/1994/UK compared to B309/2003/BE and then reversed). These three regions were located at the 3′ end of ORF1 (polymerase coding region), within the ORF2 (capsid-coding region), and in the 5′ part of ORF3, respectively.

Fig. 3
figure 3

Nucleotide similarity plot of the complete bovine norovirus sequences. Bo/NoV/B309/2003/BE was plotted against the genotype 1 and 2 BoNoV reference strains (Bo/NoV/Newbury2/1976/UK, grey line; Bo/NoV/Jena/1980/DE, dotted line and Bo/NoV/Dumfries/1994/UK, black line). Nucleotide positions are shown on the x-axis, and the similarity percentage on the y-axis. Regions with evident divergence within the main similarity percentage pattern are identified by a shaded box. Plots showing less than 50 % similarity are not shown

Table 1 Nucleotide and amino acid sequence identity between B309/2003/BE and two bovine norovirus reference strains in their three respective open reading frames (ORF), cleavage products of ORF1, and subdomains of the capsid protein

Amino acid similarity for the reference strain Newbury2/1976/UK was then plotted in its three coding regions against the other two time-divergent genotype 2 strains, Dumfries/1994/UK and B309/2003/BE, and the genotype 1 reference strain Jena/1980/DE (Fig. 4). The regions that varied the most in terms of differences in their similarity percentages were found in the N-terminal protein encoded by ORF1, in the hypervariable region of the capsid gene encoded by ORF2, and in the minor structural protein encoded by ORF3.

Fig. 4
figure 4

Amino acid identity plot for each of the three open reading frame (ORF) products of the genotype 2 BoNoV reference strains (Bo/NoV/Newbury2/1976/UK) against other genotype 2 strains (Bo/NoV/Dumfries/1993/UK, Bo/NoV/B309/2003/BE) and the genotype 1 reference strain (Bo/NoV/Jena/1980/DE). Amino acid positions are shown on the x-axis, and similarity percentages on the y-axis. Plots showing less than 70 % similarity are not shown

Genetic evolution of GIII.2 bovine noroviruses

Simplot analysis performed on the available complete GIII.2 BoNoV sequences clearly showed that genetic divergence was lower in comparison with that reported for contemporaneous complete GII.4 sequences (Supplementary Fig. 2). Moreover, Bayesian inferences with BEAST gave systematically lower substitution rates for GIII.2 BoNoV along their entire genome than for GII.4 HuNoV, regardless of the molecular clock model (Table 2). These results were also confirmed when other contemporaneous GII.4 strains were selected for comparative analysis (CHDC2094/1974/US, FJ537135; Camberwell/1994/AU, AF145896; 5M/2004/US, JQ798158; data not shown). When the substitution rates were estimated from the complete ORF2 sequences, both strict and relaxed uncorrelated lognormal molecular clock models gave substitution rates of the same order of magnitude (10−3 substitutions/site/year) but lower than those already estimated for the same genomic region for GII.4 and GII.3 HuNoVs (Table 3).

Table 2 Nucleotide substitution rates, estimated by the Bayesian Markov chain Monte Carlo method with different molecular clock models, of bovine genogroup III genotype 2 and human genogroup II genotype 4 noroviruses (complete genomes, means of the three runs are given)
Table 3 Nucleotide substitution rates, estimated by the Bayesian Markov chain Monte Carlo method, assuming different molecular clock models, in open reading frame 2 of GIII.2 bovine noroviruses (30 complete sequences)

Discussion

In this study, BoNoV sequences were molecularly detected from Belgian cattle stool samples, and all were found to be phylogenetically related to members of genotype GIII.2 in the genus Norovirus. These partial sequences from BoNoV detected in Belgium were also genetically compared to those detected around the world since the end of the 1970s. Based on these analyses, neither topotypes nor an evolutionary trend similar to that of GII.4 HuNoVs could be observed for genotype 2 BoNoVs. In addition, all currently published complete GIII.2 nucleotide sequences were compared genetically, allowing the detection of hot spots of genetic evolution in their genomes. Bayesian inference of the substitution rate for GIII.2 BoNoV supported a slower genetic evolution of these viruses compared to both GII.3 and GII.4 HuNoVs.

During this study, the genotype 2 strains were the most frequently detected ones, as has already been shown in other studies performed around the world on BoNoV [38], but no other clear epidemiological conclusions were drawn from the apparent molecular prevalence observed because the study protocol was not designed to meet this objective. These results and those from other studies thus argue for a better adaptation of these strains to their bovine host compared to the GIII.1 strains. However, bias due to a lack of primer specificity or to the sampling strategy, for example, should not be ruled out. The phylogenetic inferences deduced based on sequences from region C [42] allowed the integration into the phylogenetic tree of the largest number and the most representative set of BoNoV sequences available in GenBank. From that tree, neither topotypes (similar to those already described for other RNA viruses) nor a time evolutionary trend (similar to those already described for GII.4 HuNoVs, [3]) were observed. The accumulation of partial or complete BoNoV sequences should increase the accuracy of these studies in the future. In contrast to studies performed in the same geographic region in humans and cattle [27], no recombinant strains were detected in this study, reinforcing the hypothesis of the relatively low occurrence of such events in the field. According to the recent results on GIII.2 BoNoV attachment factors [46], the higher molecular prevalence of these strains also argues for a low zoonotic risk associated with BoNoVs.

The genetic comparison between complete genotype 2 BoNoVs sequences reveals only limited genetic evolution in comparison with that observed in other NoV clusters, such as HuNoV strains from GI and GII. Interestingly, some mutations identified in the capsid amino acid sequences between Newbury2/1976/UK and Dumfries/1993/UK (these mutations occurred 10 years apart from each other) tended to revert in the B309 genome. These could be considered true reversions or single nucleotide polymorphism hot spots. Not surprisingly, the protruding domain of the single capsid protein, which is exposed to the host’s immune response and is involved in receptor binding, was found to exhibit the largest number of mutations, as has already been observed for other NoVs [24]. It has been demonstrated that antigenic drift in the structure surrounding the binding pocket as well as receptor-switching capabilities govern ORF2 selection and evolution in GII.4 HuNoVs [20]. Binding studies with virus-like particles of two GIII.2 BoNoVs, namely Newbury2/1976/UK and B309/2003/BE, have been performed in recent years, showing the involvement of the same oligosaccharide structure (namely, the alpha-gal epitope) [28, 46]. Thus, taking into account the fact that GIII.2 BoNoVs do not seem to evolve via attachment-factor switching, we can hypothesise that only antigenic drift is currently involved in the model of ORF2 evolution of BoNoVs. In the present study, the only other proteins that showed an evolutionary pattern were the N-terminal protein coding region (in ORF1) and the minor structural protein encoded by ORF3, essentially in its N-terminal coding region. Various roles have been proposed for this poorly studied minor structural protein in members of the genus Calicivirus [2, 39]. However, assuming its evolutionary pattern, which seems to be related to virus-host interactions, it would be very interesting to better understand which properties and functions are relied upon by the virus.

The estimated substitution rates from the complete genome sequences of both GIII.2 BoNoVs and GII.4 HuNoVs need to be carefully considered due to the very small number of sequences introduced in the Bayesian studies. However, the substitution rates estimated from the GII.4 sequences were similar to those obtained during a previous study encompassing a significantly larger number of GII.4 ORF2-derived sequences [3] and more generally from those of most of the RNA viruses [14]. The improvement represented by this study was to encompass complete sequences instead of partial sequences, considering that the evolution rates should be different depending on the genomic region involved. On the other hand, when the substitution rates were inferred only from the ORF2 regions of an larger number of GIII.2 sequences, they were also lower than those already published for both GII.3 and GII.4 HuNoVs [3, 4]. These results were obtained with both a strict and an uncorrelated relaxed lognormal molecular clock. Two different molecular clocks were used in this work, as different studies already reported that the strict molecular clock, which assumes that the substitution rates are equal across all branches of a tree, could not accurately model the natural variation of mutation rates, generation times, population sizes, and structures during phylogenetic studies [12, 19]. From this perspective, an uncorrelated relaxed lognormal molecular clock should be more efficient [18] and was the recommended relaxed model in the BEAST package. When estimating substitution rates from phylogenetic relationships, recombination events need to be taken into account, even if these events occur at a low frequency, as seems to be the case for NoVs. In this study, no recombinant sequences were used in the Bayesian inferences.

Contemporaneous BoNoV B309/2003/BE sequences detected in this study were all genetically related to members of GIII.2, reinforcing the hypothesis that BoNoVs still constitutes a different lineage in the genus Norovirus and arguing that genotype 2 seems to be the most prevalent in the BoNoV genogroup. About 30 years separate the detection of the genotype 2 reference strain Newbury2/1976/UK and the B309/2003/BE strain. Surprisingly for such RNA viruses, very few genetic differences were noted between them, and this was supported by the Bayesian estimation of their substitution rate. This is consistent with the hypothesis that genotype 2 BoNoVs may evolve more slowly than HuNoVs for which a substitution rate has already been estimated (GII.4 and GII.3 strains), introducing interesting questions about varying substitution rates depending on the genogroup/genotype of the strain. BoNoVs are likely to be subjected to the same immunological pressure as HuNoVs; however, the epidemiological context is completely different: while HuNoVs are subjected to the consequences of globalisation (more opportunities to spread, evolve and adapt to new hosts), the ability of BoNoVs to spread is more restricted (e.g., due to cattle movement restrictions and the existence of breeds specific to a given region). Also, the hypothesis of a narrower spectrum of attachment factor/receptor switching available to the virus in its bovine host needs to be investigated. Moreover, the genetic evolution, functions, and interactions of the (Bo)NoV N-terminus and ORF3 proteins with the host proteome require further study.