Use of Norovirus Genotype Profiles to Differentiate Origins of Foodborne Outbreaks

Detection should enable containment of viral foodborne infections.

Because secondary transmission masks the connection between sources and outbreaks, estimating the proportion of foodborne norovirus infections is diffi cult. We studied whether norovirus genotype frequency distributions (genotype profi les) can enhance detection of the sources of foodborne outbreaks. Control measures differ substantially; therefore, differentiating this transmission mode from person-borne or food handler-borne outbreaks is of public health interest. Comparison of bivalve mollusks collected during monitoring (n = 295) and outbreak surveillance strains (n = 2,858) showed 2 distinguishable genotype profi les in 1) human feces and 2) source-contaminated food and bivalve mollusks; genotypes I.2 and I. 4 were more frequently detected in foodborne outbreaks. Overall, ≈21% of all outbreaks were foodborne; further analysis showed that 25% of the outbreaks reported as food handler-associated were probably caused by source contamination of the food.
N oroviruses are members of the family Caliciviridae and recognized as major pathogens in outbreaks of gastroenteritis worldwide. Because these viruses have environmental stability (1), ability to use different transmission routes, and low infective doses (2), their source may be diffi cult to determine during an outbreak. Transmission can occur through contact with shedding persons; food contaminated during processing, preparation or serving; sewage-contaminated water used for consumption, cultivation or irrigation of food; contaminated aerosols resulting from vomiting; and environmental contamination (3,4). Five genogroups have been described (GI-V), subdivided into at least 40 genetic clusters (5,6).
To implement effective measures for prevention, recognition of the transmission routes is necessary. Consequently, the relative importance of different transmission routes in the total number of outbreaks is of interest for estimation of cost-effectiveness of reducing the number and size of norovirus outbreaks, particularly for geographically disseminated foodborne outbreaks. Such outbreaks are diffi cult to detect when the primary introduction of viruses through food occurs simultaneously in several countries or continents (7)(8)(9). Globalization of the food industry with consequential international distribution of products increases the risk for such outbreaks. For example, the fi rst reported GII.b outbreak occurred in August 2000 during a large waterborne outbreak in southern France (10). After this outbreak, in December and January, 4 multipathogen and oyster-related outbreaks with this newly emerging genotype were reported from France. In the same period, Denmark, Finland, and the Netherlands reported norovirus cases resulting from oysters originating from a French batch that probably was sold in these countries, as well as in Sweden, Italy, and Belgium (6). All these outbreaks seemed to involve closely related and newly detected GII.b strains. After active case identifi cation, further linked cases were detected in Germany, the United Kingdom, Spain, Slovenia, and Sweden (11,12). Another example of a geographically disseminated outbreak was several seemingly independent norovirus outbreaks in Denmark that were traced back to consumption of raspberries from Poland. Although raspberries from this contaminated batch were exported to other European countries, an alert in the Rapid Alert System for Food and Feed did not result in further linked outbreak reports (7). Thus, geographically disseminated outbreaks are sometimes identifi ed but only after the joint and exhaustive efforts of different organizations, such as laboratory networks, food safety authorities, and public health institutions. Knowledge of the proportion of geographically disseminated foodborne outbreaks to all norovirus outbreaks will therefore provide insight into the cost-effectiveness of such efforts.
We studied whether the genotype frequency distributions (genotype profi les) of strains can be used to differentiate foodborne outbreaks related to contamination early in the food chain (i.e., during primary production) from those related to contamination later in the food chain (i.e., during preparation or serving). If so, detection of food origins likely to cause geographically disseminated outbreaks will be enhanced. We considered methods for attribution to multiple sources commonly applied to Salmonella infections (13) because different transmission routes involved in norovirus infections can disguise the foodborne origin. However, such methods require strain collections representative of noroviruses in the potential sources that are as yet unavailable because of diffi culties in the direct detection of viruses in food (14)(15)(16). Therefore, we compared 2 strain collections: noroviruses identifi ed through fi lter-feeding bivalve mollusk monitoring representing source contamination of food and noroviruses collected through systematic surveillance of illness in the population. The fi rst was collected by the European Community Reference Laboratory for Monitoring Bacteriological and Viral Contamination of Bivalve Mollusks during 1995-2004 (17) and the second by the Food-Borne Viruses in Europe (FBVE) network, which has conducted surveillance for norovirus outbreaks in Europe since 1999. Prior investigation of the FBVE database of systematically collected epidemiologic and microbiological norovirus surveillance data (6) showed that the epidemiology of norovirus outbreaks in Europe varies between genogroups. An analysis of the properties of reported outbreaks indicated a clear difference between GII.4 strains and other noroviruses; non-GII.4 strains were found more frequently in outbreaks with a foodborne mode of transmission, and GII.4 strains were found more frequently in healthcare settings with person-to-person transmission (18,19). Here we demonstrate that further specifi cation into genotypes shows additional differences in the epidemiology of norovirus outbreaks.

Data Sources
We used 2 broad databases refl ecting norovirus prevalence within the European countries under surveillance. These databases provided us the opportunity to compare genotype proportions as detected in outbreaks, i.e., human surveillance data, with those detected in source-contaminated food products, i.e., bivalve mollusks monitoring data.

Human Surveillance Data
From January 1999 through December 2004, FBVE collected molecular information on 2,727 norovirus outbreaks and sporadic cases in Denmark, Finland, France, Germany, England and Wales, Hungary, Ireland, Italy, the Netherlands, Norway, Sweden, Slovenia, and Spain (20,21). Although the name FBVE suggested a foodborne focus, the network actually investigated outbreaks from all modes of transmission to obtain a comprehensive overview of viral activity in the community (strengths and limitations of the FBVE data collection were described by Kroneman et al. [20]; to compare newly detected strains with the FBVE database and fi nd potential linked outbreaks, we used a comparison tool [www.rivm.nl/bnwww]). Data were reported to FBVE at outbreak level; therefore, no informed consent was needed. Outbreaks were categorized as follows on the basis of the cause of infection as reported in the surveillance system: • Foodborne-food (FB-food) when an outbreak was reported to be caused by food and the outbreak strain was detected in food; • Foodborne-feces (FB-feces) when an outbreak was reported to be caused by food and the outbreak strain was detected in human feces only; • Foodborne (FB) when an outbreak was classifi ed as FB-food or FB-feces; • Food handler-borne (FHB) when an outbreak was reported to be caused by an infected food handler contaminating the food and the outbreak strain was detected in human feces; • Person-borne (PB) when an outbreak was reported to be caused by person-to-person transmission and the outbreak strain was detected in human feces; • Unknown (UN) when the mode of transmission was not reported or was reported to be unknown and the outbreak strain was detected in human feces.
When the mode of transmission was not reported but information was given in text data fi elds, this information was used to categorize the outbreak. Because we were interested in the origin of the virus, we categorized outbreaks involving PB transmission but starting with food as FB-food, FB-feces, or FHB, depending on available information. Strains detected in sporadic cases were clustered into outbreaks if information was available. The remaining strains detected in sporadic cases were considered of interest with respect to the genotypes causing human illness and representative of potential unreported outbreaks. When we detected multiple genotypes during an outbreak or in sporadic cases, we recorded each genotype.

Bivalve Mollusk Monitoring Data
The European Community Reference Laboratory for Monitoring Bacteriological and Viral Contamination of Bivalve Mollusks systematically collected sequence data on norovirus strains routinely detected in bivalve mollusks in Europe. During January 1999-December 2004, the laboratory systematically collected 295 strain sequences with region A sequence lengths varying from 76 to 78 nt. These strain sequences were detected as part of production area monitoring studies or outbreak investigations of gastroenteritis in Denmark, England and Wales, Ireland, Scotland, and Spain. All samples were fi rst routinely tested with GI and GII PCR methods; then all positive samples were cloned (22), resulting in a representative refl ection of norovirus presence in bivalve shellfi sh. If we detected multiple genotypes in 1 sample, we recorded each genotype.

Assignment of Genotypes
Strains were genotyped by using a previously described method for sequence analysis of a fragment of the RNA-dependent RNA polymerase gene regions B, C, and D (23) because these regions were used in the FBVE network. From the start, the network used sequence-based genotyping of the then most commonly used diagnostic PCR fragment, targeting the RNA-dependent RNA polymerase gene. Since then, however, it has become clear that recombination is common and mainly occurs in the area between the overlap between the polymerase and the capsid gene. Therefore, capsid-based and polymerase-based typing may be discordant. Genotype assignment was therefore performed only after clustering of query strains against all relevant available sequences in the FBVE database (M. Koopmans et al., unpub. data). This process resulted in the genotyping of all but 68 (2%) strains. Genotypes were classifi ed on the basis of their similarity to reference strains representing known genotypes by using the norovirus typing library (www.noronet.nl/nov_quicktyping). If the (clustered) genotypes occurred <5 times in our 5-year covering data selection, the frequencies were considered too low to be ascribed a separate genotype and excluded. This was the situation for GII.18 and 6 clusters of nonassigned GII strains (n = 25, 1%).

Data Analysis
First, we compared the genotype frequency distributions detected in outbreak categories reported as FB-food, FB-feces, FHB, PB, and UN and in routinely tested bivalve mollusks. To evaluate the correlation and measures of association of these 6 proportional profi les, Pearson correlation coeffi cient ρ was calculated on the basis of frequencies (ρ 1 ) and logarithm (ρ 2 ) of the frequencies of 22 genotypes, as well as Cramer V and simulated p values by using 20,000 replications with the exact variant of the χ 2 test. The exploratory technique correspondence analysis allows for examining the structure of categorical variables in a multiway table and was used to visualize the measure of correspondence in the 6 genotype profi les. p values <0.05 were considered signifi cant.
Second, to differentiate the remaining genotype profi les detected in outbreaks, we used the genotype profi les of the 2 main transmission modes to be distinguished during an outbreak investigation. For each genotype in the human surveillance collection, the fraction of outbreaks of known origin being FB (i.e., FB-food and FB-feces) or PB was estimated on the basis of the proportion of FB outbreaks of all FB + PB outbreaks in each genotype. We used the estimated proportion of FB outbreaks of all FB + PB outbreaks in each genotype to estimate the probability that an FHB or UN outbreak was foodborne. We calculated 95% confi dence intervals (CIs) using Monte Carlo simulation with 10,000 random draws from the β distributions, which are the posterior probabilities of the proportions (24).

Results
Of 3,022 detected noroviruses, 25 (1%) were excluded because of low frequencies; for 68 (2%), assignment of a genotype was not possible because of short sequences or inability of the method applied to type the detected norovirus beyond its genogroup. Of the remaining 2,929 strains, 71 (2%) could not be linked to epidemiologic data, and therefore their origin remained unknown, leaving 2,858 (95%) strains for analysis: 922 originating from PB outbreaks, 24 from FB-food outbreaks, 151 FB-feces outbreaks, 20 FHB outbreaks, 1,446 UN outbreaks, and all 295 bivalve mollusk monitoring strains. Among the outbreaks of known origin, 175 (16%) of 1,117 were reported to be FB (i.e., FB-food and FB-feces).
We visualized genotype frequency distributions as profi les for the observed categories of outbreaks and sorted them for their relevance in UN outbreaks, presented with different scales allowing for proportional comparison (online Appendix Figure 1, www.cdc.gov/EID/ content/16/4/617-appF1.htm). The genotype profi les vary between these groups. The correlation coeffi cients based on frequencies, ρ 1 , showed that 2 genotype profi les were distinguishable ( Table 2): 1 profi le typically seen in human feces (FB-feces, FHB, or PB), and another profi le typically detected in sources other than human feces, i.e., in food (FB-food) or bivalve mollusks. The ρ 1 refl ects some genotypes frequently and others rarely seen in FB-food and bivalve mollusks. Because FB-food strains include oysterrelated outbreaks as well, we assumed that the correlation between FB-food and bivalve mollusks can be explained partly by these oyster-related outbreaks. We therefore calculated an additional correlation coeffi cient using the 14 strains detected in food items other than bivalve mollusks. Despite low numbers, this calculation resulted in a high, signifi cant correlation coeffi cient (ρ = 0.81, p <0.001). The logarithm of the frequencies, ρ 2 (Table 2), is less sensitive to peak frequencies of genotypes and therefore capable of differentiating profi les with respect to the rare genotypes and approaching the Cramer V. Cramer V and ρ 2 show less clear association of profi les, with diverging results for the FHB and UN profi les. Table 2 shows the quantifi cation of association; the associated genotype profi les illustrated by correspondence analysis is shown in the Figure. The values of the 6 columns in Table 1 can be considered coordinates in a 6-dimensional space, and the distances are computed. These distances summarize information about the similarity between the rows in Table 1. Dimension 1 may be considered to differentiate transmission modes explaining 59.12% of the correspondence, confi rming that the profi les found in bivalve mollusks and FB-food are similar with regard to the pattern of relative frequencies in genotypes (rows in Table  1) and differ from those in PB. It also shows that the FHB, UN, and FB-feces profi les are mutually similar, with their distance somewhere between the PB and FB-food/bivalve mollusk profi les. Dimension 2 may represent dual origin, explaining an additional 31.40%, showing that FB-feces, , an outbreak was reported to be caused by food and the outbreak strain was detected in food; FB-feces, foodborne-feces, i.e., an outbreak was reported to be caused by food and the outbreak strain was detected in human feces only; FHB, food handler-borne, i.e., an outbreak was reported to be caused by an infected food handler contaminating the food and the outbreak strain was detected in human feces; PB, person-borne, i.e., an outbreak was reported to be caused by person-to-person transmission and the outbreak strain was detected in human feces; UN, unknown, i.e., the mode of transmission was not reported or was reported to be unknown and the outbreak strain was detected in human feces.
FHB, and UN mutually correspond and differ from FBfood, bivalve mollusks, and PB that mutually correspond. When we compared the proportions of genotypes detected in FB outbreaks with those in PB outbreaks, we detected genotypes I.2 and I.4 signifi cantly more frequently in FB outbreaks (online Appendix Figure 2, www.cdc. gov/EID/content/16/4/617-appF2.htm). On the other hand, genotypes I.6, II.1W, II.2, II.4, II.b, II.c, and II.d were detected signifi cantly more frequently in PB outbreaks. Using these proportional FB and PB genotype profi les and their confi dence intervals to distinguish between FB and PB transmission among 20 FHB outbreaks, we could ascribe 5 (95% CI 4-6) to FB and 15 (95% CI 14-16) to PB transmission. Ascribing 1,446 unexplained human norovirus outbreaks to either FB or PB transmission resulted in ≈367 (95% CI 327-417) FB outbreaks and ≈1,079 (95% CI 1,026-1,120) PB outbreaks. Overall, use of the genotype patterns increases the estimated number of FB proportion of outbreaks to 21% (547/2,563; range 20%-23%) compared with the 16% previously mentioned among the outbreaks of known origin.

Discussion
Our combined epidemiologic and virologic analysis demonstrated that norovirus genotype profi les, derived from long-term norovirus strain collections, can be used to dif-ferentiate foodborne outbreaks caused by food contamination early in the food chain from those caused by food handlers contaminating food. Our study is one step in deriving practical applicable information from the existing record and possible only through the availability of continuously updated databases containing detailed epidemiologic data and virus characterization. We confi rmed a signifi cant difference in the GI:GII ratio; GI strains were more prevalent in bivalve mollusks. On the basis of the 5-year strain collections, some genotypes (I.2 and I.4) suggest FB instead of PB preference, and others (II.2 and II.6/II.7) are commonly seen in outbreaks but not detected in bivalve mollusks (and FB-food). Strains detected in food that caused outbreaks (FB-food) showed a genotype profi le similar to those in bivalve mollusk monitoring and dissimilar to the profi le detected in human feces (i.e., FB-feces, FHB, PB, UN) with respect to the frequently seen genotypes. This fi nding may refl ect the ability of these genotypes to survive outside humans or their diminished ability to spread or replicate within the human population. Genotype profi les of FHB and UN resulted in diverging association outcomes, which may refl ect their potential dual origin.
Although consumption of contaminated food causes both types of outbreaks, outbreaks resulting from infected food handlers clearly necessitate different measures than do outbreaks resulting from food contaminated early in the Table 2 * 1 = based on frequencies; 2 = based on logarithm of frequencies; Cramer V, 2 test with simulated p values; FB-food, foodborne-food, i.e., an outbreak was reported to be caused by food and the outbreak strain was detected in food; FB-feces, foodborne-feces, i.e., an outbreak was reported to be caused by food and the outbreak strain was detected in human feces only; FHB, food handler-borne, i.e., an outbreak was reported to be caused by an infected food handler contaminating the food and the outbreak strain was detected in human feces; PB, person-borne, i.e., an outbreak was reported to be caused by person-to-person transmission and the outbreak; strain was detected in human feces; UN, unknown, i.e., the mode of transmission was not reported or was reported to be unknown and the outbreak strain was detected in human feces.
food chain. Consequently, differentiation of these modes of transmission is of interest to food safety authorities and public health institutions. Food handler-borne outbreaks are end-of-chain outbreaks easily recognized as such, as numerous outbreak reports illustrate (25-29). Such outbreaks can be prevented or limited by exclusion of infected or shedding food handlers from work until 48-72 hours af-ter recovery (25,27,29,30), education of food handlers (26), and standard testing of food handlers during outbreaks (28). A common source of contamination early in the food chain, however, may be more diffi cult to detect. Such contamination may result from sewage infl ux containing multiple viruses (8,9,31), making a link diffi cult to identify (31). Moreover, sewage most likely contains noroviruses from person-to-person outbreaks, which can contaminate the food and thereby dilute the genotype profi ling effect. Use of genotype profi les is a fi rst step toward recognizing outbreaks resulting from contamination early in the food chain because it allows estimation of the incidence in surveillance data retrospectively and objectively minimizes misclassifi cation of outbreaks. However, genotyping data need to be interpreted with care, and continuous updating of the database remains necessary. Our study has some limitations. First, our measures of association could not detect differences between genotype profi les with respect to the rare genotypes. Even so, the rare outbreak or sporadic strains are of interest because they may represent potential emerging or zoonotic genotypes with consequences for public health. Types that were initially rare may remain in human surveillance, as seen with the emergence of GII.b after a large waterborne outbreak (10) followed by, among others, foodborne distribution throughout Europe. Since then, GII.b strains have caused 13% of all outbreaks (Table 1), now mainly PB, suggesting good adaptation. On the other hand, if the rare types are unable to adapt for persistence in the human population, they may be repeatedly reintroduced, causing only sporadic cases but not outbreaks. This repeated introduction of sporadic cases would remain undetected at present because routine surveillance for sporadic cases is rare (32) and is not the current practice of FBVE. To identify the origin of newly emerging and rare strains, systematic monitoring of additional potential sources, such as cattle and swine (33) as well as sporadic human cases, is necessary.
Second, in our analysis, the transmission route was reported as unknown for 57% of outbreak strains. Incompleteness of surveillance data is a common problem (34) and has been recognized in surveillance of foodborne viral infections (35), including in the FBVE database (19,20). Incomplete data may have resulted in underestimation of the number of foodborne outbreaks because they may be complicated to identify. Food safety authorities routinely confi rm FB clusters by detecting pathogens in food, but such confi rmation is diffi cult for viruses because viruses, unlike bacteria, do not replicate in food, resulting in a low viral load for extraction and concentration. In addition, the matrix involved may complicate these procedures, and successful detection methods are available primarily for fresh produce with surface contamination and virusaccumulating shellfi sh (36,37). However, knowledge of Figure. Two-dimensional display of the correspondence analysis of 6 norovirus genotype profi les based on nucleotide sequences in which points close to each other are similar with regard to the pattern of relative frequencies across genotypes. Dimension 1 explains 59.12% and dimension 2 an additional 31.40%. In dimension 1, foodborne-feces (FB-feces; i.e., outbreak reported to be caused by food with the outbreak strain detected in human feces only) and bivalve mollusk (BM) genotype profi les are mutually similar and differ from other profi les; the most distinct profi le is person-borne (PB; i.e., an outbreak reported to be caused by person-to-person transmission with the outbreak strain detected in human feces). In dimension 2, food handler-borne (FHB; i.e., outbreak reported to be caused by an infected food handler contaminating the food with the outbreak strain detected in human feces), FB-feces, and unknown (UN; i.e., mode of transmission was not reported or was reported to be unknown with the outbreak strain detected in human feces) mutually correspond and differ from the mutually corresponding foodborne-food (FB-food; i.e., outbreak reported to be caused by food with the outbreak strain detected in food), BM, and PB. the prevalence of strains in the environment, foods, and humans is necessary for the interpretation of matching. Such knowledge requires monitoring, which is limited to shellfi sh and norovirus outbreaks (38). For monitoring of foods other than shellfi sh, methods sensitive enough to detect viruses in naturally contaminated (and not spiked) food are required. The technical advisory group (TAG 4) of the Viruses in Food workgroup (WG 6) in the Technical Committee of Horizontal Methods for Food Analysis (TC 275) of the European Committee for Standardization (CEN) is validating standard methods for norovirus detection in bivalve mollusks, soft fruit, leafy vegetables, and bottled water (39). Until such methods are available and provide knowledge about the prevalence of viral presence in foods, the use of genetic profi les retrospectively derived from outbreak surveillance data is likely to improve foodborne viral surveillance. Because the norovirus strain population is continuously evolving, our analysis needs to be repeated periodically to ensure that retrospective fi ndings remain predictive.
Third, international comparison of norovirus strains is complicated because of their genetic diversity and the involvement of several laboratories in diagnosis; consequential different assays result in sequences with diverging lengths and from diverging genomic regions. However, this limitation is not likely to have infl uenced our results because it affects mostly the comparison of sequence clusters and not genotypes. Moreover, within FBVE, standardization of diagnostic methods occurs by having participating laboratories regularly test a representative panel of fecal samples (40).
We showed that norovirus genotype profi les can be used to estimate the foodborne proportion of norovirus outbreaks while excluding those of the food handler as a source. Distinction at genogroup level had already indicated epidemiologic differences (19), and we have now demonstrated that genotype profi les can be used to differentiate transmission modes. The profi les and proportions are likely to be helpful for estimating the number of outbreaks with potential of causing geographically disseminated outbreaks. Because identifi cation and investigation of such outbreaks provides insight into effective prevention measures during the production process, detection should enable containment of viral foodborne infection and thus prevent further spread and the consequent potential for large numbers of human infections.