Salmonella enterica Pulsed-Field Gel Electrophoresis Clusters, Minnesota, USA, 2001–2007

This procedure can help identify outbreaks of salmonellosis.

S almonellosis is a major foodborne illness that results in ≈1.4 million infections, 15,000 hospitalizations, and 400 deaths each year in the United States (1,2). Salmonella infections are primarily of foodborne origin but can also occur through contact with infected animals, humans, or their feces (3). The epidemiology of salmonellosis is complex largely because there are >2,500 distinct serotypes (serovars) with different reservoirs and diverse geographic incidences (4). Changes in food consumption, production, and distribution have led to an increasing frequency of multistate outbreaks associated with fresh produce and processed foods (5).
The development of molecular subtyping by pulsedfi eld gel electrophoresis (PFGE) has revolutionized Salmonella spp. surveillance. The National Molecular Subtyping Network for Foodborne Disease Surveillance (PulseNet) provides state and local public health department laboratories with standardized methods to subtype Salmonella serovars and normalize PFGE patterns against a global reference standard provided by the Centers for Disease Control and Prevention (CDC) (6,7). Molecular subtyping enhances case defi nition specifi city, enabling outbreaks to be detected and controlled at an earlier stage, and enabling detection of geographically dispersed outbreaks (8)(9)(10).
Although the benefi ts of molecular subtyping, specifically by PFGE, in foodborne disease outbreak detection and investigation have been well established, there is no consensus about when a PFGE cluster warrants further investigation and almost no quantitative analysis about characteristics of PFGE clusters that indicate a common source will be identifi ed (11)(12)(13)(14)(15). Cluster size and the number of days from receipt of the fi rst cluster case isolate to the third case isolate received by the public health laboratory were predictors of a source of infection being identifi ed for Listeria monocytogenes clusters in France (16). The objective of this study was to determine characteristics of Salmonella PFGE clusters that could serve as useful predictors for their being solved (i.e., result in identifi cation of a confi rmed outbreak). This information could help public health agencies with limited resources prioritize investigation of Salmonella PFGE clusters.

Materials and Methods
Salmonella infections are reportable to the Minnesota Department of Health (MDH) by state law (17). Clinical laboratories are required to forward all Salmonella isolates to the MDH Public Health Laboratory (PHL). PFGE sub-typing after digestion with XbaI is conducted on all isolates as soon as they are received according to PulseNet protocols (18). PFGE subtypes are uploaded into the national PulseNet database (6). All Minnesota residents with a culture-confi rmed Salmonella infection are routinely interviewed as soon as possible by MDH staff with a standard questionnaire about symptom history, food consumption, and other potential exposures occurring in the 7 days before onset of illness. The questionnaire contains detailed food exposure questions, including open-ended food histories and objective yes/no questions about numerous specifi c food items, as well as brand names and purchase locations. Clusters are investigated by using an iterative model in which suspicious exposures identifi ed during initial casepatient interviews are added to the standard interview for subsequent cases (19)(20)(21). Similarly, initial cluster casepatients may be reinterviewed to ensure uniform ascertainment of the suspicious exposures. This iterative approach is used to identify exposures for further evaluation with formal hypothesis testing, product sampling, or product tracing (19).
A cluster was defi ned as >2 cases of salmonellosis in different households with isolates of the same serovar and PFGE subtype and with specimen collection dates within 2 weeks (22). Thus, a single cluster would be ongoing as long as a new isolate was collected within 2 weeks after the most recent isolate in the cluster. A cluster was considered solved if the epidemiologic evaluation of that cluster resulted in the identifi cation of a common source of infection for those cases and consequently the documentation of a confi rmed outbreak. Therefore, the terms solved cluster and confi rmed outbreak are equivalent and used interchangeably.

Inclusion and Exclusion Criteria
Laboratory-confi rmed cases of nontyphoidal Salmonella enterica infection among Minnesota residents with specimen collection dates from January 1, 2001, through December 31, 2007, for which isolates were received and subtyped by MDH PHL were included in the study. Isolates not received through routine surveillance (i.e., testing was requested or conducted by MDH as a part of an ongoing investigation) were excluded from the analysis.
Solved clusters were included if they were detected and identifi ed solely on the basis of investigation of cases identifi ed through submission of isolates to MDH for routine laboratory surveillance. Solved clusters for which a call to the MDH foodborne disease hotline (www.health. state.mn.us/divs/idepc/dtopics/foodborne/reporting.html) (e.g., from the public or a healthcare provider) directly contributed to the identifi cation of an outbreak were excluded from analysis. Secondary clusters, defi ned as clusters in which the cases were part of a confi rmed outbreak that had been previously identifi ed, were also excluded from analysis. Clusters that were part of a probable outbreak (an epidemiologic evaluation suggested, but did not confi rm, a common source of infection) were also excluded.

Study Variables
Variables incorporated into the analysis were cluster year, cluster size, cluster case density, cluster serovar, cluster subtype, and cluster serovar diversity. Cluster size was defi ned as the number of cases in each cluster and was categorized into cluster sizes of 2, 3, 4, and >5. For clusters in which a common source was identifi ed, only cases received before the cluster was solved were included. Cluster case density was defi ned as the number of days from receipt date of the fi rst cluster isolate at MDH PHL to the receipt date of the third cluster isolate and was categorized into cluster case densities of 0, 1-7, 8-14, and >14 days (16).
Cluster serovar was coded as a categorical variable on the basis of serovar frequency. Serovars representing >20% of all isolates (Typhimurium and Enteritidis) were categorized as very common, those representing 3%-20% (Newport, Heidelberg, and Montevideo) as common, and those representing <3% (all other serovars) as uncommon. The relationship between common and uncommon PFGE subtypes and solving a cluster was examined for serovars Typhimurium and Enteritidis. For serovar Typhimurium, clusters with CDC PFGE subtype designations JPXX01.0003, JPXX01.0410, and JPXX01.0111 (each representing >8% of all Typhimurium isolates) were categorized as common, and all other subtypes were categorized as uncommon. For serovar Enteritidis, clusters with CDC PFGE subtype designations JEGX01.0004 and JEGX01.0030 (each representing >20% of all Enteritidis isolates) were categorized as common, and all other subtypes were categorized as uncommon.
Cluster serovar diversity was examined by categorizing the 17 most frequent serovars into highly clonal or low clonality serovars on the basis of the Simpson diversity index (23). Serovars with a Simpson index score <0.90 were considered highly clonal, and serovars with a Simpson index score >0.90 were considered to have low clonality. Cluster investigation thresholds were examined by comparing the percentage of outbreak clusters meeting a threshold, cluster investigation positive predictive value, and estimated interview burden in hours per year for various investigational thresholds. The time required to interview each patient with a Salmonella infection by using the MDH standard questionnaire was recorded for a 6-month period in 2008, and the median interview time was calculated.

Statistical Analysis
A descriptive analysis was conducted to characterize the frequency of Salmonella serovars and subtypes. Mantel-Haenszel χ 2 test for trend was used to characterize tem-poral trends in the number of Salmonella clusters that were solved. Two-sided Wilcoxon rank-sum tests were used to compare the median cluster size and cluster density of point source and non-point source outbreaks. Univariate analysis was performed to calculate odds ratios (ORs) and 95% confi dence intervals (CIs) characterizing the crude associations between Salmonella cluster serovar, cluster PFGE subtype, cluster serovar diversity, cluster size, and cluster case density and a cluster being solved. Mantel-Haenszel χ 2 tests for trend and interaction terms were used to investigate the linear nature of the relationship between cluster size, cluster case density, and the outcome. SAS software version 9.1 (SAS Institute, Cary, NC, USA) was used for descriptive and univariate analysis. An α value <0.05 was considered signifi cant.

Cluster and Outbreak Characteristics
During 2001-2007, a total of 376 Salmonella PFGE clusters were detected; they represented 1,399 (35%) isolates. Thirty-two (8.5%) clusters were excluded from analysis (21 secondary clusters, 7 clusters in which a hotline call directly contributed to identifi cation of an outbreak, and 4 probable outbreak clusters). Forty-three (12.5%) of the 344 clusters included in the analysis were solved.
During 2001-2007, a total of 65 confi rmed Salmonella outbreaks involving Minnesota cases were identifi ed; these represented 502 (12.5%) isolates. Twenty-two (34%) outbreaks were excluded from analysis (6 were multistate outbreaks in which only 1 case was identifi ed in Minnesota; in 7 outbreaks, a hotline call contributed to identifi cation of the outbreak; 1 was an outbreak was not detected by PFGE; 4 were outbreaks that did not have cases that met the cluster defi nition; and 4 outbreaks were considered probable). The remaining 43 outbreaks, representing 287 (7%) isolates, were included in the analysis and were composed of 35 foodborne, 6 person-to-person, and 2 animal contact outbreaks. Of these 43 outbreaks, 30 (70%) involved 1 facility (restaurant, daycare center, school) or event and therefore were classifi ed as point source. Thirteen (30%) involved commercially distributed food items at multiple points of sale (grocery stores, restaurants) and therefore were classifi ed as non-point source. The median cluster size of point source outbreaks was 3 cases, and the median cluster size of non-point source outbreaks was 5 cases (p<0.01, by Wilcoxon rank-sum test). The median cluster density was 6 days for point source and non-point source outbreaks (p = 0.74 by Wilcoxon rank-sum test).

Temporal Trends
During the study period, the median number of Salmonella isolates subtyped per year was 567 (range 507-662 isolates). The median number of Salmonella clusters per year was 50 (range 44-57 clusters). The median number of confi rmed Salmonella outbreaks per year was 6 (range 4-8 outbreaks). There were no statistically signifi cant trends in the proportion of Salmonella clusters that resulted in identifi cation of a confi rmed outbreak (p = 0.20) (Figure 2).  Clusters of the common Salmonella serovars Newport, Heidelberg, and Montevideo had 2.7× higher odds of being solved than did clusters of the very common serovars Enteritidis and Typhimurium ( Table 2). The proportion of uncommon serovar clusters that were solved did not differ signifi cantly from the proportion of very common or common serovar clusters that were solved (Table 2). Low clonality serovar clusters were not signifi cantly more likely to be solved than highly clonal serovar clusters (OR 1.6, 95% CI 0.8-3.1).

Cluster Subtype
No signifi cant associations between the subtype frequency of a cluster and a cluster being solved were observed. Uncommon serovar Enteritidis subtype clusters were not signifi cantly more likely to be solved than were common clusters (OR 1.4, 95% CI 0.4-5.1). Uncommon serovar Typhimurium subtype clusters were not significantly more likely to be solved than were common clusters (OR 0.9, 95% CI 0.3-3.2).

Cluster Size
The probability of a cluster being solved increased signifi cantly as the number of cluster cases increased (Mantel-Haenszel χ 2 for trend 13.7, p<0.001) ( Table 2). The odds of solving a cluster of >5 cases were 3.8× higher than the odds of solving a cluster of 2 cases. Clusters of 4 cases were 3.9× more likely to be solved than were clusters of 2 cases. Twenty-four percent of clusters with >4 cases were solved ( Table 2). Clusters of 3 cases were 2.1× more likely to be solved than clusters of 2 cases, but the difference was not statistically signifi cant. There was statistical evidence of a nonlinear relationship between cluster size and solving the cluster (Wald χ 2 for interaction 5.0, p = 0.03). The dose response between cluster size and solving a cluster plateaued after a cluster size of 4.

Cluster Case Density
The proportion of clusters solved increased significantly as the density of cluster cases increased (Mantel-Haenszel χ 2 for trend, 12.7, p<0.001) ( Table 2). The odds of solving a cluster if the fi rst 3 case isolates were received on the same day were 25.8× higher than the odds of solving a cluster in which the fi rst 3 case isolates were received during a period >14 days ( Table 2). The odds of solving a cluster if the fi rst 3 case isolates were received within 1-7 days were 5.0× higher than the odds of solving a cluster in which the fi rst 3 case isolates were received during a period >14 days. Clusters in which the fi rst 3 case isolates were received within 8-14 days were 2.8× more likely to be solved than clusters in which the fi rst 3 case isolates were received during a period >14 days, but the difference was not statistically signifi cant (Table 2). There was statistical evidence of a nonlinear relationship between cluster case density and solving the cluster (Wald χ 2 for interaction, 6.96, p<0.01).

Cluster Investigation Threshold
During June-December 2008, 10 MDH staff interviewed 214 persons with Salmonella infections and recorded the time required to complete the MDH standard  N(N -1)), where n is number of isolates of each subtype and N is total number of isolates of a serovar. A value of 1 indicates infinite diversity, and a value of 0 indicates no diversity.
questionnaire. Interview times did not vary between interviewers. The median interview time was 27 minutes (range 13-56 minutes). Therefore, conducting standard interviews of all cases in the 344 clusters of >2 cases (n = 1,182 [31%] cases) required an estimated 76 interview hours/year. This threshold detected all 43 outbreaks identifi ed through routine laboratory surveillance during the study period and resulted in a cluster investigation positive predictive value (percentage of clusters investigated that were solved) of 13% (Table 3). Other cluster investigation thresholds had outbreak detection sensitivities of 53%-81% and positive predictive values of 23%-28% (Table 3).

Discussion
During the study period, 344 Salmonella PFGE clusters were identifi ed and 43 (13%) were solved. Cluster size and cluster case density were the most useful predictors of a cluster being solved. The proportion of clusters that were solved increased as the number of cases in the cluster increased (up to 4 cases). The association was not linear and the percentage solved did not increase further for clusters with >5 cases. The observed association is logical because as the number of cluster cases increases, the amount of epidemiologic data available for evaluation also increases. Our results suggest that public health offi cials should not wait to investigate Salmonella clusters if >4 cluster cases have been received.
The ability to solve a cluster of cases of Salmonella infection was also strongly associated with the density of the cluster cases. The proportion of clusters that were solved increased as the density of the cluster cases increased, but this relationship was not linear. This association is also logical. Dense clusters increase the likelihood that the cluster cases are epidemiologically linked rather than unrelated sporadic cases. In addition, dense clusters also likely signal larger outbreaks. Our results demonstrated a clear increase in the success of solving clusters in which the fi rst 3 case isolates were received within 7 days.
In theory, PFGE subtyping is less useful for recognizing clusters of unusual serovars worth investigating. In the current study, clusters of the common serovars Newport, Montevideo, and Heidelberg were statistically 1682 Emerging Infectious Diseases • www.cdc.gov/eid • Vol. 16  more likely to be solved than clusters of the very common serovars Enteritidis and Typhimurium. However, clusters of uncommon serovars were not more likely to be solved than were clusters of common or very common serovars. It has been suggested that uncommon serovar clusters may be associated with uncommon food vehicles, which makes them more diffi cult to solve by using standard methods (24). The relationship between serovar frequency and the likelihood of solving a cluster is unclear and warrants further study. The limited number of solved clusters prevented multivariate analysis from being used to characterize the independent effect of predictors and possible effect modifi cation between predictors. However, comparing the magnitude of the estimated effect of cluster size and cluster case density suggests that cluster case density may be a more useful predictor of a cluster being solved.
The 22 confi rmed outbreaks that were excluded from the analysis demonstrate the value for national collaboration such as PulseNet and use of outbreak detection methods in addition to PFGE clustering within a given state. Six outbreaks were solved in which Minnesota only had 1 case, which demonstrated the utility of molecular subtyping in detecting geographically dispersed outbreaks. For 7 confi rmed outbreaks, a call placed to the MDH foodborne illness hotline contributed to identifi cation of the outbreak and demonstrated the utility of complaint systems in detecting outbreaks.
Interviewing all persons with Salmonella infection required a median of 27 minutes per person with Salmonella infection when the MDH standard questionnaire was used. By extrapolation, MDH staff spent ≈244 hours/year conducting routine interviews of persons with Salmonella infections. This fi gure does not include time spent attempting to reach persons, gathering demographic information from clinicians, or reinterviewing persons for cluster investigations. We recommend interviewing all persons with Salmonella infection and investigating all PFGE clusters to identify as many outbreaks as possible. However, many health departments do not have the resources to interview all persons with Salmonella infection or investigate all small clusters. Rather, they must balance the time required for these efforts and the ability to detect outbreaks (25).
Incorporating a cluster investigation threshold on the basis of cluster size and cluster case density can decrease the number of unsuccessful cluster investigations and conserve public health resources. However, this approach would also reduce the number of outbreaks that would be identifi ed. One reason for this fi nding is that outbreaks that are manifested as smaller, less dense clusters would not be investigated. Another potential disadvantage of a cluster threshold approach is that delay of interviews until a cluster is solved can decrease the quality of exposure information obtained and therefore the likelihood that the cluster will be solved (12).
Four confi rmed outbreaks during the study did not meet the cluster defi nition, and many confi rmed outbreaks had cases that were outside the cluster defi nition. This fi nding is an important reminder that lack of temporal clustering does not eliminate the possibility of an outbreak. Increasing the period covered by a cluster defi nition will yield the benefi t of solving more outbreaks. However, more resources will be expended conducting unsuccessful cluster investigations. The results of this study suggest that the use of a 2-week cluster window is suffi ciently sensitive to detect most outbreaks. However, in practice, MDH epidemiologists do not use a strict 2-week cluster window when investigating clusters. Instead, all persons with Salmonella infection are interviewed and cases with matching PFGE patterns are often compared even if the second case is received >2 weeks after the fi rst case.
The potential utility of the cluster investigation thresholds reported is based on the characteristics of the population of Minnesota and MDH surveillance methods: conducting real-time PFGE subtyping of all Salmonella isolates, interviewing all case-patients in real time by using a detailed exposure questionnaire from a central location for the entire state, and investigating clusters by using an iterative model (19)(20)(21). These factors aid in the timeliness of outbreak detection and investigation in Minnesota. These re- sults may not be applicable in jurisdictions in which PFGE is not conducted in real time or batching of PFGE isolates occurs. Additional studies at the national level and in other states are needed to understand surveillance characteristics in other states and determine useful predictors of multistate clusters being solved. Although successful cluster investigations will depend on the experience and ability of public health staff involved, this study demonstrates the increased probability of a cluster being solved as the number of cases in a cluster increases and as the cluster density increases. Specifi cally, investigation of PFGE clusters of >4 Salmonella case isolates and clusters in which the fi rst 3 cases were received at the MDH PHL within 1 week yielded a major benefi t in terms of outbreak identifi cation. These results establish a benchmark for surveillance of Salmonella infections, and may provide a basis for investigating clusters of Salmonella cases for public health agencies with limited resources.