L. monocytogenes in produce operations are likely to have the ability to cause human disease.
Overall, our data suggest that a large proportion of the LM isolates obtained from the United States produce operations included in this study have the ability to cause human disease, as supported by (i) the classification of a considerable number of isolates into hypervirulent and outbreak-associated CCs, (ii) the frequent presence of LIPI-3 and LIPI-4, and (iii) the infrequent presence of strains with virulence attenuated
inlA PMSCs. More specifically, the most frequent CC (CC388), which represented 15 of the 169 LM isolates in this study, was associated with a pork-related outbreak that sickened more than 200 people in Spain in 2019 (
32,
33). The 3 next most frequent CCs from this study represented CC4 (
n = 13), CC6 (
n = 12), and CC1 (
n = 11), which have also been associated with human illness (
15,
34–38) and have all been suggested to represent hypervirulent LM (
15). In addition to other outbreaks (
38,
39), CC4 has been associated with a 2013 outbreak in Switzerland that was linked to salad (
39). We also identified two CC7 isolates, and this CC was associated with the 2011 United States cantaloupe outbreak (
39), indicating that CCs that have previously been associated with produce related outbreaks continue to be found in produce processing facilities.
Only 9/169 (5%) of the LM isolates in this study were found to have
inlA PMSCs, which have been shown to result in virulence attenuation of LM isolates due to their low invasion efficiency into intestinal epithelial cells (
40,
41). These isolates were categorized into two CCs that have historically been isolated from food sources rather than from clinical sources: CC199 and CC9 (
15,
42). More specifically, a previous study of 6,633 isolates from France, including 2,584 clinical isolates, classified CC9 as a food-associated clone that rarely causes human disease (
15). A different study of 300 LM isolates (117 clinical) from 5 continents included 6 isolates from CC199, which were all from food or environmental sources (
42). Our finding that only 5% of the LM isolates in our study had
inlA PMSCs is in contrast to previous studies that suggest that
inlA PMSCs are common among isolates from RTE foods, processing plants, and retail environments (
40,
41,
43–45). For example, Van Stelten et al. found that 45% of 502 isolates from RTE foods (i.e., bagged salads, fresh soft cheeses, soft ripened cheeses, smoked seafood, seafood, and deli salads and meats) carried an
inlA PMSC, compared to 5% of human clinical isolates (
n = 507) (
43).
Several of the isolates included in this study had full or partial matches to the pathogenicity islands LIPI-3 (59/169) or LIPI-4 (44/169), including 20/169 isolates with matches to both. Both LIPI-3 and LIPI-4 have been shown to be associated with hypervirulence (
15,
46). Overall, LIPI-3 was found in 35% of the LM isolates characterized in our study. By comparison, Chen et al. (
12), found that 25% of the 102 LM isolates (recovered from 27,389 United States refrigerated RTE food samples) carried LIPI-3, while Kim et al. (
14) found that 37% of 121 LM isolates (recovered from milk, milk filters, and milking equipment on bovine dairy farms) had LIPI-3. However, Hurley et al. (
13) reported that only 10% of 100 LM lineage I and II isolates from food processing environments carried LIPI-3. More specifically, LIPI-3 was found among 74%, 9%, and 0% of the lineage I, II, and III LM isolates, respectively, that were characterized here. While LIPI-3 is generally thought to be restricted to lineage I (
29), previous studies (
13,
14) have found at least one lineage II isolate that had partial matches to LIPI-3. The frequencies of LIPI-3 per lineage found in our study (74%, 9%, and 0% of lineages I, II, and III, respectively) were higher than or consistent with those reported in previous studies. Kim et al. (
14) found that 73%, 2%, and 0% of lineage I, II, and III isolates had LIPI-3, respectively. Hurley et al. (
13) found that 39% and 1% of lineage I and II isolates had LIPI-3, respectively. Chen et al. (
12) found that 51% of the lineage I isolates, but none of the lineage II or III isolates, had LIPI-3.
Overall, LIPI-4 was found in 26% of the LM isolates characterized in our study. This is higher than the percentages reported by Kim et al. (
14), Chen et al. (
12), and Hurley et al. (
13), who found that 17%, 15%, and 1% of isolates had LIPI-4, respectively. More specifically, our study identified LIPI-4 in 44%, 8%, and 23% of lineage I, II, and III isolates, respectively. Kim et al. (
14) found LIPI-4 among 32%, 0%, and 33% (1 isolate) of lineage I, II and III isolates, respectively. Chen et al. (
12) found LIPI-4 among 30.6% of lineage I isolates and none of the lineage II or III isolates in their study. Hurley et al. (
13) found LIPI-4 in 4% of lineage I isolates and in no lineage II isolates. The fact that we identified LIPI-4 among lineage II isolates is surprising and may require further follow-up studies (e.g., long range sequencing) to confirm the presence of LIPI-4, determine the genomic location of LIPI-4, and probe for possible lateral transfer events that may have introduced LIPI-4 into lineage II strains.
There was a surprisingly high proportion of isolates in this study (31/169) that were categorized as lineage III, which has been historically found to be underrepresented among isolates from human clinical cases (
47). However, lineage III has also been considered to be underrepresented among isolates from foods and instead appears to be more commonly associated with food-production animal sources, particularly ruminants (
47,
48), with a recent report also indicating a high prevalence of lineage III isolates among isolates collected from natural environments across the United States (
49). This could suggest that a considerable proportion of produce-associated isolates could come from an animal source or a yet to be identified source of lineage III isolates that is common to farm animals and produce operations.
While a number of L. monocytogenes isolates from produce operations are likely to have stress response islands, few contain genes that convey reduced metal, detergent, or quaternary ammonium sensitivity.
In total, there were slightly fewer isolates that had either SSI-1 (55/169) or SSI-2 (12/169), compared to the proportions of isolates with LIPI-3 or LIPI-4. Various studies have shown a diverse range of occurrence of SSI-1 among LM isolates characterized, ranging from 33% to 70% (
12–14,
50–53). When comparing within lineages, our data showed that 1%, 53%, and 61% of lineage I, II, and III isolates had SSI-1, respectively. In comparison, Chen et al. (
12) found SSI-1 in 35%, 80%, and 100% of lineage I, II, and III isolates, respectively, while Kim et al. (
14) found SSI-1 in 47%, 43%, and 33% of lineage I, II, and III isolates, respectively. The 33% SSI-1 prevalence among the isolates characterized here is lower than what was reported by Chen et al. (
12), Hurley et al. (
13), and Kim et al. (
14), who found that 57%, 51%, and 45% of isolates had SSI-1, respectively. For SSI-2, our data showed that 0%, 3%, and 32% of lineage I, II, and III isolates had SSI-2, respectively. Fewer studies appear to screen isolates for the SSI-2 operon, with most studies showing few isolates (0 to 5%) having the operon (
12,
13,
52,
53) and Hurley et al. (
13) showing 12% of isolates (16% of lineage II isolates) having SSI-2. Overall, compared to previous studies, our study found a lower frequency of LM isolates with SSI-1 (
12–14,
50–53) but a higher frequency of LM isolates with SSI-2 (
12,
13,
52,
53), possibly due to the fact that the isolates characterized here included a larger number of lineage III isolates.
Our study found that only a few isolates contained any of the selected genes we screened for from the “metal and detergent resistance” category (see Materials and Methods). The only genes detected in our isolates were those representing the
bcrABC resistance cassette, which has been shown to confer reduced sensitivity to a quaternary ammonium (“quat”) compound called benzalkonium chloride, which is commonly used for sanitation in food production environments.
bcrABC was found in 10/169 (6%) of the LM isolates studied here, including all 4 isolates from Cluster 7, which appear to have persisted in facility CU-F. The 10 isolates with
bcrABC found here were all from lineage II (10/66, 15%), and the proportion of isolates with
bcrABC is lower than what was found by Chen et al. (
12), who found this cassette in 10/49 (20%) and 35/51 (69%) of lineage I and lineage II isolates, respectively. For the produce-specific isolates within that study (
12), 2/14 (14%) and 9/14 (64%) of the lineage I and II isolates had
bcrABC, respectively. A study investigating 100 LM isolates from three meat and vegetable processing facilities found that 19% of the isolates had the
bcrABC cassette (
13). A different study that characterized 15 produce-associated LM isolates in the United Kingdom found that 2/15 (13%) isolates had
bcrABC. Our data suggest that
bcrABC presence may be less common in isolates found in the produce-associated operations studied here, compared to previous studies. This could at least be partially due to the fact that a considerable proportion of the LM isolates characterized here (i.e., 91%) were obtained from packinghouses, which may be less likely to use quaternary ammonium compounds. Additionally, it is important to note that
cadAC,
emrE, and
qacAH were not detected in any of the isolates studied here. Overall, our findings are consistent with those reported in previous studies (
54,
55) that have not identified a strong association between the presence of specific “persistence” genes. While the findings to date could suggest that the establishment of persistence may include a strong element of chance (meaning persistence is likely to occur when an appropriate strain is introduced into a facility location that represents a potential niche where
Listeria would not be removed by sanitation), further studies that use even larger isolate sets than those described here and tools such as genome-wide association to identify new genetic markers that are putatively associated with persistence would be valuable. In addition to the large sample sizes needed for these types of studies, a continued challenge with these types of studies will be classifying isolates as truly “sporadic”. Isolates may be misclassified as sporadic if they persist in locations that are difficult to sample or are not sampled for other reasons.
Both sporadic and persistent Listeria spp. and LM contribute to the environmental contamination of produce facilities.
In addition to 141 isolates that did not fall into any hqSNP cluster, we also found that 19/45 clusters in this study (representing 42 isolates) were comprised of isolates from a single operation obtained on a single date but from different sites. Hence, the majority of
Listeria or LM positive sites appear to be due to sporadic contamination, with some representing short-term
Listeria spread within an operation, with contamination apparently controlled via standard cleaning and sanitation practices that were in place. However, our data showed the re-isolation (detection on separate dates that are at least 60 days apart) of highly related isolates (<10 hqSNP differences) in 7/16 operations (5 packinghouses and 2 fresh-cut facilities), suggesting that persistent contamination (or reintroduction, as discussed further below) is still frequent among United States produce operations (Table S5). Additionally, our data indicate that persistent
Listeria contamination can occur in both packinghouses and fresh-cut facilities. This is consistent with previous studies that found that a significant proportion of food-associated operations show evidence for LM or
Listeria persistence. For example, in a study of 9 small cheese processing facilities, 7 facilities showed evidence for
Listeria spp. persistence (
20). Similarly, in a study of 30 deli operations, 12 showed evidence for persistence (
18). More specifically, we identified 6 LM clusters that had isolation dates spanning >1 year, providing evidence for long-term persistence. This includes 3 clusters (representing 2 fresh-cut facilities) that had isolates from a single operation that were highly related (<10 hqSNPs) and detected >1 year apart. Importantly, these 2 fresh-cut facilities were operated continuously, whereas the packinghouses in this study, from which data were collected over >1 year, were operated seasonally. Therefore, the packinghouses all had an “off-season” in which equipment was down and able to be disassembled, cleaned, and dried for an extended period. This could potentially explain why we did not detect highly related isolates among those collected >1 year apart in any of the participating packinghouse operations, which is consistent with the findings of a previous study that found low LM prevalence and no evidence for persistence in 2 seasonally operated crawfish processing facilities (
56). Overall, our findings are also consistent with prior observations that LM strains may survive in operations for extended periods of time, as supported by a study that showed a processing plant that had a single strain persisting over at least 12 years (
57). Interestingly, the fact that a single produce processing facility included 3 distinct LM strains that showed evidence for persistence (Packinghouse CU-C had 3 clusters of highly related isolates found between 60 and 365 [exclusive] days apart) suggests that some facilities may be more prone to allowing for the establishment of persistence. Based on observations of this operation, persistence may occur due to equipment with poor sanitary design or infrequent sanitation procedures, and these observations are in concordance with a previous review, which found that one of the two most common risk factors for persistence mentioned in the literature was equipment cleanability (
58). However, we did not formally assess here which specific factors may have the greatest impact on the likelihood of persistence occurring in a given facility. While future studies on risk factors for persistence will be valuable, they will be challenging, as a large number of facilities would need to be enrolled.
Evidence of cross-contamination or common sources that contribute similar Listeria spp. and LM in multiple facilities.
While, as discussed above, the repeat isolation of closely related
Listeria often is interpreted as providing evidence for persistence, these types of findings could also be due to reintroduction from outside or upstream sources that are persistently contaminated with a given
Listeria strain. Interestingly, we found that 7 of the 45 hqSNP clusters in this study were comprised of isolates from 2 operations. An additional cluster (Cluster 2) included isolates from 3 operations, with all isolates within this cluster differing by <30 hqSNPs. Our findings that 8 clusters contained isolates from at least 2 operations demonstrate the potential for closely related isolates to be collected at different operations and provides evidence of the introduction of specific
Listeria strains into multiple facilities from a common source. More specifically, 1 cluster showed LM isolates that were as few as 17 hqSNPs apart and were collected from 2 packinghouses. This could be due to a common source of raw materials obtained from facilities or fields that harbor this strain or could represent a widespread presence of isolates representing this given hqSNP cluster in the environment. Interestingly, we also identified
L. seeligeri isolates that had 0 hqSNP differences in 2 different packinghouses that shared employees, despite being separately owned and operated. This provided a potential mechanism for cross-contamination between facilities. In concordance with our findings, other studies have also identified closely related
Listeria spp. isolates from different facilities, although it is important to note that different SNP-based data analysis approaches may not be directly comparable (
59). For example, 1 study showed that LM isolates with <10 SNP differences were isolated from multiple delis in separate states in the United States (
22), suggesting the introduction of closely related LM into multiple facilities, likely from an upstream source, such as a common supplier. Another study, which investigated isolates from a cold-smoked salmon facility, showed that an isolate from a different cold-smoked salmon facility was within 11 to 23 SNPs of the other isolates within the cluster (
8). Other studies have shown instances in which closely related isolates have been associated with separate operations, such as an LM strain that was tied to two ice cream production facilities, one of which purchased ingredients from the other (
25). Our findings further support the importance of using WGS data in combination with metadata to help differentiate between re-contamination and persistence. For example, the repeat isolation of closely related
Listeria isolates after sanitation and preoperation in locations with limited traffic during off hours (e.g., production rooms) supports persistence, while the repeat isolation of closely related
Listeria isolates only during operations and from sites close to potential introduction routes (e.g., a receiving dock), more likely indicates re-contamination. In addition, advanced WGS data analysis, including the construction of tip-dated phylogenies (see Harrand et al. for an example [
60]) can provide information on the most recent common ancestor (MRCA) of closely related isolates. In this case, an MRCA that predates the construction date of a facility could also suggest a reintroduction rather than persistence (particularly if supported by metadata). Importantly, if closely related isolates are found in different facilities, this may also suggest contamination from higher-up in the food supply chain (e.g., agricultural water, fields, field equipment) or from common employees, for example. Re-introduction may be more likely in supply chains where no kill steps (e.g., heat treatment) are applied (e.g., fresh produce), which would facilitate survival throughout the supply chain. Our study specifically shows that the availability of larger WGS data sets that are comprised of isolates from multiple facilities, along with detailed root cause and epidemiological approaches, will help to differentiate persistence from reintroduction or cross-contamination. This will help facilities more rapidly address the true root causes of contamination events.