Seems fishy: environmental DNA impacts on sketa22 quality control in salmonidae dominated waterbodies using qPCR and ddPCR

Globally, water resources used for recreation and drinking water are threatened by fecal pollution. These pollutants can cause gastrointestinal illness and environmental degradation. Additionally, most sources of fecal pollution are non-point sources stemming from multiple species. Identifying these sources is vital to categorizing the exposure risk from contact and improving remediation efforts. A common technique to provide species-specific information for fecal source identification is microbial source tracking (MST). MST quantifies DNA of host or host-associated microorganisms through polymerase chain reaction (PCR) technologies such as quantitative PCR (qPCR) or droplet digital PCR (ddPCR). MST techniques have been implemented globally and are used for routine monitoring. In the United States (US), the US Environmental Protection Agency has provided several approved standard PCR methods for MST and other recreational water quality applications. These methods have specified quality controls including sample processing controls (SPC) and assessments for sample inhibition. A standard SPC used in EPA methods involves spiking samples with salmon testes DNA (nominally originating from Chum Salmon, Oncorhynchus keta and quantifying them using Sketa22, a genus specific TaqManTM assay). This quality control (QC) behaves similarly to the microbial species being monitored. MST testing in Fall 2022 indicated elevated Sketa22 recoveries and re-analysis of samples indicated the detection of external Salmonidae DNA on both qPCR and ddPCR platforms. Our research was designed to identify the cause of this interference. Results indicate that the primer probe set may react with wild Salmonidae DNA. Analyzing the Sketa22 sequence using BLAST indicated matches with many species of Salmonidae present in the sampled stream system. Consequently, further research is required to identify the effectiveness of Sketa22 as a QC when native and migratory Salmonidae are present. General recommendations are provided to account for excess ambient Salmonidae DNA.


Introduction
Fecal pollution in recreational and source waters poses a threat to human health and the surrounding environment globally (Garbossa et al 2017, Hart et al 2023, McLellan et al 2018. Contaminated waters increase exposure to pathogens, such as Cryptosporidium, Campylobacter, and Salmonella, which cause various illnesses if ingested (Wade et al 2022). Fecal pollution originates from a variety of sources, both human and zoonotic. Understanding the source of fecal contamination is important in determining the risks posed to human health and guiding remediation efforts of impacted waterbodies (Boehm andSoller 2020, Soller et al 2010).
Microbial Source Tracking (MST) is commonly used globally to identify fecal sources. This technique uses polymerase chain reaction (PCR) to target the genetic sequences of a host or host-associated microorganisms. A variety of studies have used MST on both quantitative PCR (qPCR) and droplet digital PCR (ddPCR) platforms to improve management decisions in a variety of waterbodies (Cao et al 2018, Pendergraph et al 2021, USEPA 2010, 2015, 2019a, 2019b. In the United States, there are a variety of standardized protocols and guidelines for using qPCR to enhance existing monitoring efforts and total maximum daily load (TMDL) assessments (US EPA USEPA 2010, 2015, 2019a, 2019b . All standardized methods published by the United States Environmental Protection Agency (US EPA) for qPCR require specific quality controls (QC) to ensure accurate quantification of results and to standardize reporting metrics among participating laboratories. QC, for most methods, consists of a sample processing control (SPC) to assess potential target DNA losses in processing and an internal amplification control (IAC) to assess possible inhibition from external inhibitors (Borchardt et al 2021). QC is performed using synthetic DNA targets or from a reference species that should behave similarly to target organisms. Another critical distinction for accurate QC is that the selected species and genetic sequences are specific to only their targeted species and are not present in the external environment. Failure to properly establish and validate QC targets can provide inaccurate information, affecting the accuracy of quantified species of interest.
A common source of material used for QC in US EPA protocols is commercially available salmon testes DNA used as a SPC but can be utilized as both a SPC and an inhibition check in some methods (USEPA 2010(USEPA , 2015. This DNA was selected for its low cost, wide availability, repeatable results, and biochemical similarity to other microbial targets of interest. The original molecular assay for US EPA's QC, Sketa2, contained primer and probe sequences homologous to the internal transcribed spacer 2 region (ITS2) of the ribosomal RNA of Oncorhynchus keta from Domanico et al (1997) and was later updated in a subsequent EPA study through the deletion of two nucleotides on the reverse primer that was predicted to maintain the specificity of the assay for most salmonid species (Siefring et al 2008, Haugland et al 2005. This form of QC has been used globally for recreational water quality monitoring and MST testing ( Crockery Creek is a ≈ 414 km 2 sub-watershed in Ottawa, Muskegon, and Newaygo counties (Michigan, USA) and part of the Lower Grand River watershed. This watershed is connected to Lake Michigan and has a history of anadromous fish migration and spawning . The land use is primarily agriculture (66%), followed by forested areas (15%) and wetland areas, which mainly span the riparian corridor of the streams and tributaries in the watershed (10%) (figure 1). Dewitz, U.S. Geological Survey (2021). Crockery Creek was stocked with Brown Trout, Salmo trutta, and Rainbow Trout, Oncorhynchus mykiss, since 2013, although Brook Trout, Salvelinus fontinalis, have been stocked in some years but not since 2016 (table 1). Although Chinook Salmon, Oncorhynchus tshawytscha, and Coho Salmon, Oncorhynchus kisutch, are not known to occur in Crockery Creek, both species are stocked in the Grand River basin and could hypothetically migrate into Crockery Creek. The bulk of salmonids (i.e., Brown Trout and Rainbow Trout) in Crockery Creek are likely supported by stocking. Of the salmonid species mentioned, all but Rainbow Trout are fall spawners (Becker 1983). Rainbow Trout is a spring spawner (Becker 1983). Spawning activity is associated with pulses of eDNA (Duda et al 2021, Tillotson et al 2018. This watershed received an E. coli TMDL in 2003 and showed elevated levels of E. coli over the statewide partial body contact limit (PBC) (1000 MPN/100 ml) and showed detections of human and cow MST markers in a statewide analysis in 2019 (MDEQ 2003, EGLE 2022. Because of these non-point source pollutants, The Ottawa Conservation District received 319 nonpoint source funding in 2022 to assess current non-point source pollutant levels and offer landowners assistance in adopting best management practices. Repeat bacterial analysis of Crockery Creek from September 21st to October 17th yielded Average E. coli results over the (PBC) and elevated presence of human MST markers at all sites sampled using ddPCR.
While human marker elevations were anticipated, the SPC assay Sketa22 additionally had unexpected elevations. All samples were initially spiked with 0.2 ng ml −1 of Chum Salmon DNA (≈13,000-15,000 Gene Copies/100 ml (GC/100 ml)) and the MST results displayed Sketa22 concentrations ranging over 1,500,000 GC/100 ml for all samples tested in Crockery Creek. These elevations suggest the amplification of external DNA. To our knowledge no previous studies have reported the potential occurrence of external DNA amplification in environmental samples to the Sketa22 assay used for MST analyses.
This project aims to identify and address the abnormality seen in the SPC control of the 2022 MST analysis. The overall project objectives are (1) to identify if there is external DNA interference present in collected environmental samples, (2) to assess if this detection occurs on both qPCR and ddPCR platforms, and (3) to evaluate potential causes and propose solutions for future monitoring efforts. This information will be helpful to laboratories and regulatory agencies involved with MST and methods development.

Sample collection and initial processing
Five sites were selected by the Ottawa Conservation District that were representative of Crockery Creek and were sampled weekly for five weeks for fulfillment of their 319 non-point source grant requirements (figure 1). A representative sample at each location (left, center, right) was collected facing upstream in the flowing portion of the stream. For QC, a duplicate sample was collected at one site per sampling event, and three field blanks consisting of DI water were placed with the samples to ensure no cross contamination occurred. All collected samples were transported to the lab at ≈ 4°C.  Filtration and storage of all samples and QC occurred within 6 h of sample collection. Only the center sample was used for MST analysis. Upon arriving at the lab, three 100 ml subsamples of each center sample were filtered through a 0.45 μm filter. Then the filters and housing were rinsed with ≈ 20 ml of autoclaved Type 1 water to wash any potential cells from the housing onto the filter. The filter was then collected using aseptic techniques and placed into a 2.0 ml microcentrifuge tube containing 300 μg of acid washed glass beads Sigma (G-1277) and stored at −80°C for two weeks until batch analysis occurred. All MST samples from the initial sample period were re-analyzed for Salmonidae contamination using a stored filter that was stored at −80°C for 1 month. A method blank consisting of 20 ml of autoclaved Type 1 water was collected with each sample batch to ensure no cross contamination occurred during filtration and subsequent DNA extraction processes.

DNA extraction
All samples were processed using Generite DNA-EZ-spin columns following USEPA Method 1696 with minor modifications (USEPA 2019a, 2019b). To assess the effects of external DNA in our samples, all samples were rehydrated with AE Buffer that does not contain a salmon DNA suspension. QC consisted of three method blanks that were run through the extraction process, one rehydrated with the qPCR concentration of salmon suspension spike (0.2 μg ml −1 ), one rehydrated with the ddPCR concentration of salmon suspension spike (0.2 ng ml −1 ), and one rehydrated in AE buffer. Then the samples were homogenized at 6 m s −1 for 30 s in a homogenizer and centrifuged to pellet the sample. 400 μl were then taken, added to a fresh 1.7 μl microcentrifuge tube, and centrifuged. After 380 μl were taken and added to a new 1.7 μl microcentrifuge containing 760 μl of binding buffer. The contents of this tube were run through a DNA-EZ spin column via centrifugation, and the flow through was discarded. Then the filters were washed with 1,000 μl of wash buffer via centrifugation, and flow through was discarded. Finally, the columns were placed in a fresh 1.7 μl microcentrifuge tube, and 100 μl of warmed elution buffer was applied and collected from the filters. This final elution of all the samples and QC were stored at 4°C and analyzed within 12 h. The final eluent was split and quantified on both PCR methods.
2.3. PCR methods qPCR quantification was performed following USEPA method 1696/1697 (US EPA, 2019a, 2019b). In brief, 25 μl reactions are prepared to consist of 12.5 μl of Environmental Master Mix, 2.5 μl of 2.0 mg ml −1 stock solution bovine serum albumin (BSA) from fraction V powder (Sigma B-4287 or equivalent), 1,000 nm primers (forward and reverse), 80 nm probe, nuclease-free PCR water, and 2.0 μl of the template. Amplification was performed using Thermo Fisher Scientific StepOnePlus™ (Thermo Fisher Scientific, Grand Island, NY) under the following conditions: 50°C for 2 min, 95°C for 10 min, 40 cycles of 95°C for 15 s and 60°C for 1 min. The threshold was set at 0.03 ΔRn for all molecular markers. All samples and controls were run in triplicate. Controls included positive controls, extraction blanks, and no-template controls (NTC). A positive sample was defined as any sample having a Quantification value below 40 C q . The Average of the C q triplicate was used for analysis.
Quantification on ddPCR was performed following methods outlined by (Flood et al 2022). In brief, 22 μl reactions were made using Bio-Rad's ddPCR Supermix for Probes (No DUTP) (Bio-Rad Laboratories, Richmond, CA). Each ddPCR reaction contained 16.5 μl of an assay mix with 11 μl of Supermix, 900 nM primers (forward and reverse), 270 nM of probes, nuclease-free PCR water, and 5.5 μl of the template. Droplets were then generated with a Bio-Rad AutoDG and amplified using a Bio-Rad C1000 Touch™ Thermal Cycler (Bio-Rad Laboratories) under the following conditions: 95°C for 10 min, then 40 two-phase cycles of 94°C for 30 s, 58°C for 1 min. All ddPCR reactions and controls were analyzed in triplicate. Each ddPCR plate contained positive, negative, and no-template controls for each assay mix. Analysis was performed on Bio-Rad QX200. Primer and probe sequences for both PCR methods are provided in (table 2).
QC was performed for all samples. Samples for ddPCR must have 10,000 accepted droplets. Thresholds were set for all wells simultaneously, approximately 500 fluorescent units above the negative droplet cloud. Detection of at least three positive droplets per well provided evidence that DNA was present in a sample. Results were calculated as GC/100 ml using equation (2.1). The minimum detection limit (MDL) for ddPCR was 94.5 GC/100 ml.

Statistical analysis
Statistical analysis was performed using R (R Core Team 2021). The overall detection rate of each PCR technology was assessed, and its 95% confidence interval CI was calculated using the Wilson's confidence interval (Brown et al 2001). Quantified results on qPCR and ddPCR were plotted using simple linear regression to assess agreement between methods.

PCR analyses
Both PCR methods positively detected the Sketa22 MST target in Crockery Creek samples. Of the 24 samples analyzed, 17 samples, 71% (51%-85%), were positive on qPCR, and 21 samples, 88% (69%-96%), were positive on ddPCR (table 3). All Samples that were positive on qPCR were also positive on ddPCR. A plot of paired positive samples between both methods displayed good agreement in the concentration of Sketa22 with an R 2 value > 0.9 (figure 2). Additionally, every date sampled had positive detections at multiple sampling sites. From this, it was determined that external DNA interference was occurring from the environmental samples and this external DNA was detected on both qPCR and ddPCR platforms.

BLAST analysis
BLAST results indicated matches of the primer probe sequence to several species of salmon and trout present in the Great Lakes region including all stocked and resident populations of salmonids within Crockery Creek as well as additional species of salmon and trout that are not present in the Great Lakes region (table 4). This indicates potential reactivity between the Sketa22 primer probe sequence and environmental DNA. Some species, such as Rainbow Trout, have amplification present on non ITS2 region sequences with complete matches and identical amplicon lengths to the target sequence (table 4). These results were consistent with the phylogenetic analyses of the genus Oncorhynchus as described previously (Domanico et al 1997).

Discussion
The detections of Sketa22 in both ddPCR and qPCR experiments suggest the presence of external Salmonidae DNA that is amplifying in Crockery Creek. These elevations present possible false positive QC results, which could cause an underestimation of inhibition and inaccurate SPC results making users unable to assess potential cross contamination in sample processing depending on the molecular method being conducted. False negative QC results could increase the risk of exposure to pathogens in inhibited samples and affect data interpretation when using qPCR methods following standardized protocols such as US EPA Draft Method C or MST for qPCR (Aw et al 2019, Gibson et al 2012). Draft Method C instructs users to dilute and rerun samples and change the initial concentration of salmon suspension when the Sketa22 concentration is above acceptable workbook limits Table 2. Oligotide sequence for Sketa22 on qPCR and ddPCR. The sequences for qPCR and ddPCR represent the updated sequence. The original Sketa2 assay sequence also is provided. These sequences were based off of Oncorhynchus keta (AF170538.1). Fwd = Forward, Rev = Reverse.    . Given the magnitude of amplification seen in the experiments, it was likely that the presence of DNA from salmonids, as well as milt and eggs from the fall spawning species mentioned above, caused the Sketa22 primer probe sequence to fluoresce. While Brown Trout were only partial match to the Sketa22 primer probe sequence, they display enough overlap to make them another potential candidate to cause amplification. Rainbow Trout also have perfect matches to the Sketa22 primers and probes outside the ITS2 region. Any or multiple of these species may have amplified to produce these unexpected QC findings.NCBI BLAST sequence findings (table 4) of the Sketa22 primers and probes detected multiple matches on both the ITS2 and other genomic regions of other Salmonidae alongside Chum Salmon. Species that produced similar but not identical amplicons for the Sketa22 primer probe sequence include Ohrid Trout, Asiatic Trout, European Cisco, Stechlin Cisco, Danube Salmon, Atlantic Salmon, and Arctic Char. Regarding amplicon length, all but Artic Char and Cherry Salmon produce amplicons of identical length to the desired ITS2 gene of Chum Salmon. Coho Salmon, Pink Salmon, Cherry Salmon, Rainbow Trout, and Cutthroat Trout display perfect binding affinity to the desired Chum Salmon sequence with no mismatches. Chinook Salmon and Brown Trout have 1-2 base pair differences from the selected primer and probes. Any of these species' DNA may react to the primers and probes, causing unintended amplification due to the generalized reactivity in PCR (Eischeid 2019) if excess Salmonidae DNA is present. Many variables impact binding affinity, such as base pair mismatch location, G/C versus A/T mismatches, and relation to primer or probe sequences (Stadhouders et al 2010). Further analysis is needed to determine if the DNA from local species is causing amplifications on MST qPCR and ddPCR and the magnitude it amplifies during spawning season. Methods including fyke net catching, eDNA metabarcoding, or electrofishing along the waterbody in tandem with seasonally directed MST analysis may help determine which species react with the Sketa22 primers and probes. This testing could also assess if this is a temporal issue surrounding spawning events or other migration periods for Salmonidae. Other regions may need to perform preliminary investigations of DNA present in their waterbodies seasonally before using Sketa22 as a QC. While this is concerning for molecular analyses in streams with Salmonidae species present, this currently appears to be a temporal issue surrounding spawning events. The external DNA interference from Salmonidae species has not been documented in any published study globally that uses Sketa22 as a QC in molecular methods for recreational water quality monitoring of E. coli and Enterococci, including some waterbodies with resident Salmonidae populations (Aw et al 2019, Cao et al 2013, Crain et al 2021, Haugland et al 2021, Raith et al 2014. Additionally, this issue has not been documented in studies using Sketa22 as a QC for MST analysis (Shanks et al 2008, Cao et al 2018, Gentry-Shields et al 2012, González-Fernández et al 2021, Green et al 2014, Kinzelman et al 2020, Kirs et al 2017, Peed et al 2011. While salmonid spawning in the Great Lakes occurs before or after the summer beach season, episodic events from avian botulism (Chun 2013) and the occurrence of dead fish on beaches (Haack et al 2003) may contribute to occasional excess Sketa22 recoveries. One limitation of this study is that the original samples were processed with a lower concentration of salmon suspension spike for processing on the ddPCR platform. If these samples were processed with the qPCR salmon suspension spike, the elevated Sketa22 recoveries may not have been observed. Since Sketa22 is added in high concentrations for qPCR QC (0.2 μg/ml) (USEPA 2015(USEPA ,2019a(USEPA , 2019b, small amounts of errant Salmonidae DNA would not impact beach or MST monitoring. Additional assessment is required to assess the impacts of elevated Salmonide DNA on both PCR platforms.

Molecular Target
Further testing and investigation may benefit labs to identify any undetected inhibition or potential environmental DNA interference from local populations. Potential solutions to this problem include selecting a Table 4. Results of the NCBI BLAST genomic analysis of the Sketa22 primer probe sequence (AF170538.1). Bolded rows indicate species that are found in our testing region. All species listed yielded perfect alignment with the primer-probe sequences primarily on the Internal transcribed spacer 2 (ITS2) region and have near identical amplicon lengths to the target sequence. Both partial and complete gene matches were observed. Partial matches are discussed below. Italicized represents sequences outside of the ITS2 Region with amplification. different molecular target for SPC or inhibition assessment. The requirement for an effective QC species is that it does not reside in the sampling region. Another form of QC could be the use of a synthetic plasmid to remedy this issue. It is important to assess that any proposed target displays selectivity and specificity to the selected organism and does not exhibit cross reactivity to any of the species being monitored (Borchardt et al 2021). Ideally, this species or plasmid should behave similarly to the targeted environmental species to provide the most accurate comparison possible. Other options for assessing inhibition involve using a universal marker and running a dilution to assess inhibitors (Gibson et al 2012).

Conclusion
Salmon DNA and the Sketa22 primer probe sequence is a utilitarian SPC and inhibition assessment. It is longlasting, stores easily in solution at 4°C, and is affordable. This sequence was selected from a homologous gene in the ITS2 region of Oncorhynchus, a genus which includes Pacific Salmon species. Recent MST results have highlighted the possibility of reactivity of Sketa22 with external Salmonidae DNA in streams. This was observed in both qPCR and ddPCR platforms, which are both commonly used in MST testing and recreational water quality monitoring. NCBI BLAST analysis demonstrated the potential for amplification, particularly on the ITS2 regions of other common Salmonidae. This could present an issue with recreational water quality monitoring for E. coli, Enterococcus, and other Bacteroides targets if standard methods are used. However, there have been no published studies of recreational water quality monitoring that have reported this issue. Due to concentration differences between ddPCR and qPCR, the higher concentration spikes in qPCR analyses may have caused this elevation to go unnoticed. Future work needs to be done to assess which species react to the Sketa22 sequence causing excess amplification and establish any temporal connections between Salmonidae semi-annual migrations or potential resident Salmonidae populations. Future analyses should assess what concentration of environmental Salmonidae DNA impacts the interpretation of results on both PCR platforms using a combination of spiked and un-spiked samples. Finally, if these analyses are unable to discern what concentration of external Salmonidae DNA will impact PCR results on either platform, efforts should focus on new QC species that would be viable in regions where ambient environmental DNA is an issue.