crAssphage as a human molecular marker to evaluate temporal and spatial variability in faecal contamination of urban marine bathing waters

• The crAss_2 effectively discriminates between human and animal (dog and seagull) pollution. • crAss_2 andHF183 levels in human contaminated marine bathing waters are closely correlated. • Co-occurrence between crAss_2 and HF183 in compliance sampling is 76%. • Up to 2.5Log10 within day variation in FIB is observed over a 12-hour tidal cycle. • Faecal pollution is not homogeneously distributed in a marine bathing water. ⁎ Corresponding author. E-mail address: wim.meijer@ucd.ie (W.G. Meijer). https://doi.org/10.1016/j.scitotenv.2021.147828 0048-9697/© 2021 The Author(s). Published by Elsevier B a b s t r a c t a r t i c l e i n f o


H I G H L I G H T S
• The crAss_2 effectively discriminates between human and animal (dog and seagull) pollution. • crAss_2 and HF183 levels in human contaminated marine bathing waters are closely correlated. • Co-occurrence between crAss_2 and HF183 in compliance sampling is 76%. • Up to 2.5Log 10 within day variation in FIB is observed over a 12-hour tidal cycle. • Faecal pollution is not homogeneously distributed in a marine bathing water.

G R A P H I C A L A B S T R A C T a b s t r a c t a r t i c l e i n f o
Bathing water quality may be negatively impacted by diffuse pollution arising from urban and agricultural activities and wildlife, it is therefore important to be able to differentiate between biological and geographical sources of faecal pollution. crAssphage was recently described as a novel human-associated microbial source tracking marker. This study aimed to evaluate the performance of the crAssphage marker in designated bathing waters. The sensitivity and specificity of the crAss_2 marker was evaluated using faecal samples from herring gulls, dogs, sewage and a stream impacted by human pollution (n = 80), which showed that all human impacted samples tested positive for the marker while none of the animal samples did. The crAss_2 marker was field tested in an urban marine bathing water close to the discharge point of human impacted streams. In addition, the bathing water is affected by dog and gull fouling. Analysis of water samples taken at the compliance point every 30 min during a tidal cycle following a rain event showed that the crAss_2 and HF183 markers performed equally well (Spearman correlation ρ = 0.84). The levels of these marker and faecal indicators (Escherichia coli, intestinal enterococci, somatic coliphages) varied by up to 2.5 log 10 during the day. Analysis of a high-tide transect perpendicular to the shoreline revealed high levels of localised faecal contamination 1 km offshore, with a concomitant spike in the gull marker. In contrast, both the crAss_2 and HF183 markers remained at a constant level, showing that human faecal contamination is homogenously distributed, while gull pollution is localised. Performance of the crAss_2 and HF183 assay was further evaluated in bimonthly compliance point samples over an 18-month

Introduction
Faecal contamination of recreational bathing waters may pose a significant health risk not only to swimmers but also to those pursuing other water related activities such as sailing or surfing (Colford et al., 2007;Kay et al., 1994;Prieto et al., 2001;Soller et al., 2014;Zmirou et al., 2003). Intestinal illnesses as well as eye and skin irritation are some of the negative health outcomes following exposure to recreational waters with high levels of faecal indicators (Cabral, 2010;Colford et al., 2007;Prüss, 1998;Wade et al., 2006). From a European Union (EU) regulatory perspective, the assessment of bathing water quality relies on the quantification of two microbiological parameters, namely Escherichia coli and intestinal enterococci. Recreational bathing waters in member states is therefore classified as excellent, good, sufficient or poor based on 90th and 95th percentiles of these microbiological indicator levels in water samples taken over a four year period as specified by the 2006/7/EC European Bathing Water Directive (EU, 2006).
Human faecal contamination, originating from discharge of contaminated rivers and streams, wastewater treatment plants and combined sewer overflows can be a major source of contamination in urban bathing waters (Ahmed et al., 2020;Bedri et al., 2015;Brown et al., 2004;Reynolds et al., 2020;Unc and Goss, 2004). Pollution episodes in this regard can be most acute during severe weather events that trigger the activation of combined sewer and stormwater overflows but which also drive the release and transport of faecal matter from catchments such that faecal indicator concentrations from both point and diffuse sources are increased (Ahmed et al., 2019a;Panasiuk et al., 2015). In addition, these urban bathing waters may be contaminated by faecal matter originating from foraging seabirds and waterfowl (Calderon and Mood, 1991;Unc and Goss, 2004;Yamahara et al., 2007). Among wildlife species, gulls have been reported as being major contributors of faecal pollution in coastal waters (Araújo et al., 2014;Lu et al., 2008). Dog fouling may also significantly contribute to faecal contamination, an issue more significant in the case of urban beaches (Ervin et al., 2014;Oates et al., 2017;Walker et al., 2015).
A wide range of molecular source tracking markers have been developed to identify the origin of faecal contamination, many of which target host-specific Bacteroidales 16S ribosomal RNA genes and mitochondrial DNA markers (Ballesté et al., 2020;Bernhard and Field, 2000;Gawler et al., 2007;Gómez-Doñate et al., 2016;He et al., 2015;Roslev and Bukh, 2011). Recently crAssphage, a bacteriophage that infects the human gut bacterium Bacteroides intestinalis (Shkoporov et al., 2018), has been described as a novel human-associated molecular source tracking marker (García-Aljaro et al., 2017;Stachler et al., 2017). The crAssphage DNA sequence was assembled through metagenomic analysis of human faecal samples (Dutilh et al., 2014). The abundance and distribution of gut microbiota vary between geographical regions, affecting the performance of microbial source tracking markers (Boehm et al., 2013;Mayer et al., 2018;Roslev and Bukh, 2011;Yahya et al., 2017). However, the crAssphage sequence has since been detected in the majority of publicly available human faecal metagenomes collected in a range of geographic locations, suggesting that it is near-ubiquitous in the global human population (Edwards et al., 2019;Yutin et al., 2018). Recently, two crAssphage-like phages have been reported to infect other Bacteroides species such as B. thetaiotaomicron (Hryckowian et al., 2020). Following its discovery, a number of crAssphage assays have been developed for its detection (García-Aljaro et al., 2017;Stachler et al., 2017). This novel human-specific bacteriophage is highly abundant in sewage with no seasonal patterns reported and is therefore a useful tool to detect human contamination in waterbodies (Chen et al., 2021;Crank et al., 2020;Farkas et al., 2019). Furthermore, the crAssphage marker correlates with human viruses, e.g., norovirus, adenovirus, bocavirus and polyomavirus, in sewage and activated sludge Farkas et al., 2019;Stachler et al., 2018;Wu et al., 2020). It has therefore been suggested that crAssphage may be used as a viral water quality monitoring tool (Bivins et al., 2020;Malla et al., 2019).
The use of crAssphage as a human faeces marker has been demonstrated in the field, predominantly in rivers, lakes and estuaries (Ahmed et al., 2020(Ahmed et al., , 2018b(Ahmed et al., , 2018aBallesté et al., 2019;Kongprajug et al., 2019;Malla et al., 2019). This study evaluates the use of crAssphage as a tool to differentiate between human and animal pollution in spatial and temporal transects as well as in routine compliance sampling in a complex marine environment that is impacted by multiple sources of faecal pollution, including by seabirds, dogs and contaminated urban streams.

Bathing water samples
Dublin, the capital of Ireland, is a coastal city with a population of approximately 560,000 people residing within its city limits and more than 1.9 million people living in the Greater Dublin Area (a region comprising Dublin together with its neighbouring counties). Dublin Bay is a UNESCO biosphere and is home to thousands of protected native and migratory birds that roost and forage on or near the coast. Several small streams that are heavily urbanised along their courses discharge into Dublin Bay. The Elm Park stream is 3.8 km long and flows through a heavily urbanised catchment with a population of approximately 40,000 people before discharging near to two of Dublin's three designated bathing areas (Sandymount Strand and Merrion Strand) which were selected for this study. Bathing water samples were collected in the study area under three conditions ( Fig. 1): 1. After a rain event: Water samples were collected at Merrion Strand 3 h after 22.8 mm of rainfall in 4 h, every 30 min during a 12-hour tidal cycle. Sampling was carried out in a transect perpendicular to the foreshore.
2. High tide transect: Water samples were collected along a 2 km transect perpendicular to the foreshore during a high tide at Sandymount Strand. A total of ten samples were collected every 250 m, leading to maximum offshore sample collection distance of 2.25 km. The initial collection point was closest to the shore, the successive water samples were collected using a canoe to avoid disturbing the sediment. The transect was completed within 1 h, with depth measurements also being recorded. 3. Compliance point sampling: A total of 80 samples from Sandymount and Merrion Strands were collected on a bi-monthly basis from April 2018 to November 2019. Water samples were taken at high tide.
Water samples were collected in sterile containers at approximately 20 cm below the surface and were subsequently refrigerated and processed within 8 h.

Sampling of gull and dog faeces, sewage and river water
Individual herring gull (Larus argentatus) droppings (n = 20) were collected from Howth Harbour, Co. Dublin (53°23′19.4″ North 6°03′ 49.5″ West). Dog samples (n = 20) were collected from the Dublin Society for Prevention to Cruelty to Animals County Dublin and the Kildare and West Wicklow Society for Prevention to Cruelty to Animals County Kildare. Sewage samples (n = 20) were collected from the Dublin sewerage system or from the influent of three local wastewater treatment plants. In addition, samples were taken over a year from a small urban stream (Elm Park Stream, 53°18′51.8″ North-6°12′12.6″ West) that discharges onto the Merrion Strand. All samples were collected in sterile containers, kept on ice and DNA was extracted within 8 h after collection.

Enumeration of faecal indicator bacteria and somatic bacteriophages
Water samples were filtered through 0.45 μm nitrocellulose membranes (ThermoFisher Scientific) and cultured on TBX agar (Tryptone Bile X-Glucuronide agar; Sigma-Aldrich) at 37°C for 4 h followed by a further incubation at 44°C for 18 h to enumerate E. coli. Intestinal enterococci were enumerated by placing the filters on SB (Slanetz and Bartley, Oxoid) medium at 37°C for 48 h. Positive colonies were confirmed using BAA (Bile Aesculin Agar, Sigma-Aldrich) at 44°C for 2 h.

Nucleic acid extraction
Water (100 ml) and sewage samples (30-50 ml) were filtered through 0.22 μm mixed cellulose ester membrane filters (Merck Millipore). The filters were subsequently placed in 500 μl of GITC buffer (5 M guanidine thiocyanate, 100 mM EDTA [pH 8] and 0.5% sarkosyl) and stored at −20°C. DNA was extracted using a previously described modification of the DNeasy Blood and Tissue kit protocol (Qiagen) . The QIAamp Fast DNA stool mini kit (Qiagen) was used to extract genomic DNA from 180 to 220 mg of fresh faeces. The DNA was dissolved in a final volume of 200 μl of elution buffer.

Microbial source tracking (MST) marker quantification
The crAssphage marker (crAss_2) was quantified using a FAMlabelled TaqMan probe (García-Aljaro et al., 2017). The human HF183, canine and gull markers targeting Bacteroidetes sp. (human, dog) and Catellicoccus marimammalium (gull) are SYBR green based assays (Bernhard and Field, 2000;Dick et al., 2005;Sinigalliano et al., 2010). Undiluted and 10-fold diluted samples were analysed in a reaction mixture (20 μl) containing the appropriate primers (Table S1) and either the FastStart Essential DNA Probes Master or FastStart Essential DNA Green Master (Roche). Linearised standards between 10 0 and 10 6 gene copies were included in each run to quantify target gene levels in each sample (García-Aljaro et al., 2017;Reynolds et al., 2020). Positive (plasmid standard) and negative no-template (PCR grade water) controls were included. The amplification efficiency of the assays was calculated with the slope of the linear regression lines using the equation (E) = 10 (1/−slope) − 1 (Rutledge, 2003) (Table S2). The efficiency of all the assays were between 90 and 110%. All qPCR cycle conditions included a 10 min incubation at 95°C and a melt curve analysis. The limit of detection (LOD) was defined as the lowest concentration of DNA that could be detected in at least 95% of replicates and the limit of quantification (LOQ) was determined as the lowest concentration of DNA quantified within 0.5 standard deviations of the log10 concentration (AFNOR, 2015;Blanchard et al., 2012) (Table S2).

Data analysis
Sensitivity and specificity values for the crAss_2 marker as human MST were calculated as described previously (Gawler et al., 2007). Data were analysed using GraphPad Prism8 (GraphPad Software), Spearman correlation was used for correlation analysis and the Mann-Whitney test was used to determine significant differences between concentrations of microbial source tracking markers. A significant value of p < 0.05 was considered.

Sensitivity and specificity of the crAssphage marker crAss_2
Merrion and Sandymount strands are popular with dog walkers and are home to many seabirds. Furthermore, the urban Elm Park Stream, which is frequently contaminated by human faecal pollution (Reynolds et al., 2020), discharges close to these bathing waters. The main source of faecal contamination of these bathing waters is likely to be from seabirds, dogs and humans. We therefore evaluated the performance of the crAss_2 assay as human MST using DNA extracted from gull and dog faeces, and from samples taken from the Elm Park stream and from raw sewage taken from the influent of a wastewater treatment plants and from the sewage system. All dog and seabird samples were negative for the crAss_2 marker. In contrast, all Elm Park Stream and raw sewage samples were positive. The sensitivity and specificity of the crAss_2 assay was therefore 100% (Table 1). The levels of the crAss_2 marker ranged from 6.30 × 10 2 to 2.87 × 10 5 gc/100 ml in samples from the Elm Park stream and from 1.39 × 10 5 to 4.9 × 10 7 gc/ 100 ml in raw sewage samples (Fig. S1).

Correlation of crAssphage marker with human marker HF183 during a rain event
Significant rainfall events frequently cause a rapid increase in faecal indicator levels in bathing waters due to an increased discharge of faecal contamination from rivers. We therefore decided to evaluate the crAss_2 marker in marine bathing waters in conjunction with E. coli, intestinal enterococci, somatic coliphages and the human HF183 marker. The compliance point at Merrion strand was sampled at 30-minute intervals following a rainfall event over a 12-hour tidal cycle, the sampling sequence commencing 3 h after the rainfall event and coinciding with high tide.
E. coli and intestinal enterococci levels were shown to decrease by two orders of magnitude from 3.94 × 10 4 and 3.88 × 10 4 cfu/100 ml for both faecal indicators at high tide following the rain event to 2.5 × 10 2 and 1.21 × 10 2 cfu/100 ml for respectively E. coli and intestinal enterococci. A similar trend was observed for somatic coliphages, which decreased from 3.84 × 10 3 to 3.27 × 10 1 pfu/100 ml (Fig. 2a).
All samples (n = 24) throughout the tidal cycle were positive for both the crAss_2 and HF183 markers, indicating the presence of human faecal pollution (Fig. 2b). Both markers decreased by two orders of magnitude over the tidal cycle, decreasing from 6.94 × 10 4 to 3.03 × 10 2 gc/100 ml and from 8.96 × 10 4 to 1.70 × 10 3 gc/100 ml, for crAss_2 and HF183 respectively and were significantly correlated (ρ = 0.84, Fig. 3). Interestingly, both crAss_2 and HF183 were correlated, albeit to a lesser extent, with the levels of E. coli, intestinal enterococci and somatic coliphages (Fig. S2). All correlations were statistically significant (p < 0.0001).

crAssphage discriminated diffuse pollution in a bathing area in high tide conditions
The spatial distribution of human and seabird faecal contamination was examined along a transect perpendicular to the shoreline and intersecting the compliance point at high tide. Surprisingly, faecal indicators were not homogenously distributed along the transect but increased strongly from 2.38 × 10 2 and 3.91 × 10 1 cfu/100 ml for E. coli and intestinal enterococci, respectively, at the shoreline, increasing to a maximum value of 4.94 × 10 3 and 6.40 × 10 2 cfu/100 ml 1 km offshore. Somatic coliphage levels increased from 1.40 × 10 1 and 9.75 × 10 2 cfu/ 100 ml (Fig. 4a). The faecal indicator levels subsequently declined to 30 cfu/100 ml 2.25 km offshore. Peak levels of faecal indicators coincided with a shallow seabed, suggesting a submerged sandbank (Fig. 4c). Both the human HF183 and crAss_2 markers remained constant throughout the transect, suggesting a homogeneous distribution of human faecal contamination along the transect. In contrast, the levels of the gull marker increased by at least one order of magnitude 1 km offshore, coinciding with the maximum levels of E. coli and intestinal enterococci. The dog marker was not detected (Fig. 4b).

Detection of crAssphage during routine bathing water monitoring
The performance of the crAss_2 marker was further evaluated in compliance point samples over an 18-month period and was analysed in conjunction with the levels of E. coli, intestinal enterococci, somatic coliphages and the human HF183 marker. Faecal indicator levels in compliance point samples varied by up to four orders of magnitude, whereas the crAss_2 and HF183 marker varied by one order of magnitude (Fig. 5). In most samples from both Merrion and Sandymount Strands, the level of the HF183 marker was significantly higher than that of the crAss_2 marker (Mann-Whitney test, p = 0.0017 and 0.0015 for Merrion and Sandymount Strands, respectively).
The crAss_2 and the HF183 markers were not present in all compliance point samples, reflecting the strong degree of bathing water quality variability at these sites. In general, however, Merrion Strand is shown to be more polluted than Sandymount Strand (Fig. 5). This is reflected in the higher percentage of samples that test positive for either the crAss_2 or HF183 markers. At Merrion Strand, 62.5% and 77.5% of samples respectively, were positive for the crAss_2 or the HF183 markers but this reduced to 52.5% and 62.5%, respectively, for Sandymount Strand. In 77.5% (Merrion) and 75% (Sandymount) of the samples, the two markers were either both above or both below the detection limit. The remainder of the samples (24%) was positive for one marker only (Table 2). Interestingly the correlation between the crAss_2 and HF183 markers was less pronounced in compliance point samples (ρ = 0.53, Fig. 6) than in samples taken after the recent rainfall event in the transect study.

Discussion
Among recently described MST markers, Bacteroides bacteriophage crAssphage has been proposed as a potential human marker of human faecal pollution for use in bathing water management. This study aimed to assess the implementation of a crAssphage marker (crAss_2) for marine bathing water quality monitoring in an urban environment that is primarily impacted by human pollution, but is also subject to dog and seabird fouling (Reynolds et al., 2020). The crAss_2 assay showed 100% sensitivity and specificity for these sources of pollution and is therefore useful in correctly identifying waters impacted by human waste. A previous study, which did not include dog and seabird faeces, showed some cross reactivity with porcine, bovine and chicken faecal samples (García-Aljaro et al., 2017). These were not tested in the present study as these animals are not present in this urban environment. The lack of cross-reactivity of the crAss_2 assay with dog and bird faeces is in agreement with the results of the CPQ _056 and CPQ_064 crAssphage assays (Ahmed et al., 2018b(Ahmed et al., , 2018aStachler et al., 2017). Field testing of the crAss_2 marker in bathing waters during a 12-hour tidal cycle and following a rainfall event showed that the levels of the crAss_2 marker and the HF183 marker were closely correlated and present in comparable levels, pointing towards recent human faecal pollution that in this case, most likely emanated from the Elm Park stream, the outfall of which discharges in close proximity to these bathing waters and which has been shown to be highly polluted following rainfall (Reynolds et al., 2020). Similar observations have been reported for in estuarine waters in Australia receiving urban stormwater runoff (Ahmed et al., 2018a). Short-term pollution of this type can necessitate  beach closure as required by the EU Bathing Water Directive, presenting significant challenges to competent authorities charged with managing recreational bathing waters. Our study shows that the crAss_2 marker is effective for monitoring and identifying short-term episodes of faecal pollution that might arise from any or a combination from rivers and streams that can have increased faecal pollution loads at times of heavy rainfall. Dublin Bay is a UNESCO Biosphere that is home to a large population of seabirds that forage in shallow nearshore waters. We made use of the localised presence of seabirds to evaluate the ability of the gull, HF183 and crAss_2 markers to discriminate between various pollution sources. A sharp increase in faecal indicators 1 km offshore coincided with an increase in the gull marker, whereas both the HF183 and crAss_2 markers remained unchanged. This conforms that the crAss_2 assay does not respond to bird fouling. The results from this study demonstrate that human faecal contamination along the tested transect is homogeneously distributed, while in contrast, seabird contamination is more localised and remains restricted to a relatively small area.
We observed a 100-fold variation in faecal indicator levels during a 12-hour tidal cycle. Such variability is consistent with that reported by (Wyer et al., 2018) for a bathing area in the United Kingdom. Furthermore, inputs of faecal matter from gulls resulted in a 1000-fold variation in faecal indicator levels within a localised area occupied by these birds. The analysis of both the temporal and spatial distribution of faecal indicators MST markers undertaken in this study, clearly demonstrates that the current practice of classifying bathing waters based on a limited number of compliance samples taken during each bathing season over a four-year period, as required by the EU Bathing Water Directive, is potentially flawed. As suggested by Wyer et al. (2018), developing models for the daily prediction of water quality in designated bathing waters can improve management of these waters, but such models need to be underpinned by intensive sampling regimes (Wyer et al., 2018).
The performance of the crAss_2 marker was further evaluated over an 18-month period, during which compliance point samples from Merrion and Sandymount Strands were analysed. The two markers displayed a high degree of co-occurrence, with 76% of samples testing either both positive or both negative for the HF183 and crAss_2 marker. This percentage is slightly lower than that reported by Ahmed et al. (2020) in which both markers showed 85% agreement on the presence or absence of the two markers in estuarine waters (Ahmed et al., 2020). This discrepancy may be attributed to the different persistence rates of these host-specific markers in the environment. A previous study reported a T 90 value of 1 day for HF183 in seawater and 3 days for crAssphage, suggesting that HF183 could be useful for detecting recent contaminations in bathing water (Ahmed et al., 2019b(Ahmed et al., , 2019c. In fresh water environments, crAssphage T 90 values of 2 and 10 days have been reported for summer and winter conditions, respectively (Ballesté et al., 2018). Correlation between HF183 and crAss_2 markers was less pronounced at low levels of faecal contamination, which may be related to differences in decay rates as noted above, or to differences in the level of each marker in human faeces. This observation emphasises the need to deploy more than one MST marker to identify and discriminate between human and animal pollution, as suggested by others (Ahmed et al., 2020;Blanch et al., 2006;Gourmelon et al., 2010).

Conclusions
We have shown that the performance of the crAss_2 marker in a marine environment adjacent to an urban environment is closely correlated to that of the human HF183 marker and is effective in discriminating between human and animal pollution. The crAss_2 assay is therefore a useful addition to the MST toolbox to identify short-term human faecal pollution events. Correlation between the two human markers was less pronounced at low levels of faecal In the boxplots the lower hinge represents 25% quantile, upper hinge 75% quantile, and centre line the median. The whiskers are drawn down to the 10th percentile and up to the 90th. Points below and above the whiskers are drawn as individual symbols. Only values above the limit of quantification of the crAss_2 (Sandymount n = 21, Merrion n = 25) and HF183 (Sandymount n = 25, Merrion n = 31) assays are shown. I.E., intestinal enterococci; SOMCPH, somatic coliphages.

Table 2
Pairwise comparison of the human-associated HF183 and crAss_2 markers detection in bathing water samples collected in two strands in Dublin Bay. Shown are the percentage of samples in which both markers were either above or below the limit of detection (LOD), and when one of the markers was above and the other below the limit of detection. The total agreement was calculated on the sample number (n = 80). contamination, suggesting that the use of these two human markers is advisable under these conditions. This study revealed 1-2.5log 10 in-day variation in the levels of E. coli and intestinal enterococci depending on both the time of day and location of sampling. As was noted by others, current practices depending on a single sample during a day to determine bathing water quality is therefore inadequate.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.