Seasonality of enteric viruses in groundwater-derived public water sources

We investigated the seasonal prevalence of seven enteric viruses in groundwater-derived public water sources distributed across the dominant aquifers of England. Sampling targeted four periods in the hydrological cycle with typically varying microbial risks, as indicated using a decade of Escherichia coli prevalence data. Viruses were concentrated onsite by filtration of raw groundwater, and extracted nucleic acid (NA) was amplified by qPCR or RT-qPCR. Seven out of eight sources, all aquifers, and 31% of samples were positive for viral NA. The most frequently detected viral NA targets were Hepatitis A virus (17% samples, 63% sites), norovirus GI (14% samples, 38% sites), and Hepatitis E virus (7% samples, 25% sites). Viral NA presence was episodic, being most prevalent and at its highest concentration during November and January, the main groundwater recharge season, with 89% of all positive detects occurring during a rising water table. Seasonal norovirus NA detections matched its seasonal incidence within the population. Viral NA is arriving with groundwater recharge, as opposed to persisting for long-periods within the saturated zone. Neither total coliforms nor E. coli were significant pre- dictors of viral NA presence-absence, and there was limited co-occurrence between viruses. Nevertheless, a source with an absence of E. coli in regularly collected historical data is unlikely to be at risk of viral contam- ination. To manage potential groundwater viral contamination via risk assessment, larger scale studies are required to understand key risk factors, with the evidence here suggesting viral NA is widespread across a range of typical microbial risk settings.


Introduction
Groundwater supplies around half of all drinking water globally (WWAP, 2009), including 75% of European Union (EU) inhabitants (European Commission, 2008). Furthermore, it comprises 43% of the total consumptive irrigation water use (Siebert et al., 2010). Groundwater is generally considered to be lower microbial risk than surface water sources due to physical, chemical and biological attenuation in the unsaturated zone (Pedley et al., 2006) and supplies are often perceived to be potable (Jones et al., 2005). Consequently, untreated or undertreated groundwater is commonly consumed across high-income countries from private and community supplies (Hynds et al., 2013(Hynds et al., , 2014Wallender et al., 2014), as well as large municipal supplies in places, e.g. Christchurch, New Zealand (Pang et al., 2017). Nevertheless, outbreaks of disease related to enteric pathogens in groundwater are frequently reported (Murphy et al., 2017;Wallender et al., 2014). In a 12-year study of waterborne illness relating to drinking water in the USA, Reynolds et al. (2008) attributed 76% of reported outbreaks and 33% of all illness to groundwater consumption.
Enteric viruses are of a particular concern in groundwater. A systematic review of North American studies identified that enteric viruses were more frequently identified than either bacterial or protozoan pathogens in groundwater (Hynds et al., 2014). Viruses are the smallest enteric pathogens and can penetrate all common aquifer matrices.
Viruses can also survive for extended periods in the subsurface, due to favourable conditions (Pinon and Vialette 2018), such as a perennially low water temperature in temperate regions and an absence of sunlight. For example, Charles et al. (2009) demonstrated adenovirus and poliovirus remained infectious in groundwater for 364 and 140 days, respectively, and Seitz et al. (2011) showed norovirus capsids remained intact in groundwater for at least three years.
There has been increasing appreciation that the monitoring of bacterial indicator organisms is unsuitable to assure that drinking water sources are free from non-bacterial waterborne pathogens, such as enteric viruses (WHO, 2017). Consequently, many countries have responded by adopting risk-based approaches for individual water sources often based around water safety plans (WSP) (WHO, 2009(WHO, , 2017. These approaches evaluate potential hazards within the catchment of each source, assess their risks, and determine and validate control measures. The effective risk assessment and control of enteric viruses to groundwater sources requires evidence concerning their source, transport, and persistence. However, there is limited evidence concerning these properties for viruses, compared with other pathogens, because viruses have historically been difficult and expensive to analyse for Hunt and Johnson (2017) and there is a the lack of association between viruses and bacterial indicator organisms (Fout et al., 2017;Wu et al., 2011).
Studies investigating the occurrence of viruses in groundwater have traditionally been heavily focussed within North America (e.g. Abbaszadegan et al. 1999, Allen et al. 2017, Borchardt et al. 2004, Borchardt et al. 2012, Hunt et al. 2010, Stokdyk et al. 2020, Yates et al. 1985. Groundwater virus studies have been undertaken elsewhere, but there is a tendency for these to be local focused investigations of a specific virus associated with a known disease outbreak (Kauppinen et al., 2018;Lugoli et al., 2011;Shin et al., 2017), with a lack of spatial-temporal studies investigating a range of viral targets.
In the UK, there is minimal understanding of viruses in groundwater (Gregory et al., 2014). Historical work demonstrated the effective attenuation of viruses in the Chalk unsaturated zone at wastewater recharge sites (Baxter et al., 1981), at a time when 300 ML/d of sewage was infiltrated directly into the ground. Two other projects have investigated viruses in five multi-level piezometers and confirmed enterovirus, norovirus and coliphage contamination of the sandstone aquifer beneath the cities of Birmingham and Nottingham (Powell et al., 2003), and Doncaster (Morris et al., 2006). As a precursor to our field study, we sent questionnaires to all twenty water companies in England and Wales in 2018 to assess current virus monitoring in the water industry. The seventeen companies that responded confirmed there is no monitoring for viruses; instead, viral risks are managed by WSPs.
In our study, we examined enteric viruses in groundwater-derived public water sources across England. We selected eight sites within the most important water supply aquifers, which are also important water resources in many parts of Europe. The sites are of varying microbial risk, according to historical faecal indicator organism data, and in contrasting risk settings (rural/urban, depth below the surface, overlying protective geological cover). We investigated the seasonal occurrence of seven virus targets in raw groundwater, including Hepatitis E virus which is an emerging threat in high-income countries where the role of transmission through water is unclear (Fenaux et al., 2019;Wang et al., 2020). We relate viral nucleic acid (NA) occurrence and concentration to the hydrological conditions and evaluate the use of faecal indicator organisms (total coliforms and E. coli) to predict viral NA presence-absence.

Selection and hydrogeology
Eight sites (1-8) were selected to investigate viral occurrence across the major aquifers of England (Fig. 1A). These sites were distributed across the country with a focus towards aquifers that are most utilised for water supply. Where multiple sites were chosen within the same aquifer, they were located in separate geological basins, where aquifer properties are more likely to be contrasting. Three sites (1, 2, 5) were selected in the Cretaceous Chalk. The Chalk is the principal aquifer providing more than half of total licensed groundwater abstraction in England and Wales (Monkhouse and Richards, 1982) and is the most important source of freshwater in north-western Europe (Downing et al., 1993). The Chalk is a dual porosity soft white limestone comprising a low permeability matrix intersected by vertical joints and horizontal fractures. Groundwater recharge is generally considered to occur via piston displacement through the matrix, with episodic movement through joints/fractures during periods of intense rainfall occurring at times of high soil moisture content (Ireson and Butler 2011;Sorensen et al., 2015b). Water movement through the saturated zone is mainly through joints/fractures. The Chalk is more karstified in some locations and groundwater velocities from 0.5 to 6.8 km/day have been reported from sinking streams to discharge locations (Maurice et al., 2010). Sites 1 and 2 are both located in areas of the Chalk where karstic features are mapped abundantly at the surface and tracer tests have demonstrated travel times from these features to the sources in under 24 h. There is no evidence that Site 5 is connected to surface karst, although features are mapped in the vicinity.
Two sites (7, 8) were selected in the Permo-Triassic sandstone: the second most important aquifer accounting for around a quarter of licensed abstraction in England and Wales (Monkhouse and Richards, 1982), and the aquifer also covers large parts of west and central Europe (Crampon et al., 1996). The Permo-Triassic sandstones are predominantly interlayered sequences of pebbles, sands, and silts of varying cementation. Groundwater movement is typically through the matrix, although fracture flow can provide preferential flow paths on a local, or even regional, scale (Price et al., 1982).
A further two sites (4, 6) were located on Jurassic limestones that comprise the third most important source of groundwater in the UK (Neumann et al., 2003). The Jurassic limestones are relatively thin beds where groundwater flow is almost entirely through fractures/joints. Sites 4 and 6 are sited on the Jurassic Inferior Oolite Group and Lincolnshire Limestone Formation, respectively. Site 4 is 50 m upgradient from a spring, known to be hydraulically connected to the borehole, and historical records report it is drilled into an "underground river". Site 6 is protected by 10 m of low-permeability deposits and receives recharge from the aquifer outcrop at least 1 km to the West that is likely to be through the soil given the lack of proximal sinking streams (Bottrell et al., 2000).
Site 3 is a Carboniferous limestone spring in the Yorkshire Dales where groundwater flow is almost entirely along solution-enhanced fractures, including caves (Worthington and Ford, 2009). The Carboniferous limestone has the largest fractures of any aquifer in the UK and is the endmember in terms of karstic features and behaviour (Atkinson and Smart, 1981).
The sites include two springs (2, 3) and six boreholes (1, 4-8) ( Table S1). The rest water levels at the boreholes are shallow (<10 m from the surface), although the screens start as deep as 61 m below the surface (Site 7). All sites, with the exception of Site 8, have overlying protective superficial cover of between around 2 and 15 m, which is either sand/gravel (1, 2, 7) or silt/clay dominated (3, 4, 5, 6). Land use within 500 m of the sites can be classified as either rural (1, 3, 4, 6, 7), where enteric viral sources could comprise leaking septic tanks, leaking low density sewers and agricultural sources, or urban (2,5,8), where the predominant viral source would be a leaking high density sewerage network. Site 1 comprised two separate boreholes (a and b) that are similar from a hydrogeological and microbial risk perspective (Fig. 1B), and are 8 km apart. Site 1a was sampled at the beginning of the study before the site was out of operation, then Site 1b was sampled.

Historical bacterial indicator organism data
The sites are ordered 1-8 according to the likelihood of E. coli (Fig. 1B) and total coliform (Fig. S1) presence over ten years of collated water company data collected between January 2010 and December 2019 at approximately weekly to monthly intervals (n >184). Sites 1a and 1b are the highest microbial risk sites with E. coli and total coliforms present in >96% and >99% of samples, respectively. The lowest risk sites (7,8) are those in the Permo-Triassic sandstone. Three percent of samples have tested positive for total coliforms at Site 7, with a single detection of E. coli. There was no single positive detection at Site 8, which is located in the centre of a historic town, with no protective superficial deposits to protect the aquifer, and a shallow water table (Table S1). Small pore throats and slow flow in the sandstone are likely to impede live bacterial transport over appreciable distances, which would explain why no indicator organisms have been detected at Site 8. Occasional positive indicator organism detects at >61 m depth at Site 7 would only be possible if fractures are providing preferential flow horizons within the environs of the site.

Hydrological observations
Hydrological observations comprising spring discharge and groundwater level data were collated from January 2010 until February 2020 (Table S1). Spring discharge data were collated for Sites 2 and 3 from water company records. It was not possible to utilise groundwater level observations from the borehole sites due to pump duty cycles, pump rotation, and longer periods of site shutdown. Consequently, regionally representative groundwater level data were retrieved from observation boreholes within 2-40 km of the sites from a combination of the Environment Agency (environmental regulator in England) and water company observation boreholes (Table S1).

Sample rounds
Sampling was conducted in four rounds (R1-4) targeting typical distinct periods of differing hydrological conditions and microbial risk ( Fig. 2) in 2019 (R1-3) and 2020 (R4). R1 was selected as the period of typically lowest microbial risk, which also generally coincides with peak, or close to, groundwater levels/spring discharge. R2 targeted midsummer when microbial risks tend to be higher than R1 and groundwater levels/spring discharges are typically falling. The summer corresponds to a period where extreme rainfall, often convective, is more common as a result of higher air and sea temperatures (Dunstone et al., 2018;Hand et al., 2004;Jones et al., 2013). Such intense rainfall events can lead to significant, sporadic faecal contamination of public water sources; for example, the 28th June 2012 supercell and other large storms in June and July 2012 (Hannaford and Parry, 2012;Marsh and Parry, 2012;Parry et al., 2013) align with peak E. coli counts at Sites 2, 6, and 7. R3 was conducted towards the beginning of the hydrological year, which starts in October, when microbial risks are greatest. At this time of year, groundwater levels/spring discharges are low, but soil moisture deficits are typically being overcome and the annual recharge season is commencing. R4 was undertaken four months into the hydrological year, when groundwater levels/spring discharges have been rising rapidly and microbial risks remain elevated. These descriptions represent mean conditions across all sites and site-to-site variability is shown in Fig. S2.
Samples were not obtained from Site 1a in R3 and R4, Site 4 in R2, and Site 6 in R3. Site 1a was not in operation during R3 and R4 due to elevated turbidity and Site 1b was sampled in R4 instead. Site 6 was shut down in R3 because of elevated levels of bacterial indicator organisms. Site 4 had been sampled in R2, but the filter cartridge was mistakenly destroyed before extraction.

Virus sample collection and analysis
Virus extraction, concentration, and analysis followed methodology validated by the AQUAVALENS project (Gunnarsdottir et al., 2020 and references therein). Virus particles were concentrated at each public water supply by filtering through hollow fibre Asahi Polysulfone filters (Asahi Kasei, Oita, Japan). In sampling rounds R1 and R2 Rexeed-25A filters were used before the global unavailability of the units forced a transition to Leoceed-21H filters in rounds R3 and R4. The two filters are identical with the exception of the housing diameters, and the membrane surface areas that are 2.5 and 2.1 m 2 for the Rexeed-25A and Leoceed-21H, respectively.
Four filter units were used in parallel for each sample to maximise the potential flow rate. These units were housed in a bespoke sampling rig that contained analogue pressure gauges and ball valves to monitor and regulate water pressure, respectively, to ensure the water pressure did not exceed the operational pressure limits of the filters. Total flow was monitored following filtration on the rig using a flow metre (Digiflow 6710 M, Savant Electronics Inc., Taiwan). The sampling rig was sterilised between uses by re-circulating boiling water with a peristaltic pump (410, Solinst, Canada) for one hour. Laboratory trials confirmed no carryover of target viral NA prior to deployment. Furthermore, sequential field samples from our study were not positive for identical viral NA confirming no cross-contamination.
On arrival at each site, the raw water sampling tap was sterilised with >99.5% ethanol and the tap was run to waste for one minute, before connecting the sampling rig. The aim was always to filter 1000 L of water, though this was not always possible where the water pressure was insufficient, particularly at gravity-fed springs (2, 3), or where site operations restricted the available filtration time. A median of 886 L of water was filtered with a range of 235-1039 L, which took between 2.75 and 6 hours. Following completion, the filters were transported in a cool box to the laboratory where they were stored at 4 • C for up to 48 h before elution.
At the laboratory, viral NA was eluted from each filter by back-flushing with 250 mL of a sterile buffer containing 0.001% Antifoam, 0.01% Tween 80, and 0.01% Sodium hexametaphosphate solution.
Approximately 400-500 mL of sample was recovered at this primary concentration stage. Secondary concentration involved polyethylene glycol BioUltra 8000 (PEG) precipitation and centrifugation. 280 mL splits of eluted sample were mixed with 100 mL of 50% (w/v) PEGprecipitation buffer and 10 mL of 37.5% (w/v) solution of beef extract, both in sterile deionized water, and incubated overnight at 4 • C.
Where a sample split was <280 mL, then the volume was made up to 280 mL with additional elution buffer. The sample split was then centrifuged at 12 000 x g for 40 min at 4 • C to produce a pellet. The supernatant was aspirated off and the pellets from each sample split and all four filters were combined and suspended in 1.5-4 mL resuspension buffer (0.001% Antifoam, 0.01% Tween 80 solution in phosphate buffered saline (PBS)). The volume of the suspension concentrate was recorded, before it was transferred to a 7 mL plastic Bijou bottle and stored at − 80 • C until NA extraction. The NA extraction was performed on 700 µL aliquots of concentrates in 1.5 mL Eppendorf tubes. Samples were lysed chemically using 650 µL of UNEX lysis buffer (Microbiologics®, USA) and 50 µL of Proteinase K (>600mAU/mL), at 56 • C for 1 h in a water bath. The supernatant was transferred to a clean Eppendorf tube containing 0.5 g of 0.1 mm glass beads and 0.5 g of 0.7 mm zirconia beads and vortexed for 15 s. The samples were processed for 2 × 30 s in a FastPrep-24™ 5 G instrument (MP Biomedical, USA) at a speed setting of 6.0 m/s, before centrifuging at 10,000 x g for 30 s. The supernatant was transferred to a silica nucleic acid extraction column (Qiagen, Germany) for purification and the column was washed with 500 µL of 100% ethanol, then 500 µL of 70% ethanol, with centrifuging at 10,000 x g for 60 s and the filtrate being discarded between each step. Finally, viral NA was eluted by adding 50 µL of nuclease-free water to the column and centrifuging at 13 000 x g for 60 s. This final step was repeated twice and the two volumes were combined and samples stored at − 20 • C for up to 3 months or at − 80 • C for later use.
Quantitative polymerase chain reaction (qPCR) or reverse transcriptase qPCR (RT-qPCR) was performed on the extracted NA using a CFX96 (Bio-Rad, USA). Seven viruses that are commonly observed in the environment were targeted using validated commercially available kits: Hepatitis A (HAV), Norovirus (NoV) GI, NoV GII (Q standard ceer-amTools™, BioMérieux, France); Hepatitis E (HEV), Human Adenovirus F (HAdV-F), Rotavirus A (RV-A) (Genetic PCR Solutions™, Spain); and Enterovirus (EV) (genesig®, Primerdesign™ Ltd, UK). Further details for the BioMérieux and Primerdesign kits are available from International Organization for Standardization, Geneva 15216-1:2017 (2017) and Dierssen et al. (2008), respectively. All kits are TaqMan™ probe-based assays that target specific amplicons, hence melt curve analysis was not required. Details of the cycling conditions are provided in Table S2. Where a reverse transcription was required, this was undertaken as part of a one-step reaction before the PCR cycle. Each viral target was quantified on a single 96-well plate, each containing all samples, positive and negative controls. In all instances, 5 µL of template was used.
Positive controls consisted of the reference standards provided to generate the standard curves (Fig. S3), with BioMérieux providing additional positive controls for their kits and for EV, which all amplified (Table S4). Negative controls were performed on each batch of the extraction and pellet re-suspension buffers used, comprising six in total, for each target. Nuclease-free water was also used as a negative control during reverse transcription and amplification, with two controls analysed for each target, in addition to BioMérieux providing negative controls with their kits. None of the negative controls amplified within the respective maximum number of cycles. No inhibition tests were performed on these low turbidity groundwater samples, with none of the samples being coloured after extraction.
The BioMérieux, Genetic PCR Solutions, and Primerdesign kits had maximum amplification cycles of 45, 40, and 50, respectively. Positive amplifications observed >5 cycles from the respective cycle limits were Monthly mean standardised (min-max) groundwater level/discharge and mean probability of E. coli ≥ 1 cfu/100 mL across the eight sites. In the mean calculation across all sites, site 1 was represented as the mean of sites 1a and 1b to avoid bias. All means calculated using hydrological observations (Table S1) and E. coli data collected by the respective water companies over ten years commencing from January 2010. deemed true positives, given negative controls did not amplify, in line with BioMérieux and Primerdesign instructions, and Bustin and Nolan (2004). Starting quantities (Sq) were quantified by comparing the quantification cycle (Cq) values (Table S3) to standard curves (Fig. S3) produced by serial dilution of reference standards provided by the manufacturers, in line with their instructions. Two positive samples had Sq values of <1 gene copy per reaction for HAV NA, although the Cq values were less than the positive controls (Tables S3 and S4). Sq values were then converted to gene copy number per litre of filtered groundwater assuming 100% efficiency during extraction and treatment. Therefore, the reported concentrations should be considered minimum values, as losses would have occurred during extraction and treatment.

Indicator organism sample collection and analysis
A sample for total coliforms and E. coli was collected after the virus sample in a 1 L PET sample bottle (VWR, Cat No. 331-0269) for the first 12 of the 29 samples (all of R1 and four of the seven R2 samples). These samples were analysed at the University of Surrey by Colilert-18/ Quanti-Tray method (ISO 9308-2:2012) before closure of the laboratory by the University.
Subsequently, the project was reliant upon the routine raw water sampling and analysis undertaken by the water companies (SCA 2009). Of the 17 water company samples, five were taken on the same day, five within 18 h, and three within 42 h of the virus sample. Of the remaining samples, three were taken 3-5 days apart from the virus sample, but these were at Sites 7 and 8 where indicator organisms are very rarely or never recorded and were all negative. Finally, two indicator organism samples were positive four days either side of the R3 virus sample at Site 3.

Statistical analysis
All analysis was conducted in R version 3.4.0 (R Core Team 2017) and figures were produced using the ggplot2 package (Wickham, 2016). Bacterial indicator organisms were assessed as predictors of viral NA presence-absence using logistic regression models developed using the core command glm. Models were evaluated in terms of significance, true-positive rate (sensitivity), false-positive rate, true-negative rate (specificity), and false-negative rate.
To contextualise the hydrological conditions at the point of sampling, monthly standardised groundwater indices (SGIs) (Bloomfield and Marchant, 2013) were calculated using the hydrological observations (Table S1). However, instead of the inverse normal distribution applied by Bloomfield and Marchant (2013), we employed an inverse uniform distribution between 0 and 1, so SGIs were more easily interpretable. Before estimating SGIs, gaps of three and two months in observations near sites 1 and 6 were infilled by linear interpolation. SGI was calculated by splitting data from each site into mean observations for each calendar month, these were ordered and assigned a rank, an inverse uniform cumulative distribution function was applied, and the normalised monthly indices were merged to form a continuous SGI time series. Therefore, a value of 1 refers to groundwater levels/spring discharge being at a monthly maximum over the ten-year period.

Widespread viral nucleic acid prevalence
Target viral NA was detected at seven of the eight sites, in 31% of samples, and 7% of samples tested positive for multiple types (Fig. 3). HAV was the most frequently detected NA (17% samples, 63% sites), followed by NoV GI (14% samples, 38% sites), HEV (7% samples, 25% sites), and a single detection of HAdV-F. There were no positive detections of EV, NoV GII, or RV-A NA. There were no consistent patterns of co-occurrence amongst viral NA.
NoV GI was the only viral NA to be detected multiple times at the same site, occurring at forty times the concentration in R4 (January) than R1 (April), and was absent in between (R2 and R3). The two highest concentrations of viral NA relate to NoV GI at concentrations up to 9.6 × 10 3 copies L − 1 (Fig. 3; Table S5).
Seven sites were positive for viral NA on at least a single occasion (Fig. 3). Therefore, viral NA was present in all four aquifers, beneath both rural and urban land uses, and across a range of microbial risk settings (Figs. 3 and 1B). The lowest microbial risk location according to historic indicator organism data (Site 8), was the only location testing negative for viral NA throughout the study. HAV was the only viral NA detected in all four aquifers.
Viral NA was most prevalent during sampling rounds R3 and R4 (Fig. 3). Seven of the nine positive samples for viral NA and the only ones testing positive for multiple types of viral NA occurred in rounds R3 and R4. Furthermore, the highest five NA concentrations were all during these rounds.

Viral nucleic acid prevalence relates to groundwater recharge
Eighty-nine percent of positive samples for viral NA were associated with a rising groundwater level/spring discharge (Fig. 4). R1 samples were collected around peak groundwater level/spring discharge, with only a sample at Site 7 being positive for viral NA (Fig. 4). This positive sample was taken when the site was responding to recharge from several low-pressure weather systems bringing persistent heavy rainfall that resulted in local pluvial flooding . Site 2 contained the only positive detect on the falling limb of a hydrograph, occurring in R2. The majority of detections were in R3 and R4, after the main 2019/20 recharge season commenced consistently across the sites in late September.

Viral nucleic acid relationship to bacterial indicator organisms
Neither total coliforms (p-value 0.625) nor E. coli (p-value 0.562) were significant single-predictors of the presence-absence of viral NA. Sensitivity for total coliforms and E. coli were 63 and 50%, respectively (Fig. 5). Specificity for total coliforms and E. coli were 48 and 62%, respectively (Fig. 5).
All sites (1-7) with previous evidence of total coliforms and E. coli (Fig. 1B) tested positive for viral NA in at least one sampling round. These sites all tested positive for total coliforms and all but Site 7 tested positive for E. coli between February 2019 and January 2020. Site 8 with no previous evidence of any coliforms showed no evidence of virus contamination. All bacterial indicator organism data are shown in Tables S5 and S6.

Hepatitis A
Hepatitis is a major waterborne disease resulting in fever, pain, malaise, diarrhoea, vomiting, and jaundice. HAV was associated with 50% of global hepatitis cases in 2008 (Bosch et al., 2008). In high-income countries, HAV infections are considered uncommon and typically considered to be imported from low-income countries where the virus is endemic, and sanitation and hygiene may be poor, although outbreaks in certain at-risk groups do occur (Carrillo-Santisteve et al., 2017). In 2018, Public Health England confirmed 452 cases and reported that cases have been falling from a peak incidence in 1990 of 7545 cases (PHE, 2019). The limited confirmed cases amongst the population raises the question of why HAV was the most commonly detected viral NA in our study.
Firstly, infections are frequently asymptomatic or subclinical, particularly amongst children, and are likely to be underreported (Matin et al., 2006;PHE, 2019). For example, a 2008 blood donor study in Southwest England demonstrated an incidence of 0.4% in 4503 assays (Dalton et al., 2008). Secondly, viruses are shed in large numbers from infected individuals and are frequently detected in sewage and surface water samples in high-income countries (Hellmér et al., 2014;Pina et al., 2001). Thirdly, HAV has been detected in groundwater in other high-income countries (Borchardt et al., 2003;Shin et al., 2017); notably 8.6% of 150 public water supply boreholes spread across 35 states of the USA (Abbaszadegan et al., 1999). The virus's occurrence in groundwater has been attributed to its enhanced mobility through the unsaturated Fig. 4. Presence-absence of viral NA plotted on min-max normalised hydrographs for all sites. Each aquifer is grouped into a separate subplot and areas outside of sampling rounds are greyed. There is a gap in the hydrological data at Site 4, which covers R2, and linear interpolation was used to infill following confirmation that there was a rise in water level at the next nearest observation borehole (E4473). [Permissions received © Drinking Water Inspectorate, Defra]. zone and its persistence in the environment compared to other enteric viruses (Borchardt et al., 2003;Sobsey et al., 1986). Indeed, HAV was the only viral NA detected in all four aquifers.
HAV has been responsible for multiple virus-related groundwater outbreaks globally (Murphy et al., 2017) and was the most common pathogen (8.5% of those with known aetiology) linked to outbreaks in untreated groundwater supplies in the USA between 1971 and 2008 (Wallender et al., 2014). The virus has also been one of the most common causes of infectious disease outbreaks related to drinking water, both surface water-and groundwater-derived, in Canada between 1971 and 2001 (Schuster et al., 2005).

Hepatitis E
HEV is an emerging threat in high-income countries and the main cause of acute hepatitis worldwide, with an estimated 20 million people infected annually (Fenaux et al., 2019;Hakim et al., 2017). In high-income countries, there is a shift in human infections towards zoonotic genotypes (GIII and IV), that also occur in pigs (Pavio et al., 2015). Runoff from pig farms and land treated with pig slurry, as well as wastewater from slaughterhouses can introduce HEV into the aquatic environment (Fenaux et al., 2019;Krog et al., 2017). Nevertheless, the human source is also appreciable and 93% of untreated sewage samples from Edinburgh's domestic sewage works, receiving no agricultural effluent or runoff, tested positive for HEV over a period of six months (Smith et al., 2016). Infections are often linked to the consumption of pork and shellfish (Fenaux et al., 2019), although drinking water source was associated with seroprevalence in France (Mansuy et al., 2016) and HEV has been shown to persist through water treatment into tap water in Sweden (Wang et al., 2020).
In England, HEV infections are increasing year-on-year with numerous indigenously acquired infections . A study of 225,000 blood donations in Southeast England demonstrated 3.5% of donors were viraemic (Hewitt et al., 2014). Therefore, there is a large source within the human population, although both detects (sites 4 and 7) were from boreholes located in predominantly rural areas, hence could either be from an agricultural or human sources. There appears a lack of comparable studies that have either investigated or documented HEV in groundwater in high-income countries, despite the consumption of groundwater being suggested as a potential pathway (King et al., 2018).

Norovirus
Acute gastroenteritis causes the second greatest burden of all infectious diseases globally and NoVs are estimated to account for 20% of all cases (Ahmed et al., 2014). Globally, NoV is the most frequently responsible enteric pathogen for disease outbreaks relating to groundwater in the academic literature (Murphy et al., 2017), including examples from the USA (Borchardt et al., 2011), France (Gallay et al., 2006), and South Korea (Kim et al., 2005). In the UK, there are estimated to be around 3 million cases per year of gastroenteritis resulting from NoV (Tam et al., 2012), although NoV GII is the dominant genotype (van Beek et al., 2018).
NoV is commonly detected in treated wastewater and rivers in the UK (Merrett et al., 2013;Palfrey et al., 2011), and has been confirmed in groundwater (Powell et al., 2003). Therefore, our results are supported by the prevalence of NoV within the UK population and aquatic environment, as well as its common occurrence in groundwater in high-income countries. UK infections also show pronounced seasonality year-on-year: typically being highest from November through to April (PHE, 2020). This seasonality matches the occurrence of NoV GI NA in our data, with detections only in January, April, and November.

Viruses related to groundwater recharge
Viral contamination was transient, which is consistent with a study of 50 wells sampled in four different seasons in USA (Borchardt et al., 2003). There is limited evidence to indicate viral NA is persisting in the sampled locations for long periods, with viral NA arriving when groundwater was responding to recharge. If this NA is transported directly from recent sources at the surface, then the associated viruses are more likely to be infectious. However, the viruses could equally be re-mobilised from the unsaturated zone, or emanate from older sources, which could be treated in the case of certain effluent, and may be non-infectious. Research from the USA has also noted that virus occurrence in groundwater relates to groundwater recharge events (Bradbury et al., 2013;Gotkowitz et al., 2016).

Viruses transported by preferential flow
The most commonly detected viral NA (HAV, HEV and NoV GI) were all detected at Site 7, which has the deepest screen, demonstrating viral NA transport to at least 60 m below the ground surface in a sandstone aquifer. This NA could potentially have been transported through the pores of the sandstone, as viruses are small enough to enter the aquifer matrix. However, the travel time to 60 m is likely to be excessive for the survival of viral NA and, combined with previous evidence of culturable bacterial indicator organisms, supports microbial transport via fractures from leaking sewers, as hypothesised nearby in the aquifer (Morris et al., 2006). Viral NA at the other six positive sites are also likely to have been transported through fractures, given the hydrogeology, depth to screen, evidence for bacterial indicator organisms, and seasonal occurrence during the recharge season. There is no evidence for preferential flow at Site 8 where viral NA was absent, with no previous positive detections of coliforms despite the urban and shallow water table setting.
The different location settings were nearly all susceptible to the transmission of viral NA. These results indicate that the immediate environs of the location do not relate to virus risk: be that urban/rural setting or protective overlying geology (including 15 m of silt and clay). This is likely to reflect the nature of rapid, preferential flow paths in both the unsaturated and saturated zone, which are capable of delivering viruses from where sources are present and protective cover is lacking or can be bypassed. Furthermore, differences in sorption capacity between aquifers, overlying deposits, and their associated soils appear unimportant because viral NA was omnipresent across aquifer settings, although only HAV was present in all four aquifers.

Temporal representativeness of results
Rainfall during winter and early-spring in 2018/19 was below average across the study area  resulting in subdued groundwater recharge and groundwater levels/spring discharges typically being below normal during round R1 (Fig. 6). Contrastingly, June to October 2019 was the second wettest on record in England and Wales over the last 50 years (Parry et al., 2019), with persistent heavy rainfall continuing during November resulting in serious flooding across the centre and north of England . Groundwater levels/spring discharges transitioned from generally below normal in R2 to well-above normal by R3, reaching peaks at Sites 6 and 7, and their second highest at Sites 2 and 4, for the period 2010-2019. Rainfall remained above average over much of the study area in December (Turner et al., 2020) before returning to more typical values in January (Barker et al., 2020). As a result, groundwater levels/springs discharges continued to receive recharge and increased during round R4, typically maintaining exceptionally high levels/discharge (Fig. 6).
It is possible that the exceptionally wet conditions encountered prior to and during the 2019/20 recharge season (R3 and R4) enhanced the transport of viral NA to groundwater, particularly given the identified association of viral NA with recharge here. Alternatively, it is also possible that unusually high precipitation diluted existing sources of viruses arriving at the water table. The only available evidence to support either of these two arguments comes from bacterial indicator organism data. However, total coliforms and E. coli were typically close to or below monthly means during the study period (Fig. S4), hence there is no strong evidence for the meteorological conditions enhancing or diminishing faecal contamination of the sites during the study period.

Viral indicators
The lack of relationship between either total coliforms or E. coli and viral NA on an individual sample basis in our study is supported by previous meta-analyses (Fout et al., 2017;Wu et al., 2011). Fout et al. (2017) compiled data from twelve groundwater studies and demonstrated total coliforms and E. coli had very low sensitivity (12 and 2%), but higher specificity (88 and 97%), respectively, for the prediction of viruses that were quantified by molecular methods. Sensitivity was slightly higher for predicting viruses quantified by culture methods, perhaps because this is a comparison of viable organisms, but still poor (total coliforms 29% and E. coli 26%) (Fout et al., 2017).
Total coliforms and E. coli appear to have merit for evaluating whether a site is at-risk or not of virus contamination and all sites in our study could be correctly classified using historical data. The presence of these indicators would confirm a potential rapid pathway to the (near-) surface and potential sources of viruses in the environment. The availability of long-term records (10+ years) for this purpose is invaluable because viral sources can be transient and rapid pathways more likely to be active during extreme hydrological conditions. Strong support for the use of virus risk classification using E. coli is provided by Fout et al. (2017) who demonstrated a specificity of 94% for E. coli prediction of viruses by molecular methods at the site level. We consider this high specificity at the site level to be the main contributor to the high specificity estimated at the individual sample level in their dataset.

Fig. 6.
Monthly standardised groundwater indices (SGIs) for the study period determined using data from January 2010 until January 2020. Each aquifer is grouped into a separate subplot and areas outside of sampling rounds are greyed. A horizontal line is shown at SGI = 0.5 to illustrate average monthly conditions. [Permissions received © Drinking Water Inspectorate, Defra]. Sensitivity remains very low (12%), with abundant false-positives, hence why E. coli is not suitable to be used as a viral indicator at the sample level.
The lack of co-occurrence amongst viral NA targets here provides no evidence that certain viruses, such as previously suggested EV (Hot et al., 2003) or adenoviruses (Farkas et al., 2020), can be used as indicators of a broader range of viruses in groundwater. Indeed, EV was absent, and there was only a single detection of HAdV F despite adenoviruses being shown to be one of the more stable viruses in water and hence a potentially conservative indicator of other viruses (Farkas et al., 2020;Sidhu et al., 2015). Furthermore, the meta-analysis of Fout et al. (2017) demonstrated that somatic coliphages, now suggested as an indicator of enteric viruses in the EU (WHO, 2017), are no better than E. coli as viral indicators in groundwater. Alternative proposed novel viral indicators often associated with sewage include ibuprofen (Allen, 2013), or in-situ fluorescence spectroscopy that can be measured in real-time (Sorensen et al., 2015a(Sorensen et al., , 2018.

Conclusions
The episodic prevalence of viral nucleic acid across the majority of public water sources, all major water supply aquifers, and a range of typical microbial risk settings in both urban and rural areas, indicate potentially widespread seasonal viral risks in groundwater used for drinking. The public water sources that were sampled all have suitable treatment measures in place for the provision of safe drinking water before supply. However, there are likely to be other sites, notably private water sources, where water treatment is insufficient and public health risks from viruses may be present. To manage potential groundwater virus contamination via water safety plan (WSP) risk assessments, larger scale studies are required to further understand key risk factors within catchments, for example viral sources and relative loading, subsurface transport, viral persistence, and viral viability.
Sampling for viruses should be focussed during periods of groundwater recharge, when they are most likely to occur, if investigating viral risks at a source. The lack of co-occurrence amongst viral targets suggests a widespread suite of viruses would be more suitable than investigating a single indicator target, such as adenoviruses, in untreated groundwater. Bacterial indicator organisms do have value to assess whether a viral risk is present: a source with an absence of indicators in regularly collected historical data is unlikely to be at risk of virus contamination.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.