Are sanitation interventions a threat to drinking water supplies in rural India ? An application of tryptophan-like fl uorescence

Open defecation is practised by over 600 million people in India and there is a strong political drive to eliminate this through the provision of on-site sanitation in rural areas. However, there are concerns that the subsequent leaching of excreta from subsurface storage could be adversely impacting underlying groundwater resources upon which rural populations are almost completely dependent for domestic water supply. We investigated this link in four villages undergoing sanitary interventions in Bihar State, India. A total of 150 supplies were sampled for thermotolerant (faecal) coliforms (TTC) and tryptophanlike fluorescence (TLF): an emerging real-time indicator of faecal contamination. Sanitary risk inspections were also performed at all sites, including whether a supply was located within 10 m of a toilet, the recommended minimum separation. Overall, 18% of water supplies contained TTCs, 91% of which were located within 10 m of a toilet, 58% had TLF above detection limit, and sanitary risk scores were high. Statistical analysis demonstrated TLF was an effective indicator of TTC presence-absence, with a possibility of TTCs only where TLF exceeded 0.4 mg/L dissolved tryptophan. Analysis also indicated proximity to a toilet was the only significant sanitary risk factor predicting TTC presence-absence and the most significant predictor of TLF. Faecal contamination was considered a result of individual water supply vulnerability rather than indicative of widespread leaching into the aquifer. Therefore, increasing faecal contamination of groundwater-derived potable supplies is inevitable across the country as uptake of onsite sanitation intensifies. Communities need to be aware of this link and implement suitable decentralised low-cost treatment of water prior to consumption and improve the construction and protection


Introduction
It is widely recognised that inadequate separation of human excreta from human contact represents a serious threat to public health (Bartram and Cairncross, 2010;Prüss-Ustün et al., 2014). Globally it is estimated to be responsible for around 10% of the global disease burden, particularly among children under five years old (Mara et al., 2010). Consequently, improving access to basic sanitation is one of the United Nation's Millennium Development Goals. Nevertheless, one billion people still practise open defecation around the world (WHO/UNICEF, 2014), including over 600 million in India (WHO, 2012). India has been attempting to tackle open defecation for decades under various guises, including the Total Sanitation Campaign that had ambitiously set out to eliminate the practice by 2012 (Patil et al., 2014). Last year the Prime Minister of India declared this a national priority and pledged to provide a toilet in every home by 2019 (Guiteras et al., 2015).
In many low-income communities installation of a sewerage system is unfeasible as it requires significant capital investment and a piped water supply. Therefore, on-site sanitation technologies such as pit latrines and pour-flush toilets, with disposal to a leach pit or septic tank, are often the most appropriate solution (Paterson et al., 2007). In fact, it is estimated that 1.77 billion people worldwide use pit latrines as their principal means of sanitation (Graham and Polizzotto, 2013). However, on-site sanitation allows the focussed leaching of high loads of human excreta directly into the subsurface within the built-up area. On the other hand, open defecation generally results in excreta spread diffusely across open areas with greater opportunities for breakdown and attenuation before reaching the water table. Therefore, on-site sanitation may pose a greater threat to nearby groundwater-derived potable supplies through the introduction of enteric pathogens or elevated concentrations of nitrate (Dzwairo et al., 2006;Pujari et al., 2007Pujari et al., , 2012Templeton et al., 2015). Nevertheless, studies associating pit latrines and groundwater contamination are limited (Graham and Polizzotto, 2013) and do not always indicate there is a significant relationship (e.g. Howard et al., 2003).
Faecal contamination of drinking water is typically deduced using bacterial index organisms cultured over at least 18 h (Savichtcheva and Okabe, 2006), such as thermotolerant (faecal) coliforms (TTCs). These organisms occur in enormous numbers in human excreta and are used to infer the presence of enteric pathogens. A promising alternative approach is the use of portable fluorimeters targeting tryptophan-like fluorescence (TLF): a realtime indicator of TTCs Sorensen et al., 2015a) and enteric pathogens (Sorensen et al., 2015b) in African freshwater. TLF is a proteinaceous component of the fluorescent dissolved organic carbon spectrum and a well-established indicator of wastewater (Baker, 2001;Lapworth et al., 2008;Henderson et al., 2009;Stedmon et al., 2011;Cohen et al., 2014). Furthermore, evidence suggests the source(s) of TLF are more resilient and more easily transported in groundwater than TTCs and, therefore, TLF could be a better indicator of smaller, more long-lived enteric pathogens in the environment such as some viruses (Sorensen et al., 2015a). However, transportation efficacy will ultimately depend upon the association of TLF with different sources and their associated size fractions in groundwater, for example particulate matter, bacteria, or in dissolved phases, of which little is known.
This study focusses on India where open defecation is the most widespread in the world and there is a strong political drive to tackle this issue through the provision of basic sanitation. We aim to investigate: i) whether current sanitary intervention practices are a threat to potable supplies derived from a confined aquifer; ii) evaluate the ability of TLF as an indicator of faecal contamination in this setting; and iii) improve understanding of the association of TLF with different size fractions in groundwater.

Study area
The majority of the Indian population resides in rural areas that are heavily groundwater dependent (Coffey et al., 2014;Kulkarni et al., 2015). There is also a strong rural-urban divide in sanitation provision with no access to toilet facilities in 69% of rural households, as opposed to 19% in urban areas (Ghosh and Cairncross, 2014). Regionally, toilet access is lowest in northeastern India, particularly Bihar State where 77% of households are without. Consequently, rural Bihar State was designated as the study area, with four villages (Maksudpur, Shahjahanpur, Sigariyawan, Taraura) selected within the Daniyawan Block of Patna District (Fig. 1). The population of each village is between c. 1000e3000 people contained within c. 250e600 households (Indian Census, 2011). Sanitation interventions are being promoted within these villages, but open defecation is still commonplace amongst the surrounding fields. On-site sanitation installations, comprising latrines and pour-flush toilets with disposal to a leach pit or septic tank, have generally commenced within the last two years and are typically completed to less than 2 m below ground level. Local NGOs currently recommend that sanitation should be installed at least 10 m from the nearest water supply.
The villages lie on the monotonously flat South Bihar Plain in the Ganges Basin, which is infilled with up to 2 km of Quaternary alluvial sediments (Sinha and Tandon, 2014). The upper 200 m of sediments represent a massive multi-layered unconsolidated aquifer system, containing an estimated 11,800 km 3 of high quality freshwater exploited through around 15e20 million tubewells . Locally, sediments at the surface comprise clayey silts which persist to around 25 m below the surface (particle size distributions, Fig. S1), providing a barrier to the vertical migration of contaminants, before transition to fully saturated sands and fine gravels in excess of 25 m thick. These confined saturated sediments represent the sole untreated source of drinking water for the villages, which are predominantly tapped by tubewells drilled to around 50 m depth and completed with 3 m of screen at their base. Dug-wells completed to <10 m depth are also in use, but are much less frequent. The potentiometric surface of the aquifer here is perennially within <5e10 m of the ground level (CGWB, 2007). Both groundwater and surface water drain northeast towards the Ganges. The climate is subtropical with around 1000 mm of annual precipitation mostly delivered by the monsoon between June and September.

Groundwater sampling and analysis
A total of 150 groundwater supplies were sampled towards the end of the dry season in April and comprised 145 tubewells fitted with handpumps and 5 dug-wells in the four villages. Supplies were selected to achieve spatial coverage across the villages whilst ensuring a sufficient balance between those near to (<10 m) and those away (>10 m) from on-site sanitation. All tubewells were regularly used for domestic purposes and were pumped for one minute prior to sampling to ensure the pipework was flushed and the sample representative of the source. Any tubing at the surface was also removed from the headworks prior to sampling to minimise the risk of sample contamination. A sanitary inspection of each tubewell was undertaken to assess the presence of risk factors using pre-set questions (e.g. on-site sanitation/animal excreta/ surface water within 10 m) ( Table 1). The sanitary risk score (SRS) refers to the total number of positive responses to these ten risk factors.
In-situ TLF was determined using a portable UviLux fluorimeter targeting the excitation-emission peak at l ex ¼ 280 ± 30 nm and l em ¼ 360 ± 50 nm (Chelsea Technologies Group Ltd, UK). The minimum detection limit for the fluorimeter is 0.17 ± 0.18 (3s) mg/L dissolved tryptophan (Khamis et al., 2015). The sensor was calibrated using dissolved laboratory grade L-tryptophan (Acros Organics, USA) in 20 C ultrapure water at concentrations of 0, 1, 5, 10, 20 and 50 mg/L. These standards were also analysed on a bench top Varian™ Cary Eclipse fluorescence spectrophotometer for the equivalent wavelength pair, hence concentrations on the portable device can be converted to equivalent Raman Units by multiplication by 0.0024 at 20 C. Coloured dissolved organic matter (CDOM) fluorescence was also measured at 35 handpumps using a Uvilux fluorimeter targeting the peak at l ex ¼ 255 ± 30 nm and l em ¼ 450 ± 55 nm. This was undertaken because there is overlap between the TLF and CDOM optical space on these fluorimeters, which could result in an apparent TLF signal in a sample containing CDOM fluorescence but no TLF. CDOM intensity is reported in equivalent mg/L of IHSS Suwannee River fulvic acid standard using the factory calibration.
At each site, 200 mL of groundwater was collected in a 1 L polypropylene beaker to allow complete submergence of the fluorimeter windows. Readings were taken in the dark by placing the beaker within a covered stainless steel container. The sensor and beaker were thoroughly rinsed with fresh sample at each site. Following determination of fluorescence, the temperature and turbidity of the water were immediately measured using a HI766EIE1 thermocouple with HI935005 thermometer (Hanna Instruments, USA) and 2100Q turbidimeter (Hach Company, USA), respectively. This was undertaken as both variables can strongly impact upon TLF (Henderson et al., 2009;Khamis et al., 2015).
To investigate the size fractionation of TLF sources within groundwater, TLF was additionally quantified on filtered samples at a subset of 30 supplies across the range in intensity. The original water sample was passed through a 2.7 mm pore size filter (GF/D, Whatman, USA). This was considered to remove all fluorescing particulate organic matter and particles causing an apparent signal through the scattering of emitted light (Khamis et al., 2015), with the exception of that associated with any clay fraction. As many bacteria as possible would also be retained in the sample, which can contribute directly to TLF (Elliott et al., 2006;Bridgeman et al., 2015). Subsequently, the sample was passed through a 0.22 mm pore size filter (Sterivex, Millipore, USA) to exclude all suspended particles and bacteria. Therefore, any remaining TLF would be considered to be in a proteinaceous free-form (Baker et al., 2007).
Samples for TTC analysis were collected in sterile 0.2 L polypropylene bottles and stored in a cool box (up to 8 h) prior to transportation back to the laboratory for immediate analysis. TTCs were isolated and enumerated using the membrane filtration method with Membrane Lauryl Sulphate Broth (MLSB, Oxoid Ltd, UK) as the selective medium. A 100 mL sample was filtered through a 0.45 mm cellulose nitrate membrane (GE Whatman ® , UK), with the exception of more contaminated sites (based on the TLF intensity) where 10 or 1 mL were used. This membrane was then placed atop an absorbent pad (Pall Gelman, Germany) saturated with MLSB broth in a plate, pre-incubated for 1 h at ambient temperature (20 C), then incubated at 44 C for a further 17e21 h. Plates were examined within 15 min of removal from the incubator and all cream to yellow colonies greater than 1 mm counted as TTCs. Blanks, using distilled water, and sample replicates were run at the beginning and end of each day's analysis.

Predicting the presence of thermotolerant coliforms
Statistical models and tests were performed with MATLAB v14.1. Firstly, the extent to which TLF, SRS and turbidity indicate the presence of TTCs was investigated. The probability that a water  Question Answer 1. Is there a latrine within 10 m of the handpump? Y/N 2. Is the nearest latrine on higher ground than the handpump? Y/N 3. Is there any other pollution (e.g. animal excreta, rubbish, surface water) within 10 m of the handpump? Y/N 4. Is the drainage poor, causing stagnant water within 2 m of the handpump? Y/N 5. Is the handpump drainage faulty? Is it broken, permitting ponding? Does it need cleaning? Y/N 6. Is the fencing around the handpump inadequate, allowing animals in? Y/N 7. Is the concrete floor less than 1 m wide all around the handpump? Y/N 8. Is there any ponding on the concrete floor around the handpump? Y/N 9. Are there any cracks in the concrete floor around the handpump which could permit water to enter the well? Y/N 10. Is the handpump loose at the point of attachment to the base, so that water could enter the casing? Y/N supply was contaminated with TTCs was represented by a logistic regression model (Dobson, 2001). If p i is the probability that there is contamination at supply i, then the logistic model is written: where x ij is the value of predictor j at water supply i and the b j are coefficients which, in this study, were estimated by least squares (Draper and Smith, 1981). A stepwise regression algorithm (Draper and Smith, 1981) was used to decide which of the available predictors (TLF, SRS, turbidity) should be included in the model. This approach prevents the inclusion of too many predictors, where the model could then be considered overfitted, i.e. it is overly suited to the intricacies of the dataset and will not achieve the same accuracy on independent validation data.
The stepwise algorithm considers all of the available predictors not yet included in the model in turn and selects the one which causes the largest decrease in the mean squared residuals. If this decrease is deemed significant for a p-value of 0.05, according to a F-test (Draper and Smith, 1981), this parameter is added. The iterative process continues with the remaining parameters until none of them lead to a significant improvement to the model. This process was implemented using the MATLAB function 'stepwiseglm'.
A classifier of TTC contamination was then formed by designating all water supplies with p i greater than some threshold p * as contaminated. However, there is uncertainty associated with such a classification. False-positive errors occur when the classifier erroneously suggests that the supply is contaminated, whilst falsenegative errors occur when the supply is erroneously classified as uncontaminated. Similarly, true-positive and true-negative results refer to correctly classified contaminated and uncontaminated supplies, respectively. The effectiveness of a classifier can be assessed in terms of the receiver operating characteristic curve (Hanley and McNeil, 1983). This is a plot of the false-positive rate (the proportion of uncontaminated sites falsely classified as contaminated) against the true-positive rate (the proportion of contaminated sites correctly classified) as p * is varied. The area under this curve is a measure of the effectiveness of the classifier: it will be 1 for a perfect classifier and 0.5 if the classifier is performing no better than a random choice. Subsequently, to examine the linear relationship between TTC counts and either TLF, SRS, turbidity, Spearman's rank tests were performed. This test was selected because it does not require the variables to conform to a particular statistical distribution. Standard linear regression models were not employed because 82% of TTC counts were equal to zero.

Examining the significant risk factors resulting in faecal contamination at a water supply
The extent to which individual risk factors from the sanitary inspection could predict the presence of faecal contamination in tubewells was investigated. This was initially explored through the presence of TTCs using the stepwise logistic regression model procedure outlined above. It was then undertaken considering TLF as the dependent variable, which was represented by a linear regression model: where the ε i were independent realizations of Gaussian random variable with zero mean and variance s 2 . The b i and s 2 were the model parameters which in this study were estimated by least squares. Again, the optimal predictors for the model were selected by stepwise regression. The model was estimated for the natural logarithm (plus 0.2 to define the logarithm where intensities were occasionally negative) to ensure the model residuals conformed to an approximate Gaussian distribution.

Extent of faecal contamination and spatial trends within villages
Thermotolerant coliforms (TTCs) were identified in 18% of village water supplies with counts up to 13,760 c.f.u/100 mL. The supplies testing positive were classified according to the WHO (1997) risk categories as 7 (very high risk), 6 (high risk), 3 (intermediate risk) and 11 (low risk) (Fig. 2). All five dug-wells were at least high risk with counts ranging between 380 and 9880 c.f.u/ 100 mL. Taraura contained the most contaminated supplies, with 27% testing positive, as opposed to Shahjahanpur where only 8% contained TTCs.
Tryptophan-like fluorescence (TLF) in excess of the minimum detection limit (MDL) was more spatially extensive than TTCs and observed in 58% of water supplies (Fig. 2). Nevertheless, where present within handpumps, concentrations were low, with a median and interquartile range of only 0.85 and 0.89 mg/L, respectively.
At the dug-wells concentrations were generally highest with a median and interquartile range of 21.04 and 23.57 mg/L, respectively. Only three handpumps exceeded 10 mg/L. Environmental variables that can impact TLF did not vary appreciably. Median water temperature was 27.0 C and 90% of the data were within 3.0 C (Fig. S1). Turbidity was low with a median of 1.1 NTU and 90% of the data were less than 6.0 NTU.
Spatially, TTC occurrence was concomitant with TLF above the MDL, with a tendency for higher counts to be observed at higher TLF (Fig. 2). Supplies contaminated with TTCs appear isolated with no clear spatial pattern, whilst the more numerous sites elevated with respect to TLF appear clustered in certain locations. For example in northern and eastern Shahjahanpur, where clusters appear centred on supplies where TTCs were present and TLF is relatively high. Away from the centre of these clusters, TLF concentrations generally remain the same or decrease. This trend is also apparent in the western section of Maksudpur.

Size fractionation of tryptophan-like fluorescence sources
The majority of the TLF signal was within the <0.22 mm size fraction in 93% of supplies across the range of representative concentrations (Fig. 3). Generally, filtration through the 2.7 mm membrane had negligible impact and filtration through the 0.22 mm membrane removed only a minor proportion of the fluorescence. Overall, the median percentages were 2, 15 and 86% within the >2.7, >0.22 and <2.7 and <0.22 mm fractions, respectively. Filtration significantly reduced turbidity from a mean of 3.1 NTU (raw), to 1.6 NTU (<2.7 mm), and finally 0.6 NTU (<0.22 mm) (Fig. S2). Temperature remained relatively constant following filtration with only a slight warming (<0.5 C) of the 0.22 mm samples at the surface before analysis (Fig. S2).

Predicting the presence of thermotolerant coliforms
Tryptophan-like fluorescence was the only significant predictor of TTC presence (b ¼ 1.53, p-value ¼ <0.001). The area under the receiver operator curve was 0.91 (Fig. 4A), which is much closer to the perfect classifier value of 1 than the random selection value of 0.5. False-negative error are frequent and occur across the range of TLF, whilst false-positive error are few and restricted to low TLF intensity (Fig. 4B). Since TLF is the only predictor, the relationship between these readings and the probability of contamination is monotonic. For example, at a TLF intensity threshold of 1.5 mg/L there is a 5% probability that a contaminated site is incorrectly classified (Fig. 4C). On the other hand, there is a 48% probability of an uncontaminated site being incorrectly classified for the same intensity threshold. Examination of the raw data shows TTC contamination only where TLF exceeds 0.43 mg/L, and where TLF exceeds 2.74 mg/L all sites are contaminated. There is also a significant tendency for higher TTC counts as TLF intensity increases (Fig. 5).

Significant risk factors resulting in faecal contamination
Water supplies within the villages were generally (77%) within 10 m of sources of pollution, such as animal excreta, rubbish, and surface waters (Fig. 6). In fact, stagnant water was within 2 m of 71% of handpumps. The majority (86%) of supplies were also vulnerable to these sources with concrete floors less than 1 m in diameter potentially allowing infiltrating contaminated water to come into close contact with any casing, which if not competent could flow directly into the tubewell. Nevertheless, 87% of handpumps were securely attached to the casing inhibiting the direct ingress of contaminated water at the surface. The majority of sites had faulty drainage (55%), but animals had free access to 68% through inadequate fencing. Toilets were within 10 m of 62% of supplies, but were not located on higher ground in 91% of instances (a reflection of the lack of topography within the area). Overall the mean SRS was 6, considered to be high risk.  Location of a toilet within 10 m of the handpump was the only risk factor that was a significant predictor of TTCs (Table 2), with 91% of all TTC contaminated supplies near a toilet (Fig. 6). These contaminated supplies included those near both pit latrines (16 supplies) and septic tanks (6 supplies). In total, 20% and only 4% of contaminated handpumps were <10 m and >10 m of a toilet, respectively (Fig. 6). Therefore, given that 80% of sites near a toilet remained uncontaminated, it is unsurprising that the false-positive rate of the model was relatively high and comparable to many of the other risk factors. However, the false-negative rate was low because the raw data highlights that only 4%, or 2 supplies, were contaminated away from a toilet.
Proximity to a toilet (R1) was also the most significant single predictor of TLF (Table 2). It was also added first to the stepwise regression algorithm, although the model was subsequently improved through the addition of faulty handpump drainage (R5), then inadequate fencing (R6). It is noted that TLF was generally lower where fencing was inadequate around handpumps (Fig. 6). Nevertheless, the addition of these two parameters produced a model significant at a p-value of <0.001, although the r 2 was only 0.13. The final model is written: lnðTLF þ 0:2Þ ¼ À0:56 þ 0:39R1 þ 0:45 R5 À 0:50R6 (3) 4. Discussion

Pathways for faecal contamination of water supplies and wider implications
Unconsolidated aquifers are considered the least vulnerable to pathogen contamination (ARGOSS, 2001) in India: flowpaths are intergranular, inducing greater potential for sorption and filtration, and can be relatively slow facilitating significant die-off. The 25 m  of protective clayey silt overlying the exploited horizon will be prohibitive to pathogen transport, with long travel-times most probably negating any risk from pathogenic bacteria due to die-off, assuming the absence of permeable windows. Furthermore the small matrix apertures, most likely to be <1 mm in silt dominated sediments (Pedley et al., 2006), would be effective at filtering out and retaining many pathogens. This would also preclude Escherichia coli, the dominant TTC in areas of poor sanitation (Howard et al., 2003), which is around 1 Â 3 mm (Reshes et al., 2008). Therefore, the detection of TTCs at a tubewell is likely to reflect contamination as a result of inadequate headwork completion or sanitary seal, incompetent casing, or poor backfill contact facilitating rapid vertical by-pass down the casing, with contamination through natural recharge pathways unlikely. To summarise, in this hydrogeological setting it is likely to be an issue of water point vulnerability rather than aquifer vulnerability. This is supported by: i) the spatially isolated nature of supplies contaminated with TTCs, ii) the fact that some sanitary risk factors were significant predictors of faecal contamination, and iii) infrequent contamination of supplies despite widespread sources of pollution at the surface.
The sediments of the Gangetic-Brahmaputra-Meghna Basin grade systematically from coarse gravel and sand close to mountainous margins, to silt dominated in the delta region (Singh, 2004). The aquifer within Bihar is hydrogeologically representative of that across much of the Ganges Basin: inter-layered highly permeable stacked channel and interchannel deposits, groundwater levels within around 5 m of the surface, and in excess of 750 mm/yr of precipitation . Therefore, much of the aquifer could be considered similarly vulnerable to leaching from on-site sanitation. However, ultimately, aquifer vulnerability is dependent on the vertical location, thickness, and continuity of any subordinate low permeability horizon(s), such as the protective clayey silts at the surface in the Daniyawan Block. Where this layer is absent at the surface (e.g. Lapworth et al. 2015), particularly if a very shallow permeable horizon is tapped, then the aquifer would be considerably more vulnerable than here. Towards the deltaic region of the basin, the aquifer would be least vulnerable with more extensive near-surface silts and clays, as demonstrated by the good water quality in the confined alluvial aquifer beneath the megacity of Kolkata (Pujari et al., 2012). Irrespective of the hydrogeological setting, if a proportion of tubewells are intrinsically vulnerable due to poor construction,

Table 2
Estimated coefficients and p-values for the hypothesis that the coefficients are zero for single-predictor models of probability of the presence of thermotolerant coliforms and intensity of tryptophan-like fluorescence. False positive rates (FPR) and false negative rates (FNR) are also shown for the logistic regression model. then toilets need to be sited at a sufficient lateral distance to minimise contact between excreta and water supply. Local NGOs recommend the installation of toilets at least 10 m from the nearest water supply. This appears effective within this setting as only two adhering supplies were contaminated, including a site where discharging water could be seen re-entering the same tubewell. However, the numerous occurrences of toilets in close proximity (several within a metre) to water supplies suggest an ignorance of the guidance amongst households and toilet constructors. Notwithstanding this, the high population density and widespread private handpump occurrence can actually prevent the siting of a toilet at sufficient distance, thus any recommendations cannot always be implemented. Consequently, increasing sanitation provision in India is likely to lead to the higher incidence of faecal contamination of water supplies.
Overall the incidence of TTCs is surprisingly low in comparison to other rural studies in the developing world where the majority of sites are frequently contaminated (e.g. Nevondo and Cloete, 1999;Bordalo and Savva-Bordalo, 2007). Whilst this is partially attributable to the low vulnerability of the aquifer here, it also reflects positively on the transition from the highly contaminated legacy dug-wells to tubewells for water supply in these villages. However, it should also be considered that this study was conducted at the end of the dry season and there is potential for greater contamination during monsoon season. This is supported by the elevated presence of TLF, against a background appearing close to zero, in a higher majority of supplies. In groundwater with a strong seasonal recharge pattern in Africa, Sorensen et al. (2015a) demonstrated that sites testing negative for TTCs in the dry season yet contained elevated TLF, would then frequently test positive for TTCs in the subsequent wet season.

Tryptophan-like fluorescence as a predictor of thermotolerant coliforms
The results of this study reinforce previous relationships between TLF and TTCs observed in river and sewage effluent in the UK (Cumberland et al., 2012), rivers in South Africa , and groundwater impacted by poor sanitation in Zambia (Sorensen et al., 2015a). Remarkably, the raw datasets from these studies all overlap despite the contrasting settings, variation in environmental variables that could impact TLF, and the use of fluorimeters from different manufacturers (Fig. 7). The significant vertical scatter in the dataset may raise questions over the use of the technique for direct inference of plate counts, although this scatter is partially attributable to the lack of repeatability in plate counts themselves. Nevertheless, it is clear that more intense TLF is indicative of higher TTC counts and thus a likely greater risk of enteric pathogens in freshwater.
A more pertinent question is whether TLF could be adopted in regulation from a water supply compliance perspective? This could be desirable as TLF is detected almost instantaneously and as has been demonstrated here, and in Sorensen et al. (2015a), as a superior predictor of TTC presence than turbidity. Other authors have also demonstrated little relationship between these two commonly used water parameters (e.g. Pronk et al., 2006). However, turbidity is still widely used as a real-time proxy for microbial contamination, and monitored continuously at public groundwater abstraction sites in the UK (UKWIR, 2012). Therefore, TLF monitoring could represent a significant improvement to current monitoring practices.
Any uptake by regulators, and resultant implementation in the water industry, would require the definition of suitable TLF thresholds. In this study, we suggest a precautionary intensity of 0.43 mg/L to infer faecal contamination, and data from Sorensen et al. (2015a) indicate an intensity of 0.33 mg/L. Such precautionary thresholds would result in numerous false-positives with respect to TTCs, which may be appropriate given that TTCs are frequently absent in the presence of enteric pathogens (Noble and Fuhrman, 2001;H€ orman et al., 2004).
It had previously been hypothesised that such false-positives may be a result of the more efficient transport of TLF through the subsurface, as tryptophan-like fluorophores may be smaller than TTCs (Sorensen et al., 2015a). This is supported by the results presented here, showing the predominance of fluorescence within the <0.22 mm fraction where bacterial cells have been removed.
Furthermore, there were apparent spatial patterns in TLF, which were not observed in the TTC data. This may also occur because TLF is more resilient in groundwater than TTCs (Sorensen et al., 2015a). Hence, any fluorescence may relate to earlier episodes of contamination whereas TTCs are indicative of recent contamination within 16e45 days (Taylor et al., 2004). Alternatively, the linear regression model suggested TLF was related to faulty drainage from the handpump, so there may also be an additional contributory source.

Uncertainty in thermotolerant coliform counts and tryptophanlike fluorescence data
There was no evidence for plate contamination in the laboratory with all blanks testing negative and eight negative sample replicates. Furthermore, the four positive sample replicates were similar, with all within two colony forming units. This was equivalent to being within 100 c.f.u/100 mL with the results of the replicates as follows: 3 and 5, 600 and 600, 500 and 600, 5 and 3 c.f.u/ 100 mL. TLF repeatability is within ±0.18 mg/L up to a concentration of 5 mg/L using the UviLux fluorimeter (Sorensen et al., 2015a).
However, TLF can be impacted by a range of variables including temperature, turbidity, and the matrix composition (Henderson Fig. 7. Relationship between tryptophan-like fluorescence and E. coli/Thermotolerant coliforms from all published studies (n ¼ 389). Note that in areas of poor sanitation, between 90 and 99% of all thermotolerant coliforms comprise E. coli in freshwater (Leclerc et al., 2001;Howard et al., 2003). All negative counts are displayed as À1. et al., 2009;Khamis et al., 2015). In groundwater, temperature is typically stable and turbidity is very low allowing favourable application of the technology (Khamis et al., 2015). In this study area, the typical 3 C range in temperature could account for around 3% uncertainty in the dataset, based on the study of Baker (2005) who identified around a 30% change in TLF over 35 C in a range of rivers and wastewaters. Water turbidity in our study was <200 NTU where attenuation has previously been observed, although low-level apparent TLF enhancement due to light scattering may have been observed where turbidity was most elevated (Khamis et al., 2015). Nevertheless, at the highest turbidities in the dataset of 33.4 and 14.8 NTU, TLF was only 0.36 and 0.45 mg/L, respectively, indicating any turbidity induced TLF was minimal. pH within the alluvial aquifers of the Ganges plains are within 6e8 (Kumar et al., 2010;Saha et al., 2011;Mukherjee et al., 2012), which is within the region of marginal impact on TLF (Reynolds, 2003).
Portable tryptophan-like fluorimeters are targeted at the fluorescent peak, but there is potential for bleed-through from elsewhere within the fluorescing spectrum given the large bandpasses that are used to maximise the received signal. CDOM intensity was zero in 66% of handpumps, with a mean of 0.07 mg/L where present. Examining bleed-through from IHSS Suwannaee River fulvic acid standards in the laboratory (see Supplementary), suggests CDOM interference was negligible at these low intensities. For example, a CDOM intensity of 1 mg/L (0.09 RU) would only induce an equivalent 2 mg/L peak in the tryptophan-like region. Furthermore, the study area is rural and dominated by non-mechanised agriculture, therefore groundwater contamination and bleedthrough due to polycyclic aromatic hydrocarbons (PAHs) is unlikely. Sample absorbance was not investigated to evaluate any inner-filtering effects, as previous studies have demonstrated absorbance is typically very low in groundwater (Lapworth et al., 2008;Sorensen et al., 2015a).

Conclusions
The eradication of open defecation via access to sanitation will dramatically improve the lives of hundreds of millions of people in India. However, we suggest that sanitary interventions in rural areas are also contaminating groundwater-derived potable supplies with excreta upon which these communities are completely dependent upon. In this low vulnerability hydrogeological setting, this is likely to be a result of excreta entering individual supplies as a result of inadequate headworks, poor sanitary seals, and/or incompetent casing. Therefore, it is considered the widespread implementation of current on-site sanitation systems across India will inevitably lead to the faecal contamination of adjacent water supplies, irrespective of the setting.
The development of on-site sanitation needs to consider the potential adverse impacts on water supplies by mandating appropriate vertical and lateral separation. In the study villages, vertical separation was adequate, but lateral separation was problematic due to high population densities and the widespread coverage of private handpumps. Therefore, there is a need to increase the awareness within these communities of the potential risks of onsite sanitation and mitigation measures to ensure domestic water is clean prior to consumption. Given that centralised treated water supplies in rural areas are not realistic options in at least the medium term, individuals should employ a range of standard low-cost treatment measures such as boiling or solar water disinfection. Nevertheless, every effort should be made to adequately protect water supplies from faecal contamination through suitable sanitary seals, headworks and subsurface installation. This could be undertaken by raising awareness of the risks and current guidance, and perhaps better regulation of the drilling industry. In low-vulnerability settings such as this, this may effectively negate any risks for new tubewells.
This study reinforces the premise of tryptophan-like fluorescence (TLF) as a real-time, reagentless indicator of the faecal contamination of drinking water. In this well-protected hydrogeological setting, where the baseline intensity appeared close to zero, it was a significant predictor of thermotolerant coliforms (TTCs) and there was a significant tendency for more intense fluorescence at higher plate counts. TLF was predominantly associated with fluorophores <0.22 mm, hence is likely to be transported more easily in groundwater than bacterial index organisms, thus serving as a more precautionary indicator of smaller enteric viruses. The collation of all concurrent published TTC-TLF data show similar relationships across a wide-range of environmental settings, and have an overall significant correlation, suggesting that the technique is widely applicable for monitoring faecal contamination in drinking water supplies.