The importance of nutrient ratios in determining elevations in geosmin synthase ( geoA ) and 2-MIB cyclase ( mic ) resulting in taste and odour events

Geosmin synthase ( geoA ) and 2-MIB cyclase ( mic ) are key biosynthetic genes responsible for the production of taste and odour (T & O) compounds, geosmin and 2-MIB. These T & O compounds are becoming an increasing global problem for drinking water supplies. It is thought that geosmin and 2-MIB may be linked to, or exacerbated by, a variety of different environmental and nutrient triggers. However, to the best of our knowledge, no studies to date have evaluated the combined effects of seasonality, temperature


Introduction
Taste and odour (T&O) are the primary sensory considerations used by customers to assess the quality of drinking water (Kehoe et al., 2015).Odorous or unpalatable T&O compounds in treated drinking water can erode customer trust in water quality and generate complaints to water companies worldwide (Webber et al., 2015).Although these compounds pose no risk to human health (Sotero-Martins et al., 2021), significant, costly treatment of drinking water is required to remove them.Adsorption by activated carbon is considered an effective measure in the removal of T&O compounds (Kim et al., 2014).However, Rodriguez (2018) estimated that to remove 15 ng L −1 of a T&O compound at a flow rate of 40 million gallons per day, 5077 kg of Powdered Activated Carbon (PAC) would be required.With PAC having a market cost of around 1.2 -2 $ kg −1 (Alhashimi and Aktas, 2017), this is very costly.
Geosmin (trans-1-10 dimethyl-trans-9-decalol) and 2-MIB (2-methylisoborneol) are the most common compounds associated with T&O complaints worldwide (Clercin and Druschel, 2019;Echenique et al., 2006;Hayes and Burch, 1989;Menezes et al., 2020;Perkins et al., 2019;van Rensburg et al., 2016).These compounds are produced by a variety of distantly related bacteria including Actinobacteria, Cyanobacteria and Proteobacteria (Watson, 2003), but Cyanobacteria are considered to be the main producers of the volatile T&O compounds in aquatic environments (Suurnäkki et al., 2015).Cyanobacteria have also established important connections with other phytoplankton.For example, Bar--Yosef et al. (2010) revealed a close relationship between other phytoplankton capable of alkaline phosphatase (AP) production and the cyanotoxin, cylindrospermopsin.This cyanotoxin stimulates algae to produce APs used to access orthophosphate from organic phosphorous, which in turn facilitates cyanobacterial growth (Bar-Yosef et al., 2010).This is supported by Olsen et al. (2017) who suggested that, specifically, diatoms influence geosmin and 2-MIB production, as demonstrated by stronger correlations between 2-MIB production and diatom abundance as compared to cyanobacterial abundance.This is consistent with previous findings that link diatoms with the proliferation of geosmin and 2-MIB (Izaguirre and Taylor, 1998;Schrader et al., 2011;Sugiura et al., 2004Sugiura et al., , 1998)).
Geosmin and 2-MIB exist as an irregular sesquiterpenoid and a monoterpene, respectively (Watson and Juttner, 2019).Geosmin and 2-MIB are produced along the metabolic pathways for isoprenoid synthesis involved in the 2-methylerythritol-4-phosphate isoprenoid (MEP) pathway, the mevalonate pathway, and the leucine pathway (Jüttner and Watson, 2007).The molecular foundation of geosmin production is from the geosmin synthase gene (geoA) which encodes for a bi-functional domain enzyme in a two-step Mg 2+ dependant reaction (Churro et al., 2020).The N-terminal part of the enzyme causes the ionization and cyclization of farnesyl diphosphate (FPP) into germacradienol, whilst the C-terminal part facilitates the protonation, cyclization, and fragmentation of the precursor germacradienol molecules into geosmin and acetone (Watson et al., 2016).For the biosynthesis of 2-MIB in Cyanobacteria, there are two metabolic steps: firstly, a S-adenosylmethionine-dependant methylation of the monoterpene precursor geranyl diphosphate (GPP) to 2-methyl-GPP catalysed by geranyl diphosphate 2-methyltransferase (GPPMT), and secondly, further cyclization of 2-methyl-GPP to 2-MIB catalysed by 2-MIB cyclase (mic) forming a putative operon (Fig. 2) (Giglio et al., 2011).Discovery of the geoA and mic genes has enabled biomolecular methods like quantitative polymerase chain reaction (qPCR) to become available to monitor their abundance (Cane and Watt, 2003;Dickschat et al., 2007;Giglio et al., 2011;Gust et al., 2003;Komatsu et al., 2008;Wang and Cane, 2008).However, to the best of our knowledge, no primers currently developed target all geosmin and 2-MIB producing Cyanobacteria.
Geosmin and 2-MIB are recalcitrant to conventional drinking water treatment procedures such as clarification, filtration, and oxidation using chlorine (Srinivasan and Sorial, 2011).With both compounds exhibiting extremely low odour thresholds (1.3 ng L −1 for geosmin and 6.3 ng L −1 for 2-MIB; Young et al., 1996), it is of vital importance that water companies remove the source of these compounds through mitigative measures before proceeding to water treatment in order to avoid customer dissatisfaction and ultimately complaints.This is of particular concern considering that climate change can promote cyanobacterial bloom formation which in turn has the potential to lead to more T&O events worldwide (Davis et al., 2009;Taranu et al., 2015;Zhang et al., 2017).Warming can selectively encourage cyanobacterial growth as they have higher optimal growth temperatures compared to eukaryotic algae (Zhang et al., 2017).Shen et al. (2022) found that warmer temperatures favored cyanobacterial growth leading to an increase in T&O compounds.However, lower temperatures (15 • C) have also been shown to promote the expression levels of geoA and mic genes compared to 25 • C and 35 • C (Shen et al., 2022).Jeong et al. (2021) reported that Pseudanabaena yagii produced 2-MIB during the summer season and released 2-MIB under low temperature conditions in the autumn.
Many studies have focused on the environmental triggers for geosmin and 2-MIB production (Saadoun et al., 2001;Journey et al., 2013;Oh et al., 2017;Clercin and Druschel, 2019).Individual studies have focused on geoA and mic copy numbers in relation to seasonal occurrences of benthic production of geosmin and 2-MIB (Gaget et al., 2020), the transcription of the genes in response to temperature (Shen et al., 2022), and developing early detection methods (Chiu et al., 2016;John et al., 2018;Suurnäkki et al., 2015).However, to the best of our knowledge, no studies to date have evaluated the combined effects of seasonality, temperature, and nutrient concentrations on geoA and mic copy numbers together.Kutovaya and Watson (2014) developed taxon-specific PCR and qPCR assays for the early detection of geosmin, however, they were unable to determine any correlations between gene expression, temperature, and nutrient concentrations.Usual nutrients implicated in the production of geosmin and 2-MIB are nitrogen and phosphorous (Harris et al., 2016).Further, Molot et al. (2014) proposed a critical role of ferrous iron and sulphate for cyanobacterial bloom formation, yet neither ferrous iron nor sulphate have been evaluated in relation to geosmin and 2-MIB production before.
Here we assess the associations between seasonality, temperature and nutrients and the production of geosmin and 2-MIB in reservoir drinking water.We employ quantitative Polymerase Chain Reaction (qPCR) to quantify the gene abundance of geoA and mic in nine reservoirs across Wales, U.K., using a newly developed reverse mic primer to aid the detection of 2-MIB producing Cyanobacteria.Findings are discussed with relevance to triggers for, and prediction of, T&O events in drinking water supply.

Defining a T&O event
For geosmin, customer complaints start at 7 ng L −1, and for 2-MIB customer complaints start at 12 ng L −1 (Simpson and MacLeod, 1991).According to D ŵr Cymru Welsh Water, increased sampling frequency on reservoir water begins when geosmin is >5 ng L −1 and 2-MIB is >2.5 ng L −1 .When concentrations are >10 ng L −1 for geosmin and >5 ng L −1 for 2-MIB, Granular Activated Carbon (GAC) filters are turned on if the drinking water treatment works have the facilities.If no GAC is present, Powdered Activated Carbon (PAC) dosing at low levels commences and customer complaints are checked daily.Reversion back to normal operation is only authorised once at least 3 samples are below 10 ng L −1 for geosmin and 5 ng L −1 for 2-MIB.
For the purposes of this study an event is defined as a concentration measurement >10 ng L −1 for geosmin and >5 ng L −1 for 2-MIB.

Sample locations
Between July 2019 -August 2020, 500 mL of reservoir water was collected using bankside sampling (water depth max 0.5 m) once a month for molecular and water chemistry analysis at 28 sampling sites across nine reservoir locations within Wales, U.K. The exception to this was a sampling break between March -April 2020 due to COVID-19 restrictions.In North Wales, the selected reservoirs were reservoir 1, reservoir 3, reservoir 4, reservoir 8, and reservoir 2, and in South Wales the selected reservoirs were reservoir 6, reservoir 7, reservoir 9, and reservoir 5 (Fig. 1).The map was created using R 4.1.0and package 'leaflet' (Cheng et al., 2017).

Sample collection and genomic DNA extraction
The 500 mL of reservoir water samples were filtered through a Sterivex filter (0.2 µM) using a vacuum manifold.One mL of ATL buffer (Qiagen, Germany) was added to the Sterivex filter before storing each sample at −20 • C for later use.Prior to extraction, 100 µL of proteinase K was added to each filter and left on a turntable for 2 h.DNA was extracted using 100 µL of sample removed from each Sterivex filter, following methods described by Fawley & Fawley (2004), with the agitation step modified to 30 s at 5 ms −1 , repeated twice, with a 5-minute interval.In brief, after bead-beating, the sample was centrifuged for 2 min at 14,000 rpm and the supernatant was recovered for use in the next step.The remaining stages of the extraction protocol used the DNAeasy® Blood & Tissue Kit (QIAGEN, Germany) following manufacturer instructions.Each DNA sample was eluted in 50 µL of A.S. Hooper et al. nuclease-free water.

Standard curves for geoA and mic
The acquisition of strains used for the generation of each gene qPCR standard curve (geoA, mic, 16S rRNA) are displayed in Table 1, along with the genomic results of the putative 16S rRNA gene classification of the strains used (materials and methods, see Appendix A.1). 16S rRNA gene numbers were used to normalise copy numbers for geoA and mic samples to account for biomass.
For the generation of standard curves for 16S rRNA, equimolar concentrations of Anabaena sp.1446/1c, Anabaena flos-aquae 30.87,Cylindrospermopsis raciborskii 1.97, Planktothrix sp. 18, and Streptomyces coelicolor M145 were used amplified using primers 27F (5 ′ -AGAGTTT-GATCMTGGCTCAG-3 ′ ) and 1492R (5 ′ -GGTTACCTTGTTACGACTT-3 ′ ), as described by DeLong (1992).The thermocycling conditions were as follows: initial denaturation at 95 • C for 2 min followed by 40 cycles of 94 • C for 30 s (denaturation), 52 • C for 30 s (annealing), 72 • C for 1:30 min with an incremental second increase with every cycle (extension), and a final extension step at 72 • C for 5 min.All amplicons were cleaned and purified with the QIAquick PCR purification kit (Qiagen Ltd, Venlo, Netherlands).Additionally, prior to use, standards were quality control checked using the QIAxcel (QIAGEN, Germany) to ensure amplicons were of the expected size; 905 bp, 307 bp and 1490 bp for geoA, mic and 16S, respectively.A list of where cultures were acquired from are displayed in Table 1.

mic qPCR and reverse primer development
For detection of the mic gene, the primer MIBS02F (5 ′ -ACCTGTTACGCCACCTTCT-3 ′ ) adapted from Chiu et al. (2016) was used in conjunction with a newly developed reverse primer MIBAHR (5 ′ -GTCATGGAGGTGTAGAAGCTGTCG-3 ′ ).The reverse primer, MIBAHR was designed using Geneious 9.1.8using 27 known 2-MIB producing species of Cyanobacteria (see Appendix B.1 for a phylogenetic representation of organisms used for the construction of the reverse primer, MIBAHR).Additionally, in-silico testing of the two primers was performed in the statistical software R 4.1.0using the package 'primerTree' (Cannon et al., 2016), which identified 30 species of Cyanobacteria that can be detected using the MIBS02F and MIBAHR primer set (Appendix B.2) compared to only 23 species detected when using MIBS02F in conjunction with the reverse primer detailed in Chiu et al. (2016).Further, when using primers Mtcf/ Mtcr from Wang et al. (2011) only 13 species were identified, and when using primers MIB3324F/ MIB4050R and MIB3313F/MIB4226R from Suurnäkki et al. (2015) only 17 and 7 species were able to be identified, respectively (Appendix B.3). qPCR reactions were performed in a 10 μL reaction mixture containing 5 µL of 2x GoTaq qPCR Master Mix (Promega, USA), 0.4 pmol µL −1 of each primer and 2.5 μL of template DNA.qPCR reactions were executed a QuantStudio™ 7 Flex Real-Time PCR System, 384-well (ThermoFisher Scientific, USA) using the following conditions: hot start activation at 95 • C for 1 s (denaturation), and 65 • C for 20 s (annealing and extension) followed by melting curve analysis of the amplified products.The linear dynamic range for copy numbers in this qPCR was between 24.1 and 24.1 × 10 6 copies µL −1 .

Determination of T&O compounds
To determine concentrations of geosmin and 2-MIB, Gas Chromatography-Tandem Mass Spectrometry (GC-MS/MS) was employed.500 mL of each water sample was extracted using a solid phase extraction (SPE) technique at a D ŵr Cymru Welsh Water accredited laboratory (ISO/IEC 17,025:2017).The analytes were eluted from the SPE cartridge with 1.0 ± 0.01 mL dichloromethane.An internal deuterated standard was added to the extract before being transferred into a labelled 2 mL auto sampler vial to assess efficiency of the run.The extract was then injected into a gas chromatograph through a multimode inlet in solvent vent mode.A flow of helium carried the analytes on to a 30 M Ultra Inert HP-5MS capillary column (Agilent, Santa Clara, USA) where they were separated by boiling point.Analytes were detected by a triple Quadrupole Mass Selective detector operating multiple reaction monitoring mode.

Chemical and physical parameters
Mean temperature ( • C) for all reservoir locations was obtained through the Meteorological Office (Met-Office) using the longitude and latitudes from each reservoir's location.Nutrients: ammonium (NH 4 + ), total oxidized nitrogen (TON), nitrate (NO 3 − ), nitrite (NO 2 − ), total phosphorous (TP), sulphate (SO 4 2− ), and dissolved reactive silicate were analysed using a discrete analyser (Thermo Scientific Aquakem 600) at D ŵr Cymru Welsh Water accredited laboratory (ISO/IEC 17,025:2017).Nutrient concentrations were measured colourimetrically or turbidimetrically.The analytical method of measurement was dependant upon the reaction between the analyte within the sample and the reagents.Sulphates reacted with the reagent to produce an insoluble precipitate which was measured turbidimetrically.All other determinants produced a coloured complex when reacted with the reagents and were measured colourmetrically.At a predetermined wavelength the intensity of the coloured or turbid sample solution was proportional to the concentration of the analyte within the sample.
Two GAMs were performed for the response variable, geoA:16S copies mL −1 .The first only included data from reservoir 1, the second all other 8 reservoirs with an additional explanatory variable of reservoir.Reservoir 1 was modelled independently from other reservoirs as this reservoir was the only one to experience high geosmin concentrations according to the GC-MSMS data.Reservoir was re-levelled to have the reference reservoir as reservoir 5, as this was the reservoir with the lowest range of gesomin concentrations.For the response variable mic:16S copies mL −1 , one GAM included all 9 reservoirs with an additional explanatory variable of reservoir.Reservoir 5 was again the reference level, as this was the reservoir with the lowest range of 2-MIB concentrations.The decision to use nutrient ratios (NH 4 + :NO 3 − and TIN: TP) was chosen as opposed to using concentrations of nutrients (TP, NO x , NH 4 , TIN) by comparing model fit and efficiency (using Akaike Information Criterion; AIC) from models containing either nutrient fractions or nutrient ratios, to avoid multicollinearity.Nutrient ratios produced a better model fit and explained a greater variation in the data than using actual concentrations of nutrients in question.Explanatory variables were selected according to the (AIC), using a backward stepwise model fitting approach (Akaike, 1974;Bozdogan, 1987).
Smoother functions were applied to the covariates geosmin, 2-MIB, NH 4 + :NO 3 − and TIN:TP.Remaining variables were used as linear parametric terms.All GAMs had an inverse Gaussian error distribution, with the link function of identity.
For 2-MIB, mean concentrations remained below the event classification threshold of 5 ng L −1 across all reservoirs throughout the sampling duration.Reservoir 7 demonstrated the highest 2-MIB concentration, ranging from 0.57 to 58 ng L −1 , but with a generally low mean (4.44 ng L −1 ± 11.59) and median concentration (0.86 ng L −1 ) as shown in Table 3. Reservoir 9 had a maximum 2-MIB concentration of 7.90 ng L −1 , which was also categorized as a 2-MIB event.Nevertheless, reservoir 9 generally had low mean (1.75 ng L −1 ± 2.14) and median concentrations (0.68 ng L −1 ).
No consistent relationships could be deduced from log 10 mic:16S copy numbers mL −1 and log 10 2-MIB concentrations ng L −1 (Appendix C.1).The maximum 2-MIB ng L −1 concentration (58 ng L −1 ) was seen in September 2019, which coincided with a high mic:16S copy numbers mL −1 value (5.12 mic:16S copy numbers mL −1 ).A slight positive relationship was observed in August 2020, although this was not significant (p = 0.688), and a significantly negative relationship was witnessed in October 2019 (p < 0.01).

Temporal changes in gene copy numbers and T&O concentrations by season and year
Highly significant positive relationships were observed for geosmin and geoA:16S gene copy numbers (Fig. 4) for summer 2019 (p < 0.0001), summer 2020 (p < 0.0001).Significant relationships were apparent for winter 2019 (p < 0.01) and autumn 2019 (p < 0.05).When geosmin concentrations fell below 3.16 ng L −1 a slightly negative non-significant correlation was detected, as seen in winter 2020 (p = 0.305).
For 2-MIB and mic:16S no significant relationships could be determined.In both summer 2019 and 2020 a weak positive relationship can be seen in Appendix C.2 (p = 0.611 and 0.151, respectively).In autumn 2019 and winter 2020, non-significant negative relationships are displayed (p = 0.437 and 0.127).Winter 2019 was removed from the log 2-MIB and log 10 mic:16S copy numbers plot (Appendix B.2) due to no variation seen in log 10 2-MIB concentrations.

geoA:16S abundance in reservoirs with minor geosmin concentrations
For all reservoirs except reservoir 1, geoA:16S copy numbers mL −1 GAM, summer 2020 and summer 2019 were found to be positively significantly different to winter 2019 (p < 0.001 and p <0.05, respectively) (Table 5).In addition, geoA:16S in all reservoirs were found to be significantly different to geoA:16S in reservoir 5, apart from reservoir 9.

mic:16S abundance in all reservoirs
Autumn and summer 2019 were significantly positively and negatively (respectively) different from winter 2019 (p < 0.001, for both) for mic:16S copy numbers mL −1 (Appendix D.4).Dissolved reactive silicate also had a slightly significant negative relationship with mic:16S copy numbers mL −1 (p < 0.1).2-MIB concentrations were not found to be significantly non-linearly associated with mic:16S copy numbers mL −1 up to concentrations of 8 ng L −1 as depicted by the horizontal estimated smooth function in Appendix D.5A.NH 4 + :NO 3 − had a significantly nonlinear relationship with mic:16S copy numbers mL −1 (p < 0.1), showing a positive trend with low ratios of NH 4 + :NO 3 − between ~0.00 -0.07 and at higher ratios (> 0.18) (Appendix D.5B).An inhibitory effect of NH 4 + :NO 3 − on mic:16S copy numbers mL −1 was observed between ~0.07 -0.18.

Discussion
This is the first study to our knowledge to demonstrate variations in geoA and mic abundance between different seasons.This is also the first study to report the associations between iron, sulphate, and dissolved reactive silicate with geoA and mic copy numbers in relation to T&O levels.Significant correlations were determined between geoA and geosmin concentrations by month and by season, whereas no relationships could be deduced from correlations observed between mic and 2-MIB concentrations by month or by season.geoA was deemed to be a suitable indicator of geosmin concentrations, but only when geosmin concentrations were elevated (>100 ng L −1 ), as seen in reservoir 1 from this study.Indicators of heightened geoA abundance between reservoirs experiencing elevated geosmin concentrations and non-elevated concentration, were negative linear relationships with mean temperature and dissolved reactive silicate.Both nutrient ratios (TIN:TP and NH 4 + : NO 3

−
) were significantly associated with the abundance of geoA.TIN:TP generally had the greatest effect on geoA abundance at low ratios, with inhibitory effects witnessed at intermediate levels for TIN:TP ratios which were suppressed at higher TIN:TP ratios.Model analysis also revealed that when NH 4 + :NO 3 − ratios were high geoA abundance was also high.In addition, the positive linear relationships between sulphate and dissolved iron with geoA in a reservoir experiencing geosmin "events" should not be ignored.Although no correlations between mic and 2-MIB concentrations could be determined, model analysis revealed the significance of a negative linear relationship with mic and dissolved reactive silicate and a smoothed relationship with NH 4 A limitation of this study was the frequency of sampling (monthly) which was a constraint of the water industry project.Paerl et al. (2022) demonstrated that a large change in geosmin concentration can occur on a week to week basis during springsummer, and Pochiraju et al. (2021) reported that geosmin concentration may decline by 12% within a week.Hence monthly monitoring may miss significant spikes in geoA and geosmin concentrations.Similarly, monthly monitoring of mic and 2-MIB may miss spikes in 2-MIB concentrations or the resuspension of underlying sediment containing cyanobacterial species containing the mic gene.It is therefore suggested that weekly or at least biweekly monitoring should be implemented by water companies, to facilitate suitable accuracy in predictive capacity.

Triggers of geosmin "events"
Of the nine reservoirs studied, reservoir 1 showed the most elevated geosmin concentrations considered to be significant "events", and hence was chosen for modelling triggers of geoA prevalence relating to geosmin concentrations.Findings from this study identified geoA to be a suitable indicator of geosmin concentrations, although significant correlations were only apparent when geosmin concentrations had a large range with high maximum concentrations (≤420 ng L −1 ).In accordance with this, previous studies that have reported significant correlations between geoA and geosmin also had large ranges of concentrations with elevated maximum concentrations (10 5 ng L −1 , Su et al., 2013; 10 2 ng L −1 , Tsao et al., 2014; 10 3 ng L −1 , Otten et al., 2016).In addition, Jørgensen et al. (2016) were unable to detect geoA from surface and bottom waters that had geosmin concentrations of 1.4 and 5.8 ng L −1 , respectively.In contrast, Gaget et al. (2020) found low correlations between geoA and geosmin with geosmin concentrations up to 18 ng L −1 .This highlights the need for physical and chemical parameters to be measured in parallel with qPCR to better monitor changes in geoA prevalence in the absence of elevated geosmin concentrations (≤15 ng L −1 ).For reservoirs experiencing mild geosmin concentrations (≤ 15 ng L −1 ) and elevated geosmin concentrations (≤ 420 ng L −1 ) negative relationships existed between mean temperatures, dissolved reactive sillicate and geoA abundance.Similarly, Shen et al. (2022) found geoA gene expression to be higher at 15 • C than at 25 • C and 35 • C.This was consistent with findings from Zhang et al. (2009) that showed geosmin production by Lyngbya kuetzingii was maximal at low temperature (10 • C), while Saadoun, Schrader and Blevins (2001) suggested that at low temperatures, more geosmin was synthesized by Anabaena sp.Negative relationships with geoA and dissolved reactive silicate could be used as a gauge for diatom formation, as depletion of dissolved reactive silicate is usually an indicator of diatom production of silicified cell walls containing amorphous silica (frustules) (Shimizu et al., 2001).Negative relationships between dissolved reactive silicate and geoA points towards a potential mutualistic symbiotic relationships between the two phytoplankton.Olsen, Chislock and Wilson (2016) found that T&O production throughout their study may have been linked to Synedra sp.being used as a substrate for cyanobacterial growth and the proliferation of T&O compounds.In addition to mean temperature and dissolved reactive silicate, geosmin and both nutrient ratios (TIN:TP and NH 4 + :NO 3 − ) ratios were good indicators for elevated geoA levels in a reservoir experiencing geosmin "events" and reservoirs with mild geosmin concentrations.This coincides with the findings of Howard (2020) who suggested that low TN:TP favoured the growth and dominance of Cyanobacteria, whilst low NO 3 − :NH 4 + promoted the production of T&O compounds.From this study, both low TIN:TP and high NH 4 + :NO 3 − ratios were shown to be significant in relation to geoA abundance.Interestingly, at intermediate ratios of TIN:TP the abundance of geoA was reduced below the average value of expected geoA.Cyanobacteria have been reported to assimilate NH 4 + more efficiently than NO 3 − (Hampel et al., 2018), and NO 3 − has been shown to have inhibitory effects on the production of T&O compounds, for example, geosmin in Dolichospermum (Saadoun et al., 2001).TIN:TP ratios revealed that low levels of TIN:TP favoured geoA abundance in reservoirs experiencing extreme and mild − ratio and (C) TIN:TP ratio.The y-axis denotes the partial effect size, the comb on the x-axis shows where the value of predictor data points lie and the points are the residuals.The horizontal red line at y = 0 intercept indicates the overall mean of the response (geoA abundance).
A.S. Hooper et al. geosmin concentrations.However, at intermediate ratios the response of geoA was inhibited below the average value and when TIN:TP was high the response of geoA was regained.Youn et al. (2020) found that cyanobacterial community composition affected geosmin levels, when nitrogen concentrations were high (changing to a high TN:TP) non-nitrogen fixing Cyanobacteria dominated.Regained geoA levels in this study after an intermediate inhibitory effect of heightened TIN:TP may reveal a transition from nitrogen-fixing Cyanobacteria to non-nitrogen-fixing Cyanobacteria.Non-nitrogen-fixing Cyanobacteria typically prefer a high TN:TP ratio, whereas nitrogen-fixing Cyanobacteria are more commonly observed in water columns experiencing low TN:TP (Elliott and May 2008;Vrede et al., 2009).Here, we support these findings and identify high NH 4 + in proportion to low NO 3 − to be a key trigger in causing elevations in reservoirs experiencing mild and extreme geosmin concentrations; with a 0.1 NH 4 + :NO 3 − ratio threshold.Therefore, the most useful water chemistry parameters were the ratio ammonium to nitrate which was previously found in analysing drinking water reservoirs in Wales and England (Perkins et al., 2019); ammonium data from this study fell within the limits set out by Perkins et al. (2019) (0.00 -0.30 mg L −1 ).
Dissolved iron and sulphate were both significantly positively associated with geoA for a reservoir experiencing geosmin "events".Molot et al. (2014) proposed that the availability of ferrous iron (Fe 2+ ) regulates the ability of Cyanobacteria to compete with other phytoplankton counterparts to assert dominance.Cyanobacteria also possess siderophores to readily convert ferric iron (Fe 3+ ) to usable Fe 2+ forms in Fe-limited environments (Wilhelm and Trick, 1994).In combination with dissolved iron, sulphate reduction to sulphide can limit Fe 2+ diffusion rates from anoxic sediments due to insoluble iron sulphide formation (Molot et al., 2014).The increase of sulphate concentrations can thus promote the availability of Fe 2+ for cyanobacterial dominance assertion.However, a negative significant relationship with dissolved reactive silicate may pose as a better early indicator for elevated geoA shown in this study.

mic and 2-MIB concentrations
In this study, no relationship between mic and 2-MIB concentrations could be determined by month or by season; however, this was likely due to the low concentrations of 2-MIB detected throughout most of this study period (0.57 -58 ng L −1 ).Chiu et al. (2016) found that mic gene levels in some open water samples were below the limit of detection despite 2-MIB being detected.They proposed that this was likely a result of 2-MIB production not being indigenous to the pelagic region, e.g., originating from benthic Cyanobacteria, which diffused 2-MIB to the open water sampling site hence why no mic genes were detected.Low concentrations of 2-MIB observed throughout this study, despite high levels of mic detected, could be due to sediment resuspension, suspending benthic species containing mic.Another possible reason for poor correlation is the periodicity of sampling, as 2-MIB is lost more readily from the water column compared to geosmin owing to a higher volatility and biodegradation (Cho, 2007;Li et al., 2012).Both T&O compounds are associated with the thylakoid and cytoplasmic membrane proteins, although 2-MIB is less closely bound and more easily excreted than geosmin (Wu and Juttner, 1988).
For mic the NH 4 + :NO 3 − ratio was considered a better indicator of elevated mic gene levels in the water column than 2-MIB concentrations, along with a negative linear relationship with dissolved reactive silicate.Both low and high NH 4 + :NO 3 − ratios revealed the greatest partial effect seen on mic levels, with intermediate ratios inhibiting the levels of mic.This could be explained by the preference of Cyanobacteria for NH 4 + ; when Cyanobacteria absorb NH 4 + they immediately incorporate it into amino acids, whereas they require enzymatic reduction to use NO 3 − (Kim et al., 2017).Thus, Cyanobacteria that use NH 4 + prior to NO 3 − may experience inhibition of NO 3 − uptake (Dortch, 1990) before being able to produce the enzymes capable of reducing NO 3 − to NO 2 − then finally to NH 4 + .Dissolved reactive silicate was also a significant proxy for elevated mic.This supports additional studies that have linked diatoms to 2-MIB production (Izaguirre and Taylor, 1998;Schrader et al., 2011;Sugiura et al., 2004Sugiura et al., , 1998)).Although many studies have only been able to identify a correlation between 2-MIB and diatoms (Olsen et al., 2016), additional research is required to understand the relationship.

qPCR primer specificity for mic and geoA
In silico testing, comparing established primer sets (MIBS02F and MIBS02R; (Chiu et al., 2017), Mtcf and Mtcr; (Wang et al., 2011), MIB3324F and MIB4050R;(Suurnäkki et al., 2015), MIB3313F and MIB4226R; (Suurnäkki et al., 2015) for mic quantification revealed a lack of universality (Appendix A.3).The designed reverse primer MIBAHR from this study in combination with MIBS02F (Chiu et al., 2017), allowed for the detection of 30 cyanobacterial strains that possess mic (Appendix A.3), seven more strains compared to the original primer set (Appendix A.3;MIBS02F and MIBS02R;(Chiu et al., 2016)).Capturing a larger proportion of Cyanobacteria that possess the mic gene enabled us to better quantify mic present in the water body from this study.However, Wang et al. (2011) stated that more than 40 Cyanobacteria species have been identified to produce 2-MIB.This would imply that the mic:16S copy numbers mL −1 recorded in this study may be underestimated.Thus, the lack of universality in primers used for mic detection would indicate that the data reported here is not fully representational of all Cyanobacteria that possess the mic gene in the water column.
Likewise, the geoA qPCR primers used in this study, namely, geo799F (John et al., 2018) and geo982R (Suurnäkki et al., 2015) were not universal.For future reference, forward primer geo799F should be used in conjunction with the reverse primer geo927R (John et al., 2018).If taxon-specificity is required to see which producers are present and how much geoA they contribute to, this would require multiple sets of taxon-specific primers with differing protocols (Devi et al., 2021).

Seasonal influence of T&O compounds
In accordance with Oh et al. (2017), geosmin was predicted to have the potential of causing drinking water problems in all seasons, this can be reaffirmed by winter 2019 results from this study.When geosmin concentrations were low during winter 2020 (≤0.4 log 10 (ng L −1 )), an uncoupling of the relationship occurred, illustrated by a slight negative association.Model analysis on a reservoir experiencing geosmin "events" revealed significant differences between geoA levels during all seasons when compared to summer 2019.Dzialowski et al. (2009) found that elevated geosmin concentrations were not necessarily confined to summer months, and heightened concentrations of geosmin were found during the winter in some studied reservoirs like this study.For mic levels autumn 2019 was the only significant season when compared to winter 2019; this was likely due to the 2-MIB "event" (58 ng L −1 ) witnessed in reservoir 7 during this time.

Conclusions
This study demonstrates that geoA copy numbers can be implemented as a suitable direct proxy for geosmin concentrations during periods of elevated geosmin concentrations.Through modelling the response of geoA in parallel with physical and chemical water parameters, it can be concluded that geoA also has suitable predictive applications for geosmin concentrations ≥4 ng L −1 .From these data, nutrient ratios (TIN:TP, and NH 4 + :NO 3 − ) were better predictors of 2-MIB events than 2-MIB concentrations.Negative linear relationships between geoA and dissolved reactive silicate and mean temperature should also be considered as variables for inclusion in modelling T&O event prediction.Therefore, a combined molecular dataset with water chemistry data provides a powerful predictor of geosmin-based T&O events.However, frequency of samples taken needs to be at least bi-weekly to ensure that fluctuations seen in nutrients, gene levels and T&O concentrations are detected and are fully representative of the water body.Sample type should also be considered in relation to 2-MIB producing species (i.e., sediment samples) to ensure benthic communities are assessed for mic abundance.To evaluate geoA response to high and low extremes in TIN: TP ratios, metabarcoding should be implemented to allow species composition to be assessed in relation to N-fixing and non-N-fixing − ratio and (C) TIN:TP ratio.The y-axis denotes the partial effect size, the comb on the x-axis shows where the value of predictor data points lie and the points are the residuals.The horizontal red line at y = 0 intercept indicates the overall mean of the response (geoA abundance).

Fig. 2 .
Fig. 2. Box and whisker plot showing log 10 geosmin concentrations (ng L −1 ) for all nine reservoirs over the length of this study.The length of the box indicates the interquartile range, extending from 25th to 75th percentile.The horizontal bar within the box denotes the median value, and the diamond shape represents the mean value.The whiskers display the range, with outliers depicted as black dots.

Fig. 3 .
Fig.3.Scatterplots of log 10 concentrations of geosmin (ng L −1 ) and geoA:16S (copy numbers mL −1 ) from all reservoirs.Individual points are coloured corresponding to reservoir, and facet wrapped according to sampling month and year.Each scatterplot includes a linear regression line of best fit with the R 2 value and associated significance assigned by p values.

Fig. 4 .
Fig. 4. Scatterplots of log 10 concentrations of geosmin (ng L −1 ) and geoA:16S (copy numbers mL −1 ) from all reservoirs.Individual points are coloured corresponding to reservoir, and facet wrapped according to sampling season and year.Each scatterplot includes a linear regression line of best fit with the R 2 result and associated significance assigned by p values.

Fig. 5 .
Fig. 5. Smooth function plots for predictor variables in reservoir 1 geoA:16S copy numbers mL −1 GAM.Estimated smooth functions (solid lines) with 95% confidence intervals (grey shaded area) are shown for each smoothed predictor: (A) Geosmin concentrations ng L −1 (B) NH 4 + :NO 3− ratio and (C) TIN:TP ratio.The y-axis denotes the partial effect size, the comb on the x-axis shows where the value of predictor data points lie and the points are the residuals.The horizontal red line at y = 0 intercept indicates the overall mean of the response (geoA abundance).

Fig. 6 .
Fig. 6.Smooth function plots for predictor variables in the control geoA:16S copy numbers mL −1 GAM.Estimated smooth functions (solid lines) with 95% confidence intervals (grey shaded area) are shown for each smoothed predictor: (A) Geosmin concentrations ng L −1 (B) NH 4 + :NO 3− ratio and (C) TIN:TP ratio.The y-axis denotes the partial effect size, the comb on the x-axis shows where the value of predictor data points lie and the points are the residuals.The horizontal red line at y = 0 intercept indicates the overall mean of the response (geoA abundance).

Table 1
List of strains used for standard curves in all qPCR reactions with their origin and the putative 16S classification after genomic classification with corresponding percentage identity.

Table 2
Statistical data of geosmin concentrations per reservoir over duration of this study.

Table 3
Statistical data of 2-MIB concentrations per reservoir over duration of this study.

Table 4
GAM model results for reservoir 1 with geoA:16S copy numbers mL −1 as the response variable.Using summer 2019 as the reference level for seasonal comparison.

Table 5
GAM model results for all reservoirs except reservoir 1, with geoA:16S copy numbers mL −1 as the response variable.Using winter 2019 as the reference level for seasonal comparison, and reservoir 5 as the reference level for reservoir comparison.Note: Variables with significant influences are indicated by: .p < 0.1,.* p < 0.05, ** p < 0.01, *** p < 0.001.
A.S.Hooper et al.