Interactive comment on “ Regional-scale identification of groundwater-surface water interaction using hydrochemistry and multivariate statistical methods , Wairarapa Valley , New Zealand ”

In this paper, the authors use multivariate statistical methods to infer spatial patterns of surface water-groundwater interaction using a regional, historical water quality data set. The authors first refine the large chemical data set (of both surface and ground waters) by integrating and reducing reported field and laboratory results, summarizing multiple samples from individual sites using medians, and assessing data quality using a charge balance approach. They then apply hierarchical cluster analysis (HCA) to group sets of sites that are statistically similar and then use principal component analysis to better


Introduction
It is widely acknowledged that groundwaters and surface waters interact at a variety of spatial and temporal scales, the degree to which is highly dependent on meteorological, fluvial, anthropogenic and geological processes (Winter et al., 1998).Such interactions play a significant role in determining the quantity and hydrochemical composition of water bodies at both local and regional scales.Therefore, identifying locations and understanding mechanisms of groundwater-surface water interaction is vital for effective environmental management (Dahm et al., 1998).
Approaches generally attempt to infer groundwatersurface water interaction by quantifying changes in water temperature, stage and discharge, water chemistry and/or hydrograph separation (Kalbus et al., 2006;Sophocleous, 2002).These methods generally assume change in one or more of these parameters is caused or influenced by the transfer of water across the aquifer-stream boundary.Hydrograph separation is an exception, and involves the separation of stream flow into the runoff components base flow and quickflow (e.g.Laudon and Slaymaker, 1997;Hugenschmidt et al., 2010) and then assuming that base flow represents groundwater discharge into the stream (Kalbus et al., 2006).These methods have a range of benefits and limitations, and the Published by Copernicus Publications on behalf of the European Geosciences Union.
appropriateness of each method is highly dependent on the study's desired outcomes and available resources.
In this study, our focus is on extending the hydrochemical methods for identifying groundwater-surface water interaction.The hydrochemical composition of fluvial systems is largely influenced by precipitation, regional geology, aquifer-stream transfers and, increasingly, also by anthropogenic contamination.Subsequently, water quality data offer insight into the various flow pathways, geological settings and physical and chemical processes which water experience (Freeze and Cherry, 1979;Güler and Thyne, 2004;Kumar et al., 2009).Numerous studies suggest that similarities in water composition between neighbouring groundwater and surface water bodies, such as ion ratios or the concentration of total dissolved solids (TDS), can be used to qualitatively or statistically infer potential groundwater-surface water interaction (e.g.Burden, 1982;Kumar et al., 2009;Taylor et al., 1989) or, if water flux data are available, to quantitatively estimate fractions of water derived from different sources or end-members with distinct hydrochemistry (Schmidt et al., 2010;Vásquez-Suñé et al., 2010;de Louw et al. 2011).
Multivariate methods such as Hierarchical Cluster Analysis (HCA) and Principal Components Analysis (PCA) have become increasingly popular in the qualitative and statistical analysis and interpretation of environmental datasets, especially where water flux data are not available.These techniques enable the simplification of large datasets by grouping data into components or clusters based on the relationship between specified variables (Kumar et al., 2009;Mencio and Mas-Pla, 2008).These components or clusters can then be related to the processes that create variability in the dataset, to gain a greater understanding of natural earth systems and their operations.Both HCA and PCA have been extensively applied in hydrogeology to identify groundwater facies that merit further exploration in relation to natural groundwater evolution (e.g.Banoeng-Yakubo et al., 2009;Cloutier et al., 2008;Demlie et al., 2007;Güler and Thyne, 2004;Koh et al., 2009;Delgado-Outeriño et al., 2011), contamination from salt water intrusion (e.g.Kim et al., 2003;Ritzi et al., 1993) and contamination from anthropogenic sources (e.g.Demlie et al., 2007;Koh et al., 2009).Several studies have also utilized multivariate statistics in the classification of surface waters (e.g.Alther, 1989;McNeil et al., 2005;Panda et al., 2006), while Güler and Thyne (2004) and Thyne et al. (2004) classified both groundwater and surface water bodies simultaneously at a multi-watershed and single watershed scale, respectively, to characterize the hydrogeology of an area and identify localised areas of groundwater recharge and discharge.Recent work by Kumar et al. (2009) utilised similar statistical methodologies to specifically identify the provision of contaminated groundwater base flow to an urban reach of the Yumana River in India, classifying groundwater and surface water bodies together based on hydrochemical similarities.Lautz and Finelli (2008) applied similar multivariate statistical techniques to evaluate groundwater-surface water interaction at a more localised scale.Studies by the likes of these authors demonstrate significant chemical correlation between groundwater and surface water and highlight the use of multivariate statistical methods as an investigative tool to identify potential areas of groundwater-surface water interaction.
To date this methodology of inferring potential sites of groundwater-surface water interaction using existing hydrochemical data and multivariate statistical methods has been neglected at a regional or multi-watershed scale, despite the availability of extensive regional hydrochemical datasets from both groundwater and surface water bodies (e.g.State of Environment monitoring in New Zealand, US Environmental Protection Agency monitoring data, European-wide national infrastructure monitoring programmes in place to satisfy the legislative demands of the recent European Water Framework Directive).Such datasets have been utilised to classify ground water and surface waters individually at a regional scale (e.g.McNeil et al., 2005;Thyne et al., 2004), but to our knowledge, few studies have attempted to simultaneously classify both groundwaters and surface waters into similar hydrochemical facies to determine their interrelationships and degree of interaction.Identification of areas of groundwater-surface water interaction using existing data would greatly assist in the selection of key areas or specific watersheds that are sensitive to environmental change and so may require future monitoring and investigation.
This study makes two main contributions with regard to the use of multivariate statistical techniques to specifically identify areas and mechanisms of potential groundwatersurface water interaction.First, although some previous studies have employed multivariate statistics to assess hydrochemical variations amongst rivers, or amongst groundwaters, very few studies have applied these statistical methods simultaneously to groundwaters and rivers.Second, for the few studies that have evaluated hydrochemical variations between groundwaters and rivers using multivariate statistics, none to our knowledge has focused on the regional, multiple-catchment watershed scale.Hydrochemical data from groundwater and surface water bodies within the pastorally dominated Wairarapa valley in New Zealand were subjected to HCA and PCA and classified into hydrochemical facies based on their chemical composition.The Wairarapa valley was chosen for investigation because the intensification of agriculture and subsequent degradation of water quality is leading to the need for a greater understanding of hydrological systems and the potential transport of contaminants across the aquifer-stream boundary in this region.Groundwater and surface water monitoring sites were grouped together in a number of facies suggesting similarities in their hydrochemical composition.The drivers of hydrochemistry were inferred, including groundwater-surface water interaction, then validated by comparing hydrochemical facies defined by HCA and PCA to previous measurements of 18 O, water age and excess air (Morgenstern, 2005).

Site description: Wairarapa Valley
The Wairarapa valley is a 3000 km 2 geological depression located in the south-east corner of the North Island of New Zealand (Fig. 1).The valley extends approximately 80 km north-east to south-west and largely overlies the locked subduction interface of the Australian and Pacific plates (Mc-Conchie, 2000).The valley is bounded by a series of axial Torlesse greywacke ranges on its western periphery with the Rimutaka and Tararua Ranges and similarly on the southeast periphery with the Aorangi ranges (Fig. 2a).These hard, heavily vegetated, Triassic-Jurassic greywackes form the highest relief in the area (ca.1500 m) and are at times dissected by sheared blocks of basalt, chert, limestone and mudstone (Kamp, 1992).The north-east boundary of the valley is bounded by soft, early Pleistocene/Late Tertiary marine sandstone, siltstone, mudstone and limestone ranges (Mc-Conchie, 2000).The valley itself is filled with successive layers of well sorted to poorly sorted Quaternary glacialfluvial sedimentary fans, dissected and reworked by fluvial systems (Fig. 2b).Substantial marine, estuarine and lacustrine depositional layers have accumulated in the lower sub-siding 25 km of the valley due to global climate cycles and subsequent sea level fluctuations (Begg et al., 2005).As a result the lower basin contains 40-50 m thickness of postglacial estuarine mud, underlain by a sequence of lacustrine deposits that houses at least six thin artesian gravel layers.
The complex mosaic of sedimentary layers within the Wairarapa basin has resulted in a dynamic regional groundwater system that overrides a low permeability deposit of middle Quaternary clay and silt sediments (Begg et al., 2005).This groundwater system is compartmentalized into various sub-regional flow systems by the Mokonui, Masterton and Carterton faults that traverse the upper Wairarapa valley (Jones and Gyopari, 2006).Along these faults, older, less permeable sedimentary layers have been pushed towards the surface, creating barriers that locally restrict the movement of groundwater.Localised regions of subsidence and uplift, the result of tectonic fault movement, have also resulted in elevated basement and Quaternary sediments such as Tiffen Hill and Fernhill, and localised depressions such as the Te Ore Ore Basin (Fig. 2c).
Groundwaters within the Wairarapa valley generally flow down valley in a south-easterly direction (Fig. 2c).The potentiometric surface largely follows changes in topography but is also heavily influenced by the presence of impermeable barriers such as Tiffen Hill, Fernhill and the Te Maire Ridge that redirect groundwater flow (Fig. 2c).In the lower valley the piezometric gradient is reduced as groundwaters flow into the subsiding valley and are directed toward the area beneath Lake Wairarapa (Begg et al., 2005).In general, aquifers located within the upper and middle Wairarapa valley are largely unconfined and hosted within mid-to wellsorted alluvial gravels.In contrast, aquifers located within the lower valley are typically confined (Fig. 2c).Recharge of the Wairarapa groundwater system is largely determined by Quaternary surface sediments and the location of river systems.Rainfall recharge dominates the majority of the valley, whereas groundwaters underlying permeable Q1 river gravels are thought to receive recharge primarily from overlying river systems (Morgenstern, 2005).A large proportion of groundwaters within Q1, Q2 and Q4 sediments receive recharge from both mechanisms.
The climate of the area is highly influenced by the Rimutaka and Tararua ranges, which act as a topographic barrier, sheltering the valley from the predominantly westerly winds, and creating a vast disparity of rainfall distribution.The highest mean annual rainfall is experienced in the Rimutaka and Tararua ranges (ca.6000 mm yr −1 ) while progressively less precipitation is experienced in the valley itself (ca.800-1200 mm yr −1 ).During the summer months rainfall is more variable, however these months are generally drier with higher intensity rainfall recorded during the winter.The valley floor experiences mean annual temperatures of 12-14 • C and it is common for dry fohn winds to move across the valley during the summer months.months November to March (ca.16.6 • C) and can be affected by summer drought conditions (Hawke et al., 2000).
A number of river systems flow across the Wairarapa Valley, with their headwaters in the low to mid altitude Rimutaka, Tararua and Eastern Wairarapa hills (Figs. 1 and 2a).Water is largely provided to these systems by rainfall and snowmelt, however a significant proportion is also supplied to these rivers from underlying groundwater systems, subsurface and soil flows.As a result of the numerous input sources, flow displays significant seasonal variability, with high flows generally experienced during the winter months May to August.In general, these rivers lose water to underlying groundwater systems as they exit the surrounding hills and cross historic permeable outwash plains.Further downstream these same rivers usually switch to gaining systems in which a proportion of base flow is provided by groundwater systems (Jones and Gyopari, 2006).Concurrent flow gauging surveys conducted by the Greater Wellington Regional Council (GWRC) in 2006 resulted in the classification of river systems within the valley into three interacting categories; rivers that lose significant volumes of water to underlying groundwaters, rivers that gain significant volumes of water from underlying groundwaters and neutral rivers that do not gain or lose significant volumes of water from groundwater.
Land use in the Wairarapa Valley is dominated by pastoral agriculture, which covers approximately 76 % of the valley floor and includes viticulture and market gardening.These activities are known to add additional nutrients and chemicals (e.g.NO 3 , NH 4 , K, Cl) to the land through fertilizers, soil cultivation and the discharge of effluents (Watts, 2005).Further, these land use activities place significant pressure on groundwater and surface water systems for water abstraction.In 1999, groundwater allocation was 48 million m 3 yr −1 , the majority of which was abstracted for irrigation during the warmer months of October to March (Jones and Baker, 2005).Groundwater and surface water abstraction has continued to increase substantially since the last decade in response to rising agriculture and horticulture needs (Jones and Baker, 2005).In December 2004, there were over 150 individual surface water and 318 groundwater extraction permits, a number which is likely to have increased in recent years.The extent to which abstraction influences groundwater-surface water interaction in the area is not well understood.monitoring sites.The dataset included results for 50 different parameters (not all analysed for every sample), including both "dissolved" and "total" concentrations of Na, K, Ca, Mg, B, Fe, P, Mn, SO 4 , Cl and SiO 2 .These referred to analyses conducted on field-filtered (dissolved) and unfiltered (total) water samples, respectively.Dissolved concentrations were selected for use, when available, as they were deemed less likely to be affected by post sampling chemical processes.For samples that had not been analysed for dissolved concentrations, total concentrations were used instead.Likewise both "field" and "lab" measurements of pH and conductivity were available for some samples.Preference was given to field measurements as post sampling reactions (e.g.degassing) can influence pH, but lab measurements were used if field measurements were not available.These substitutions aimed to maximize the amount of data included in subsequent statistical analysis.

The hydrochemical database
Although multivariate statistical methods such as HCA can process datasets of significant size, it was not feasible to process every individual sample from each monitoring site.Therefore, in order to improve the practical size of the workable database, a representative value was determined for each parameter at each site, based on all of the available samples.250 of the groundwater sites had been sampled only once, and results from these samples were therefore used in subsequent statistical analysis.For the remaining groundwater sites and all of the river sites, the log-probability method, based on the underlying theory of Helsel and Cohn (1988), was employed to calculate representative median values for each analyte at each site, using an automatic water quality processing program developed by Daughney (2005Daughney ( , 2007)).Median values were chosen, as opposed to means, in order to reduce the influence of outliers and provide a more accurate snapshot of water quality (Helsel and Hirsch, 1992).The log-probability method was deemed appropriate as it can account for up to 70 % of the dataset being below a known detection limit (censored values) (Helsel and Cohn, 1988).This is a common occurrence in hydrochemical datasets and would usually reduce the number of input variables used in the calculation of the medians using other techniques.Median parameter values calculated in this manner were based on at least twelve samples for all 28 of the river water sites and 79 of the groundwater sites; median parameter values for the remaining 278 groundwater sites were based on between two and eleven samples.The resulting output was a 50 parameter × 635 site data array; if data were not available for a given parameter at a given site, it was left blank in the array.It is acknowledged that biases may have been created in this array, i.e.where representative parameter values are based on just one sample at some sites, or are calculated medians based on two to more than twelve samples at other sites.Thus three different QA/QC procedures were applied to ensure that meaningful results would be obtained from subsequent statistical analysis: assessment of temporal trends, calculation of charge balance error (both described below in this section), and identification of sites/samples with unusual chemistry (Sect.3.2).
The dataset was screened for the presence of temporal trends on a per-parameter basis for all sites that had been sampled at least twelve times (79 groundwater sites and 28 river water sites).This was intended to determine whether the calculated median parameter values based on samples collected in different years might be biased by temporal changes in water quality, and whether data from single samples collected at different sites years apart could be meaningfully compared.Identification of temporal trends was based on the seasonal Mann-Kendall trend test (α = 0.05), with the magnitude of significant trends quantified using Sen's slope estimator (Helsel and Hirsch, 1992;Daughney, 2005Daughney, , 2007)).Significant parameter-specific temporal trends could be detected for roughly 25 % of the sites considered, but rates of change were typically less than 2 % per year.It is assumed that the same mangnitude and frequency of occurrence of trends applies at the other sites for which insufficient data were available to conduct the Mann-Kendall test.Hence, we conclude that the use of the 50 parameter × 635 site data array derived as described above is not strongly biased by existing temporal trends, because the trends are generally slow and occur at a minority of sites.
The dataset was also screened based on charge balance error (CBE), calculated after Freeze and Cherry (1979), to identify samples that were electrically unbalanced (Eq.1).

CBE
Here, v is the absolute value of the ionic valance, m c is the molarity of the cationic species and m a is the molarity of the anionic species.CBE values were calculated at each site using the median concentrations of the cations Na, K, Ca, Mg and the anions HCO 3 , Cl and SO 4 .Other analytes (e.g.Mn, www.hydrol-earth-syst-sci.net/15/3383/2011/ Hydrol.Earth Syst.Sci., 15, 3383-3398, 2011 NH 4 , NO 3 , Fe) were excluded from the CBE calculations because they had not been analysed at a significant number of sites.For sites at which these other parameters had been analysed, their concentrations are low and so their inclusion in the calculations affected CBE values by less than 0.5 %, and hence CBE values based on major ion concentrations are considered meaningful for this study.For sites that had been sampled only once, calculation of CBE as described above is conventional.However, sites that had been sampled more than once, CBE values were calculated for each monitoring location using site-specific median concentrations rather than results for individual samples.This approach was followed because a large number of individual samples did not have available the full suite of individual parameters required to calculate CBE.Although unconventional, this approach was reasonable for this study because temporal trends are slow or non-detectable for most parameters at most sites.In total, CBE values were calculated for 383 sites, with the remaining 252 sites excluded because median values for two or more major ions were missing due to incomplete datasets.Of the 383 sites at which CBE could be calculated as described above, 56 sites had CBE above +10 %, and 22 sites had CBE below −10 %.All sites for which CBE could not be calculated or where CBE was above 10 % or below −10 % were excluded from subsequent statistical analysis.The CBE threshold of ± 10 % was selected (Güler et al., 2002), as opposed to the industry standard threshold of ± 5 % (Freeze and Cherry, 1979), to exclude only those sites with severe charge imbalances indicative of sampling, analytical or data entry errors.The resulting output was thus a 50 parameter × 305 site data array.239 of these 305 sites were groundwater bores ranging from 2.0 to 113.0 m depth (25th, 50th and 75th percentiles of 7.8, 16.0 and 35.8 m respectively).

Multivariate statistical methods
In order to infer locations of potential groundwater-surface water interaction, the 50 parameter × 305 site data array was subjected to HCA.This and other statistical procedures discussed in this paper were conducted using the statistical package STATGRAPHICS Centurion (Version 15.2.12).HCA is a data reduction tool that works by partitioning a set of observations (e.g.monitoring sites) into a distinct number of clusters based on the statistical similarity of a given set of parameters (e.g.median values of water quality parameters) (Timm, 2002;Kumar et al., 2009).Observations grouped together within the same clusters are statistically similar (at a 95 % confidence level), while observations in different groups show little statistical similarity.Following previous studies, HCA was conducted using site-specific median values of selected parameters (e.g.Reiley et al., 1990;Güler et al., 2002).In this study, HCA was conducted using Ca, Mg, K, Na, HCO 3 , Cl, SO 4 and electrical conductivity.These analytes were selected as they were deemed likely to reflect changes in regional lithology and are most likely to indicate groundwater-surface water interaction because they are known to differ substantially between groundwater and surface water bodies.Further, these parameters were the most commonly analysed across all sites, whereas many additional parameters (e.g.nutrients, pH) were missing from individual samples and monitoring sites.If these additional parameters were included in the clustering processes they would have substantially reduced the number of sites analysed.In total 276 individual monitoring sites provided a full suite of these eight parameters and satisfied the CBE test, and so were included in the clustering algorithm.Prior to this process, data were log-transformed to meet the assumptions of homoscedasticity and normal distribution that are required for the cluster analysis procedure and then standardized according to their z-scores (Venugopal et al., 2008;Woocay and Walton, 2008): where z represents the z-score for the site and analyte of interest, x represents the median of that analyte at that site, x represents the mean of the medians of that analyte over all surface water and groundwater monitoring sites, and s represents the standard deviation of the medians of the analyte over all monitoring sites.The statistical similarity between individual monitoring sites was measured by the Squared Euclidian Distance (SED) between the z-score of the medians of the water quality parameters at sites i and k, as given in Eq. ( 3).The SED is the geometric distance in multidimensional space between two sites i and k, based on the difference in the j -th hydrochemical parameter, and summed over all n hydrochemical parameters considered in the HCA algorithm (Kumar et al., 2009).The SED has been found to produce the most distinctive sample groupings (e.g.Adar et al., 1992;Cloutier et al., 2008;Daughney andReeves, 2005, 2006;Güler et al., 2002).
Following the calculation of the distance (similarity) between monitoring sites, each observation is placed automatically into an individual cluster.Clusters are then combined in a stepwise fashion, two at a time, based on their similarity measurement.This agglomerative process continues until all clusters have been joined and the user specifies an end target number of clusters (e.g.seven clusters) (Hair et al., 2006).Two methods of agglomerative HCA were employed.The first, the Nearest Neighbour linkage rule, was used to identify sites with unusual chemistry compared to other sites in the dataset.These sites are termed "outliers", and may exist in the dataset due for example to the approach used in this study to determine median parameter values, in spite of the CBE and temporal trend calculations applied for data QA/QC specifically to avoid this issue.It is not known a Hydrol.Earth Syst.Sci., 15,[3383][3384][3385][3386][3387][3388][3389][3390][3391][3392][3393][3394][3395][3396][3397][3398]2011 www.hydrol-earth-syst-sci.net/15/3383/2011/ priori how many outlier sites might exist in the dataset, and so they were identified by a two-step process.First, potential outlier sites were identified by a sharp change in slope on the agglomeration-distance plot, as is typical in many applications of HCA (Timm, 2002).Second, following a standard approach for outlier detection (Helsel and Hirsch, 1992), sites were confirmed as outliers if, for each parameter considered in HCA, their concentrations were more than two times the median absolute deviation greater or less than the median across all non-outlier sites.Outlier sites identified in this way were removed to prevent them biasing further statistical analysis, but their hydrochemistry is discussed separately in Sect.4.1.
Following the exclusion of outlier sites, the remaining sites were subjected to further clustering using Ward's linkage method.Ward's linkage method evaluates the distance between clusters using an analysis of variance procedure and has been found by other studies to create smaller and more distinct clusters than other linkage rules (Cloutier et al., 2008;Daughney and Reeves, 2005).
PCA was also employed to evaluate the water chemistry dataset.PCA is a statistical method that reduces data sets without removing underlying patterns or variance of the original data (Wackernagel, 2003).PCA achieves this by identifying components that statistically explain variance in the original data set (Hagg and Westrich, 2002).PCA was conducted using site specific median values for the eight parameters included in HCA, plus log-transformed median concentrations for the analytes Fe, total reactive phosphorous, NO 3 and NH 4 .All input variables were standardised by zscoring according to Eq. (2) prior to PCA.In order to reduce the overlap of water quality parameters over the principal components, a varimax rotation was conducted to produce either extreme negative, extreme positive, or near-zero loadings (Davis, 1986).Following the established Kaiser criterion, only components with an eigenvalue greater than 1 were retained.

Hierarchical cluster analysis
Eight monitoring sites with unusual hydrochemistry were identified using the Nearest Neighbour linkage rule and were excluded from further HCA.The hydrochemistry of these outlier sites is summarised in Table 1.There are no obvious spatial correlations of these sites within the Wairarapa valley.This suggests that the unusual chemistry displayed at these sites is not influenced by a spatially distributed set of processes such as lithology or land use.All but one of these outlier sites had been sampled only once.It is therefore likely that analytical results from these outlier sites are biased by sampling or data reporting errors, despite having CBE within the acceptable range.One outlier site, S27/0442, had been sampled more than twelve times and is shown to display hydrochemistry that is consistent with extremely evolved groundwaters.Because this investigation aims to identify areas of groundwater-surface water interaction, which is expected to involve relatively fresh waters, the exclusion of this site is not considered an issue.Although further analysis of these outliers is not warranted in this paper, it should be noted that the outlier sites can still provide important information for understanding the water resources of the study area.
The remaining 268 groundwater and surface water monitoring sites were subjected to HCA using Ward's linkage method.In our study the resulting dendrogram, presented in Fig. 4, allowed for the visual identification of seven major clusters, or hydrochemical facies, at a SED threshold of 600.As identified by Daughney and Reeves (2005) and Güler et al. (2002) the separation threshold can be lowered or raised to allow for the identification of more or fewer clusters.Seven clusters were deemed appropriate for this study, this number being a practical size for further analysis; further evidence supporting the appropriateness of this cluster threshold is provided in the following section.The number of monitoring sites assigned to each cluster and their hydrochemical medians are presented in Table 2. Surface water monitoring stations were assigned to three of the seven arbitrarily named clusters with 36 % of all surface water sites assigned to cluster A1, 46 % to cluster A2 and 18 % to cluster B1.Clusters B2-B5 consisted entirely of groundwater monitoring sites, showing that these clusters have different hydrochemistry compared to the surface water monitoring locations.

Hydrochemical differences between clusters
Although HCA can assign monitoring sites to individual clusters based on similarity in their hydrochemical parameters, information on the specific water quality parameters that distinguish and differentiate each cluster cannot be extracted from a dendrogram (Daughney andReeves, 2005, 2006).Thus, we examined the hydrochemical characteristics of each cluster with the aid of Piper diagrams, Box and Whisker plots and Principal Components Analysis (PCA) (Table 2 and Figs. 5 and 6).Note that the Piper diagram employs exaggerated scales for Mg and SO 4 ; this is simply to aid visual interpretation by increasing separation of points on the diagram, but it means that care must be exercised when evaluating ion ratios based on the position of points in the triangular fields.Selections of additional parameters (e.g.nutrients), not included in the original HCA, were also compared between clusters because they offered further insight into chemical pathways and potential processes that influence groundwater and surface water bodies (Dahm et al., 1998;Freeze and Cherry, 1979).
The seven clusters identified by HCA were largely differentiated by their TDS concentrations (determined by summation of major ion concentrations), redox condition (inferred from measured concentrations of Fe, Mn, NO 3 , NH 4 and SO 4 ) and major ion ratios (Fig. 7).Concentrations of major ions, TDS and electrical conductivity progressively increased in the following cluster sequence A1-A2-B1-B2-B3-B5-B4 (Table 2 and Fig. 5).This indicates waters assigned to clusters A1-B1 are generally dilute and can be interpreted as young whereas those assigned to the latter clusters most likely reflect older waters (see Daughney et al., 2010 and further discussion below).Concentrations of SO 4 and NO 3 showed an inverse sequence to that of the other ions, with SO 4 concentrations statistically highest in clusters B1 and B5 and NO 3 concentrations highest in cluster A2 (Fig. 5).Groundwaters assigned to clusters B3 and B4 dis-played low concentrations of SO 4 , indicating highly reduced waters.The deepest groundwater sites tended to be assigned to clusters B3 and B4, with shallow wells assigned to clusters A1 and A2.The descriptive statistics and Piper diagrams indicate the ratio of major ions varied between clusters, with clusters A1, B1 and B2 largely dominated by Ca and HCO 3 (Fig. 6).Waters assigned to the remaining clusters tended to display higher Na to Ca and higher Cl to HCO 3 , reflecting the importance of sedimentary lithology, evaporative concentration, ion exchange processes and/or older evolved groundwaters higher in Na and Cl.
An often perceived weakness in the determination of hydrochemical clusters is the subjective nature by which the number of clusters is selected from the dendrogram (i.e. the selection of a separation threshold).In order to investigate the sensitivity of clusters to the separation threshold and validate our selection of seven clusters, two alternative separation thresholds yielding 2 and 13 clusters were explored (Fig. 4).Box and Whisker plots and Kruskal-Wallis tests were conducted for the alternative cluster grouping to determine any statistical difference in water quality parameters between the clusters at each separation threshold.For ease of interpretation only Ca, SO 4 and electrical conductivity are compared in this paper, because they were generally representative of hydrochemical differences between the clusters.At the two-cluster threshold, clusters A1 and A2 are grouped together within cluster A whereas clusters B1-B5 are grouped within cluster B. Clusters A and B are differentiated by conductivity, Ca and SO 4 with concentrations all highest in cluster B (Fig. 8).However, this twocluster threshold fails to capture the hydrochemical variability amongst monitoring sites and therefore it is useful to further sub-divide into 7 or 13 clusters.At the lowest separation threshold presented (13 clusters), cluster A is subdivided into A1, A2a-b and cluster B is subdivided into B1a-c, B2ab, B3a-b and B4a-c.However, the significance of hydrochemical differences between clusters begins to diminish at Hydrol.Earth Syst.Sci., 15,[3383][3384][3385][3386][3387][3388][3389][3390][3391][3392][3393][3394][3395][3396][3397][3398]2011 www.hydrol-earth-syst-sci.net/15/3383/2011/ TDS* µS cm −1 mg l −1 mg l −1 mg l −1 mg l −1 mg l −1 mg l −1 mg l −1 mg l  the lowest separation threshold.Figure 8 shows little statistical difference for a range of parameters across the various sub-clusters (e.g.Ca cannot be statistically differentiated amongst clusters B4b-c, B1a-c, B2a-b, B3a-b).This highlights the over-sensitivity of a threshold that yields 13 clusters, and the lack of additional insight provided.It is therefore deemed appropriate to remain with seven clusters, due to the statistically significant differences in most parameters between these clusters and the feasible presentation of results.
PCA also confirmed that surface water and groundwater monitoring sites were largely differentiated according to their concentration of TDS and redox condition.PCA produced two components with eigenvalues greater than one that together explained 73.3 % of the variance in the data set.The majority of variance within the data is contained in Component 1 (57.5 %), which is associated with the positive weighting of the analytes electrical conductivity, HCO 3 , Ca, Cl, Mg, Na, K and indicates the importance of natural mineral dissolution and chemical evolution of water bodies (    dataset and displays strong positive weightings of NH 4 , Fe and Mn and strong negative weightings of NO 3 and SO 4 , highlighting the importance of different redox conditions.The third component is not considered to be significant because it has an eigenvalue of 0.8 and explains only 6.9 % of the variation in the dataset. Figure 9 shows the principal component scores for all sites assigned to each of the hydrochemical clusters.Waters assigned to clusters A1 and A2 showed negative loadings for both Components 1 and 2, suggesting these waters are fresh, low in solutes and oxic with the presence of NO 3 and SO 4 .In contrast, sites assigned to clusters B1-B5 tend to show a positive loading for Component 2, suggesting a shift to anoxic conditions and the presence of Fe, Mn and NH 4 as redox condition decreases.An inverse relationship exists between TDS and redox condition, with highly evolved waters with high TDS tending to lack NO 3 and SO 4 , which is be expected for many groundwater systems (e.g.cluster B4) (Freeze and Cherry, 1979).
Figure 10 shows that the hydrochemical clusters defined by HCA and PCA are consistent with previous interpretations of recharge source and water age (Morgenstern, 2005).Groundwater recharge source can be inferred from δ 18 O, because rainfall on the floor of the Wairarapa Valley typically has δ 18 O values that are more negative than −6.5 ‰, whereas rivers that are sourced from the surrounding hills typically have δ 18 O more positive than −6.5 ‰.Groundwater recharge source can also be inferred from the concentration of excess air, i.e. dissolved air in excess of the amount that is soluble at equilibrium at a given recharge temperature (Heaton and Vogel, 1981).Excess air is thought to originate by processes such as bubble entrapment occurring during recharge through the vadose zone.Morgenstern (2005) used measured concentrations of dissolved nitrogen and argon to calculate excess air concentrations for groundwaters in the Wairarapa Valley.Morgenstern (2005) interpreted values of excess air greater than 2 ml at STP per kg to indicate groundwater recharge primarily from rainfall, values close to zero to indicate recharge primarily from river seepage, and values significantly less than zero to indicate degassing taking place in reduced groundwater.High values of excess air can also result from denitrification in anoxic groundwater (Hinkle et al., 2010).Morgenstern (2005) also showed that mean residence time of groundwater in the Wairarapa Valley varies between sites from ca. 1 yr to more than 150 yr, based on fitting of the exponential-piston flow model (Zuber et al., 2005) to measured concentrations of tritium, chlorofluorocarbons and sulfur hexafluoride.Although the tracer measurements have been made at relatively few sites considered in this study, the available data suggest that cluster A1 is dominated by river-recharged groundwater with mean residence time of 1 to 2 yr.Cluster A2 appears to represent groundwater with mean residence time ca.40 yr, possibly sourced from a mixture of rainfall and river seepage.Groundwaters in cluster B1 have mean residence time ca.30 to 60 yr and appear to be sourced primarily from rainfall recharge.There are few sites in clusters B2, B3 and B4, and no sites in cluster B5, at which the tracers have been measured, and so generalizations cannot be made regarding recharge source or water age.However, based on the available data, it appears that sites in clusters B2, B3 and B4 represent old groundwater, with mean residence time greater than 150 yr, the limit of the dating techniques used by Morgenstern (2005).Negative values or highly positive values of excess air suggest that cluster B contains some highly reduced groundwaters, i.e. which may be influenced by degassing and denitrification, respectively.

Identifying areas of groundwater-surface water interaction
From the areal distribution of sites assigned to each cluster, the groundwater and surface water bodies within the Wairarapa valley can be separated into two main areas based on hydrochemistry: (1) the upper and middle valley and (2) the lower valley (Fig. 11).Surface water and groundwater monitoring sites assigned to clusters A1, A2, B1 and B2 are largely located within the upper and middle valley whereas groundwater sites assigned to clusters B3-B5 are located almost entirely in the lower Wairarapa valley.These two areas largely reflect the level of aquifer confinement, which may in turn influence the degree of groundwater-surface water interaction.In the following discussion, we aim to test the hypothesis that similarities in hydrochemistry between groundwater and surface water may be the result of interaction between them for at least some types of hydrochemical signatures observed in the study area.
Reaches of the Waiohine, upper Ruamahanga, Waingawa, Waipoua and Tauherenikau Rivers are inferred to provide recharge to underlying groundwater systems in the upper and middle Wairarapa valley.These river systems largely drain the resistant western Tararua and Rimutaka (greywacke) ranges.Monitoring sites on these rivers within or near their emergence from these ranges are assigned to cluster A1.These sites are typified by Ca-HCO 3 water, low in major solutes and conductivity, as is typical of global fresh water river systems dominated by carbonate dissolution (Berner and Berner, 1996).There are 26 groundwater monitoring sites that are also assigned to cluster A1.These groundwaters also have a Ca-HCO 3 signature and are typically oxic, as indicated by low concentrations of NH 4 , Mn and Fe.Groundwaters assigned to cluster A1 also have δ 18 O and excess air concentrations that are indicative of river seepage (Fig. 10).Moreover, the groundwater sites assigned to cluster A1 are generally shallow (<10 m) and located in highly permeable Q1 and Q2 alluvial gravels that are expected to foster high connectivity between surface water and groundwater.These sites are down-gradient of river reaches that are shown by flow gauging data to lose water to groundwater (Fig. 11).Overall then, the similarity in hydrochemistry between groundwater and surface water sites assigned to cluster A1 is consistent with other data from river gauging, δ 18 O and excess air, all suggesting that these groundwaters are provided with a consistent source of oxygen-rich recharge from A1-type rivers.This is a similar result to that found by Burden (1982) in which some shallow groundwaters of the Canterbury plains of New Zealand closely reflected the chemical composition (high proportions of Ca-HCO 3 , low Na, Cl and NO 3 ) of the adjacent Rakaia and Asburton Rivers.
www.hydrol-earth-syst-sci.net/15/3383/2011/ Hydrol.Earth Syst.Sci., 15,[3383][3384][3385][3386][3387][3388][3389][3390][3391][3392][3393][3394][3395][3396][3397][3398]2011 Groundwaters are inferred to provide base flow to some river systems in the upper and middle Wairarapa valley, based on hydrochemical similarities of sites assigned to cluster A2.There are 65 groundwater monitoring sites assigned to cluster A2, most of which are 10-30 m deep and located within Q2 alluvial fan deposits.When compared to monitoring sites assigned to cluster A1, sites assigned to cluster A2 have slightly higher median values of all major ions, higher electrical conductivity, higher NO 3 , higher Na relative to Ca, and higher Cl relative to HCO 3 .This hydrochemistry could suggest that groundwaters assigned to cluster A2 receive recharge from rainfall, because Na, Cl, and NO 3 accumulate as rainwater moves through the soil column (Rozemeijer and Broers, 2007;Whittemore et al., 1989).This inferred source of recharge is supported by measured concentrations of excess air, which also imply rainfall recharge, although the available δ 18 O data suggest that A2-type groundwaters receive at least some recharge from rivers (Fig. 10).Groundwater-surface water interaction can be inferred from monitoring sites in gaining reaches of rivers such as the Mangatarere, lower Whangaehu and Ruamahanga, which have the same Na-Cl-NO 3 signature and are also assigned to cluster A2.Hiscock (2005) reported a similar transfer of chemical signature from rainfall-recharged groundwaters into stream base flow.
The rivers that drain the eastern Wairarapa foothills (e.g.Whangaehu, Huangarua and Taueru Rivers) may also provide recharge to underlying groundwater systems.Monitoring sites on these rivers near their emergence from the eastern foothills are assigned to cluster B1.B1-type waters display a considerably higher TDS and major ion concentrations in comparison to those rivers that drain the western ranges (e.g.cluster A1).This increased concentration of solutes in B1-type rivers likely derives from the easily eroded Pliocene marine sedimentary lithology of the eastern hills, relative to the chemically more resistant Mesozoic Torlesse greywacke that forms the western ranges.There are 50 groundwater monitoring sites also assigned to cluster B1.Many of the B1-type groundwater sites are in Q1 and Q2 alluvial gravels along the eastern periphery of the Wairarapa valley, near the river systems that drain the eastern foothills; other B1-type groundwater sites occur as small groupings located in the Parkvale and Te Ore Ore Basins (Fig. 11).These groundwater sites show a greater average well depth relative to A1 and A2 groundwaters, and elevated concentrations of Mn, Fe and NH 4 , which suggest anoxic conditions.Values of δ 18 O and excess air indicate that B1-type groundwaters may be sourced from river recharge, rainfall recharge or a mixture of the two (Fig. 10).Hence the similarity of hydrochemistry between groundwaters and rivers assigned to cluster B1 does not necessarily indicate groundwater-surface water interaction.In some locations, groundwaters may be recharged from neighbouring B1 surface waters, and/or in other locations these aquifers may be hosted by chemically reactive aquifer materials, including alluvium derived from the Miocene-Pliocene marine deposits in the eastern hills.
Interaction between groundwater and surface water is not clearly indicated among the sites assigned to cluster B2.Groundwater sites assigned to cluster B2 may receive some recharge from the easterly draining river systems, because they share a similar hydrochemistry as those surface waters assigned to cluster B1.However, these groundwater sites typically have slightly lower concentrations of major ions and higher concentrations of NH 4 , Fe and Mn.These higher concentrations, coupled with the observed low SO 4 concentrations, suggest that B2-type waters are anoxic, with sulphate reduction occurring at some sites.Groundwaters assigned to cluster B2 are of shallow to moderate depth and show considerable spatial agglomeration in the Parkvale basin (Fig. 11).The sequence of Q3, Q5, Q6 and Q8 sediments in the basin may present various confining layers of silty gravels and clay that reduce oxygen supply to these deeper groundwaters.
The lower Wairarapa valley is largely dominated by clusters B3-B5 which consist entirely of groundwater monitoring sites.This suggests significant hydrochemical differences between groundwater and surface water in the lower Wairarapa valley and indicates that there is little interaction between the two.Groundwater wells in the lower valley are typically relatively deep and are located in the substantial marine, estuarine and lacustrine depositional layers that occupy the lower 25 km of the valley under permeable Q1 alluvial gravels.These groundwaters generally display a Na-Cl water type relatively high in TDS and electrical conductivity.This suggests these groundwaters are older (Figs. 5 and Hydrol. Earth Syst. Sci., 15, 3383-3398, 2011 www.hydrol-earth-syst-sci.net/15/3383/2011/  6) (Daughney et al. 2010), consistent with the results of Morgenstern (2005) (Fig. 10).This is further supported by cluster B3-B5 monitoring sites showing high loadings for Component 1 (Fig. 9).Concentrations of all analytes are higher in B4-type waters, and the increased concentrations of Mn, NH 4 and Fe coupled with reduced NO 3 and SO 4 indicate a heavily reduced anoxic environment with little connection to the atmosphere or overlying surface water systems (Taylor et al., 1989).This is likely due to the deep confining layers of marine and estuarine sediment.Values of δ 18 O and excess air suggest that recharge to these groundwater sites is provided rainfall recharge that has percolated through overlying groundwater units.Five groundwater sites located at the outlet and east of Lake Wairarapa are assigned to cluster B5 and display a Na-Cl water type with high TDS and conductivity.However, these wells are shallow to moderate in depth, and low concentrations of K, HCO 3 , Mn, Fe, coupled with high NO 3 and SO 4 suggest these sites are oxygen rich, a phenomenon unlikely to occur in older chemically evolved groundwaters.Saltwater intrusion is the likely cause of high Na, Cl, SO 4 and subsequent conductivity at sites located by Lake Onoke.Other relatively shallow B5 groundwaters further inland may be influenced by some degree of groundwater-surface water interaction.For example, leaching from fertilizer inputs may have caused elevated SO 4 exhibited in these groundwaters.The hydrochemistry of these groundwaters might also be affected by interaction with localised volcanic deposits (McLaren and Cameron, 1996;Stanton, 1972).There are no δ 18 O or excess air measurements that can constrain recharge source for these B5-type groundwaters.

Conclusions and implications
Identifying areas of groundwater-surface water interaction is crucial for effective environmental management because this interaction is known to influence the quantity and quality of both groundwater and surface water.This study utilised well-established multivariate statistical methods in a novel manner, to uniquely group surface water and groundwater monitoring sites from the Wairarapa valley of New Zealand into seven hydrochemical clusters, and identify potential areas and mechanisms of interaction based on hydrochemical similarities.
A two-stage HCA linkage approach was employed to firstly identify sites with unusual hydrochemistry (Nearest Neighbour Linkage rule) and group the remaining 268 surface water and groundwater sites into seven clusters, based on similarities in their hydrochemistry (Ward's Linkage method).Surface water and groundwater monitoring sites were grouped together in three (A1, A2 and B1) of the seven clusters with the inference made that similarities in hydrochemistry may indicate potential interaction.PCA and Box and Whisker plots were used to determine the hydrochemical differences between the seven clusters.Clusters were largely differentiated by their concentrations of TDS, redox condition and ratio of major ions.PCA revealed two components, which explained a combined 73.3 % of the variability in the dataset.Component 1, characterized by high loadings for the major ions, was related to natural mineral dissolution and the degree of water evolution or concentration of TDS.Component 2 had high loadings for NH 4 , Fe and Mn and negative loadings for NO 3 and SO 4 , indicating the importance of redox reactions and the presence of both aerobic and anaerobic waters in the Wairarapa valley.
Clusters were spatially distributed into geochemical zones that reflect aquifer confinement, regional lithology and the degree of groundwater-surface water interaction.Shallow aerobic aquifers, located in close proximity to losing reaches of rivers such as the Waiohine and Waipoua, were grouped with similar Ca-HCO 3 type surface waters in cluster A1, indicating potential recharge to groundwaters from these river systems.Further, these sites were located in permeable Q1 and Q2 alluvial gravels that foster high connectivity between www.hydrol-earth-syst-sci.net/15/3383/2011/ Hydrol.Earth Syst.Sci., 15, 3383-3398, 2011 groundwater and surface water.Likewise, surface waters such as the Whangaehu and Taueru Rivers that drain the Pleistocene sedimentary geology of the eastern hills were grouped with similar Ca-HCO 3 groundwaters in cluster B1.These surface water sites had slightly elevated concentrations of TDS in comparison to other surface waters in the area, a phenomenon that likely reflects the marine sedimentary lithology of the eastern hills.B1-type groundwaters were located in close proximity to losing reaches of these river systems on the eastern side of the valley and showed in an increase in concentrations of Mn, Fe and NH 4 concentrations, suggesting the onset of slightly anoxic conditions.It is possible that recharge is provided to these groundwaters from overlying B1-type surface waters.Groundwater monitoring sites that displayed a rainfall recharged chemical signature with higher Na relative to Ca, higher Cl relative to HCO 3 and an accumulation of NO 3 were grouped with gaining river reaches such as the Mangatarere stream in cluster A2.This grouping suggests the provision of groundwater base flow to these river systems and the transfer of this chemical signature from underlying A2-type groundwater systems.The drivers of hydrochemistry inferred from major and minor element data are consistent with previous measurements of recharge source and water age (Morgenstern, 2005).
The hydrochemical techniques used in this study did not reveal surface water and groundwater interaction in the lower Wairarapa valley.Here, deep anoxic aquifers, high in TDS with a distinct Na-Cl signature, were grouped together in clusters B3-B5, but showed no apparent link to surface water sites.These groundwaters were largely restricted to the lower Wairarapa valley, an area dominated by various mud and estuarine confining layers.Five groundwater sites were assigned to cluster B5 and showed a similar hydrochemical signature to those sites assigned to clusters B3 and B4.However, B5-type groundwaters were relatively shallow, and high concentrations of SO 4 suggest seawater intrusion, anthropogenic contamination and/or a localised volcanic lithology as sources of Na and Cl.
The main assumption upon which this study is based is that similarities in hydrochemistry can be used to infer groundwater-surface water interaction.Although this assumption is extensively made in the literature (e.g.Burden, 1982;Kumar et al., 2009;Taylor et al., 1989), it is possible that hydrochemical similarities are due to other phenomena such as similar flow paths, similar regional geology or similar episodes and styles of contamination.It is therefore useful to support findings from this method with other investigative approaches such as the quantification of flow loss/gain on river systems, or measurement of δ 18 O, excess air, water age and temperature.In terms of this paper, findings support existing classification of river systems determined using concurrent surface water flow gaugings (e.g.Jones and Gyopari, 2006).
This investigation and the grouping of groundwater and surface water bodies was entirely based on a historic dataset from the Wairarapa valley, a dataset that encapsulated significant timescales and a variety of water quality parameters.It is important to note the limitations of our dataset, in particular the variability of parameters collected at each monitoring site, data quality issues surrounding the collection and maintenance of the original dataset and the use of median values to simplify temporal timescales.Despite these limitations, HCA was still able to extract meaningful relationships between the site-specific median values of the eight parameters included in the clustering algorithm.When coupled with Box and Whisker plots and PCA, which analysed a range of additional parameters, HCA results were successfully able to identify distinct hydrochemical clusters that were used to infer potential areas and mechanisms of groundwater-surface water interaction.This highlights the potential of these methodologies as a preliminarily tool to explore groundwater-surface water interaction at a regional scale using existing datasets that may be of poor quality or resolution.However, in order to establish the extent to which these methodologies are applicable in various hydro-climatic regimes and watersheds, and with data of various quality levels, continued cluster attempts need to be undertaken.Further, HCA outputs and interpretations of hydrochemical clusters and potential areas of groundwater-surface water interaction should be supplemented with findings from other investigative techniques such as measurement of δ 18 O, excess air, water age, stream temperature and stage quantification.As noted by Kalbus et al. (2006), in order to reduce the limitations and uncertainties that surround the investigation of groundwater-surface water interaction, a multi-scale approach that utilises various methods should be employed.
This paper does not attempt to investigate temporal variability in water quality, hence the use of site specific medians, and therefore encourages those wishing to employ a similar methodology to undertake multiple clustering attempts with data from different timescales (e.g.base flow and storm flow).Similar approaches in local scale studies have been executed by Kumar et al. (2009) and Thyne et al. (2004), which demonstrate that monitoring sites can show temporal variability in water quality and shift between cluster groupings.It is essential for future research to determine the impact that this temporal shift in cluster assignment is likely to have on the interpretation of surface water and groundwater interaction.
This study has shown that multivariate statistical methods can provide insight into areas and mechanisms of groundwater-surface water interaction at a regional scale using existing hydrochemical datasets.Although limitations surround this methodology and highlight the difficulties of using a widespread dataset that encapsulates significant timescales, the use of hydrochemical medians and HCA to identify areas of potential interaction in the Wairarapa valley was successful.Furthermore, multivariate statistical tools constitute a potential method that could be used to identify areal zones of groundwater and surface water chemical evo-Hydrol.Earth Syst.Sci., 15, 3383-3398, 2011 www.hydrol-earth-syst-sci.net/15/3383/2011/ lution and anthropogenic contamination at a regional scale as has been demonstrated by Güler and Thyne (2004) and Thyne et al. (2004).From a water management perspective, our study presents a rapid and cost-effective starting means by which the interaction between groundwater and surface water could be identified and investigated using water quality data alone.

Fig. 1 .
Fig. 1.Location of Wairarapa Valley and major river systems, North Island, New Zealand.

Figure 2 .Figure 2 .Figure 2 .
Figure 2. Catchment characteristics of the Wairarapa Valley, New Zealand; (a) Simplified 3 regional geology, (b) Quaternary (Q) alluvial surface sediment units (c) Groundwater flow 4 direction and potentiometric surface map.Only major fault lines in the Wairarapa Valley are 5 named in Figure 2a.6 7 2

Fig. 2 .
Fig. 2. Catchment characteristics of the Wairarapa Valley, New Zealand; (a) simplified regional geology, (b) quaternary (Q) alluvial surface sediment units (c) groundwater flow direction and potentiometric surface map.Only major fault lines in the Wairarapa Valley are named in Fig. 2a.

Figure 3 Figure 3 .
Figure3presents a summary of the methodologies employed to prepare the hydrochemical database for the statistical techniques employed in this paper.The dataset provided by GWRC consisted of over 6000 individual samples collected from 607 groundwater monitoring sites and 28 surface water

Fig. 3 .
Fig. 3. Schematic representation of the methodologies used for compiling and analyzing the hydrochemical database."SW" denotes surface water monitoring sites, "GW" denotes groundwater monitoring sites and "n" denotes sample size.

Figure 4 .Fig. 4 .
Figure 4. Dendrogram for the 246 groundwater (GW) and 22 surface water (SW) monitoring 3 sites from the Wairarapa Valley classified into seven clusters.Based on the format of Cloutier 4 et al. (2008).5 6

Figure 6 .
Figure 6.Piper diagram showing the variation of mean values for major ions (see Table 2) amongst the seven defined clusters and eight outlier sites.Mg and SO4 scales are exaggerated increase separation of points on the diagram; circles in center diamond are proportional to conductivity.

Fig. 6 .
Fig. 6.Piper diagram showing the variation of mean values for major ions (see Table2) amongst the seven defined clusters and eight outlier sites.Mg and SO 4 scales are exaggerated to increase separation of points on the diagram; circles in center diamond are proportional to conductivity.

7Figure 7 .
Figure 7. Simplified schematic representation of differences amongst the seven clusters (A1-B5) in relation to their TDS, aquifer depth and redox potential.Scales of axes are simplified representations of increase only.

Fig. 7 .Figure 8 .
Fig. 7. Simplified schematic representation of differences amongst the seven clusters (A1-B5) in relation to their TDS, aquifer depth and redox condition.Scales of axes are simplified representations of increase only.

Fig. 8 .
Fig. 8. Box-Whisker plots showing the variations in electrical conductivity, Ca and SO 4 across the two, seven or thirteen clusters defined at different separation thresholds shown in Fig. 4.

Figure 9 .Fig. 9 .
Figure 9. Plot of principal component scores for Components 1 and 2 for all sites assigned to 3 each of the seven clusters identified by HCA. 4 5

Figure 10 . 7 Fig. 10 .
Figure 10.Relationships between the hydrochemical clusters defined in this study and 3 previous measurements of δ 18 O, excess air (based on measured concentrations of dissolved 4 nitrogen and argon) and mean residence time (in years, shown by the data point labels, and 5 based on measured concentrations of tritium, chlorofluorocarbons and sulfur hexafluoride).6 7 Fig. 10.Relationships between the hydrochemical clusters defined in this study and previous measurements of δ 18 O, excess air (based on measured concentrations of dissolved nitrogen and argon) and mean residence time (in years, shown by the data point labels, and based on measured concentrations of tritium, chlorofluorocarbons and sulfur hexafluoride).

11Figure 11 .
Figure 11.Spatial distribution of groundwater and surface water monitoring sites assigned to seven hydrochemical clusters in the Wairarapa Valley, New Zealand.Distribution is shown in regards to pre-determined river interacting properties; losing, gaining and neutral (Jonesand Gyopari, 2006).

Fig. 11 .
Fig. 11.Spatial distribution of groundwater and surface water monitoring sites assigned to seven hydrochemical clusters in the Wairarapa Valley, New Zealand.Distribution is shown in regards to pre-determined river interacting properties; losing, gaining and neutral(Jones and Gyopari, 2006).

Table 1 .
Median values of chemical parameters for the eight outlier sites (µS cm −1 for conductivity, mg l −1 for other parameters).Mean of site-specific median values from the remaining 268 monitoring sites are presented for comparison.

Table 2 .
Assignment of surface water (SW) and groundwater (GW) monitoring sites to hydrochemical clusters and mean values of chemical parameters for each cluster.

Table 3 .
Principal component weightings and explained variance for the two significant components.Component scores presented in bold show the highest weighting for each component.