Efficacy of Fuzzy c-Means Cluster Analysis of Naturally Occurring Radioisotope Datasets for Improved Groundwater Resource Management under the Continued Risk of Climate Change

Global change is recognized as an additional potential stressor on already over-tapped groundwater systems. Mitigation of impacts due to global change requires planning for sustainable use of groundwater systems. Identifying and developing mitigation plans for sustainable use of groundwater resources require detailed knowledge of aquifer dynamics and temporal behavior for a higher level of certainty on which decisions can be made by a knowledgeable group of stakeholders. The principal hypothesis of this study was that a robust set of uranium (U) and thorium (Th) decay series data from multiple wellfields representing different confining and geochemical conditions would cluster in a meaningful manner using a fuzzy c-means technique for better understanding of aquifer dynamics for management purposes. Three conceptual models were represented by the wellfields: 1) a well-confined artesian aquifer; 2) an area receiving recharge via a confining layer window; and 3) a regional recharge zone where the aquifer sub-crops near the land surface. These conceptual models were defined as C1, C2, and C3 according to the respective definitions. Eleven samples from the three wellfields were analyzed for ten parameters consisting of U and Th decay series isotopes. The data Research Article British Journal of Environment & Climate Change, 3(3): 464-479, 2013 465 clustered successfully into three cluster types providing discrimination of behavior within each wellfield. Clusters C2 and C3 were characterized by the higher values of Rn, Ra, Ra, and Ra. Whereas, C1 was characterized by a higher values of Th, which was mostly absent from C2 and C3. The data clustered as expected between the well-confined, window, and regional recharge conceptual models with insights into individual well behavior. The data offer a robust conceptualization of aquifer dynamics in the regional area that may benefit decision makers.


INTRODUCTION
Global climate change is recognized as one of the stressors on already over-tapped groundwater systems [1]. Mitigation of impacts due to global climate change requires planning for sustainable use of groundwater systems that is dependent upon a detailed understanding of the storage and flow dynamics of the system. Accurate conceptual models and observational data are needed to understand the complex relationship between groundwater reservoirs and sources and sinks that may alter storage dynamics locally or over larger regional areas. The role of recharge and the importance of better understanding of diffuse versus focused recharge mechanisms as they relate to potential aquifer vulnerability have been recognized in the literature as an important need [2][3][4][5]. More recently the focus of the discussion in the literature has been on how groundwater storage may provide a more stable resource under extreme hydrologic variation due to the stress of global climate change [1].
The purpose of this study was to test the hypothesis that fuzzy c-means clustering techniques could identify groundwater system types (confined versus recharge regime) using naturally occurring uranium and thorium decay-series isotopes. The setting for the method testing included three wellfields (Morton, Shaw and Sheahan wellfields) in the Memphis Tennessee, USA area, as shown in Fig. 1.

Hydrogeology of Case Study Area
The case study application is focused on the Memphis aquifer, formally defined as a part of the Claiborne group within the regional context of the Mississippi Embayment and has been the focus of numerous studies, with the U.S. Geological Survey being a primary leader in the effort [6,7]. A detailed description of the hydrogeology of the regional Mississippi Embayment and the local county-level scale exists in the literature and is not repeated here for brevity [8][9][10].
It is important to note that of The Memphis aquifer, defined regionally as a part of the Claiborne group sequence, is the major drinking water resource for western Tennessee, and for many years water resource managers remained concerned about the vulnerability of the system from shallow leakage [6,8]. Of particular interest is the transition from the Quaternary to Late Tertiary age fluvial deposits, which constitute the local regional shallow aquifer system and the Tertiary age Cook Mountain formation that forms the upper confining unit to the Memphis aquifer. Research has shown that there are areas near active wellfields where the clay is known to be thin or absent [11]. The thinning or absence of clay in these areas allows for the direct exchange of recharge fluxes from the shallow aquifer and potentially other near surface features and these have been termed aquitard or confining layer windows [10,12,13]. The focus of more recent research has been on the spatial identification and assessment of the aquifer vulnerability, using modeling and geochemical techniques, within the localized source water area. A map of the regional study area, local municipal wellfields and the suspected or confirmed confining layer window locations is shown on Fig. 1. The figure also shows the wellfields that were sampled as a focal point of this study.
Previous studies in the literature have identified various mixing regimes and hydraulic fluxes associated with diffuse and focused recharge areas (i.e. through faults, or confining layer windows) to the Memphis aquifer and have sought more efficacious ways of identifying the presence of recharge from a confining layer window through various chemical, isotopic and modeling approaches [8,[10][11][12][14][15][16][17][18][19]. For the purpose of the discussion in this study, three primary recharge types have been considered for identification, shown in Fig. 2, which would likely require differing source water management approaches due to potential vulnerability.

Fig. 2.
Example conceptual models for groundwater recharge systems in the regional Memphis aquifer: (a) diffuse uniform recharge through a confining layer; (b) focused recharge through a confining layer window; (c) diffuse recharge through a regional sub-crop zone. Note: As shown, R denotes precipitation, and q r denotes a volumetric flux over an areal opening to an aquifer, or through a low conductivity matrix

Uranium and Thorium Decay Series
Uranium and thorium decay series analyses have been used as a valuable tool for the study of groundwater systems. A graphical representation of the relevant decay series is shown in Fig. 3. A good presentation of the early research and a review of the behavior of uranium and thorium isotopes in groundwater can be found in Nimz [20]. Later research by Luo et al. [21], demonstrated that 234 U/ 238 U, 234 Th/ 230 Th, and 224 Ra/ 228 Ra activity ratios exhibit a strong correlation with aquifer recharge and flow paths. In systems with active groundwater exchange, the movement of radionuclides can be retarded by multiple chemical and physical processes [22].
Luo et al. [21] noted in the study at Idaho National Engineering and Environmental Laboratory (INEEL) that 238 U was not entirely free of interactions with the aquifer solids. Implicit in the literature is that redox conditions can strongly influence the phase or pool in which the isotopes preferentially reside [23,24]. For short-lived radionuclides, sorption and desorption also play an important role [25]. Gentry et al. [10] observed that the Th isotope activities in a semi-confined area of the Memphis aquifer are all similar to those observed from the unconfined (oxygenated) basaltic aquifer at INEEL, Idaho [21] as well as other sandy unconfined aquifers [26]. Tricca et al. [26] hypothesized that the concentration of Th in groundwater was dominated by the chemical solubility of Th minerals. However, as noted in Gentry et al. [10], this hypothesis was not supported by observations of groundwater samples from the Memphis aquifer area with the most likely source for the variance being due to the occurrence of colloids in groundwater [21]. Gentry et al. [10] demonstrated that redox and pH control of uranium behavior can be used to explain the source and mixing behavior in and near a confining layer window. A conceptual model was proposed, where high uranium concentration near-surface waters enter the Memphis aquifer through an aquitard window and due to redox reactions and changes in pH occurring within the redox barrier, uranium is depleted. Downgradient flow paths gain 234 U through alpha-recoil mobilization from 234 Th, and possible dissolution and precipitation of uranium along reducing flow paths from the confining layer window. The current study provides a more robust testing of differences in uranium and thorium isotope hydrochemical facies across multiple hydrologic recharge regimes.

Fuzzy c-Means Clustering
For the purpose of describing the approach, the nomenclature and variable definitions used by Güler and Thyne [27] were used for consistency. Building upon the work of Bezdek [28] and Güler and Thyne [27], the FCM technique used multivariate data analysis to partition a dataset, fuzzy clusters, which are identified by the cluster prototypes, The partitioning process is an optimization problem, with the goal of minimizing the following objective function [27]: where, M is the membership matrix, C is the cluster prototypes (centers) matrix, c is the number of clusters, n is the number of data points, and u ik is the degree of membership of sample k in cluster i. If we consider the Euclidean distance (p) as the straight line distance between the datum x k and cluster prototype v i , then when p is large, J FCM is minimized. If p is small then the membership value approaches unity [27,29]. m is a weighting exponent that controls the degree of fuzziness of the classification, such that having been shown to be a widely accepted value [27,30,31]. Elements of the membership matrix, M, are constrained over the range of (0, 1), given the following constraints [27,30]: ( As identified in the literature, a two step iteration process is used to minimize J FCM , where C is initialized randomly and M is estimated using the dataset of X, m>1, and C, where the degree of membership and cluster prototypes are calculated as follows [27]: Several stopping criteria for the algorithm have been suggested based upon the relative change in M or the cluster prototypes in subsequent iterations [27,32].

METHODOLOGY
The implementation of this study was dependent upon samples collected from production wells in the Memphis aquifer and fuzzy c-means cluster analysis of the U-and Th-decay series data. The details associated with the sampling, analysis and study specific fuzzy cmeans algorithm development are provided in the following sections.

Radiochemical Sample Collection and Assay
As described in Gentry et al. [10] samples were collected from eleven wells with top of screen (TOS) depths ranging from 76 to 236 meters below ground surface (mBGS). In this manuscript we include the data from two wellfields with differing hydrogeologic conditions (i.e. near the aquifer sub-crop and well-confined with no suspected leakage) that have not been reported in the literature. However, the samples reported in the literature and here were collected at the same time and under the same sampling methodology [10]. For brevity, the radiochemical analysis techniques are not repeated here since they were analyzed at the same time as those reported in the literature using the same conditions and [10, 21,33].

Fuzzy c-means Cluster Analysis
The theoretical and mathematical basis for fuzzy c-means cluster analysis was summarized in equations 1 through 5. For the purposes of implementing fuzzy c-means analysis for this study a modified algorithm from Bezdek et al. [34] was used. For specificity the only modifications made to the algorithm were to allow input and output to the computer screen and files, as opposed to older forms of input. In addition, modifications were made for providing run-time diagnostics to assess any failure modes. The modified algorithm allows for differing norms (Euclidean, Diagonal, or Mahalonobis) in the calculation of J FCM , from equation 1. For the purposes of this study, a Euclidean Norm (the identity matrix) was used. Also, for the weighting factor, m in equations 4 and 5, a value of 1.7 was used. Bezdek et. al. [34] suggested that values of 1.5 ≤ m ≤ 3.0 would give a good result for most data, and is consistent with Güler and Thyne [27]; and, Hathaway and Bezdek [31]. The algorithm input and output routines were modified and compiled using a FORTRAN 77 standard using Absoft Pro Fortran 7.5 (http://www.absoft.com). For the purposes of conducting the fuzzy cmeans analysis, any data that were non-detect were input as a value of 0.001 which is several orders of magnitude less than detectable levels of U and Th isotopes measured in other samples and their associated uncertainties.

RESULTS AND DISCUSSION
Analysis of samples from the Sheahan, Shaw and Morton wellfields were performed, with the results from the Sheahan wellfield having been presented previously in the literature [10]. The results from the U and Th analyses are summarized in Table 1, which also includes the uncertainty associated with each value. All isotopes were detected in all samples except for 228 Th, which was non-detectable in select samples. The non-detect data were input into the fuzzy c-means algorithm with a value of 0.001 to represent the non-detectable values of 228 Th. The algorithm was allowed to choose between two to four primary clusters. After inspection, the most meaningful results based upon the recharge system types were three primary clusters. The three clusters were designated as: C1, a regional confined area (presented earlier in Fig. 2a); C2, an area receiving recharge from a localized confining layer window (presented earlier in Fig. 2b); and C3, recharge via a regional sub-crop (presented earlier in Fig. 2c). Clustering with membership between two primary clusters was successful for identifying recharge versus confined characteristic wells, but any clustering above 3 was not meaningful based upon the hydrogeological interpretation of the area. The results of the cluster analysis and the partitioned membership of each well within C1, C2 and C3 are summarized in Table 2.
All wells located in the Morton wellfield partitioned primarily to C1 membership, the regional confined system. All wells in the Sheahan wellfield partitioned membership to either C1 (wells 78, 87 and 95), or C2 (wells 88 and 99), previous research has shown that wells 87 and 99 in the Sheahan wellfield receive a component of modern water with unique U-and Th-decay series behavior [10,12]. The membership distribution in the wellfield further demonstrates the behavioral differences in the transport and retardation of U-and Th-decay series isotopes in regional well-confined systems versus those impacted by localized recharge, which was reported earlier by Gentry et al. [10] for the Sheahan wellfield but did not include the additional data from the Morton wellfield in the analysis. Similar to the Sheahan wellfield, the Shaw wellfield partitioned membership between C2 (well 704) and C3 (wells 721 and 722). Well 704 is the most shallow well in the Shaw wellfield dataset and the membership distribution demonstrates that the wells receiving recharge via a confining layer window have a similar hydrochemical facie to shallow regional recharge zone wells. Thus, C3 is indicative of the regional recharge system. These results demonstrate the information rich nature of the data based upon the behavior of the isotopes due to changing redox conditions with the differing recharge regimes and the likely influence of colloidal transport of select isotopes, which is consistent with the geochemical conceptual model presented by Gentry et al. [10]. Further, the fuzzy c-means membership partitioning tends to likely follow a correlation to the likely mixing between the end member conceptual models represented by C1, C2 and C3. This further demonstrates that these types of data are helpful in the identification of hydrochemical facies, which have relevance for managing the resource with respect to vulnerability.
To further explore these mixing relationships and their patterns, the data were normalized to the maximum value for each isotope. The distinct cluster centers determined from the fuzzy c-means analysis were also normalized to sample maximums for the purpose of pattern comparison. The normalized cluster centers are shown in Fig. 4, where each axis on the radar plot is scaled between 0 and 1. The regional confined zone cluster (C1) is characterized by lower values of radium and radon isotopes and the presence of 228 Th. Whereas, the localized confining layer window cluster (C2) has higher values associated with uranium and thorium isotopes than C1, but 228 Th is absent. The regional recharge zone cluster (C3) had the highest values of radium and the lowest values of 238 U, 234 Th, and 230 Th, with 228 Th being similarly absent. For purposes of exploring these patterns, the data from each wellfield in Table 1, was normalized and plotted in a similar manner.  The normalized data for Sheahan are shown in Fig. 5a, for wells not in close proximity to the known focal window recharge area, and in Fig. 5b, for wells 88 and 99 which are known to receive recharge from a near-window area [10,12]. The normalized data for the Shaw and Morton wellfields were plotted in a similar manner and are shown in Fig. 6 and Fig. 7, respectively. It is apparent from the data patterns that the hydrochemistry from the three conceptual models can distinguish patterns in each wellfield with respect to wells that behave as deep more well confined systems and those influenced by shallow recharge systems. The Sheahan non-window wells shown in Fig. 5a show a very similar pattern to the Morton wells shown in Fig. 7, particularly for the presence of 228 Th. The primary difference noted between the Sheahan non-window wells and the Morton wells is the high value of 234 Th noted in well 95 in the Sheahan wellfield, which is likely due to the colloidal behavior of thorium in the system as noted by Gentry et al. [10,21]. The data from the Shaw wellfield and the window recharge wells in the Sheahan wellfield show the same absence of 228 Th and the similar pattern of higher radium and radon isotope concentrations. The variability seen in the Sheahan wellfield is the result of mixing between  Table 2 correlate with the mixing ratios from these two sources. Explicitly captured by the cluster analysis, are the differences between the regional recharge wellfield (Fig. 6 shows the highest radium isotope signature) and the confining layer window recharge area (Figs. 5a and 5b) shows the highest uranium and thorium isotope concentrations, except for the absence of 228 Th. Further research should be done to corroborate these findings and to investigate the causation for the high radium isotope signature associated with both recharge sources. This approach using data-rich hydrochemical information may be useful as a future technique for better understanding the sources and behavior of individual wells in complex regional aquifer settings. This is particularly true given the state of global climate change and the current scenarios of groundwater management globally. These types of techniques would further provide understanding of the impacts from longterm pumping and aquifer storage response from a long-term perspective.

CONCLUSIONS
Eleven wells from three wellfields with differing recharge source water were analyzed for uranium and thorium decay series isotopes (ten parameters for each well). These data were analyzed using a fuzzy c-means algorithm to determine the efficacy of the technique for discriminating meaningful hydrochemical facies.
Overall the study conclusions can be summarized as follows: 1. The findings have indicated the fuzzy c-means technique coupled with robust U-and Th-decay series data can identify the differences between: (C1) well confined settings with no leakage; (C2) localized confining layer window recharge; and (C3) regional recharge zone settings.
2. The overall technique was efficacious given that it was capable of determining behavior characteristics at the individual well level within the wellfield groupings with meaningful interpretations to the given conceptual models.
3. The fuzzy c-means technique may be used with a robust hydrogeochemical dataset to further elucidate aquifer storage behavior and response for management purposes where vulnerability is linked to the aquifer recharge mechanism. Demonstrated by the ability to identify possible mixing relationships representative of conceptual model types.