Trace Element Analysis , Model-Based Clustering and Flushing to Prevent Drinking Water Contamination in Public Schools

Drinking water samples taken from cafeteria sinks and water fountains in each of the 76 schools in the Winston-Salem/Forsyth County Schools (WSFCS) district (North Carolina, United States) were analyzed by inductively coupled plasma tandem mass spectrometry (ICP-MS/MS) to determine As, Cd, Cr, Cu, Pb, Sb, Se and Tl. All samples from currently active schools tested below the maximum contaminant level (MCL) set for each element. Model-based clustering was employed to identify schools more prone to drinking water contamination. This multivariate approach may be used in a prevention program that can be tailored to specific school districts, with each school tested at a frequency compatible with its contamination risk level. Water flow stagnation during the summer break results in higher elemental concentrations in school drinking water, but a simple 5-60 min flushing procedure significantly reduces the contamination levels.


Introduction
2][3] Within the state of North Carolina, the State House Legislature introduced legislation in 2016 4 requiring drinking water testing and monitoring for Pb at elementary schools and day care facilities built before 1987.Current United States legislation only requires mandatory testing in public schools that are also regulated as a public water supplier.However, approximately 89-92% of all schools in the country fall outside of this category, and testing is only done on a volunteer basis. 2 Monitoring of drinking water for Cu and Pb falls under the mandates established by the Lead and Copper Rule (LCR) and its revisions. 1The Environmental Protection Agency (EPA) has set forth plans to revise the LCR and address concerns and weaknesses within the legislation such as the fact that there are no enforceable health-based standards.Issues associated with adverse corrosive control methods, inappropriate sampling or sampling processing, delay in action, lack of clarity in what determines compliance, as well as partial lead service line replacements (PLSLR) and potential spikes in Pb concentrations in water as a result of PLSLRs are also being evaluated. 1,3In this context, the EPA released a white paper in 2016 advocating for the revision of the LCR with the following key goals outlined: further reduce exposure to Pb in drinking water, set clear enforceable requirements, and improve transparency, environmental justice and children's health.
7][8][9] For Cd, Cr, Cu, Se and Tl, the potential negative health effects of long-term exposure are also recognized by the EPA and the World Health Organization (WHO), with critical concentrations ranging from 2 µg L -1 (Tl) to 1,300 µg L -1 (Cu). 5,10he most significant sources and/or causes of Pb in drinking water include Pb service lines, faucets and fixtures with leaded brass, pipes with Pb solder, and adverse water chemistry. 1Common sources of Cu, Sb, As, Cd, Cr, Se and Tl in drinking water include corrosion of household plumbing systems, erosion of natural deposits (Cu); discharges from petroleum refineries, fire retardants, ceramics, electronics and solder (Sb); erosion of natural deposits, runoff from orchards, runoff from glass and electronics production wastes (As); corrosion of galvanized pipes, erosion of natural deposits, discharge from metal refineries, and runoff from waste batteries and paints (Cd); discharge from steel and pulp mills, and erosion of natural deposits (Cr); discharge from petroleum and metal refineries, erosion of natural deposits and discharge from mines (Se); and leaching from ore-processing sites, discharge from electronics, glass and drug factories (Tl). 5he United States Congress amended the Safe Drinking Water Act in 1986 to prohibit the use of pipes with more than 8% Pb, and solder or flux with more than 0.2% of Pb in their constitution.Effective January 2014, the Reduction of Lead in Drinking Water Act reduced the maximum allowable amount of this element from 8% to no more than a weighted average of 0.025% on a wetted surface. 1 Although the amount of Pb in infrastructure built after 1986 in the United States has been significantly reduced due to legislation, many schools were built prior to 1987 and may have Pb service lines, as well as building materials containing solder and flux with relatively high amounts of this element.Implementation of corrosion control practices, such as water treatment with orthophosphate (OrthoP), is capable of preventing leaching of Pb into drinking water. 11owever, the amount of Pb (as well as Cu and other elements) found in an individual tap varies over time due to several parameters such as variation in the constitution of plumbing materials, plumbing age, water quality, sampling procedure (e.g., water stagnation time, and flow rate during sampling), water use, and temperature. 12,13tudies involving elemental analysis of school drinking water have been carried out in the United States, 2,14,15 Canada 16,17 and Saudi Arabia, 18 however, most of them focused on just Cu and Pb, or Pb alone.In one of these studies, concentrations of Al, Cd, Cu, Fe, Ni and Zn in school water coolers in Saudi Arabia were found to exceed the guideline values set by the European Economic Community. 18Spikes in drinking water concentration of Pb as high as 1,520, 1,600, and 13,000 µg L -1 were found in Washington D.C. Public Schools (DCPS), Seattle Public Schools (SPS), and schools in the Los Angeles Unified School District (LAUSD), respectively. 2,14,15Results from these studies caused public alarm and negative media coverage leading to water remediation strategies such as implementing water filters, supplying consumers with bottled water, and removing Pb in plumbing to reduce potential accumulation of this metal in school children. 2,15n the present work, tap water samples were collected from all 76 schools in the Winston-Salem/Forsyth County Schools (WSFCS) district (North Carolina, United States), which serves approximately 54,385 students.Samples were taken from cafeteria sinks and water fountains, as they may be some of the most used sources of drinking water in a given school.Inductively coupled plasma-tandem mass spectrometry (ICP-MS/MS) was used to determine the concentrations of eight elements (i.e., As, Cd, Cr, Cu, Pb, Sb, Se and Tl), and the results were compared with MCLs and ALs set by the EPA.
Multivariate analysis encompasses techniques to explore and analyze multivariate data in order to extract the maximum amount of signal from the noise and make inferences about the sources of the data. 19This strategy has been extensively applied in environmental water research, [20][21][22][23] and to study potable water systems in Greece and the United States. 24A data driven approach was used, for example, after residential water testing in Flint (Michigan, USA) to develop machine learning models capable of predicting whether or not a given parcel data would have a Pb concentration above the EPA action level of 15 µg L -1 . 25Although widely used in many and diverse fields of research, multivariate analysis has yet to be evaluated as a tool to study trace elements in drinking water of an entire school district.In the present work, principal components analysis (PCA) and cluster analysis (CA) are applied to identify subtle patterns within the data and provide information that may be essential for preventive action programs on drinking water contamination in schools.In this case, school building age and the concentrations of eight trace elements in drinking water from cafeteria sinks and water fountains are used in PCA and CA.In an additional study, the effect of water stagnation during the summer break on Cu and Pb concentrations in the school's drinking water supply was also evaluated.Four schools, selected according to the age of the building, Pb concentration during the school year, and activity level over the summer break, were tested after 52 days of drinking water stagnation.The same taps tested during the school year were retested after the summer break, at five points throughout a 2 h flush period, to evaluate the effect of flushing on minimizing water contamination.

Instrumentation
An Agilent 8800 ICP-MS/MS (Tokyo, Japan) was used to determine As, Cd, Cr, Cu, Pb, Sb, Se and Tl in drinking water at the mass-to-charge ratios (m/z) of 75, 111, 52, 63, 208, 121, 78 and 205, respectively.On-mass ICP-MS/MS was used in all determinations, with H 2 gas flowing at 4 mL min -1 in the collision/reaction cell. 26Instrumental parameters and optimized conditions for elemental analysis are listed in Table 1.

Reagents, standard reference solutions and samples
All samples and analytical solutions were prepared using trace-metal-grade nitric acid (Fisher, Pittsburgh, PA, USA) and distilled-deionized water (18 MΩ cm, Purelab Option-Q, Elga, Woodridge, IL, USA).Single-element stock solutions of As, Cd, Cr, Cu, Pb, Se, Sb and Tl (1000 mg L -1 , SPEX CertPrep, Metuchen, NJ, USA) were used to prepare the calibration standards.The external standard calibration method was employed in all determinations.A certified reference material (Trace Elements in Water, NIST 1643e) from the National Institute of Standards and Technology (NIST, Gaithersburg, MD, USA) was used to check the method's accuracy.Water samples were directly analyzed, with dilutions performed when necessary to fit the calibration curve concentration range.

Sample collection
Tap water samples from cafeteria sinks and water fountains in all 76 schools of the WSFCS district were collected.Cafeteria sinks were chosen based on ease of access at the time of sampling and high usage.Water fountains were chosen based on location in high-traffic areas of the schools.Fountains located outside the main office were sampled in most cases.Sample collection took place during school activity hours (8:00 am-4:30 pm) between October 2016 and May 2017.Samples for the summer stagnation study were collected in August 2017.A P8 coarse filter paper (Fisher Scientific, Pittsburgh, PA, USA) and a plastic funnel were used to collect the samples into prewashed 50 mL conical polypropylene tubes (Thermo Scientific, Waltham, MA, USA), which were preloaded with 5 mL of 10% v v -1 HNO 3 .The collection tube was then filled up to the 50 mL mark with the drinking water sample, for a final acid concentration of 1% v v -1 .Prior to sample collection, the plastic funnel and filter paper were conditioned with the respective water sample.All glassware and plastic ware used in this study were kept overnight in a 10% v v -1 HNO 3 bath, and thoroughly rinsed with distilled-deionized water before use.

Data analysis
After comparing the results for all eight elements evaluated with their respective MCLs and ALs, the data were further investigated to identify schools that would be a priority regarding drinking water monitoring and potential contamination by toxic elements.The data were first analyzed to observe the proportions and patterns of data below the limits of detection (LODs), which are reported here, along with maximum and minimum values for each element.The data used to determine element concentration averages, as well as in multivariate analyses, were exclusive to elements presenting more than 50% of their results above the respective LODs.The expectation maximization algorithm (EM) 27 of log-ratio transformed data 28 was used to impute values below the LOD for each element using the zCompositions package in R. 29 Model-based clustering was performed using the mclust package in R. 30 Finite mixture models were developed using the EM algorithm, 28 and the best model was selected according to the Bayesian information criterion (BIC). 31PCA was used as an exploratory and visual tool to uncover mutual relationships and correlations in the data.In this case, multivariate relationships are described in terms of complementary scores and loadings plots. 32Spatial representations of the data were performed using the ggmap package in R. 33 All of the R code used in this study is available upon request.
Different vocabulary is used in the data mining community to refer to samples and variables.In this work, samples are referred to as observations, and variables are referred to as features.Therefore, the observations are the samples of drinking water, and the features include the concentrations of trace elements in drinking water and the school building construction year.

Trace element concentrations in schools' drinking water
The LODs, limits of quantification (LOQs) and percent recoveries from a certified reference material (CRM) are listed in Table 2 for As, Cd, Cr, Cu, Pb, Sb, Se and Tl.The LODs ranged from 0.004-0.3µg L -1 , and are significantly lower than the respective MCL and AL values set by the EPA (Table 3).The percent recoveries ranged from 87-98%.To further evaluate the accuracy of the method used in this work and compare it with that routinely used to assess water quality in the county, concentrations of Cu and Pb in 56 samples were also determined by the Winston-Salem water treatment plant laboratory (Winston-Salem, NC, USA).No statistically significant difference was observed between the results from the two laboratories by applying a two-sample t-test at the 95% confidence level.
The water treatment plant in Winston-Salem uses OrthoP as a passivation chemical to prevent elemental leaching from pipes into the drinking water.For all occupied buildings evaluated in this study, and for both taps sampled, no element concentration exceeded the respective MCL or AL.Average values for all eight elements were at least 40-fold lower than the MCLs (Pb), and even the highest concentrations found were 3-fold (Pb) to more than 200-fold (Cr) lower than the MCL and AL values (Table 3).

Investigation of school buildings more likely to present elemental concentration spikes
As discussed earlier, drinking water contamination with toxic elements such as Pb is primarily a result of leaching from service lines, faucets and fixtures, which may be exacerbated by changes in the water chemistry. 1,34,35n normal conditions, one may argue that the main source of drinking water contamination in a building is the materials used in its construction.Therefore, older buildings (especially those constructed before stricter legislation on building material safety was established) 4 may be more prone to element concentration spikes in their drinking water supply.In this context, we carried out a study to identify school buildings that would be more likely to present higher concentrations of the eight elements evaluated, and that should be more closely monitored for drinking water contamination.We applied a multivariate approach using PCA and CA to identify more nuanced patterns, involving not only the age of the building, but also trace element concentration data and geographical location.Considering the proposed new legislation requiring drinking water testing for Pb at elementary schools and day care facilities built before 1987, 4 this strategy may contribute to optimizing public resources.Water testing efforts may target the most "at risk" buildings more intensely, and potentially toxic elements other than Pb may also be monitored.The numbers in parentheses are the certified concentrations for each element in µg L -1 .LOD: limit of detection; LOQ: limit of quantification.Summary statistics for all water samples analyzed are listed in Table 3. Figures S1 and S2 (Supplementary Information (SI) section) show the proportions of data below elemental LODs for cafeteria and fountain samples, respectively.For both taps, more than 50% of observations were below the LOD for Cd, Se and Tl.Therefore, these three features were discarded when determining means and when performing multivariate analyses.On the other hand, all observations had detectable levels of Cu and Sb.Thus, the statistics shown in Table 3 were calculated only for school building construction year, and As, Cr, Cu, Pb and Sb (i.e., elements for which less than 50% of the data were below the respective LODs).These same features were used in PCA and CA.
Figure 1 shows the results for a univariate analysis of data on trace element concentration values and the age of the school buildings.It lists the features correlation plots for cafeteria sink and water fountain observations.Significant correlations (p < 0.05) are the same for both taps: building year-Cr (negatively correlated), building year-As (positively correlated), and Cr-As (negatively correlated).A negative correlation between a given elemental concentration and building year implies that older buildings have a higher concentration of that element.Thus, older school buildings in this study present higher concentrations of Cr and lower concentrations of As in the drinking water, which is further confirmed by the negative correlation observed for As and Cr (Figure 1).On the other hand, newer school buildings show higher concentrations of As and lower concentrations of Cr in the drinking water.7][38][39] However, the As associated with Fe materials and other solids in the DWDS may vary and is difficult to predict, 40 and further research is required to explain the relationships observed in this study.It is possible that the trends observed in Figure 1 for both As and Cr are related to the materials available at the time the school was built.Although these results may provide useful information for preventive policies associated with drinking water safety, the maximum concentrations found for As and Cr were more than 50-fold and 200-fold lower than the respective MCLs (Table 3).
Cafeteria sink and water fountain data were also independently evaluated using a multivariate approach.Data were standardized to a mean of 0 and a standard deviation of 1 for both sets of data.Model-based clustering was used to objectively classify the samples.The models were set to optimize parameters according to a user-specified number of groups.Multivariate analyses involving 1-5 sample groups were tested, and the most efficient interpretation of clustered data was achieved with four groups.
Figure 2 is a biplot of the scores and loadings for the first two principal components (PCs) after performing PCA on the standardized cafeteria sink data.Ellipses are drawn around observations according to cluster group assignment.Although there is some overlap amongst the four groups, clear patterns arise from the first two PCs.Considering the loading vectors, observations in group 2 separate according to As, Cu, Pb and Sb.Groups 1 and 3 separate according to the negative correlation between Cr and building year.Group 4 consists of observations with near zero scores in the first two PCs.
Figure 3 shows the boxplots of standardized cafeteria sink data for each feature according to group assignment.The y-axis in this figure represents the distance from the mean in number of standard deviations.Group 1 consists of cafeteria sinks in newer school buildings, with relatively low concentrations of each element in drinking water, except for As.School buildings in group 2 have varied dates of construction, but present generally higher concentrations of As, Cu, Pb and Sb in drinking water when compared with the other groups.Groups 3 and 4 contain samples from older school buildings (all built before 1977).Samples in these groups present generally lower element concentrations than those in group 2, except for Cr in group 3.
The results in Figures 2 and 3 suggest that regulations introduced to limit the amount of toxic elements in building materials, combined with corrosion control treatments administered by the water treatment plant, were effective at reducing drinking water contamination.All school buildings in group 1, which present the lowest trace elements levels in drinking water, were built after 1986. 1 This observation is reinforced by the results in Figure 4, which shows the geographical distribution of the four groups of samples associated with the cafeteria sink dataset.Schools in group 1 tend to be outside the older main urban center of the school district (Winston-Salem, Figure 4).As the city grew over the years, so did the awareness of environmental risks related to trace elements, which resulted in cleaner construction materials and lower concentrations of toxic elements such as Pb in drinking water.On the other hand, it is interesting to note that the relationship between age of the building and level of trace elements in drinking water is not always direct, especially for construction periods before 1987.Most schools in the WSFCS district were built in 1920s (ca.9% of the buildings), 1950s and 1960s (ca.49% of the buildings), and 1990s and 2000s (ca.33% of the buildings).Despite consisting of schools built before 1977, groups 3 and 4 present relatively low levels of most elements in drinking water when compared to the other groups.Alternatively, schools in group 2, with  Similar results to those discussed for cafeteria sink were observed for water fountain samples (SI section, Figures S3-S5).Considering the multivariate analysis results for both taps, the approach employed in this study is more effective at identifying school buildings more prone to toxic element concentration spikes than solely relying on building construction date.It provides valuable information on location and the most critical elements for effective targeting in a public policy involving preventive action and drinking water safety in schools.Water samples collected from cafeteria sink and water fountain taps in group 2, for example, may be more susceptible to spikes in Pb concentration (for which the maximum value is only 3 times lower than the respective MCL, Table 3) due to periods of water stagnation or changes in water chemistry. 34,35 further investigate the potential negative effects of water stagnation on Pb concentrations, we carried out a study with a school that was vacant for approximately 2 years (not included in the previous discussions and data), and confirmed a significant spike in Pb concentration in its drinking water (44 µg L -1 of Pb in a cafeteria sink sample).To remediate such high contamination level, we evaluated the effect of simply flushing the tap for a few minutes.After 5 min of flushing, Pb concentration was reduced to less than 1.0 µg L -1 .The results of a similar study performed with four schools during the summer break is discussed in more detail in the next section.
Water stagnation during summer break and potential Cu and Pb concentration spikes in drinking water Drinking water samples from four schools were retested after 52 days of summer break to evaluate the effect of water stagnation on Cu and Pb concentrations.These elements were chosen due to their importance in current water safety legislation. 1The schools were chosen according to Pb concentration in drinking water during the school year and the level of activity in the building during the summer.Considering these parameters and group assignment after multivariate analyses, we hypothesized that the cafeteria sink and water fountain at location C (Table 4) would be the taps most prone to spikes in Cu and Pb concentration.Locations A and B had summer activity.Location D is an old building with summer activity.Location C is an old building with no summer activity, and is part of group 2 (Figures 2 and 3, and S3 and S4 (SI section)), with a relatively higher Pb concentration during the school year.
As observed in Table 4, there was an increase in Pb concentration in drinking water for half of the cafeteria sinks evaluated.An even greater percentage of water fountains tested presented Pb concentration increase during the summer break (i.e., 3 out of 4 taps).As expected, samples from the cafeteria sink and water fountain in school location C (group 2, Figures 2 and 3, and S3 and S4 (SI section)) presented the highest concentrations of Pb after the summer break.Most results for Cu also showed an increase in concentration due to water flow stagnation.It is important to note that despite the spikes observed, no Cu nor Pb concentrations reached values above the respective MCLs.
Similar to the experiment previously described for the unoccupied school building, a flushing procedure was evaluated as a simple method to minimize element contamination due to water stagnation during the summer break.Figure 5 shows the concentration profiles for Cu and Pb during the flushing of each tap at location C. Similar results are observed for the other locations evaluated (Figure S6, SI section).For all but one spout there was a drop in Cu  and Pb concentration after flushing for 5 min.Interestingly, Pb concentrations at the water fountain in school location C increased after 5 min, and only showed signs of reduction (compared to the original Pb concentration) after 1 h of flushing.A similar effect, although less pronounced, was observed for the cafeteria sink in location B (Figure S6, SI section).These results may be related to any combination of factors affecting sampling, which were discussed earlier.
The water flow was held relatively constant during sampling at all taps, so the spikes in Pb concentration observed during the flushing procedure may be associated with inherent and random particulate release from Pb-containing materials at the school. 12

Conclusions
All samples collected from currently active schools in the WSFCS district tested below the MCL value set for each element (e.g., Pb values were all ≤ 5.0 µg L -1 ).Univariate analysis identified significant correlations (p < 0.05) between the age of the school building and concentrations of As and Cr in drinking water.CA and PCA identified four distinct groups of schools, which were separated according to the age of the building and the concentrations of trace elements in the drinking water.This strategy may be useful to water safety administrators, as resources can be focused on locations most likely to present concentration spikes due to water stagnation or changes in water chemistry.
The method described in the present study is applicable to other school districts.It may be used as a model for a broader water contamination prevention program, in which samples from each school would be frequently collected during a long period of time (e.g., once a month for one year), and the results from a multivariate analysis would identify priority schools.Finally, policy makers could set sampling frequency (once a month, once every three months, once a year, etc.) and identify the elements to be determined in drinking water for each school.The main advantage of such an approach is that it can be tailored to specific school districts, with each school tested at a frequency compatible with its contamination risk level.
The results presented in this study suggest that water stagnation does increase the chance of higher elemental concentrations in drinking water, especially Pb.They also indicate that a simple 5-60 min flushing procedure is capable of significantly minimizing water contamination due to flow stagnation in unoccupied or temporarily unused buildings.For example, a water sample with a Pb concentration of 2.3 µg L -1 during the school year reached a value of 4.2 µg L -1 after a 52-day stagnation period.However, after 5 min of flushing, the concentration at that tap (cafeteria sink at location C) dropped below 1.0 µg L -1 for Pb.

Figure 1 .
Figure 1.Correlation matrices for the cafeteria sink and water fountain datasets.Insignificant correlations (p ≥ 0.05) are marked with crossed boxes.

Figure 2 .
Figure 2. Biplot of the scores and loadings from the first two PCs after PCA of the standardized cafeteria sink dataset.Groups 1-4 are composed of 19, 14, 15 and 25 cafeteria sink samples, respectively.

Figure 3 .
Figure 3. Boxplots of standardized cafeteria sink data.The x-axis represents the sample group assignment after model-based clustering, and the y-axis represents the number of standard deviations from the mean.

Figure 4 .
Figure 4. Geographical location of WSFCS school buildings according to group assignment after model-based clustering of cafeteria sink data.The background in this figure was prepared with Google Maps and includes Winston-Salem (36.099861 o N, 80.244217 o W) and neighboring towns in Forsyth County, NC, USA.

Figure 5 .
Figure 5. Two-hour flushing profile for Cu and Pb concentration in drinking water from cafeteria sinks and water fountains in school location C after 52 days of summer break.

Table 1 .
ICP-MS/MS operating parameters used to determine trace element concentrations in drinking water

Table 2 .
Limits of detection (LOD) and quantification (LOQ), and percent recoveries from a certified reference material of water (NIST 1643e) for As, Cd, Cr, Cu, Pb, Sb, Se and Tl determined by ICP-MS/MS a

Table 3 .
Summary statistics for cafeteria and fountain drinking water samples, and maximum contaminant levels (MCL) or action levels (AL) set by the EPA a Not determined (N.D.), as more than 50% of observations were below the respective LODs.MCL: maximum contaminant level; AL: action level; LOD: limit of detection.

Table 4 .
Concentrations of Cu and Pb in drinking water collected during the school year and after a 52-day summer break