Modeling the influence of various water stressors on regional water supply infrastructures and their embodied energy

Water supply consumes a substantial amount of energy directly and indirectly. This study aims to provide an enhanced understanding of the influence of water stressors on the embodied energy of water supply (EEWS). To achieve this goal, the EEWS in 75 North Carolina counties was estimated through an economic input-output based hybrid life cycle assessment. Ten water stressor indicators related to population, economic development, climate, water source, and land use were obtained for the 75 counties. A multivariate analysis was performed to understand the correlations between water stressor indicators and the EEWS. A regression analysis was then conducted to identify the statistically significant indicators in describing the EEWS. It was found that the total amount of water supply energy varies significantly among selected counties. Water delivery presents the highest energy use and water storage presents the least. The total embodied energy was found to be highly correlated with total population. The regression analysis shows that the total embodied energy can be best described by total population and temperature indicators with a relatively high R square value of 0.69.


Introduction
Water and energy are highly interdependent: providing water needs energy for pumping and treatment, and providing energy requires a large amount of water for cooling and processing. It has been argued that this 'water-energy nexus' could substantially increase the vulnerability of water and energy resources under future global changes [1,2]. Over the past decade, a growing awareness of the water-energy nexus has led to proliferated research efforts to understand and quantify the energy embodied in varied water infrastructures. For instance, energy audits and risk assessments have been carried out to estimate operational energy use during a myriad of treatment processes [3][4][5][6]. Recently, with the development of life cycle assessment (LCA), an increased number of studies have examined energy consumption over the life cycle of water infrastructure, including construction, operation and maintenance (O&M), and decommission phases [7][8][9][10][11][12][13][14][15][16][17][18]. Such energy consumption is commonly referred to as 'embodied energy' or 'life cycle energy'.
Previous life cycle studies have revealed the significance of indirect energy flows associated with providing chemicals and services in water systems, which are commonly neglected in water utilities' energy audits [17,19]. Meanwhile, the heterogeneity of local socioeconomic conditions and water availability have been found to influence regional energy use of water supply [17]. Nevertheless, there is still little understanding on how and to what degree regional heterogeneity is correlated with the embodied energy of water supply (EEWS).
Water demand, availability, and quality could potentially serve as important indicators of the EEWS. When water demand increases, a larger amount of water has to be treated and pumped, requiring a higher energy input. The need to expand treatment capacity or construct new treatment plants may arise, furthering energy consumption. In existing or future cases where local freshwater availability is no longer able to meet the growing demand, alternative water sources, such as brackish groundwater or seawater, need to be adopted; these are usually much more energy intensive to treat compared with freshwater [9,17]. Degraded freshwater quality could also lead to higher energy consumption associated with providing higher amounts of aeration or chemicals to meet quality standards.
Certain region-specific characteristics, such as climate, population, land use, urbanization, and economic growth, have multiple effects on water demand, availability, and quality, and are considered to be primary global 'water stressors' [20]. Previous studies have projected water stressor influence on water resources, showing that population growth and economic development, especially in urban areas, have a significant impact on local water demand [21][22][23][24]. These factors also degrade water quality by diminishing the return flow and impairing water bodies' selfcleaning abilities [25]. Studies have also shown a significant relationship between land use and water quality [26][27][28][29] and quantity [30,31]. It has been reported that agricultural and impervious urban lands produce much higher levels of nitrogen and phosphorus in surrounding water bodies than other land surfaces [27,29]. Regional climate characteristics related to temperature and precipitation could potentially influence agricultural [32] and residential [33] water demand.
Though the implications of water stressors such as population growth and climate change on water resource management have been intensively studied in the past two decades [34][35][36][37][38][39][40][41], very few studies have further discussed how changes in water stressors could influence the energy management in water systems [42,43]. Quantifications of the 'water stressor-water-energy' relationships are not found in literature, implying a lack of understanding on how human interventions influence resource depletion from a water-energy nexus perspective. Missing this critical aspect could potentially lead to underestimation of resource depletion rates and the urgency in adopting alternative water and energy management measures. In reality, water stressor influences have been increasing at accelerated rates in terms of scale and intensity over the past 150 years and will continue to change rapidly in the foreseeable future [20]. It is vital to understand how waterrelated energy consumption changes in order to provide guidance for future water and energy decision making and resource planning. This study attempts to fill an initial piece in this complex puzzle by examining the relationships between water stressors (population, economy, land use, and climate) and EEWS via statistical methods. This is accomplished through a case study in North Carolina (NC), where population, land use, climate, and EEWS were estimated on a county basis. Although water scarcity and quality degradation are not immediate issues in NC, the method developed in this case study could potentially be applied (given uncertainties) to other regions that do face such issues.

Case study area
NC was selected as a case study area based on its highly heterogeneous socioeconomic and environmental conditions and well-documented water supply infrastructure database. NC is ranked the 9th most populous state in the US with the 6th highest net population growth in 2014. Population is unevenly distributed across the state; around two thirds of the population concentrate primarily in the middle third of the state's landmass [44]. NC is a major producer of textiles, chemicals, food and tobacco, furniture, and electrical machinery in the nation, representing 14.5%, 8.8%, 8.3%, 7.4%, and 7.4% of the nationwide production of each industry respectively in the year of 2012 [45]. The thriving agricultural industry occupies around 27% of the state's 31 million acres land area [46]. Percentages of agricultural land across the NC counties are highly diverse, ranging from less than 1% to greater than 50% (figure 1). NC's climate is mainly influenced by the altitude of its three principal physiographic divisions, the Coastal Plain (<200 feet elevation), the Piedmont (200-1500 feet elevation), and the Mountains (1500-6684 feet elevation) [47]. The annual mean temperature across NC varies more than 20°F from the lower coast to the highest mountain. The Mountains also present a unique precipitation pattern.
Detailed Geographical Information System (GIS) file data on water supply infrastructure, including water treatment plants, pipelines (public water mains only), wells, surface water intakes, storage tanks, and pumping stations developed during 1997-2000, were obtained for 75 out of 100 counties statewide [48]. These data were originally developed by the NC Rural Economic Development Center (NCREDC) to facilitate water and wastewater systems' planning, siting, and impact analysis in NC. Locations of varied water infrastructure were digitized from state or owner supplied maps and digital files with tabular attribute data populated by information obtained from public water supply systems and individual owners [48]. A list of the 25 counties without water infrastructure data is provided in table S-1 of the supporting information; these counties were not analyzed. Figure 1 shows the locations of water treatment plants included in this study as well as each county's total population and percentage of agricultural land. Public water supply in NC mainly relies on its abundant surface water resources (85%) and groundwater withdrawals (15%) [49]. Average uphill/downhill slope was assumed to be zero for the pipelines, and its associated uncertainty is discussed in section 5.

Indicator selection and estimation
A variety of quantitative indicators, selected based on their potential correlations with water supply infrastructures and data availability, were used to describe the different aspects and current states of each water stressor (population, economy, land use, and climate) across NC counties. To improve temporal consistency, most indicators were obtained for the similar time period when the water infrastructure data were generated (1997)(1998)(1999)(2000), except for climate indicators where thirty-year normal data were used. Each county's population is described using two indicators: total population (P, number of people) and population density (P d , number of people acre −1 ). While total population is directly correlated with water demand and supply, population density influences the distribution and density of water pipelines. County level estimations of the two population indicators in 2000 were obtained from the US Census Bureau [44]. The economic development of each county was described by the average per capita personal income (I, $) over 1997-1999 obtained from the US Bureau of Economic Analysis [45].
Due to potential correlations with water demand and quality, percentages of urban land (L u , %) and agricultural land (L a , %) over the total land area of each county were used to describe land use conditions. State land use was obtained from the 1996 Statewide Land Cover Raster Map (93.5 feet resolution) provided by the NC Center for Geographic Information and Analysis [50]. GIS was used to process the map and estimate L u and L a . The map was divided into 100 counties using a county boundary layer obtained from the US census TIGER database [51]. Land cover was classified into 22 categories in accordance with the state's Standard Classification System for the Mapping of Land Use and Land Cover [50]. Among the 22 categories, 'high intensity developed land' and 'low intensity developed land' were included as urban land, while 'cultivated land' and 'managed herbaceous cover' were included as agricultural land. The rest 18 categories represent unmanaged natural land, such as forest land, shrub land, water bodies, or barren land. L u and L a were then calculated as the number of grid cells assigned to the target land covers divided by the total grid cell numbers within the county.
Climate indicators include mean temperature (T,°F ), cooling degree days (CDD,°F day), heating degree days (HDD,°F day), and monthly precipitation (Pr, inch). Original temperature and precipitation information of all climate stations within each county was obtained from the NC Climate Retrieval and Observations Network of the Southeast Database [52]. As each station provides climate information for a variety of time periods (ranging from a few months to more than 100 years), only stations with longer climate records were included. All four indicators were calculated as averaged 30 year normal values from 1971 to 2000.
Different water sources usually require specific water intake and treatment infrastructures. Generally, groundwater supply requires more energy for water intake and pumping, while surface water supply requires more energy for system constructions and providing chemicals [6,9] because of a lower raw water quality [53]. As surface and groundwater are the two primary water sources in NC, the percentage of surface water supply (w s , %) provided by NCREDC [48] was used as an indicator to describe the water source composition of each county.
The values of all aforementioned indicators for the 75 NC counties are provided in table S-2 of the supporting information.

Embodied energy calculation
The current study examines the embodied energy (in the form of primary energy) associated with providing energy, materials, and services during the construction and O&M of all water supply infrastructures (treatment plants, wells, surface water intakes, pipelines, storage tanks, and pumping stations) in each county. Equation (1) shows the calculation of embodied energy associated with water supply infrastructures in each county where EE is the total embodied energy of supplying water in a certain county each year, Terajoule (TJ)/ year; i is the water infrastructure index; C c,i is the total constructional cost of water infrastructure 'i'; $ million in 2002 $USD; e o is the embodied energy intensity of water infrastructure O&M in primary energy form; TJ of primary energy/$ million in 2002 $USD; e c is the embodied energy intensity of water infrastructure construction in primary energy form; TJ of primary energy/$ million in 2002 $ USD; and T i is the life span of water infrastructure 'i'; years. Life span of all water infrastructures was assumed to be 100 years [7,9] with regular maintenance of water treatment plants occurring throughout [54]. Embodied energy intensities used in the current study (e o and e c ) were adapted from Mo et al (2011) [9], estimated as averages of constructing or operating and maintaining an entire water supply system using an input-output based hybrid LCA approach (table 1). These embodied energy intensities were chosen because they have been tailored to both a typical groundwater supply system and a typical surface water supply system. Most NC groundwater is treated through simple disinfection and most surface water supply systems adopt conventional treatment processes, including coagulation, sedimentation, filtration and disinfection [48] which is consistent with Mo et al (2011) [9]. Constructional and operational costs (C o,i and C c, i ) were calculated for each type of water supply infrastructure by integrating infrastructure parameters mapped by NCREDC (e.g., length of pipeline, well depth, water flow etc) [48] into cost equations and/or curves obtained from a variety of previous studies (table 2) [53][54][55]. The operational cost for water delivery was included as the operation of pipelines instead of pumping stations. Cost curves for the O&M of water treatment plants include pumping energy for providing a finished dynamic water head of 100 feet. Additional pumping energy for elevated storage tanks and other operational costs were neglected. Pumping energy used for groundwater wells and surface water intake is accounted as a part of water intake energy instead of water delivery energy. All dollar values were adjusted to 2002 $USD for consistency with the unit of embodied energy intensities.

Multivariate and regression analyses
A multivariate analysis was conducted to understand correlations between the socioeconomic, water source, and climate indicators and the EEWS, and a stepwise regression analysis was performed to find the statistically significant contributors to the EEWS. Both analyses were performed in JMP Pro 11 ® developed by SAS Institutional Inc. The multivariate analysis utilizes the pairwise method to estimate Pearson correlation coefficients among variables. The values of Pearson correlation coefficients range from −1 to 1, indicating the strength and direction of the linear correlations among variables [56][57][58]. Values closer to 1 indicate stronger positive correlation, while values closer to −1 indicate stronger negative correlation. Values closer to 0 indicate weaker correlation. All data were standardized (equation (2)) before the regression analysis in order to avoid bias associated with the scale differences among the datasets is the transformed data after standardization; X is the average of the original dataset; and σ is the standard deviation of the original dataset. Stepwise regressions were then performed on the standardized datasets in identifying important contributors to the EEWS. This approach is usually adopted when there is little theory or understanding to guide the selection of terms for a model. The program examines all possible models to correlate socioeconomic, water, and climate indicators with the EEWS, and selects a best model with the minimum Bayesian Information Criterion (BIC) value defined by equation where BIC is the Bayesian Information Criterion value; L is the the maximized value of the likelihood function for the regression model; k is the number of parameters in the regression model; and n is the sample size.

Embodied energy of water infrastructures
Of the 75 counties studied, the overall embodied energy of each county (EE) ranges from around 10 TJ of primary energy in sparsely populated Clay County to over 1500 TJ of primary energy in more densely populated Brunswick and Robeson Counties. Figure 2 provides the estimated embodied energy for each of the 75 NC counties arranged in order of most to least energy intensive from left to right. EE was further separated into four subcategories, indicating the constructional and operational embodied energy   Pipelines total embodied energy [7][8][9]18]. Within the water delivery subcategory, operational energy usages (around 97-99% of EE p ) greatly outweigh the constructional energy usages, highlighting the importance of improving pumping energy efficiency and reducing water delivery needs. Water intake (0%-21% of EE), treatment (0%-37% of EE), and storage (0%-4% of EE) are relatively less significant in terms of embodied energy consumptions compared with water delivery. The water intake subcategory in figure 2 includes the constructional and operational embodied energy of both surface water intake infrastructures and groundwater intake wells. Water withdrawal (including public, industrial, agricultural, and recreational water supply) in the 75 counties represents around 40% of the total water withdrawal of the entire state, because some of the most populous counties are not included in the available datasets. Within the 75 counties, the amounts of surface and groundwater withdrawals represent 70% and 30% respectively [49], while their associated embodied energy represent 79% and 21% of the total water intake energy (EE i ) respectively, implying surface water intake is more energy intensive compared to groundwater intake in NC (2.6 versus 1.7 MJ m −3 ).
Most surface water supply systems in the selected counties employ conventional processes of coagulation, sedimentation, and filtration, while 11 systems apply direct filtration. Most groundwater supply systems employ direct disinfection, while around 50 systems have hardness and iron removal. These treatment processes are relatively simple and energy efficient, which explains the relatively insignificant contribution of water treatment to EE in each county. Water storage is the least significant contributor to EE. The tanks are used to store either raw water or finished water. Most tanks in the selected counties are made of metal, and only a few are made of concrete. Around two thirds of these tanks are elevated; the majority of the rest are ground storage tanks, while a few (less than 7%) are hydro-pneumatic.

Multivariate analysis
For the multivariate analysis, all indicators and embodied energy were classified into four groups  relatively high positive correlations with P d and P, implying the potential effect of population growth on land use patterns. P is highly correlated with P d , with a correlation coefficient of 0.85 (Tables S-3 and 4 in SI). L a and I, on the other hand, do not have significant correlations with any other socioeconomic indicators, showing they are relatively independent variables. Multivariate analysis results of the climate indicators of T, CDD, and HDD indicate they are highly dependent with absolute values of correlation coefficient above 0.97. Because CDD and HDD are both calculated based on temperature, and both were removed in the following regression analysis to improve its fitting performance. Pr does not have high correlation with any other climate indicators, and is considered an independent variable. w s is relatively independent, and does not have high correlations with any socioeconomic or climate indicators. Socioeconomic, water source, and climate source indicator groups do not have high correlations with each other.
The high positive correlation between EE and EE p can be explained by the high percentage of water delivery energy within the total embodied energy. EE i and EE t are also positively correlated with EE and EE p , but to a lesser extent because although the amount of water is the same, water delivery distance varies across counties. EE i and EE t have a relatively high correlation with each other, which can be explained by the fact that they are both dependent on systems' water flow and the variance can be explained by the different intake systems and treatment technologies adopted. EE s is not highly correlated with any other embodied energy variables, showing the construction of water storage tanks might be relatively independent of the other types of water infrastructures. Across the indicator groups, most embodied energy variables except EE s have relatively high correlations with population. EE and EE p also have positive correlations with L u , with correlation coefficients of 0.54 and 0.52 respectively.

Regression analysis
Stepwise regression analysis identifies the most statistically significant variables in determining the EEWS of a county. Figure 4 provides the 5 selected variables out of 10 initial variables (CDD and HDD excluded, and variables that are not selected in any individual regression analyses are not listed) in determining the values of EE, EE i , EE t , EE p , and EE s through five individual regression analyses. The variables selected for the total embodied energy are total population and temperature. Both variables have p-values of less than 0.0001, indicating their very strong statistical significance in determining the values of EE. The two variables combined have an R square value of 0.69 in determining the values of EE, which is higher than using either one of them as a predictor. Given the higher correlation between EE and EE p , the selected variables of water delivery energy replicate those of the total embodied energy. The R square value for using the same two variables in determining the values of EE p is also relatively high (0.67). Selected determinates of water intake energy are percentage of urban land and total population with high statistical significance (p-values <0.01). The R square value of this equation is 0.52, indicating a relatively good predictability of water intake energy using the two selected variables. However, none of the 8 initial variables could offer good estimations of water treatment energy and water storage energy; even the best models selected yield very low R square values. Nevertheless, water source, urban land, and temperature have been identified to have relatively high statistical significance in determining the values of EE t , while population density has high statistical significance in determining the values of EE s . It must be noted that these selected variables for each type of embodied energy might not explain the actual causes of the changes in EEWS, but rather identify statistically significant factors in determining the values of each type of embodied energy. For instance, population density, and total population have high positive correlation with each other, and hence, selection between the two variables purely depends on how well either variable improves the model performance in terms of BIC instead of which variable actually explains the embodied energy results. Nevertheless, the regression results still provide an important means to roughly estimate the embodied energy of a region based on variables that could be easily obtained (e.g., population and temperature).

Uncertainties
It is important to understand the uncertainties associated with this study to guide future investigations and improvements. In the current study, most of the raw data were obtained from quality sources as detailed in section 3.1 with relatively high confidence. Some assumptions were made for estimating embodied energy intensities and costs of water infrastructures. First, an assumption of 100 year life span was made for all types of water infrastructures in accordance with previous practices [7,9]. A previous study conducted a sensitivity analysis investigating the potential influence of infrastructure life span on embodied energy estimations [17], and it was found that when the life span is between 50 and 150 years, insignificant changes (<10%) in EEWS were observed due to the insignificance of constructional energy compared with operational energy. Nevertheless, when the life span is further reduced to 20 years, embodied energy might increase significantly (15%-25%). Another uncertainty is associated with the assumption of zero average uphill/downhill slope in estimating the operational cost of pipelines, which could potentially underestimate the pumping energy given NC's elevation changes from east to west. When the average uphill/downhill slope changes to an expected maximum of 300 feet per 1000 feet, the operational energy of pipelines will increase by 241% in this study. An additional source of uncertainty is that the selected counties in this study are primarily water abundant rural areas, and hence outcomes of the current study may not apply as well to regions that are primarily urban areas where water system scales, raw water sources (e.g., inclusion of seawater or brackish groundwater as alternative sources), treatment technologies, and/or water demand could be significantly different. The authors also acknowledge that not all indicators that could potentially influence the EEWS were exhausted. Hence, future studies are needed in understanding the mechanistic cause and effect relationships among the indicators and the EEWS.

Implications
This study estimated the EEWS in 75 NC counties, explored a correlation with water stressors, and provided an approach to rapidly estimate the EEWS using socioeconomic and environmental variables. Among the selected counties, the total amount of water supply energy varies significantly. As water and energy infrastructures are critical to city and region Figure 4. Outcomes of the stepwise regression analysis showing the statistical significant water stressors of each type of embodied energy of water supply in the studied North Carolina counties (blue numbers: coefficients a 1 -a 6 in equations EE (or EE i , or EE t , or EE p , or EE s ) = a 1 × L u + a 2 × P + a 3 × P d + a 4 × w s + a 5 × T + a 6 ; black numbers: p value of each coefficient based on t-statistics; green numbers: R 2 value of each equation; brown characters: standardized embodied energy and water stressor indicators. Grey shades indicate statistical significance: the darker the color, the higher the statistical significance).
planning, it is important to understand their drivers and predictors of changes to guide future decision making. EEWS estimation shows that water distribution consumes the highest amount of embodied energy, while water storage consumes the least. This sheds light on a potential benefit of integrating onsite or small-scale decentralized water supply technologies such as rainwater harvesting, greywater recycling, or community-scale water supply systems to supplement future water supply in an energy efficient way by reducing the need of long distance water delivery. The total embodied EE was found to be highly correlated with total population, and a rough prediction of EE can be made using total population and temperature indicators. Doubling of the average total population per county could potentially result in 72% increase in the total EEWS. Future climate change might influence the need of water infrastructures, and could potentially impact the water-related energy demand. Based on our results, a 10% increase in average temperature could result in 40% increase in the total EEWS.
The average total volumetric energy intensity calculated in this study (∼34 MJ m −3 ) is higher than previous estimations of around 11 MJ m −3 [9,17] (yet still in the same order of magnitude), which could be contributed by the economies of scale resulted from the mostly small water systems (<10 thousand m 3 d −1 ) included in this study. We also examined the applicability of the regression model in estimating the EE of individual water supply systems located in regions outside of NC [8,9]. A groundwater supply system (76.8 thousand m 3 d −1 of daily flow and serving a population of 121 000) located in Kalamazoo, Michigan was previously estimated to have an EE of 289 TJ/ year via an input-output based hybrid LCA approach [8,9]. Additionally, a surface water supply system (287 thousand m 3 d −1 of daily flow and serving a population of 657 000) located in Tampa, Florida was previously estimated to have an EE of 1131 TJ/year [8,9]. Applying the EE regression model and an annual mean temperature of 49.15°F [59] to the Kalamazoo system results in an EE of 345 TJ/year, which is within 20% uncertainty range of the original estimation by [8,9]. On the other hand, when applying the regression model and an annual mean temperature of 73.35°F [60] to the Tampa system, a 1.5 times of increase in the EE estimation (2867 TJ yr −1 ) was observed, although both estimations are still in the same order of magnitude. This result indicates a potential broader applicability of the regression model in different regions and study scales across the US. The uncertainty variance between the Kalamazoo and Tampa estimations aligns with our suspicion that the regression model might be more applicable for rural/smaller scale systems than urban/larger scale systems. This study is a first attempt to describe the EEWS by indicators of water stressors. Often, such indicators are easy to obtain, while estimating the EEWS for a region could be time and data intensive. This study provides an alternative estimation method. It could be useful for integrated water and energy planning under climate change and population growth [61]. Such an approach could also benefit future studies of water-energy-society interactions.