1 Introduction

A large proportion of the optimization models for water distribution systems focus on networks consisting of pipes only. Very few published works simultaneously incorporate the sizing and operation of tanks and pumps, multiple operating conditions and demand variations which are all typical features of water distribution systems. This is mainly attributed to the significant increase in complexity which stems from the additional design variables and multiple operational constraints. In addition, accurate dynamic simulation of the system is extremely time-consuming, particularly in evolutionary optimization algorithms that operate on large populations of candidate solutions.

The Anytown network is a hypothetical system with multiple loadings and multiple storage tanks and pumps (Walski 1987). Murphy et al. (1994) applied an evolutionary algorithm to optimize the design based on single-objective optimization with constraint-violation penalties. Walters et al. (1999) solved the problem using multi-objective optimization. Dimensional weightings were required to facilitate the aggregation of the various system improvements considered. Vamvakeridou-Lyroudia et al. (2005) employed multi-objective optimization combined with fuzzy membership functions for constraint handling purposes based on aggregators that essentially are weightings. The performance is heavily dependent on the many parameters and operators introduced.

Several researchers modified the Anytown problem by incorporating other measures of hydraulic performance. Farmani et al. (2005, 2006) considered the resilience index (Todini 2000) and water age (Rossman 2002). Prasad and Tanyimboh (2008) investigated the statistical flow entropy (Tanyimboh and Templeman 2000; Tanyimboh et al. 2011) and resilience index. More recently Atkinson et al. (2014) maximized flow entropy and resilience index at once in an attempt to generate hybrid solutions with properties derived from the two measures. The above-mentioned extensions and other aspects such as the uncertainties associated with the nodal demands (Spiliotis and Tsakiris 2012) were not considered in this article.

Kurec and Ostfeld (2013) addressed tank sizing for a network in the literature. They employed constraint-violation tournaments, and coefficients for tank-level imbalances. Constraint handling methods in evolutionary algorithms generally employ penalties (e.g. Kougias and Theodossiou 2013) or some form of weighting to standardize different constraint violation measures. The major disadvantage of using these parameters (Dridi et al. 2008) is that there is no rigorous method to obtain their values. Users often have to find the most effective values by experimentation that is time consuming. Moreover, penalty parameters and weightings are often problem specific in that they may perform well on some problems but not so well on others. Hence the whole cycle of calibration and trial runs has to be repeated every time a different optimization problem is solved. In addition, the optimality of the solution heavily depends on these parameters.

This paper extends previous work by the authors that was concerned with the optimization of networks with pipes only (Siew and Tanyimboh 2012a; Siew et al. 2014). The development and application of a multi-objective evolutionary optimization approach for the design and rehabilitation of water distribution systems that avoids the above-mentioned difficulties is described. The minimum node-pressure constraints are addressed implicitly with pressure-driven analysis of the distribution system. Tank sizing is carried out seamlessly using pressure-driven extended-period simulation. Tank depletion over the operational cycle (typically 24 hours) is promoted explicitly, as an extra objective to improve water quality alongside the tank replenishment objective. Furthermore, optimal tank siting and pump scheduling are fully integrated in the optimization procedure. Upgrading of pumping stations and the operation and energy consumption costs are considered. This new formulation achieved many new solutions that are fully feasible, satisfy both pressure and other hydraulic and operational constraints, and are cheaper than the previous solutions in the literature. The solutions achieved were assessed further, in terms of the fluctuations of the water levels in the tanks, plus the temporal and spatial variations of the water quality in the distribution system. Given the context of the Anytown network that is hypothetical, the values used in the water quality simulations were selected to achieve the prescribed minimum chlorine residual throughout the distribution system.

2 Brief Overview of Anytown Network Optimization Problem

A brief summary of the network is presented here for completeness. The layout of the Anytown network is shown in Fig. 1a. The source is a water treatment plant located at node 40 with a fixed water level of 3.05 m. Water is pumped from the plant into the system via three identical pumps operating in parallel. There are two existing storage tanks located at nodes 14 and 17 both with operating water levels between 68.58 m and 76.20 m. The volume of water below 68.58 m and above 65.53 m is emergency storage. Other data for pipes and nodes are available in Walski (1987).

Fig. 1
figure 1

The Anytown water distribution network. Solutions with system performance values less than unity are infeasible. Tanks 41 and 42 are existing tanks

A minimum pressure of 28.12 m must be provided at all nodes for the average day flow of 24 hours duration as well as the instantaneous peak flow, i.e. 1.8 times the average day flow. There are three different critical fire flows during which water is supplied at a minimum pressure of 14.06 m. With one pump out of service and all tanks starting at their lowest operating levels for normal storage, the emergency storage in each tank must be sufficient for a 2-hour fire and at the same time supply peak flow demands, i.e. 1.3 times the average day flow. 35 existing pipes are considered for paralleling or cleaning and lining. In addition, there are six new pipes to be sized. One or two new tanks can be added to the network. Potential tank locations can be any of the 16 available nodes which are not connected directly to an existing tank. Tanks are connected to a node by a riser of length 30.78 m with the diameter to be determined.

New pumping stations are not considered but an upgrade of the existing pumping station is allowed through the addition of one or two new pumps with identical characteristics to the existing ones. Given eight average-day demand factors (one each for the eight 3-hour durations in 24 hours), eight ON-OFF status control variables are used for the operation of a single pump. As such, each status control variable corresponds to a demand factor. This enables the pump scheduling to be optimized for the different demand periods.

3 Methodology for Tank Siting and Design

Tank siting and design are addressed seamlessly in the optimization model. With extended-period pressure-driven analysis, the tanks are simulated realistically. Consequently no additional constraints are needed to prevent tanks from overfilling or dropping below emergency operating levels. The reason is that the limits of the operational levels are explicitly recognised by the hydraulic simulator during the extended-period simulation as standard (Siew and Tanyimboh 2010a). This approach avoids the problems of tank flow imbalances (e.g. Murphy et al. 1994; Walters et al. 1999; Kurek and Ostfeld 2013).

Moreover, the philosophy here is different (Siew and Tanyimboh 2010b). For example, constraints to balance the respective initial and final water levels in the tanks and are not required. Similarly, a constraint to balance the total inflow and outflow volumes of the tanks for the network as a whole is not required. In addition to pressure-driven extended-period simulation, we have incorporated criteria that promote tank depletion and replenishment in the objective function (Section 4). Accordingly, solutions having tanks with excessive cost, insufficient or excessive elevation, insufficient or excessive capacity, partial depletion or partial replenishment are optimized out seamlessly through natural selection. If a candidate solution has a new tank, the solution’s chromosome specifies the site and other relevant parameters. The pressure-driven hydraulic simulator together with penalty-free Pareto dominance permits any solution (feasible or infeasible) to be rated realistically and without bias.

It may be recalled that in extended-period simulation, a snapshot analysis is executed at the beginning of a hydraulic time step and the system is checked for any status changes during the hydraulic time step. If, for example, the water level in a new tank reaches the minimum level before the end of the time step, the tank’s riser is temporarily closed and an additional snapshot analysis is performed due to the changed system state. This sequence is carried out in each hydraulic time step in the extended-period simulation. This is the standard procedure for extended-period simulation in EPANET-PDX (a pressure-driven hydraulic simulation model developed in the present research) (Siew and Tanyimboh 2010a, 2012b). Conceptually the extended period simulation procedure is the same as in EPANET 2. Tank designs achieved at the end of the optimization are final and no further tank adjustments (as in e.g. Vamvakeridou-Lyroudia et al. 2005) are required.

New tanks were assumed to be cylindrical. Four design variables for the new tanks were used as follows (Vamvakeridou-Lyroudia et al. 2005; Prasad 2010).

(a) Total volume (V);

(b) Ratio of diameter to height (D/H);

(c) Ratio of emergency volume to total volume (v/V); and

(d) Level of the bottom of the tank (i.e. the tank’s elevation).

Upper and lower bounds on the shape parameter D/H and emergency storage fraction v/V were employed to avoid solutions that may be undesirable in practice (Prasad 2010). All nodes except those already connected directly to an existing tank were considered as possible locations for new tanks. Table 1 summarizes the decision variables of the optimization problem. The range of tank volumes and their associated costs was taken from Walski (1987). Intermediate tank sizes were considered here and corresponding costs were interpolated linearly. The tank-sizing decision variables were discretized to provide eight options per decision variable (Table 2).

Table 1 Overview of the decision variables
Table 2 Tank costs and associated discrete decision variable options

4 Formulation of the Optimization Model

The objective is to upgrade the system to meet future demands and pressure in the most economical way, considering both capital expenditure and operational costs. The procedure developed is based on two primary objectives. The first objective minimises the total cost. The second objective aims to improve the system’s performance. The formulation of the second objective function involves normalising the operational constraints and combining the normalised operational constraints with a measure of hydraulic performance i.e. the demand satisfaction ratio. The demand satisfaction ratio, derived by pressure-driven analysis (see e.g. Gupta and Bhave 1996; Giustolisi et al. 2008; Gorev and Kodzhespirova 2013; Kovalenko et al. 2014; Ciaponi et al. 2015), is the ratio of the flow delivered to the flow required (Ackley et al. 2001; Tanyimboh et al. 2003; Kalungi and Tanyimboh 2003; Tanyimboh and Templeman 2010). The demand satisfaction ratio is also known as the available demand fraction (see e.g. Abdy Sayyed et al. 2015; Gupta 2015).

The first objective function f 1 represents the total cost while the second objective function f 2 represents the system’s performance; f 1 is minimized and f 2 is maximized.

$$ {f}_1={\chi}^2 $$
(1)
$$ {f}_2={\pi}^4 $$
(2)

where χ and π are the normalised total cost and system performance functions, respectively. The exponents 2 and 4, in Eqs. 1 and 2 respectively, are default empirically derived values in Siew and Tanyimboh (2012a). For solution i,

$$ {\chi}_i=\frac{C_i}{C_{\max }};\kern1.5em i=1,\kern0.5em \dots, \kern0.5em NI $$
(3)

where NI is the number solutions, C i is the total cost for solution i and χ i is the normalised total cost. This includes all the costs incurred (i.e. pipes, pumps, tanks, energy, etc.). The present worth of energy costs is based on an interest rate of 12 % and an amortisation period of 20 years. Details of the costs of pipe paralleling, cleaning and lining, pump operation and tanks are available in Walski (1987). C max is the cost of the most expensive of solutions in the population.

We investigated two alternative definitions of the system performance function π in Eq. 2. For solution i, where NI is the number solutions,

$$ {\pi}_i=\frac{1}{2}\left(\frac{1}{NL}{\displaystyle \sum_{l=1}^{NL}{\mu}_l}+\frac{1}{NR}{\displaystyle \sum_{r=1}^{NR}{\rho}_r}\right);\kern1em i=1,\kern0.5em \dots, \kern0.5em NI $$
(4)

where NL is the number of loading conditions (e.g. average day, peak flow, etc.), NR is the number of service reservoirs or tanks. For service reservoir r, ρ r is the replenishment ratio; i.e. the ratio of the volume of water at the end of the last hour of the extended-period simulation to the total operational water volume. This criterion aims to refill each reservoir at the end of the operational cycle (typically 24 hours). For loading condition l, μ l is the average network demand satisfaction ratio; i.e. the ratio of the available flow to the demand and is derived from the pressure-driven simulation model. Maximizing this criterion aims to satisfy all the nodal demands (Ackley et al. 2001). In this way, the minimum node pressure constraints are addressed seamlessly (Siew and Tanyimboh 2012a; Siew et al. 2014).

The average network demand satisfaction ratio μ l for loading condition l is

$$ {\mu}_l=\frac{1}{N{T}_l}{\displaystyle \sum_{t=1}^{N{T}_l}{\sigma}_t};\kern0.36em l=1, \dots,\;NL $$
(5)

For hydraulic time step t, σ t is the network demand satisfaction ratio; and NT l is the number of hydraulic time steps for loading condition l. The parameters μ l , ρ r and σ t all have values between 0 and 1. Hence, the system performance function π i (Eq. 4) reaches a maximum value of 1 when all the criteria defined are met. As π i comprises normalised necessary conditions, additional weights or coefficients are not required.

The system performance function π i may achieve fully feasible solutions. However, the solutions so obtained need not make full use of the variable storage of the service reservoirs during the average day. Therefore, to improve the formulation further, an additional criterion was introduced in Eq. 6 to promote service reservoir depletion.

$$ {\pi}_i=\frac{1}{3}\left(\frac{1}{NL}{\displaystyle \sum_{l=1}^{NL}{\mu}_l}+\frac{1}{NR}{\displaystyle \sum_{r=1}^{NR}\Big({\rho}_r}+{\delta}_r\Big)\right);\kern.5em i=1,\kern0.5em \dots, \kern0.5em NI $$
(6)

For service reservoir r, δ r is the depletion ratio i.e. the ratio of the maximum cumulative depletion achieved for the average-day flow to the total operational volume. This criterion promotes cost effectiveness and improves water quality through full depletion of the service reservoirs in each cycle (see e.g. Edwards and Maher 2008).

5 Computational Solution of the Optimization Problem

We used a genetic algorithm PF-MOEA (penalty-free multi-objective evolutionary algorithm) developed in this research (Siew and Tanyimboh 2012a; Siew et al. 2014) to solve the optimization problem. PF-MOEA is based on NSGA II (non-dominated sorting genetic algorithm) (Deb et al. 2002). PF-MOEA rates all solutions according to the performance function π i (Eq. 4 or 6) followed by Pareto dominance based purely on the performance and cost functions π i and χ i .

The pressure-driven hydraulic simulator EPANET-PDX (Siew and Tanyimboh 2012b; Seyoum and Tanyimboh 2014) is embedded in PF-MOEA. The pressure-driven hydraulic simulator provides a realistic assessment of the hydraulic properties of the solutions with insufficient flow and pressure as it takes account of the relationship between the pressure at a demand node and the flow that is available (see e.g. Elhay et al. 2016). Unlike NSGA II (Deb et al. 2002), PF-MOEA rates both feasible and infeasible solutions strictly according to their cost effectiveness. Accordingly, in PF-MOEA all feasible and infeasible non-dominated solutions are considered superior to all feasible and infeasible dominated solutions. In other words, non-dominated infeasible solutions are considered superior to dominated feasible solutions. Additional details on the methodology and its effectiveness are available in Siew and Tanyimboh (2012a) and Siew et al. (2014). Also, demonstrations of the effectiveness of an alternative penalty-free formulation that uses the EPANET 2 hydraulic simulator are available in Saleh and Tanyimboh (2013, 2014, 2016). The hydraulic modelling approach in EPANET 2 assumes that the flow that is available at a demand node is the same as the demand and is thus characterised as demand-driven analysis (see e.g. Spiliotis and Tsakiris 2011).

We wrote the program for the genetic algorithm in C++ (Siew and Tanyimboh 2012a). We used binary coding and the operators were single-bit mutation, single-point crossover and binary tournament selection for crossover. The population size was 200; crossover probability was 1; and mutation rate was 0.005; 15 optimization runs with different initial populations were conducted for each formulation (Eqs. 4 and 6) of the performance function π i . The members of the initial populations were selected randomly except for the minimum and maximum solution vectors that are always included by default. Each decision variable in the minimum solution vector has the smallest permissible value within the solution space. Similarly each decision variable in the maximum solution vector has the largest permissible value within the solution space.

Each optimization run lasted 5,000 generations i.e. one million function evaluations. We carried out extended-period simulation with a hydraulic time step of one hour for the 24-hour average-day flow. Previous studies used hydraulic time steps of three hours (Vamvakeridou-Lyroudia et al. 2005; Prasad 2010) and six hours (Walters et al. 1999). Also, we used a hydraulic time step of 30 minutes for the fire flows. A personal computer (Intel Core 2 Duo, CPU 2.66 GHz, RAM 3.23 GB) was used for this study. A typical optimization run required on average 22.7 hours; i.e. 16.36 seconds per generation or 0.08 seconds per function evaluation.

6 Results and Discussion

The solutions achieved were assessed further by investigating the spatial and temporal variations of the water age, chlorine residual and disinfection by-products for the average-day flow in the hydraulic simulator EPANET 2 (Rossman 2002). Complete mixing was assumed for all tanks. All water quality simulations were run for 72 hours to enable the results to stabilize. Results for the last 24 hours are presented here. For the water quality simulations, both the hydraulic and water quality time steps were one minute. Bulk and wall reaction rate constants of 0.5/day and 0.1 m/day, respectively, were assumed (Carrico and Singer 2009). To ensure the chlorine residual at all demand nodes and tanks remained just above 0.2 mg/L (WHO 2011) the chlorine concentration at the treatment plant was kept constant at 0.6 mg/L. A maximum total THM (trihalomethane) concentration of 100 μg/L (European 1998) was adopted. For water age and THM, initial values of zero were assumed for all nodes and tanks. The 72-hour extended-period simulation in EPANET 2 required 1.6 seconds for water age, 1.3 seconds for chlorine and 2 seconds for THM, on an Intel Xeon workstation (with two processors of CPU 2.4 GHz and RAM of 16 GB). To avoid misunderstanding, this workstation was used for the water quality modelling only. The other aspects i.e. optimization and results verification were performed on the PC (personal computer) mentioned previously in Section 5.

Two of the best solutions achieved i.e. Solutions 1 and 2, are presented and discussed in detail. These solutions are fully feasible as they do not violate any node pressure constraints while operating under all five loading conditions and all tanks fully refill for the average-day 24-hour cycle. Both solutions have been simulated with a hydraulic time step of one minute to confirm their feasibility. Figure 1b shows the Pareto-optimal front from which Solution 2 was obtained. The optimization algorithm consistently found many feasible designs as in Fig. 1b. Therefore, post-optimization, artificially stringent criteria were adopted for the hydraulic simulations of the solutions achieved. Aside from Solution 2, there are many fully feasible designs in the same optimization run that are cheaper than the previous best solution in the literature. Table 3 provides a cost comparison with the previous best solutions.

Table 3 Cost comparison with previous best solutions

The algorithm also achieved consistently many competitive solutions that are feasible based on larger hydraulic simulation time steps of 30 or 60 minutes. Two such solutions, i.e. Solutions 3 and 4, from different optimization runs, are included for completeness but not discussed in detail. All solutions presented are cheaper than the cheapest feasible solution reported in the literature to date with a total cost of $10.59 million (Prasad 2010). Solutions 1, 2 and 4 achieved the lowest tank costs compared to previous published solutions. The least cost solution achieved (Solution 1) has a total cost of $10.31 million. New pumps were not required and a single new tank was added.

Maximum velocity constraints were not considered herein as they are not in the original problem specification (Walski 1987). However, all solutions presented have average day flows with velocities less than 2 m/s (Prasad 2010). Take Solution 2 for instance whose maximum flow velocity is 1.51 m/s in pipe 4 at the 10th hour when the demand factor of 1.3 is the highest. Pipe 4 connects node 1 to the pumping station.

Pipe upgrading and rehabilitation details for the best two solutions are summarized in Table 4. The values of the pipe diameters in Table 4 have been converted from inches to metres; the pipe sizes are in fact discrete. The results generally appear to suggest that pipe paralleling (PP) is preferred to cleaning and lining (CL). In each of Solutions 1 and 2, only one pipe was selected for cleaning and lining (Table 4).

Table 4 Pipe upgrading and rehabilitation results

Solution 1 is the cheapest solution achieved at a total cost of $10.31 million. Unlike most previous solutions in the literature with two tanks (e.g. Prasad 2010), a single new tank was added at node 7 (Tank 7(N) hereafter). No new pumps were added to the pumping station. One of the three existing pumps operates when demands are high from 9 a.m. to 6 p.m. i.e. nine hours and the remaining two operate for the entire 24 hours. The pumps collectively use 18,733.5 kWh of energy per day (Table 5).

Table 5 Properties of the new tanks

The pump operation strategy achieved is somewhat different from other solutions published in the literature (e.g. Walters et al. 1999) where the third pump is usually switched on during the low demand period to re-fill the tanks. Herein, for Solution 1, additional flow during the peak demand hours is supplied by both the new tank and the third pump. Probably this could be the reason why the algorithm only allocated one new tank instead of two as in some of the previous solutions (Prasad 2010; Vamvakeridou-Lyroudia et al. 2005; Murphy et al. 1994).

Table 5 provides details of the new tank. Figure 2a shows the tank operating levels over a cycle of 24 hours for the average day flow. All tanks refill fully by the end of the day. The new Tank 7(N) and existing Tank 41 (Tank 41(E) hereafter) drain rapidly (approximately 6 and 3 hours respectively). The water level in existing Tank 42 (Tank 42(E) hereafter) fluctuates and only approximately 40 % of the total operational volume is utilised. For comparison purposes, the tank operations from Prasad (2010) are presented also in Fig. 2c. It is evident that the capacity of Tank 42(E) is not fully utilised. Similarly, the best solution in Vamvakeridou-Lyroudia et al. (2005) has this weakness for the average day and two fire flows. The above-mentioned solutions including Solution 1 are thus undesirable from the standpoint of tank operation and water quality.

Fig. 2
figure 2

Water levels in the tanks for the average day flow. (E) and (N) refer to existing and new tanks respectively

Solution 2 costing $10.41 million was the cheapest feasible solution achieved with the enhanced performance function in Eq. 6. One new tank was added at node 6, i.e. Tank 6(N). No new pumps were added; the pumping cost is slightly higher than Solution 1. All three pumps operate from 6 a.m. to 3 p.m. i.e. nine hours when the demands are high. Only two pumps are required for the rest of the day (Table 6). Costs for the new tank and pipe rehabilitation are similar to Solution 1.

Table 6 Optimized pumping schedules

Figure 2b shows the operation cycles of the tanks for the average-day flow for Solution 2. The available operational volumes for all three tanks are utilised effectively. Indeed the water level in existing Tank 42(E) reaches a minimum level of 0.21 m above the minimum operating level. This shows that the proposed new formulation in Eq. 6 with the tank depletion criterion is advantageous. The Appendix shows the tank operations for the three fire flows. The total emergency storage provided by the tanks satisfies the fire flows. Existing Tanks 41(E) and 42(E) drain fully in each case. The new Tank 6(N) reaches a maximum depletion of approximately 90 % at the end of Fire-flow 2.

Table 7 provides the values of the pressures at the most critical nodes for the various loading conditions and shows that all the solutions satisfy the pressure requirements in full. Results presented herein are based on steady state and extended period simulations (with a hydraulic time step of one minute unless otherwise stated) performed with EPANET-PDX. Also, the solutions were re-checked in EPANET 2 that gave the same results. For the results verification, a 24 hour extended-period simulation with a hydraulic time step of one minute took 0.843 seconds in EPANET-PDX on the PC.

Table 7 Minimum pressures for the various loading conditions

Over the 24 hour cycle (with hydraulic time step of one minute) the pumps of Solution 1 operate consistently near their best efficiency point of 65 % and do not drop below 60.5 %. For Solution 2, the efficiency of the three pumps varies between 60 and 65 % throughout the 24 hour operating cycle. Aside from the ON-OFF status variables for pump scheduling in Table 1, no extraneous operational constraints were applied to the pumps. For example, Prasad (2010) and Vamvakeridou-Lyroudia et al. (2005) used a constraint on the pump operational capacity to meet daily demand variation.

Finally, in terms of water quality, Fig. 3 shows the two new tanks 6(N) and 7(N) are generally comparable. On the other hand water quality is better in the existing Tank 42(E) in Solution 2 due to the criterion that promotes tank depletion. Also Fig. 4 suggests that in Solution 1 the water quality at node 7 would have the largest deviation from the average among all the demand nodes in both Solutions 1 and 2. These results seem to suggest that overall, considering both the tanks and demand nodes, Solution 2 is better. However, without additional investigations it is difficult to say whether the improved formulation of the performance function π i in Eq. 6 is the most dominant factor. Figure 5 shows the depletion of the emergency storage for Solution 2, for the three fire-fighting flow scenarios.

Fig. 3
figure 3

Water quality variations in the tanks

Fig. 4
figure 4

Water quality variations at the demand nodes over 24 hours

Fig. 5
figure 5

Emergency storage depletion in Solution 2 during the fire flows. Tanks 41 and 42 are existing tanks; Tank 6 is new

7 Conclusions

This article concerns the development and application of a new multi-objective evolutionary optimization approach for the design and upgrading of water distribution systems with multiple pumps and service reservoirs. The tank siting and design methodology was based on pressure-driven extended-period simulation and was shown to be highly effective and the performance of the pumps was consistently efficient. Explicit criteria for the depletion and replenishment of the service reservoirs were included in the optimization model. The optimization procedure developed achieved many optimal and near optimal solutions consistently when applied to the benchmark Anytown water distribution network. The new best solutions found for the Anytown network were competitive and fully feasible. The advantages of the proposed new approach are that it is practical and generic.