A SIMPLIFIED SANITARY SEWER SYSTEM GENERATOR FOR EXPLORATORY MODELLING AT CITY-SCALE

Future climatic, demographic, technological, urban and socio-economic challenges call for more flexible and sustainable wastewater infrastructure systems. Exploratory modelling can help to investigate the consequences of these developments on the infrastructure. In order to explore large numbers of adaptation strategies, we need to re-balance the degree of realism of sewer network and ability to reflect key performance characteristics against the model’s parsimony and computational efficiency. We present a spatially explicit algorithm for creating sanitary sewer networks that realistically represent key characteristics of a real system. Basic topographic, demographic and urban characteristics are abstracted into a squared grid of ‘Blocks’ which are the foundation for the sewer network’s topology delineation. We compare three different pipe dimensioning approaches and found a good balance between detail and computational efficiency. With a basic hydraulic performance assessment, we demonstrate that we attain a computationally efficient and high-fidelity wastewater sewer network with adequate hydraulic performance. A spatial resolution of 250 m Block size in combination with a sequential Pipe-by-Pipe (PBP) design algorithm provides a sound trade-off between computational time and fidelity of relevant structural and hydraulic properties for exploratory modelling. We can generate a simplified sewer network


Introduction
Wastewater infrastructure systems face several challenges in meeting their future requirements. The applicability and sustainability of conventional centralised sewer systems for wastewater management are increasingly questioned (Wong and Brown 2009, Hering et al. 2012, Marlow et al. 2013, Larsen et al. 2016, Bakhshipour et al. 2019. Aging or underperforming infrastructure and predicted climatic, demographic, technological, urban and socio-economic developments, call for an adaptation of current urban wastewater management systems, considering future requirements and challenges related to service provision (Baron et al., 2016, Larsen et al., 2016. This adaptation involves not only technical aspects, but also resource efficiency, resilience and sustainability. Recent research suggests a transition towards more decentralised schemes (e.g., on-site, cluster or community-level infrastructure for treating, dispersing or reusing wastewater at or near its source) that align better with development scenarios (Kaufmann Alves 2013, Larsen et al. 2013, Baron et al. 2016, Bakhshipour et al. 2019, Hoffmann et al. 2020, Elmqvist et al. 2021) and changing goals (Hering et al. 2012). Their 'degree of centralisation', defined as the ratio of sinks and sources (Eggimann et al. 2015) can vary from fully centralised (e.g., all sources are connected to one centralised wastewater treatment plant) to fully decentralised (every source has its own local sink) (Larsen et al. 2013, Eggimann et al. 2015, Poustie et al. 2015. Considering this additional degree of freedom is one of the grand challenges in current wastewater infrastructure planning. Three core issues emerge: i) generation of new or novel infrastructures within existing systems, e.g., because existing infrastructures require renewal or due to urban growth, ii) appropriate representation of the spatial characteristics of infrastructure and of its underlying catchment and iii) integration of deep or scenario uncertainty into the planning process.
Several algorithms to generate water infrastructure have been developed and used to model and evaluate different system alternatives. Sitzenfrei et al. (2010a) present the Virtual Infrastructure Benchmarking (VIBe) model to generate ensembles of virtual case studies of entire urban water systems (e.g., water supply, drainage and sewer systems (Urich et al. 2010)) that resemble real systems for testing new measures and generalising theories. Spatial layouts of water infrastructure networks in this framework are generated through graph theoretical algorithms (Sitzenfrei et al. 2010b, Duque et al. 2020 or agent-based models, in which possible/optimal sewer placement is identified (Urich et al. 2010).
Graph theory (Ahuja et al. 1993) is commonly used to represent water networks. For example, when solving optimization problems for design, operation or maintenance using heuristics (Sitzenfrei et al. 2010b, Urich et al. 2010, Haghighi and Bakhshipour 2015, Bakhshipour et al. 2017, Moeini and Afshar 2017 or exact algorithms (Newman et al. 2014, Duque et al. 2016, Navin et al. 2019, Duque et al. 2020. Bakhshipour et al. (2019) generate multiple infrastructure layouts and explore different degrees of centralization in both green-fields and existing urban areas. Others considered the impact of the urban characteristics on water infrastructure (Kaufmann Alves 2013, Baron et al. 2016, Bach et al. 2018. Most of these, aim to obtain a detailed representation of the sewer network and are therefore often computationally expensive depending on the size of the network (i.e. number of pipes).
Water infrastructures are closely tied to the urban form and therefore strongly conditioned by spatial characteristics. For example, location, shape and dimensions of sewer infrastructures depend on the density and spatial distribution of water users, but also influenced by terrain and street layout. Geographic Information Systems (GIS) have been used to represent these spatial parameters, to assess and improve the management and planning of urban drainage networks , and localise different elements of the system (Leitão et al. 2005, Abbas et al. 2019. A relevant open question in this regard is the required degree of spatial abstraction or resolution for different planning purposes, especially in light of considerable uncertainties of future developments. Lempert et al. (2003) coined the term 'deep uncertainty' to describe situations where "we know absolutely nothing about probability distributions and little more about the possible outcomes", as is the case concerning future water use, population growth, and urbanisation . One way to consider such uncertainties is to explore the consequences of system alternatives under different scenarios in a long-term planning process. Such an approach has been applied to explore potential urban drainage transition pathways , Baron et al. 2015, for layout and size optimisation of sewer networks (Duque et al. 2016, Bakhshipour et al. 2019, Moeini and Afshar 2019, Turan et al. 2019, Duque et al. 2020, Zaheri et al. 2020, operation and maintenance assessment (Marzouk andOmar 2013, Petit-Boix et al. 2015), virtual case study generation of urban water infrastructure (Urich et al. 2010), quantification of decentralised water management opportunities , risk and vulnerability assessment (Sitzenfrei et al. 2011, Meijer et al. 2018), among others.
DAnCE4Water (Dynamic Adaptation for enabling City Evolution for Water) is an example of an integrated planning support model that simulates the development of water infrastructure over long time periods considering urban and societal dynamics . The model uses multi-criteria assessment to evaluate numerous stormwater management options at various spatial scales . Given its size and scale, DAnCE4Water requires significant data input to its sub-models. Baron et al. (2017) also present a highly detailed model, SinOptikom, to model transitions of combined sewer networks towards source-separated wastewater systems with decentralised treatment over a 50-year period. Integer linear programming is used to optimize the transition strategy using an objective function with eight objectives. This approach was applied to several small, rural villages in Germany with 500-800 inhabitants (approximately 4 km network length). SinOptikom is computationally hard to scale and details on the modelling approach and its implementation are not publicly accessible (Baron et al. 2016).
From a long-term, city-scale planning perspective, the above-mentioned approaches suffer from high complexity. A zeal for close representation of the real system, next to the integrated modelling of physical processes seem to have led to detailed spatial models that require vast amounts of data and computational resources even for small water infrastructure systems.
We need models that allow us to analyse large numbers of sewer infrastructure alternatives and future scenarios for robust decision making and options exploration at larger spatial scales than what is currently available (Davis et al. 2007). Therefore, the generation of sewer networks with different levels of spatial abstraction is necessary to 1) create appropriate sewer models for the modelling task (e.g. more detailed hydraulic performance evaluation vs. city-scale urban drainage planning) and 2) allow the comparison of multiple alternatives, while maintaining the integrity of information about the urban characteristics and associated water infrastructure ). This paper presents a sewer infrastructure generator for exploratory modelling in wastewater system planning, focussed on sanitary sewers (i.e., disregarding stormwater). The automatic generation of representative sewer networks is essential to test the impact of changes in the urban form. We developed a spatially explicit algorithm for creating approximate sewer network topologies that realistically represent main properties of a real system, based on topographic, demographic and urban characteristics. To explore different degrees of realism, levels of abstraction and computational efficiency, two different methodologies for generating virtual wastewater sewer networks are tested and compared to the existing infrastructure. A third, highly abstract (non-spatial) algorithm is used to compare the physical network characteristics and the computational effort required to generate such an abstract network. These three approaches are contrasted to better understand the tradeoffs of different simplified representations in comparison to models using topology and dimensions of real sewer infrastructure. We also evaluate the hydraulic performance of the most promising network generation approach against the expected performance of the real system.

Methodology
Following good modelling practice, we first define the specific purpose and context before determining model structure and parameters and ultimately evaluating how well a model fulfils its purpose (Jakeman et al. 2006, Schmolke et al. 2010. Our aspiration is a sewer network generator for exploratory analysis of topological and loading changes of sewer systems for long-term planning (i.e. >10 years horizon) at city-scale (typically several hundreds to thousands of km of pipes). Ideally, we would like to use it interactively with stakeholders, where only the most relevant out of thousands of candidate solutions would be selected and evaluated. This would require the generation of hydraulically feasible designs and the computation of key hydraulic performance indicators for each relevant solution with high computational efficiency. The sewer topology should fulfil requirements of pertinent planning/dimensioning norms and be able to approximate the real system at district level, while reflecting common constraints imposed by spatial characteristics such as topography, the location of major roads, surface water bodies or parklands. Furthermore, the pipe diameter distributions should not diverge substantially from that of a real infrastructure system to estimate aggregate infrastructure cost and construction efforts. Key hydraulic performance characteristics under typical flow as well as high and low flow conditions should be reflected appropriately. In summary, we aim to develop a sewer network generator that is significantly faster while still respecting the spatial characteristics but also coarser than the existing approaches reviewed in the introduction. Figure 1 presents our proposed approach to achieve this. We use basic spatial information from population, land use and topography, to create a spatial abstraction of the urban characteristics using the Planning-Support Model UrbanBEATS (Bach et al. 2018). Urban characteristics are abstracted into a squared grid of Blocks (as defined in section 2.1) in order to determine the need for wastewater management depending on the land use area. This information is then used by the sewer infrastructure generator to produce a network topology. These network designs are compared against a fully abstract, spatially not explicit approach know as Urban Water Infrastructure Model (UWIM) (Maurer et al. 2013) and the real network. Since we want to sufficiently represent the characteristics of the real network, we compared the physical characteristics concerning topology, network connectivity, and pipe diameter frequency distributions of the real system and the generated system at three spatial resolutions. Finally, a basic hydraulic performance assessment was done for the algorithms that fulfilled abovementioned requirements for our envisaged modelling purpose.

Input data and abstraction of urban characteristics
To generate a sewer network, we require information on the (a) land use, (b) population and (c) elevation, which defines the urban environment to calculate wastewater volumes based on local planning regulations (e.g. average housing occupancy, water demands). To support this process, we use the UrbanBEATS Planning-Support Model as a starting point. UrbanBEATS is a spatial model designed for planning decentralised stormwater management solutions in urban catchments (Bach et al. 2018. It processes GIS inputs of land use, population and elevation among other data sets to produce a gridded map of cells known as Blocks. Each Block is a grid cell containing a geodatabase of urban characteristics for its corresponding geographic location and represents an abstraction of the urban environment. This Block can have a flexible size that is dependent on modelling aims and the model is capable of simulating differently sizes of urban areas (it has been tested for areas ranging between 5 km 2 up to 1200 km 2 ).
We selected UrbanBEATS as a suitable starting point for aggregating all relevant spatial information to undertake our sewer network generation. The model considers the variability in land uses in a spatially explicit manner at a reasonably fine spatial scale. Additionally, the spatial resolution in UrbanBEATS allows us to design an abstract network that can encompass a broad range of pipe diameters and spatial scales.
To set up our input layer for sewer network generation, we follow the following steps: (1) Delineation of Blocks, (2) Abstraction of the urban form and (3) Spatial mapping of wastewater generation.

Delineation of Blocks
UrbanBEATS processes three essential spatial input maps of (a) land use (using a watercentric classification from Bach et al., 2015b), (b) population and (c) elevation (see Figure 2ac). A grid of Blocks containing the essential information to define the water infrastructure networks and technologies as well as their interconnectivity amongst the adjacent Blocks is delineated for the spatial extents of these maps as described by Bach et al. (2018). Blocks represent the smallest spatially explicit unit within the model in which input data is aggregated. Block size is user-defined and will have an influence on the level of abstraction of urban characteristics. In this study, we use a 250 by 250 m square Block size for the abstraction since we want to generate not only the trunk mains of the sewer network but also the collection sewers. This size will provide us with an extensive sewer network without excessively aggregating the spatial data. However, we also test the sensitivity of our algorithm to larger square Blocks with 500m and 1000m sides to better understand the impact of resolution.

Abstraction of the urban form and mapping of wastewater generation
The available higher-resolution input data is aggregated to the Block level. Note that each Block retains information about the mix of land uses while total population and average elevation are determined by overlaying the grid onto the input rasters. UrbanBEATS then uses standard parameters, e.g., from statutory planning guidelines and other relevant urban planning documents, to create an abstraction of the urban form based on the aggregated input data. This involves a procedural algorithm that systematically identifies characteristics for each land use such as the size and geometric form of residential allotments, impervious fractions, breakdown of land covers in industrial and commercial zones among others.
Employment is determined from population data and non-residential land uses. For a full description, we refer the reader to Bach et al. (2018). To determine the wastewater volumes generated, an end-use analysis considering the frequency and duration of different household water demands (e.g. shower, toilet, kitchen, laundry) and multiplying these with flow rates for the various indoor household appliances is used. For this study, we applied water efficiency ratings from the Australian Standards AS6400:2016 (Standards Australia, 2016). Non-residential wastewater volumes are determined based on unit flow rates per capita or floor space. The process is calculated on a Block-by-Block basis. Other than irrigation, all end uses are assumed to discharge into the sewer system. Output values are annual volumes, but can be downscaled using appropriate diurnal and seasonal patterns or modified to reflect peak discharge for the following hydraulic design.

Figure 2. Abstraction of the urban characteristics based on (a) topography, (b) land use and (c) population density maps into a coarse grid representation using Blocks, representing (d) elevation, (e) employment and population
density and (f) wastewater discharge at the peak hour. Note that in this particular case study, the highest elevation is located in the North East part of the catchment and gets shallower towards the coastal line.

Sewer topology and hydraulic design
Sewer network topology delineation is intended to follow basic urban characteristics. It should generate a network from scratch in new urban developments or represent an existing network when the cadastre is not available. Other methods for simplification (e.g. skeletonization) of an existing network could be used if all the data were known (Grayman and Rhee 2000, Cantone et al. 2008). In this study we focus on the methodology for topology delineation based on the abstraction of the urban characteristics into a Blocks map.
The Blocks map is used by the infrastructure generator, assuming that sewer connections are required in areas where people live or work. The spatial distribution of population and employment, defines areas needing wastewater management. Thereafter, two key procedures are run: (1) Generation of sewer network topology (section 2.2.1), followed by (2) Hydraulic design of sewer pipes (section 0).

Generation of sewer network topology
Based on the abstracted data of the city, the sewer network topology is generated through the following steps: (1) identify existing wastewater treatment plants, main roads, and water bodies; (2) correct terrain depressions found after the abstraction of the topography; (3) delineate sewer topology based on gravity-driven flow paths influenced by the infrastructure identified in step 1 and (4) generate the trunk sewers to connect all local outlets.
To start, we define a final outfall at the spatial system boundary (typically representing a wastewater treatment plant or a trunk main leading thereto) and use spatial features such as main roads and rivers to guide the sewer delineation. The modeller can determine whether or not to use certain types of streets as an attractor for the sewer delineation. Since Blocks are the lowest abstraction level, the nodes of the flow paths are defined by each Block's centroid, with an elevation equal to the average elevation of the Block. Pipes within the Blocks are not considered. Topology delineation takes place over the entire area that requires wastewater management. Flow paths are delineated using the D8 algorithm (O'Callaghan and Mark 1984) assuming gravity-driven flow, following the steepest slope. This algorithm determines the largest elevation drop among each Block's eight neighbours and saves that neighbour as its downstream Block. The topography of the study area is pre-processed to avoid the presence of internal depressions and flat areas by using a carving method (Sedgewick 1988), which is explained in more detail in appendix A1.1.
Local depressions that cannot be overcome with the pit removal method normally correspond to the lowest Blocks either at the edge of the case study or natural depressions (e.g. rivers or lakes). These Blocks are then the outlets of the generated sewer (sub-)catchment. To ensure that all these catchments drain to a defined wastewater treatment facility, we subsequently generate the trunk sewers. This is done by connecting the catchment sinks using a minimum spanning tree (Ahuja et al. 1993) algorithm to get the minimum pipe length of such a trunk system. The Minimum Spanning Tree to delineate the trunk sewers considers the location of outlets of the different sub-catchments and the wastewater treatment plant, as well as the elevation of these points. So far, we allow for 'negative slopes' (i.e. against gravity), which means that pumping is allowed. We also allow crossings of water bodies, considering that we do not necessarily want to have the trunk sewers being delineated along the rivers.

Hydraulic Design of Sewer Pipes
This part of the model builds on the generated sewer topology by assigning diameters and its upstream and downstream invert elevations to the pipes generated in the last section. We apply and compare two distinct dimensioning algorithms to size the generated gravity sewer network: (a) the Sewer Network Design (SND) algorithm (Duque et al. 2016), which gives a cost-optimal hydraulic design after an exhaustive search over all the feasible design combinations and (b) Pipe-by-Pipe (PBP) method, returning the first feasible hydraulic design obtained when designing one pipe at the time starting from the extremes of the network towards the outfall and choosing the minimum pipe diameter capable to transport the design flow. These contrasting approaches were chosen to identify which general method is suitable to represent a realistic system vis-à-vis our modelling purpose.
The calculated average daily wastewater discharge at each Block, is used in equation 1 to estimate the maximum daily peak flow. The model does not consider inflow from infiltration.
We consider residential, industrial and commercial wastewater discharges per Block assuming a separate foul sewer network. The maximum daily peak flow is used as the design flow for the hydraulic design of the infrastructure: Eq. 1 where Qmax is the wastewater peak or design flow [m 3 ·s -1 ], Qww is the average wastewater production [m 3 ·s -1 ], fmax the peak water use factor and fww the wastewater return factor representing the amount of used water that is discharged to the sewer. We use a fmax value of 1.2 and a fww of 0.85, following Butler and Davies (2011).
The design algorithms for pipe diameter and slope are parameterised to reflect local design regulations and customary hydraulic design constraints. For both dimensioning algorithms, the hydraulic constraints presented in Table 1 are considered during the hydraulic design.
Additional parameters and considerations for the hydraulic design under very steep or shallow terrain slopes, using sewer drops or pumps respectively, are presented in appendix A2.1. The pipe diameter is a discrete variable, which is defined by the regulatory minimum diameter and the set of commercially available or used pipes. We used the set of diameters D In the first method, we used the dimensioning algorithm of Duque et al. (2016) to solve the hydraulic design problem (i.e., diameter-slope combination for each pipe). Duque et al.'s (2016) framework finds the cost-optimal design of a series of sewer pipes using Dynamic Programming. The input data are network topology, flow rates for each pipe, a cost function in terms of the diameter and excavation depth, roughness of the pipe material (Manning or Strickler coefficients) and hydraulic constraints that guarantee proper system operation (see Table 1). The hydraulic design problem is modelled as a Shortest Path Problem (cf., Bellman 1958; Ahuja et al., 1993), where the provided directed graph is reflecting the network topology and considers every feasible slope-diameter combination for each pipe. The slope is defined by discretized invert elevations of the couple of nodes at the extremes of each link, while the diameter of the link is defined by its downstream node. The shortest path algorithm identifies the cost-optimal hydraulic design by selecting the least cost path. Each pipe has an estimated cost based on its diameter and required excavation volume. The evaluation of every feasible design combination makes this algorithm computationally expensive, however it is able to solve the problem in a polynomial time which makes it an NP-easy problem (Garey and Johnson 1979).

Method 2: Pipe-by-Pipe design
The Pipe-by-Pipe design (PBP) algorithm designs one pipe at the time starting from upstream extremes of the network topology towards the outfall. We calculate the minimum and maximum feasible slopes for the given diameter using Manning's formula in equation 3, for the minimum and maximum velocity (Table 1), assuming a maximum filling ratio of the pipe: where [-] is the slope of the pipe, [m·s -1 ] is the flow velocity, [ ] is the Manning roughness coefficient and [m] is the hydraulic radius. We set initial values at the minimum slope for each pipe and select the first slope that fulfils the hydraulic constrains for the pipe including the minimum and maximum excavation limits of 1.2 m and 5 m respectively. The real filling ratio for each diameter-slope combination, given the design flow, is recalculated before the hydraulic constraints evaluation. Once a feasible discrete diameter is determined we continue downstream and design the next pipe. When two or more pipes converge to one node, the algorithm stops until all the upstream pipes have been designed. This ensures connectivity between pipes. Each pipe starts at the deepest depth of the pipes arriving at its upstream node. Diameters increase as we proceed downstream, implying that the minimum starting diameter for each pipe is that of its upstream neighbour. Once we reach the network outlet (i.e., the sink), a feasible hydraulic design is found. This method thus only proposes a feasible hydraulic design that complies with constraints in Table 1, without seeking an optimal design. UWIM's input data comprises housing density, total population, amount of wastewater to drain and catchment area to calculate the total length and diameter distribution of sewer pipes. It divides the total catchment area into a rectangular grid and uses three model parameters to characterise the catchment's geometric shape and urban form. The parameter

Model Testing and Application
[-] is the shape factor of the catchment area, [-] is the shape factor of housing plots and [-] is the fraction of secondary lines above the main trunk. They determine the height and width of the cells that subdivide the catchment. The number and length of pipes is directly affected by the dimensions of the cells, making the diameter distribution very sensitive to these parameters. Here, f1=60 f2=60 f3=0.6 are chosen, calibrated to the cells area to get a comparable sub-catchment with regard to those obtained when using a 250 m Block size to generate the topology.
This simple model can represent key parameters of real sewer systems realistically without any explicit spatial design. We use this dimensioning algorithm as a benchmark for our abstraction process. UWIM is expected to be computationally efficient, but also the coarsest and most abstract representation of the sewer system available.

Case Study Description
We selected a region within inner metropolitan Melbourne, Australia ( Figure 2) (Bach et al. 2015), were converted to a 10m x 10m raster for UrbanBEATS input (Figure 2Figure 2 a-c). Additionally, to aid the delineation and network generation, main roads and rivers were also obtained in ESRI Shapefile format (DataVic 2020). To allow comparison with the real sewer infrastructure, we obtained a sewer assets database from the water authority Melbourne Water. The baseline used for the assessment was 2010, with land use data and population data from data sets around this time period.
The abstraction obtained from UrbanBEATS was created not only for 250 m Block size, but also for 500 m and 1000 m to test the sensitivity of our sewer generation algorithm. Outputs for a 250 m Block size are shown in Figure 2 (d-f). We can see from the employment map that the city's business activity is concentrated in the centre of the case study and north of the port in the west. The darkest parts reflect the Central Business District and the St. Kilda region, representing the highest office density in this case study. Another hotspot with high employee density is the port, dominated by heavy industry and trade. The population density follows the residential land use and shows a high concentration of residents along the coast to the south and across the hills. Areas without employment or population, are either green spaces or undeveloped areas, which do not need a wastewater management.
Figure 2 (f) shows the wastewater discharge per Block that is used as inflow for the wastewater sewer network. It is calculated based on the water consumption per Block, considering the land use mix. Areas with higher discharge are those with high density residential, commercial and industrial areas.

Validation and Performance Criteria
We compare the physical characteristics and hydraulic performance from the generated networks obtained through the SND, PBP and UWIM algorithms against the real network.
Since we sacrifice accuracy to reduce computational times we highlight the trade-offs of more accurate physical characteristics against computational effort required to design a wastewater sewer network. As outlined at the beginning of this section, an acceptable generated sewer network would have total pipe lengths and dimensions in same order of magnitude and would suitably reproduce basic hydraulic performance as for the real network.

Representation of physical components versus computational effort
A validation process was carried out to compare the generated systems with the existing sewer system. For this, we compare the fidelity of representing the real sewer network characteristics, comparing the generated sewer network topology and pipe diameter distributions qualitatively as well as quantitatively, using the Kullback-Liebler (KL-) divergence distance DKL for discrete probability distributions in equation 4 (Kullback 1959).
The KL-divergence compares the probability of occurrence of each observation between two probability distributions. There, the results from both dimensioning algorithms are translated into frequency distributions of the pipe diameters to be compared where each diameter is considered an observation when calculating DKL. ( ) and ( ) represent the two probability distributions of network pipe diameters ( ) to compare. An ideal KL-divergence value is 0, i.e.
both distributions are equal. This measure is asymmetric, which means it does not require or favour any specific probability distribution.
We also evaluate the computational effort required by the three different dimensioning approaches. We compare for SND and PBP the impact of the spatial resolutions of the Blocks during the topology delineation on the computational times. Likewise, we compare the computational time used by UWIM, whose geometry parameters as specified in section 2.3.1 roughly reflect a 250 m resolution Block.

Hydraulic performance
Both real and generated network topologies were designed using the same hydraulic parameters and conditions using the PBP algorithm and modelled in EPA's Storm Water Management Model (SWMM) (Rossman 2010) for a comparison of their basic hydraulic characteristics. Table 2 presents the hydraulic performance indicators used and the relevant flow conditions for their computation. Detailed water demand patterns and other modelling assumptions are presented in appendix A3.

Wastewater infrastructure generator
There is a trade-off between the spatial resolution determining the level of detail of generated sewer infrastructure and the computational effort required to generate and evaluate it.
Smaller Block sizes give more detail for the topology delineation of the urban characteristics and the delineation of the collection sewers between the Blocks, while reducing the number of pipes to represent the sewer infrastructure for the same urban area (see appendix A1.2). Table 3 compares the results of the generated sewer network topology delineation for the areas of inner Melbourne to the real network topology. The generated topology includes the collection and trunk sewers, taking the wastewater to the discharge point in the west of the city. There are similarities in the location of trunk sewers along the coastline in the south, but also key differences in other areas of the city. Topological differences between the collection sewers of the real network and the generated one are clearly evident in Figure 3. The generated network produces less but longer pipes. Since the pipes are generated from centroid to centroid of the Blocks, they can only be connected through their eight neighbours.

Hydraulic design
Pipe diameter frequency distributions for the real and generated network using the three design methodologies are shown in Figure 4. This distribution gives a first good glance at the dimensions of the infrastructure using the different dimensioning algorithms. It allows for a quick comparison of the proportion of the different pipe diameters in a sewer network based on the shape of distributions. In addition, considering the level of abstraction of UWIM it is the most intuitive way to compare the three dimensioning algorithms (SND, PBP and UWIM) against the real network. It shows that 80% of the real network is composed of the smallestdiameter pipes and around 1% of diameters consist of trunk sewers with a pipe diameters greater than or equal to 1 m.
As apparent from Figure 4, SND and PBP match the real network's diameter distribution closely, but not UWIM. SND obtains a higher frequency of pipes with a diameter of 2.5 m in comparison to PBP, while PBP has a higher frequency on the smallest diameter. Although UWIM is similarly able to represent the proportion of smallest diameter sewers, it appears to omit small-to medium-diameters and returns much larger portions of diameters between 0.8 m and 1.5 m.  Table 4 shows how the three generated sewer networks differ in terms of fidelity of representing the physical components using the KL-divergence measure and computational effort. The computational effort (CPU time) required by SND is orders of magnitude higher than for the other two algorithms and also increases strongly with decreasing Block size. The

Representation of physical components versus computational effort
UrbanBEATS spatial abstraction and topology delineation took about 17 s on a standard computer with Intel Core i7-8550U with 1.80GHz CPU, 4 Cores, 16 GB memory RAM. As such, it was not critical in the total computation time. This time must be added to the time used by SND and PBP to dimension the pipes. Excluding topology delineation, SND took around 2 181 s (36 min) to solve a 763-pipe network design with 250 m Block size, while the PBP algorithm took less than 1 s. UWIM computational time accounts only for the dimensioning of pipes which took just 1.5 s and does not grow much with the size of the catchment area.
These stark differences are not apparent in the fidelity as reflected by KL-divergence. At 250 m spatial resolution, both SND and PBP match the real network very closely (KLdivergence close to zero) where SND's fidelity is only slightly better than that of PBP. The coarser the resolution the faithless the fidelity (higher the KL-divergence). UWIM has an acceptable KL-divergence of 0.32 considering its lack of a spatial representation.

Hydraulic performance and computational effort
The hydraulic performance assessment was done for the hydraulic design of the generated and real topology using PBP algorithm only. SND was omitted from this analysis as the network generation module required too much computational effort for it to be suitable for our purposes. Figure 5 shows that the hydrograph at the outfall node of the generated network reflects that of the real network closely during a 24 h simulation, despite a small delay.       Table 6 presents the computational effort for 48 h hydraulic simulations in SWMM. The computational effort needed using the generated sewer network amount to less than 1/10 th for kinematic wave calculations and 1/50 th for dynamic wave calculations as compared to the real network.

Discussion
The main aim is to identify a network generator that mimics the characteristics of real networks sufficiently well while significantly increasing the computational efficiency of network design and (hydraulic) performance evaluation. One of the key challenges lies in demonstrating the 'fit-for-purpose' of the network generators to support exploratory modelling of changing sewer systems in support of long-term infrastructure planning.
We did this by comparing the generated network with the existing. Such a direct comparison faces two fundamental causes for deviation: 1. The abstraction and idealisation of the network generator might produce fundamental topological or dimensioning differences because it ignores design-relevant spatial details on the ground.
2. There is no reason to believe that an existing network would have been the one and only best way to create a drainage system. Current systems represent one possible 'good enough' solution for a functioning drainage system, usually the product of an historically grown system.
We are interested to minimise the first cause of deviation. However, it is impossible to assign observed differences clearly to one or the other deviation type. Also, the fidelity and acceptable computational time deemed necessary depend on what outcomes of interest one is interested to model at which spatial resolution. With this difficulty in mind, we assess the suitability of the presented sewer network generator.
This model is not only applicable for urban drainage planning and exploratory modelling, but could also be used for other aims such us operation and maintenance modelling of the sewer systems. For example, it is possible to use this simplified sewer generator to analyse different sewer deterioration models, implement different rehabilitation strategies, change characteristics of the urban area and analyse the consequences on the sewer system, among others. However, the nature of the model also has limitations in terms of the resolution that is required for the different modelling aims. It is not meant to be use as a tool for a detailed design of a sewer system, nor a detailed simulation of the operation of the network.

Wastewater infrastructure generation module (network topology)
We found that a 250 m Block size allows for simplification without losing important information of the land use classification and the average elevation per Block. The obtained topology delineation shows similarity between trunk sewers in the real and generated network as presented in Figure 3. The generated topology reflects the two main trunk lines on the north and the south of the case study and the difference in their lengths is marginal (0.3 %). They differ in that the real trunk mains tend to follow roads and highways, while the sewer generator prioritizes topography and follows the river.
The topology of the collection sewers shows more substantial differences. The total pipe length of the generated network is 17 % shorter compared to the real network. Figure 3 shows quite well that the real collection sewers tend to observe more closely the urban form (e.g., road alignments and district shapes) than optimising the slope to the surface elevation.
Overall, the sewer network coverage in the case study area is similar and follows the need for foul sewer networks mainly in residential, industrial and commercial areas. An exception is the area of the port (on the west) where the generated network has more sewer coverage.
This is an artefact of the simple rule of translating the employment per block into needs for wastewater handling. The port area does not have their water consumption distributed equally over their entire land use as it is assumed by the model. Such special cases are quite easy to correct for in the model, by creating a 'no sewer industrial area' if needed.
There is also a very obvious difference in pipe length. This can be attributed to the different definition of nodes in the model and in reality. SND and PBP use the centroids of the Blocks as nodes, a pipe unit is usually defined between a pair of inspection shafts. In the model, this can be easily corrected by introducing a 'node rule' that limits the maximum pipe length.
A more important issue with comparing real with generated networks is their different abstraction levels. The generator is defining the block as a sub-catchment therefore ignoring the pipes within a block and defining the centroid as a node. It is not possible to achieve the same abstraction level with the real network, as the real sub-catchments are not aligning well with the square blocks used. Additionally, the fairly large minimal diameter (0.225 m) pipes used does not allow a simple cut off diameter to harmonise the representation of the two networks. As a consequence, we would expect the real network to be substantially larger than the more efficiently generated network. Nevertheless, we can show that this generated network is capable of evacuating all the wastewater with a similar hydraulic behaviour.

Hydraulic design
The diameter distribution represents two relevant factors: a short-hand reflection of the network-level hydraulic characteristics and, together with the length, it is an important measure of replacement value (e.g., Maurer et al. (2013) The high computational efforts (Table 4) and the poor scalability (Duque et al. 2020), therewith, the slow analysis of all possible hydraulic designs is a major drawback of SND. PBP was several orders of magnitude faster than SND and was able to generate suitable network designs in less than a second and seems to scale well.
The overall conclusion is that the PBP design seems to be a good and robust approach to dimension the pipe network generated for a 250 m Block resolution with a reasonable computational time. This 'goodness' was confirmed with the hydraulic performance validation. SND can generate a more cost optimised network, but requires substantially more computational effort. Additionally, this optimisation is also not reflected in the real sewer network data. At lower resolution the UWIM approach performs comparably well and if the purpose were to only reflect the pipe diameter distributions under data poor conditions then UWIM is a good choice.

Hydraulic Performance Validation
The hydraulic performance evaluation was to see whether PBP algorithm would achieve sufficient 'fit-for-purpose' hydraulic characteristics. To make the real and generated networks comparable, we re-dimensioned the real network using PBP algorithm.
With low flow conditions, both generated and real networks present similar sedimentation risk in the smallest pipe diameters at the upstream pipes of the network. Likewise, with typical flow conditions, the resulting flow hydrograph at the outfall of the simulated sewer catchment area closely resemble that of the simulated real sewer network. This is, however, not the case for the other hydraulic performance criteria. Travel times are poorly represented by the generated network, as on average they are 34 % longer than for the real network. This is can be explained by the 12% increase in the average path length from upstream extremes to the outfall in the generated network, despite its total pipe length being 17 % shorter. We believe that this can be explained by the artefacts of the delineation of the trunk sewers. The real trunk sewers do not follow an optimised least distance approach and consequently they are 18% longer and generate lengthier travel paths (12%) towards the outfall from pipes on the East (see Figure 3). As a consequence of the lengthier travel paths, we found that the network takes more time to drain than the real network.
Under high flow conditions, the portion of the network experiencing surcharge and the percentage of volume that is flooded at the peak hour is twice as much for the generated network as for the real one. The surcharge is a consequence of the lengthier travel paths under high flow conditions, where the water demand pattern has a peak factor of 3.0. Note that the pipes are designed for the maximum daily peak flow under typical flow conditions, where the peak factor reaches 1.2.
We could identify that, under the same flow conditions, the differences in hydraulic performance among the real and generated networks come from the topology delineation rather than the PBP dimensioning algorithm. The delineation of trunk sewers seems to have a higher impact than that of the collection sewers. Thus, if one needs to represent abovementioned hydraulic characteristics (e.g. outfall hydrograph, travel times, sedimentation, surcharge, flooding) more closely, the trunk sewer delineation should be improved to mimic the real layout and forego the length optimisation.
The PBP-based sewer network generator does provide the necessary gains in computational speed, requiring only about 11-16 s per run including network generation and hydraulic evaluation. Thus, for applications where a consistent, high-level representation of main network and hydraulic characteristics is sufficient, PBP is found to be the most suitable among the compared algorithms for achieve exploratory modelling of sanitary sewer networks at city scale.

Conclusions
For exploratory modelling of changes in sewer topology at city-scale, the automatic generation of representative sewer networks is essential to test the impact of changes in the urban form. Relying on the real system has its own challenges. Typically, asset databases are fraught with uncertainties, where errors and missing information impede explicit modelling.
Integrated models are often more detailed than necessary for strategic planning purposes and too computationally expensive for system-level exploratory modelling. We need models that allow us to analyse large numbers of sewer infrastructure alternatives and future scenarios for robust decision making and options exploration at larger spatial scales than what is currently available.
The presented sanitary sewer infrastructure generator allows for rapid creation of idealised, yet sufficiently consistent, realistic and reproducible representations of sanitary sewer systems compared to historically-grown real sewer networks:  A spatial resolution of 250m for network topology generation in combination with the PBP algorithm provides a sound trade-off between computational time and fidelity of relevant structural and hydraulic properties for exploratory modelling.
 There is a clear trade-off between the spatial resolution determining the level of detail at which sewer infrastructure designs are generated and the computational effort required to do so.
 The minimal UWIM algorithm yields only a very rough estimation of the amount of infrastructure required, but has a low data demand and is computationally efficient.
 The SND algorithm is capable of creating cost optimised networks, but has a substantial computational demand with NP-easy scalability.

Declaration of interests
☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.