Hydrological modelling of green and grey roofs in cold climate with the SWMM model

Rooftop retrofitting targets the largest land-use type available for reduction in impervious surfaces area in urban areas. Extensive green and grey roofs offer solution for retention and detention of stormwater in densely developed urban areas. Among the available green roof types, the extensive green roof has become a popular selection and commonly adopted choice. These solutions provide multiple benefits for stormwater and environmental management due to stormwater retention and detention capacities. The Storm Water Management Model (SWMM) 5.1.012 with Low Impact Development (LID) Controls was used to model the hydrological performance of a green and a grey (nonvegetated detention roof based on extruded lightweight aggregates) roof (located in the coastal area of Trondheim, Norway) by defining the physical parameters of individual layers in LID Control editor. High-resolution 1-min data from a previously monitored green and grey roof were used for calibration. Six parameters within the individual LID layers: soil (four parameters) and drainage mat (two parameters) were selected for calibration. After calibration, the SWMMmodel simulated runoff with a Nash-Sutcliffe model efficiency (NSME) of 0.94 (green roof) and 0.78 (grey roof) and a volume error of 3% for the green roof, and 10% for the grey roof. Validation of the calibrated model indicates good fit between observed and simulated runoff with a NSME of 0.88 (green roof) and 0.81 (grey roof) and with volume errors of 29% (green roof) and 11% (grey roof). Concerning the snowmelt modelling, the calibrated model showed a NSME of 0.56 (green roof) and 0.37 (grey roof) through the winter period. However, regarding volume errors, additional model development for winter conditions is needed; 30% (green roof) and 11% (grey roof). Optimal parameter sets were proposed within both the green and grey configurations. The results from calibration and especially validation indicated that SWMM could be used to simulate the performance of different rooftop solutions. The study provides insight for urban planners of how to target and focus the implementation of rooftop solutions as stormwater measures.


Introduction
The combined effect of urban growth and climate change is altering the hydrological balance in developed urban areas (Gill et al., 2007;Leopold, 1968;Semádeni-Davies et al., 2008). They is also an increase public awareness of the impact of stormwater on flash flood occurrence and water quality in receiving water bodies (Jia et al., 2015). Generally, rooftops remain unused, even though they cover a large part of the impervious surfaces. Rooftop retrofitting, using either vegetated (Stovin et al., 2012) or non-vegetated (in this paper: "grey" detention roof based on extruded lightweight aggregates, which profits from its filter media while attenuating stormwater runoff) (Hamouz et al., 2018) solutions, has shown multiple benefits in terms of hydrology, building physics, biodiversity, and usage as living areas (allowing for working and/or recreational purposes) (Ahiablame and Shakya, 2016;Berndtsson, 2010). The rooftop retrofitting offers a method to manage stormwater at the source while providing retention and runoff detention in already urbanized areas (Cipolla et al., 2016).

Urban green roof runoff modelling
A large part of the research has been conducted as monitoring studies to understand the hydrological performance of green roofs. However, it remains challenging to predict the hydrological performance in general, as each performance reflect a specific type of green roof and its location. There has been several attempts to simulate green roof runoffs on an individual roof scale (Carson et al., 2013;Hilten et al., 2008;Johannessen et al., 2019;Kasmin et al., 2010;Krebs et al., 2016;Locatelli et al., 2014;Metselaar, 2012;Stovin et al., 2012;Villarreal, 2007) or a catchment scale (Ashbolt et al., 2013;Carter and Jackson, 2007;Krebs et al., 2013Krebs et al., , 2014Palla and Gnecco, 2015;Rosa et al., 2015;Warsta et al., 2017). These models can either be categorized as data-based, where runoff is calculated as an empirical function of rainfall or https://doi.org/10.1016/j.jenvman.2019.109350 Received 15 January 2019; Received in revised form 17 July 2019; Accepted 1 August 2019

Urban snowmelt modelling
Snowmelt and rain-on-snow events in urban areas in the coastal Nordic regions are complex processes. Frontal systems drive low pressure systems as they hit the coast. The path and precipitation intensities are often difficult to accurately predict, resulting in a rapidly changing weather pattern. The winter season often brings heavy snowfall followed by rainfall events, which results in flood risk in urban areas due to mixing of rainfall and snowmelt for these rain-on-snow events, (Moghadas et al., 2017). Moreover, in coastal regions such as Trondheim, continuous changing in freezing and thawing periods can create an intermittent impermeable cover in cold and wet areas (e.g., the coast of Norway) (Matheussen, 2004;Paus et al., 2016). Furthermore, the snowpack distribution is influenced by meteorological variables (temperature, precipitation, wind, radiation), spatial disposition (topography, vegetation, insulation conditions, albedo) and anthropogenic activities ( (Beven, 2011;Førland et al., 1996;Matheussen and Thorolfsson, 2004;Moghadas et al., 2016;Semádeni-Davies, 2000), among others)). This adds challenges to the urban snowmelt modelling as snow characteristics (e.g., snow density, albedo, grain size porosity, solar energy absorption) in urban areas vary substantially from rural areas (Bengtsson and Westerström, 1992;Gray and Male, 1981;Semádeni-Davies, 2000;Sundin et al., 1999).
There is a knowledge gap in the hydrological performance of the different solutions under variable climates and geographical locations, especially on a large scale, rather than a small pilot test. Applying modelling software in combination with observed data offers a tool to simulate expected hydrological performance under various current and future climate conditions (Peng and Stovin, 2017). Therefore, a more generic approach has been adopted in order to model green and grey roof hydrological performance on-site. In this study, the Storm Water Management Model (SWMM) including the LID module for green roofs has been applied for simulating runoff from the aforementioned green and grey roof located in the coastal area of Trondheim, Norway.
The literature review has revealed that there have been several attempts to model retention performance of small-scale green roofs using the SWMM model. However, there is still a lack of knowledge with respect to modelling of detention performance of green roofs. This moreover applies to a very new concept of grey detention roofs which has shown promising results especially for stormwater detention and it ought to be mentioned that in this study the green and grey roof has been tested in the full-scale size (i.e., area of a family house with 100 m 2 ). Another major gap within the SWMM model is the transferability of initial parameters representing runoff characteristics (Johannessen et al., 2019). This study revealed the importance of calibration against local meteorological data.
Several urban snowmelt models have been developed; however, a challenging part in snowmelt modelling remains in terms of finding an optimal level of complexity. This is because more sophisticated models do not necessarily provide better results in a diverse urban environment due to the lack of available data (Moghadas et al., 2016). The previous attempts to model snowmelt in urban areas were performed on a catchment scale (Ho and Valeo, 2005;Moghadas et al., 2017;Semádeni-Davies, 1997. However, one of the main issues linked to snowmelt modelling is caused by snow redistribution in an urban environment as aforementioned reported. In this study, the focus is given to rooftops only, where human-made snow redistribution is not expected. This might facilitate snowmelt simulation from a single green or grey rooftop.
The research questions that this study aims to answer are: 1. What is the performance of the SWMM model after calibration for long-term continuous simulations 1 in terms of the Nash-Sutcliffe model efficiency and volume error of green and grey roofs in coastal regions during warm 2 period? 2. Does the model calibration provide an optimal parameter set, which will satisfy both the objective functions; the Nash-Sutcliffe model efficiency and volume error? 3. Is the SWMM model able to accurately simulate snowmelt and rainon-snow events from a green and grey roof in coastal regions during cold 3 period?

Characteristics of the green and grey roof
A full-scale field setup was built (approximately 10 m above ground and 50 m.a.s.l.) in order to study the hydrological performance of three different roof configurations at Høvringen in Trondheim, Norway (63°26′47.5″ N 10°20′11.0″ E). A conventional (black) roof with bituminous waterproofing served as a reference to the green and grey roof. The dimensions of each roof were 8 × 11 m, with a longitudinal slope of 2%. Thus, 88 m 2 served for green/grey roof retrofitting within each field, but an additional 12 m 2 were accounted for runoff contribution from impervious surroundings. The structural composition of the grey roof was made up of an underlying protection layer, a 200 mm thick layer of lightweight extruded clay aggregates (LWA) and covered with concrete pavers (200 × 200 × 70 mm). The green roof consisted of an underlying protection layer, a 25 mm plastic drainage layer (egg box), a 10 mm retention mat and a 30 mm pre-grown reinforced extensive Sedum mat (Fig. 1). Based on (FLL, 2008), the maximum water holding capacity (MWHC) was estimated having 52.8 mm for the grey roof and 20.6 mm for the green roof (theoretical values 4 ). A more detailed description of the grey roof setup can be found in the previous study (Hamouz et al., 2018).

Input data
The data was collected at the field station within the period from May 2017 to April 2018. Precipitation was measured by a heated tipping bucket rain gauge (Lambrecht meteo GmbH 1518 H3, Lambrecht meteo GmbH, Göttingen, Germany) with a resolution of 0.1 mm at 1-min intervals and with accuracy ± 2%. Runoff was measured using a weight-based system (accuracy class C3 according to OIML R60) with two tanks downstream of the drainage outlets. The collection tanks were automatically emptied every 30 min, and when the collected water reached the capacity of the tank. All the data were recorded at 1-min intervals with a CR 1000 data logger (Campbell Scientific, Inc.). Air temperature was registered using a thermosensor (Vaisala HMP155A Temperature and Humidity with accuracy ± 0.03°C), and wind speed using an ultrasonic anemometer (Lufft VENTUS Ultrasonic anemometer, 240W heater with an accuracy ± 2%). Actual evapotranspiration was estimated as the water loss from direct measurements of precipitation and runoff. Soil moisture sensors were not available during the model calibration 1 Long-term continuous simulation means simulation through several months, including several events in this paper.
2 Period without snow and negative temperatures. 3 Period with snow and temperatures that can influence runoff (≤0°C). 4 The values of MWHC are, however, very theoretical since the method assumes a comparison between a wet and oven-dry sample, which is not possible to achieve in the field conditions. At the same time, the methodology assumes dripping away over 2 h following total immersion for 24 h where the dripping period is questionable for such detention materials used in the green/grey roof build-up.
process. Therefore, the initial moisture was estimated, and the first rainfall event was used as a warm-up period.

Model application and parameters estimation
The Storm Water Management Model (SWMM) 5.1.012, including the Low Impact Development (LID) Controls module specifically designed for modelling SUDS (Sustainable Urban Drainage Systems) structures, was used for long-term and short-term simulation of runoff quantity using the rainfall/ runoff process with 1-min reporting time step. The green and the grey roofs were modelled as a subcatchment in SWMM, where the rooftop occupied 88% of the subcatchment, and the remaining 12% by impervious area. This impervious area is covered by a standard asphalt roofing, same as used for the reference roof. For this layer, the Manning's surface roughness was set to 0.015 and the depression storage to 0.01 mm. Simultaneously, the impervious area was routed to the LID module. The LID module consists of three layers (surface, soil and drainage mat). Only parameters included in the soil and drainage layers were selected for calibration (Table 1), as the surface layer is assumed not to contribute to the retention or detention performance in the LID module due to the high infiltration capacity. Within the soil layer the porosity (indicating potential space within soil layer for storing stormwater), field capacity (indicating the amount of water in the soil layer after free water drainage), conductivity (indicating the velocity, which the water can flow through a porous medium), conductivity slope (indicating the slope of the curve of log (conductivity) vs. soil moisture content) were calibrated, in addition, two parameters within the drainage mat; the void fraction (indicating the ratio of void volume to total volume in the mat) and roughness (used to compute the lateral flow rate of drained water through the mat). The initial green and grey roof parameters as well as lower and upper bound used during the calibration were estimated from field measurements, literature (Carson et al., 2017;Krebs et al., 2016;Peng and Stovin, 2017;Rosa et al., 2015), or defaults (Rossman andHuber, 2016a, 2016b). The thickness of the substrate layer of the grey roof, which is 200 mm, was used in the LID module for both the soil layer as well as a part within the drainage mat layer. The estimation of the drainage mat thickness, which could represent the flow through the lightweight aggregate was set to 5% of the whole thickness, thus 10 mm. This was estimated based on the high infiltration capacity of the lightweight aggregate where saturated hydraulic conductivity was experimentally determined in the laboratory to be 1432 mm/h. The Green-Ampt and the curve number infiltration method were used for the green roof and the grey roof, respectively. The kinematic wave routing method, which solves the continuity equation with a simplified form of the momentum equation, was applied for overland flow calculations (Rossman andHuber, 2016a, 2016b).    [mm] [mm] [%] [-] [mm] [mm] [%] [

Model performance
Two objective functions were applied to evaluate the model performance. The model accuracy was quantitatively assessed with the Nash-Sutcliffe model efficiency (NSME) (eq. (1)) (Nash and Sutcliffe, 1970), which aims to evaluate the peak performance. Regarding water balance evaluation, the volume error (VE) (eq. (2)) was used to calculate discrepancies between observed and simulated (modelled) runoff. (1) where Q obs i , is the observed discharge and Q sim i , is the modelled discharge, Q obs i , is the mean of observed discharge, and V obs and V sim are the observed and simulated runoff volumes, respectively. The measured precipitation and outflow were used to evaluate how well the model matched this outflow using the NSME ranging from −∞ to 1. The NSME was used as an objective function in order to find the optimal parameter set, and to measure the goodness of fit. In general, the closer the model efficiency is to 1, the more accurate the model can predict the performance of green roofs, whilst an NSME greater than 0.5 indicates acceptable model performance (Rosa et al., 2015). The final parameters were achieved by applying the Shuffle Complex Evolution (SCE) algorithm (Duan et al., 1992). The SCE method is based on four concepts that aims for efficient global optimization. The calibration process was based on random sampling from a predefined variable range where each parameter had lower and upper bound delineated. The SCE algorithm uses an initial guess to generate a sequence of improving approximate solutions in order to reach the highest NSME; where the n-th approximation was derived from the previous ones. The termination criteria of the calibration process were based on the principle of convergence (objective function stabilizes) (Duan et al., 1992).

Model calibration and validation
A long-term continuous calibration was chosen in order to prevent eventual validation issues while comparing events with different characteristics. Data between 11th of May and 31st of July served for the model calibration. The calibration period included five larger events; while six events were used for model validation ( Table 2). The model was evaluated by the NSME and VE in both the calibration and validation period using a long-term continuous dataset as well as an event-based dataset.

Snowmelt modelling
The model, which was calibrated against long-term continuous observed flows generated from rain events only, was applied for a period between November 2017 and April 2018 in order to identify the essential parameters for the snowmelt processes. Time series with hourly temperature and wind speed were used to distinguish between liquid and solid precipitation as well as recognise snowmelt generation. The SWMM model employs either the degree-day method or a simplified energy budget method (Anderson, 1968;Rossman and Huber, 2016a). The degree-day method was used for all snow events except rain-on-snow events to compute the melt rate for any particular day (eq. (3)). Minimum and maximum snowmelt coefficients are used to estimate a melt coefficient that varies by day of the year (Rossman and Huber, 2016a). The relationship between snowmelt and air temperature can be expressed as: = where SM is the snowmelt generated (mm/day), C M is the melt coefficient (mm/°C/day), T A is the index air temperature in°C (used the mean daily temperature according to the observation from the field station), and T B is the base temperature in°C (used 0°C according to (Rossman and Huber, 2016a)  Capacity (FFWC = 0.1) (Rossman and Huber, 2016a). The simplified energy budget (heat budget) method is applied during rain-on-snow events where the energy is supplied by sensible heat (from the air) and advective heat (from rain). Other forms of energy, as well as albedo and snow age, are neglected. Within this method, snowmelt increases with increasing air temperature, wind speed and rainfall intensity (used data from the field station). In this study, the human-made snow redistribution did not occur, therefore, to avoid any snow redistribution the depth, at which snow removal begins, was set to 1000 mm.

Results and Discussion
The main intention of this chapter is to present result from the model calibration and validation of the green/grey roof, evaluate six parameters which were calibrated as well as test the calibrated model during the cold period. However, data from monitoring including several rooftop management aspects (retention and detention) can be also derived from Table 2 where comparison of peak runoffs and volume generation from individual rooftops can be seen.

Calibrated long-term continuous simulation
Long-term continuous simulations using initial uncalibrated parameter sets indicated agreement between observed and simulated runoff from the green roof with the NSME equaled to 0.5. The same did not apply for the simulated runoff from the grey roof with the NSME equaled to -2.87 where calibration was required.
Six parameters within two LID layers (namely the soil and the drainage mat) were calibrated. Their values prior and after calibration are presented in Table 1. Within the green roof, the overland flow does not usually occur due to high infiltration capacity of the soil where ponding is not allowed. Therefore, the parameters associated with the surface layer were excluded from the calibration process. Some of the parameters (berm height, surface slope, thickness) were kept fixed to preserve the physical description of the field setup as well as avoid overparameterization. Substantial improvement of model performance was achieved after the calibration of both the green and grey roof parameters. The NSME calculated from observed and simulated runoff from the green roof improved from 0.50 to 0.94, of the grey roof improved from insufficient −2.87 to 0.78. It should be noted that the NSME values include inter-events periods (periods without rain) since the increased detention effect (led to higher baseflow) made it challenging to distinguish when runoff stopped.
Long-term continuous model simulation and comparison of the observed and simulated runoff from the green and the grey roof following calibration is shown in Appendix in Figs. A7, A9 and A10. These figures show a better fit within the calibration period of the green roof which can be visually seen in data spread of observed versus simulated runoff or mathematically when expressed with the lowest value of norm of residuals (and mean squared error MSE, coefficient of determination R 2 and correlation coefficient). The model simulated lower volumes (flow rates) than observed in most of the cases which is also shown by slope and intercept of the regression line. Visually, the models simulated the runoffs fairly well. The simulated runoff from both the green and grey roof tended to underestimate the observed peak flow responses to rainfalls with the highest intensity. At the same time, the model had difficulties in simulating the tails of grey roof runoff more than in the green roof. The simulated cumulative runoffs (total volumes) were close to the observed data.
In comparison, the volume errors between the simulated and observed runoff from the grey roof counted 10% and from the green roof counted 3%. Firstly, the volume errors were caused by inaccurately estimated seasonal evapotranspiration rates, which occur during dry periods and cause regeneration of the storage capacity of the roof. Secondly, the volume errors were caused by inaccurate model simulations of runoffs. It was noted that actual ET rates decay with time during the dry period but the SWMM model assumes a fixed ET rates (Peng and Stovin, 2017). Thirdly, the model runoff outputs had coarser resolution than the observed runoff which made the volume comparison more challenging.
Five events were chosen to evaluate the model performance in term of event-based simulation during the calibration period (Fig. 2). One can see the difficulties with the simulation and the underestimation of the runoff tails in the grey roof (Fig. 4). This leads to that the ability of the model to simulate runoff detention is partly limited. Thus, the equations describing the detention processes as well as the detention parameters, namely porosity, soil conductivity, soil conductivity slope and soil suction head within soil layer and the parameters within the drainage mat serving to the estimation of the baseflow, should be further investigated in order to improve runoff prolongation.

Validation
In order to assess model performance for the non-calibration period, the calibrated green and grey roof models were tested in terms of the NSME and VE through a part of the summer and whole autumn (may be seen in Appendix in Figs. A8, A11 and A12). During this study, only one event with a 2-year return period was registered via rain gauge at Høvringen, Trondheim in August 2017 (Fig. 3). However, several larger events were also used to validate the model performance as well as a long-term continuous dataset with a high resolution of 1-min.
Six events were chosen to evaluate the model performance in term of event-based simulation during the validation period (Fig. 4). Events V2 and V6 offered interesting results reaching the NSME of 0.8 and higher for both roofs, and the volume error fell into reasonable limits as well. Both events lasted several days, and relatively large volumes were registered, and one can conclude that such events are of interest due to the fact that the SWMM model showed its ability to reproduce registered runoffs and that the roofs were able to reduce the maximal flow (Table 2). . The green roof runoff was simulated reasonably well except for one event in September during the validation period (Fig. 4), which followed an almost one-week dry period, which intensely dried the roof storage capacity. This was, however, captured with the model of the grey roof,  Table 4 Model sensitivity to parameters adjustment ± 10% and ± 50%. Abbreviation: GN = Green roof, GY = Grey roof, 10UP = 10% increment, 10DOWN = 10% decrement, 50UP = 50% increment, 50DOWN = 50% decrement, P = porosity, FC = field capacity, C = conductivity, CS = conductivity slope, VF = void fraction and Mn = Manning's n. Orange marking applies for performance deterioration and green marking for performance improvement.  (Table 2). It should be noted that the model output is on much coarser resolution than the observed runoff. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) which generally showed a poorer fit to peak runoffs.
Overall evaluation of the model performance is presented in Table 2. One can see that the green roof model performance deteriorated regarding the NSME from 0.94 to 0.88 and VE from 3% to 29% during validation within the long-term continuous datasets while the grey roof model experienced improvement in terms of the NSME from 0.78 to 0.81.

Parameter evaluation
During the calibration process, nearly 2300 iterations were carried out within the green roof and nearly 2700 for the grey roof. Each parameter was plotted using a 3D histogram in order to see a representation of the distribution of a sample achieved during calibration (Figs. A13 and A14). The optimal value of each parameter is corresponding to the highest column of each graph. One can see the lower and upper bound of the individual parameters as well as a pattern in the histograms, which tended to be skewed to one side of the bound.
The randomly sampled numerical data of six calibrated parameters were sorted based on the NSME where only values higher than 0.7 were taken into consideration (Table 3). The table shows the initial value, median, mean and optimal value of each parameter where the medians of most parameters are very close to optimal values. The optimal parameters between the roofs show very different values. The standard deviation, showing individual parameter dispersion of the grey roof parameters achieved lower values in comparison with the green roof parameters, except for conductivity. Thus, the conductivity of the grey roof experienced the largest data point spread. Overview of distribution of each the parameter vs. the NSME from the calibration and the low data spread of the parameters can be visually detected from Figs. A13 and A14. However, having a low standard deviation of a parameter shows that the model might be sensitive to this parameter.
Four parameters within the soil layer were calibrated (Table 3). The calibrated porosity (P) of 0.559 (green) and 0.45 (grey) was comparable with suggested values from the SWMM manual (Rossman and Huber, 2016a) (P = 0.4-0.5). Similarly, Krebs et al. (2016) achieved P = 0.41 after calibration and concluded that porosity was the most sensitive parameter. Other researchers decided to use the measured value of the porosity and did not calibrate it (Peng and Stovin, 2017). The field capacity (FC)was obtained to be 0.267 (green) and 0.095 (grey) after calibration. This is in line with the SWMM manual, which suggests field capacity of various soils between 0.062 and 0.378, other research presented FC = 0.29 (Krebs et al., 2016). The calibrated conductivity (C) Fig. 4. (continued)   Fig. 5. Impact of parameters adjustment ± 10% and ± 50% to model performance (NSME and VE). The grey roof served more like a detention solution and the green roof as a retention solution. Therefore, the parameters representing runoff detention (conductivity, void fraction and Manning's roughness) showed variation compared to the SWMM manual or other green roof researches. The model consists of a large number of physical parameters where some of the parameters might be determined from laboratory or field measurements (e.g., porosity, conductivity). This is however questionable as the SWMM model is unable to fully represent the green and grey roof structure design (e.g., lightweight aggregates representing both the soil layer and drainage mat layer; and/or the plastic drainage layer (egg box) affecting retention parameters) (Johannessen et al., 2019). The model sensitivity to parameters deviation ( ± 10% and ± 50% adjustment) was evaluated comparing the model performance including the objective functions (NSME and VE) for different periods (calibration period, validation period and whole period) in terms of the NSME and VE (Table 4 and Fig. 5). Additionally, several basic statistics (Pearson product moment correlation coefficient, slope of the linear regression line with intersect the yaxis and correlation coefficient between the observed and simulated data) were supplied. A ± 10% adjustment had little impact on the model performance of the green roof whereas for the grey roof some variation in model performance was observed. One can conclude that 10% increment of the field capacity of the grey roof had a positive impact on the VE (−2.48%-1.21%) whereas 10% increment of the porosity worsen the VE (38.31%-49.26%). The green roof model was mainly sensitive to 50% decreasing of the porosity which resulted in negative NSMEs (−1.251 -2.546). However, in contrary this adjustment resulted in improvement of the VE (0.01 %-14.32%). The grey roof model was mainly sensitive to 50% decreasing of the porosity and conductivity slope which resulted in negative NSMEs (−0.617 to −2.209 and −0.561 --2.214). 50% increasing of the porosity resulted in large VEs (42.48%-59.95%). A positive impact on the model performance was achieved 50% increasing of the conductivity which resulted in low VEs (2.8%-4.82%) and NSMEs (0.631-0.826) remained relatively high as well.

Snowmelt modelling
The SWMM model, which was calibrated over a rainfall period, was tested during the winter period between November 2017 and April 2018. Within this period solid precipitation, as well as temperatures (degrees Celsius) which formed water into ice, were registered. Therefore, some of the winter precipitation did not directly contribute to the runoff but remained on the roofs in the form of snow (Fig. 6). This was driven by a temperature equal to 1°C below which precipitation falls as snow instead of rain. During the period, the heated rain gauge alone registered 346 mm of precipitation which was considerably lower than the cumulative precipitation registered from the reference black roof 390 mm (both located at the field station). Therefore, the total cumulative precipitation was corrected according to the reference black roof assuming that the error is based on wind and snow drift.
The model was able to simulate snowmelt and rain-on-snow reasonably well (Fig. 6). The observed volume equalled to 371 mm of the green roof runoff and 384 mm of the grey roof runoff and the model simulated 259 mm of the green roof and 340 mm of the grey roof. Regarding model evaluation, the green roof runoff was simulated with the NSME of 0.56 and VE of 30%. The grey roof runoff was simulated with the NSME of 0.37 and VE of 11%. One can see that the model had difficulties in simulating remaining water in the roof soil layers (substrate in the green roof, lightweight aggregate in the grey roof), which was slowly released in the first half of April. Limitation in terms of the snowmelt simulation was setting of the start and end of the cold period which may affect control of water balance.
Modelling remarks: • The model was less accurate for short intense rainfall following a long antecedent dry period. This shows that the model assumed that the storage capacity of the green roof was regenerated and does not take into account detention of the roof and/or climatic effect (condensation of atmospheric vapor).
• NSME improved within the grey roof from calibration to validation from 0.78 to 0.81. Median and mean values of the observed flow used during calibration were 0.0012 L/s (median) vs. 0.003 L/s (mean) and during validation 0.006 L/s (median) vs. 0.0036 L/s (mean). Thus, there was double higher baseflow during calibration and slightly peaky runoff during validation. This can mean that the wetter part of the year performed better during the dryer part and reveal a question of using the same part of the year for validation.
• The calibrated SWMM models reproduced the observed runoffs sufficiently, but the calibration revealed that there might be several parameters sets which perform equally good. This makes the parameters valid only for the roof setups and climatic conditions of the study site or potentially roofs that have the same components as the roofs tested with similar climatic conditions. The reason for skewness in the parameters sets could be due to the optimization algorithm, which instead of finding the global minimum, maybe found only local minima. Sensitivity analysis of the tested parameters showed that the porosity is most sensitive parameter for the green roof. In terms of the grey roof, the porosity and conductivity slope were found to be the most sensitive parameters (Table 4 and Fig. 5).
• Limitation due to equifinality, which states the non-uniqueness of the optimized parameter sets. The issue here is that similarly performing optimized parameter sets may not all be equivalent in terms of transferability and sensitivity. This will also affect the distribution of parameters into identifiable regional patterns; two similar models may have different optimized parameter sets. It is also likely that the optimized parameter sets should be transferred in its entirety and not as individual parameters Beven and Freer, 2001. • Limitation due to singularity: Information contained in the optimized parameter set is specific to the modelled roof, and thus not transferable to another roof model. This might be due to data error compensation (precipitation or/and runoff) as the precipitation measurements are affected with the wind and intense rain undercatches. Another likely issue is the bias induced to the model by the optimization period and its specific climatic conditions and the model conceptualization error Poissant et al., 2017. There is a possible explanation for the mismatch between observed and simulated runoffs during the winter period. The SWMM model simulates runoff from rainfall or snow accumulation from snowfall according to observed air temperature. This temperature, however, does not correspond to temperature inside the medium (substrate) or around the outlet. Having negative air temperatures and positive medium temperatures may, therefore, lead to continuously observed runoff while the model accounts for snow accumulation. Additionally, rainfall measurement was performed over a period of 4 months using manual rain gauges (simple plastic non-digital cones which must be handled manually) (8 gauges per roof) in order to see rainfall distribution over the roofs and if the roof receives an equal amount of precipitation during non-snow period. There were not found differences between registered volumes. However, the volume error which raised from 3% during the warm period to 29% during the cold period revealed that the wind effect during cold period has to be still considered. Limitation within this manuscript is the actual dataset duration which spans over less than one year and that there was only one event with a return period greater than 2 years. This may also raise the question of whether it is appropriate to validate the model using periods with different season within the same year.

Practical implications for environmental management
In light of the results and limitations, one can conclude that the SWMM model can simulate the hydrological performance of green/grey roof solutions. It is a user-friendly tool, which may support urban planners and decision makers in activities related to project planning, implementation, and assessment. From the management point of view, important results are presented in Table 2. The grey roof outperforms the reference black roof and extensive green roof in terms of stormwater detention while changing extreme runoff to more natural flow with a low peak and long duration (Hamouz et al., 2018;Mentens et al., 2006;Stovin, 2010). The runoff attenuation may also be seen between the green and grey roof in the figures in chapter Results and Discussion (Figs. 2, 4 and 6 and in Appendix in Figs. A7 and A8). This difference will influence the number of CSO events as well as the duration of the events. (Ahiablame and Shakya, 2016). reference black roof during a warm period while the grey roof retained 46 mm. This clearly shows that the green roofs should be favoured from a retention point of view.
This study is an important step prior to a large-scale implementation of green and grey roofs in a watershed. More interest should be given to the grey roof, in particular for cool season locations due to low maintenance of the roof and the fact that evapotranspiration is a limiting factor for the green roofs in cold climates.

Conclusions
In this study, the runoff from a green and grey rooftop retrofitting were simulated using the model SWMM version 5.1. The model was able to simulate runoff with a good fit, 0.94 (green roof) and 0.78 (grey roof) values of the Nash-Sutcliffe model efficiency (NSME) within the calibration period. Similarly, a good fit was achieved during the validation period with 0.88 (green roof) and 0.81 (grey roof) values of NSME. The model was able to simulate the water balance satisfactorily during the calibration period with volume errors of 3% (green roof) and 10% (grey roof). Remarkable deterioration was observed within the validation period of the green roof with volume error of 29%, within the grey roof almost no change occurred with volume error of 11%. Concerning snowmelt modelling, the calibrated model showed the NSME of 0.56 (green roof) and 0.37 (grey roof) with the volume error of 30% (green roof) and 11% (grey roof) through the winter period. The results indicate that there is a need for further research related to the volume errors during the winter season.
The SWMM model allows simulating runoff from the green and grey roof with a good fit between observed and simulated runoff but after calibration and with limitation to the specific local climate. The parameters may deviate with different roof layers build-up, geometry, and climate, which was confirmed after comparing with other studies and laboratory measurements but still being within the recommendation and limits of green roof standards and manuals. The study provides insight for urban planners who may use the output from the SWMM model as an aid in the implementation of roof retrofitting in urban watersheds.