A new risk probability calculation method for urban ecological risk assessment

The ecological risk associated with urbanization is of great concern where multiple stressors and risk receptors co-exist. Probabilistic risk characterization methods were rarely applied in past urban ecological risk assessments because of the difficulties in the derivation of theoretical probability distribution functions and the definite integral calculation. Therefore, we proposed a new method which is based on computer simulation and able to facilitate the calculation of risk probabilities. This method quantifies multiple ecological risk-related indicators using ecological models, implements Monte Carlo simulation to calculate the risk probability of single indicators, and applies the copula model to calculate the joint risk probability of multiple indicators. We conducted an assessment of urban ecological risk related to urban surface water environment in Beijing as a case study to validate this method. The results show that the means of surface runoff risk probability, total nitrogen pollutant load risk probability, and comprehensive (joint) risk probability were 0.33, 0.44, and 0.23, respectively, in the areas within Beijing Sixth Ring Road. All three types of risk were at moderate levels in the study areas, but exhibited high spatial heterogeneity and urban–suburban gradient. The average contributions of the three risk types were 25% (surface runoff risk), 32% (total nitrogen pollutant load risk), and 43% (comprehensive risk), indicating that the joint risk was overall the major risk type. In conclusion, our method considering multiple indicators and their probabilistic attributes can handle the uncertainties in ecological models and thus has potential to evaluate different types of urban ecological risks.


Introduction
With the rapid urbanization in the last century, urban ecological risk, defined as the possible adverse effects on urban ecological and environment factors, ecological processes, and ecosystem services caused by urban development, has drawn more and more attention (Perrodin et al 2011, Wang et al 2014. The economy, society, and human welfare in cities may be undermined because of such adverse effects . Previous research has focused on the negative impacts of physical, chemical, and biological stresses induced by urbanization on the urban ecosystem, finding that urbanization can increase the risks of urban flooding, pollution, and species invasion (Luo et al 2011, Tixier et al 2011, Ju and Li 2012. Ecological risk assessment is the process of evaluating the likelihood that adverse ecological effects may occur or are occurring as a result of exposure to one or more stressors (US Environmental Protection Agency (EPA) 1998a). This definition also applies to urban ecological risk assessment, but different from traditional ecological risk assessment, urban ecological risk is characterized by multiple risk sources, multiple risk receptors, and complex exposure pathways, making the risk assessment process more complex and uncertain (Wang et al 2014). Therefore, it is necessary to consider multiple risk indicators to obtain comprehensive results in urban ecological risk assessment.
Facing the need to consider multiple risk indicators, comprehensive index methods were usually used to characterize the urban ecological risk in previous research (Li et al 2008, Gasiorek et al 2017, Kang et al 2018. However, comprehensive indices can only describe the risk qualitatively or semi-quantitatively and reflect whether the risk exists or not. Compared to comprehensive index methods, probabilistic risk characterization is a quantitative method describing the likelihood of risk occurrence, which can better reflect the concept of risk and increase the veracity and practicability of risk assessment results (Kaplan 1997, Solomon et al 2000, Li et al 2017. The key points in using probabilistic risk characterization to urban ecological risk assessment are calculating the single indicator risk probability (SRP) and the multi-indicator risk probability (MRP). According to traditional probability theories, the SRP and MRP can be calculated using integral calculation based on the theoretical probability density function (PDF) of single indicators and the theoretical multivariate probability density function (MPDF) of multiple indicators. However, there are two main difficulties in using traditional methods to calculate the SRP and MRP. Firstly, the theoretical PDF and MPDF of the variables of complex ecological or environmental models are often difficult to derive. Secondly, even though the theoretical PDF and MPDF are derived, it is difficult to calculate the SRP and MRP using definite integral calculation in the situation that the structures of the PDF and MPDF are complex. Because these two difficulties were not completely overcome in the past, there have been a few studies using probabilistic risk characterization in urban ecological risk assessments with multiple indicators (Li et al 2017).
In order to facilitate the application of probabilistic risk characterization in the urban ecological risk assessment with multiple indicators, this study attempted to build a new risk probability calculation method and conducted a case study in Beijing to evaluate the effectiveness of this method. The remainder of this article is organized as follows. Section 2 provides a detailed description of our method. Sections 3 and 4 present a case study using the method and the results, respectively. Section 5 presents some discussions regarding our methods.
2. Methodology 2.1. Core idea of our method We proposed a numerical simulation method, coupling Monte Carlo simulation with the copula model to calculate the SRP and MRP. This method is not dependent on the theoretical PDF and MPDF and the definite integral calculation. Monte Carlo simulation was used to produce enough samples for numerical simulation and calculate the SRP, and based on the calculations of the SRP, we used the copula function to calculate the MRP, because the function can estimate multivariate probability without using theoretical MPDF by fitting samples (Favre et al 2004, Genest et al 2007, Salvadori and De Michele 2007.

Framework of the proposed method in calculating SRP and MRP
The framework of this method is displayed in figure 1. We proposed the following five-step procedure for implementing the calculation of the SRP and MRP: Step 1. Select relevant ecological models that can reflect and calculate the dose-response relationships between stressors and risk receptor indicators according to the risk assessment question concerned.
Step 2. Determine the best fitted probability distributions of all the models' parameters and independent variables.
Step 3. Based on the results from step 2, perform a large number of Monte Carlo simulations for every indicator to calculate all the values of indicators as much as possible.
Step 4. Set risk thresholds for every risk receptor indicator. Calculate the SRP of every indicator based on every risk threshold and every indicator's simulations from step 3.
Step 5. Based on the samples from step 2, select the best fitted copula function, and apply the selected copula function to calculate the MRP.

Calculation of SRP
Assuming that EI denotes the value of a certain risk receptor indicator and EI c is the risk threshold value of EI, the general form for calculating the SRP is given as equation (1) if EI is a positive indicator or as equation (2) if EI is a negative indicator.
(·) F is the theoretical cumulative distribution function of EI. But in our method, to calculate the SRP, the probability distribution functions of model parameters and independent variables (stressors) to calculate the EI of the selected stressors (dose-response relationship) are needed, which can be given as where (·) f is the function calculating EI, q i is the model parameter i, and S i is the independent variable i.
In urban ecosystems, due to the uncertainties of model parameters and input data, the results of model simulation have a large difference from the real value (Refsgaard et al 2007, Skinner et al 2016. In fact, model parameters and independent variables obey their respective probability distributions (van Oijen et al 2011). How to integrate these probability distributions into the calculation of the SRP is critical to improve the assessment accuracy. In this regard, the Monte Carlo simulation has the advantage of fully taking into account the uncertainty of model parameters and independent variables to make the calculation results of SRP more accurate (Ghersi et al 2017). To be specific, in each Monte Carlo simulation, each model parameter and independent variable is randomly sampled from its theoretical probability distribution. According to the law of large numbers, the more times the Monte Carlo simulation is performed, the more possible values of EI are generated, and the closer the risk frequency gets to the risk probability (Haas 1999, Maccheroni andMarinacci 2005).
For example, n EI samples were generated from n Monte Carlo simulations, including m EI samples larger (negative indicator) than or lower (positive indicator) than the risk threshold (EI c ). Therefore, In practical calculation of the SRP, the parameters or independent variables with low uncertainties can be considered as non-random variables and can be excluded in Monte Carlo simulation.

Calculation of MRP
The MRP, which describes comprehensive risk, is defined as the probability that the values of all indicators exceed the risk thresholds. From the point of view of risk damage, interactions between different risk indicators usually need to be considered in the comprehensive risk assessment. When several risk indicators act on the same receptor, the risk damage effect may be higher or lower than the sum of the risk damage effects of the individual risk indicators. Therefore, in the process of calculating the MRP, unless the damage effects induced by different risk indicators are independent, the risk thresholds of each indicator are not the same as those in the calculation of the respective SRP and need to be adjusted based on the relationships between the damage effects induced by different risk indicators.
The relationships between the damage effects induced by different risk indicators include four cases: independent effects, additive effects, positive interaction, and negative interaction (Brzoska and Moniuszko-Jakoniuk 2001, Borgert et al 2004). The way in which these four cases affect the determination of risk thresholds in the comprehensive risk assessments is described in the supplementary materials available online at stacks.iop.org/ERL/15/024016/mmedia.
In theory, the copula model can be applied to calculate the multivariable probability of random vectors in any dimension (Salvadori andDe Michele 2007, Shiau et al 2007). In recent years, the theory of the copula model has been constantly developing (Sadegh et al 2017), and has been widely used in economics, hydrology, and medicine because of its simplicity and accuracy in calculating multiple probabilities Therefore, the copula model has the potential to be applied to urban ecological risk assessment.
As the bivariate case illustrates, the general form of the copula function is: is the joint cumulative probability distribution function of X and Y. ( ) F x X and ( ) F y Y are the cumulative probability distribution functions (marginal distribution) of X and Y, respectively.
In the case that there are two risk receptor indicators X and Y, X mt and Y mt are the respective risk thresholds adjusted based on the relationships between the damage effects on X and Y. The MRP in the four situations can be obtained by using the following formulas.
If both X and Y are positive indicators, If both X and Y are negative indicators, If X is a positive indicator and Y is a negative indicator, Y mt is the copula function, which can be determined based on the fitting to samples of X and Y generated from Monte Carlo simulation.

Question analysis
In this research, we applied the proposed method to evaluate the ecological risk related to surface water environment in the urban-suburban areas of Beijing. In our case study, land use and the urban surface water environment are a risk source and risk receptor, respectively. We selected urban land cover type and impervious surface ratio as two stressors to describe land use. Surface runoff and non-point source total nitrogen pollutant load were selected as two risk receptor indicators to describe the urban surface water environment, respectively. We referred to Heaney et al (1977) and selected the following models.
it F i is the total annual surface runoff on grid i (m 3 ); Q i is the total annual non-point source total nitrogen pollutant load on grid i (kg); A i is the area on grid i (m 2 ); P i is the total annual precipitation on grid i (mm); IR i is the impervious surface ratio on grid i (%); EMC i is the mean concentration of total nitrogen in rainfall events (mg l −1 ); F it is the monitored volume of surface runoff on grid i at time t in a rainfall event (L); C it is the monitored total nitrogen concentration on grid i at time t in a rainfall event (mg l −1 ); 0.001 is the unit conversion coefficient.

Study area
The areas within Beijing Sixth Ring Road (about 2267 km 2 ) were selected as the study area in this research. The artificial surface, which occupied about 66% of the land areas within the Sixth Ring Road in 2015, is mainly distributed in the urban center (Kuang 2012). The land use types of the marginal areas of the city are mainly forest, grassland, and farmland. Such land use spatial pattern indicates that the areas within the Sixth Ring Road exhibit an obvious urbansuburban gradient (Peng et al 2016), and thus it is an ideal area for urbanization research.

Determination of probability distributions of model parameters
According to equations (10) and (11), precipitation and the mean concentration of total nitrogen in rainfall events (EMC) were selected to identify probability distributions, because the other parameters and independent variables are less uncertain.
The uncertainty in precipitation is affected by many factors, especially climate change. In general, without dramatic climate changes, annual precipitation is stationary and follows normal distribution, otherwise it is non-stationary and follows skewness distribution (Yang et al 2002). According to the World Meteorological Organization, data for at least 30 years of meteorological elements and statistical properties of meteorological phenomena can reflect relatively stable climate standards (Moron et al 1995, De Luis et al 2000, Donat et al 2013. Moreover, previous studies found some fluctuations in annual precipitation in the past 40 years in Beijing (Liang et al 2019). Therefore, we used the annual precipitation data for around 40 years before 2015 (1978-2015), which is the same period as the land use data.
EMC is the average concentration of a pollutant in a rainfall event (equation (12)), containing all information of the rainfall event, including the information about the surface runoff and time (figure S1). Meanwhile, the surface runoff also changes with time. Therefore, the uncertainty in EMC is mainly related to the change of time (Dauphin et al 1998). In probability statistics, exponential distribution and Weibull distribution are often used to describe random variables associated with time variations (Lawless and Fredette 2005). To identify which distribution is the best probability distribution of total nitrogen EMC, we used some values of total nitrogen EMC from relevant literature (Wang et al 2004, Ouyang et al 2010.
Firstly, the p-value of the Kolmogorov-Smirnov test was used to test the significance of probability distribution fitting. If the p-value is far greater than 0.05, the tested probability distribution is suitable at the statistical level of significance. Secondly, the adjusted R 2 of the curve fitting was used as the discriminant standard to evaluate the goodness of fit of the probability distributions passing the hypothesis test. The probability distribution whose adjusted R 2 is the closest to 1 is regarded as the optimal probability distribution.
In this case study, the best distributions of annual precipitation and EMC are normal distribution (p=0.47, adjusted R 2 =0.98) and Weibull distribution (p>0.7 and adjusted R 2 >0.75 for all types of land use), respectively (tables S1 and S2).

Determination of best copula function
After determining the best probability distributions of total annual precipitation and total nitrogen EMC, 10 000 Monte Carlo simulations were performed to produce 10 000 surface runoff values and 10 000 total nitrogen pollutant load values, in order to select the best copula function to connect surface runoff and total nitrogen pollutant load. Five copula functions, including the Gaussian copula function, Student copula function, Clayton copula function, Frank copula function, and Gumbel copula function, were widely used in past research (Liu et al 2011, Mirabbasi et al 2011, Zhang et al 2015. Detailed information about these five types of copula function is provided in the supplementary materials. Five copula functions were used to fit the values from 10 000 Monte Carlo simulations. The Kolmogorov-Smirnov test was applied for the statistical tests of these copula functions. The root-mean-square error (RMSE) and Akaike information criterion (AIC) were employed to estimate the goodness of fit between the samples and the five theoretical copula functions. According to the test results (table S3), the Frank copula function was selected as the best copula function in this case study (p=0.91, RMSE=0.0073, and AIC=−99 729).

Risk threshold setting and risk level determination
The risk thresholds of surface runoff were determined referring to the Code for design of outdoor wastewater engineering GB50014-2006 (Ministry of Housing and Urban-Rural Development of the People's Republic of China 2006). According to the literature, different land use types exhibit different risk thresholds of runoff coefficient (table 1). The risk threshold value of the total annual surface runoff is the production of runoff coefficient and the total volume of annual precipitation.
The risk threshold of the total nitrogen source pollutant load was determined referring to Integrated discharge standard of water pollutants DB11/307-2013 (Beijing Municipal Ecological Environment Bureau 2018). According to this standard, the total nitrogen concentration of water discharged into natural water bodies should not exceed 15 mg l −1 . Accordingly, the risk threshold value of the total nitrogen pollutant load is the production of this threshold concentration and total annual surface runoff.
In this case study, the surface runoff is an indicator of urban water quantity, which reflects adverse impacts on the surface of urban areas, such as urban waterlogging. The annual non-point source total nitrogen load is an indicator of urban water quality, which reflects adverse impacts on the downstream water body in urban areas. Therefore, the damage caused by surface runoff and total nitrogen pollution can be regarded as independent, and the comprehensive risk thresholds of surface runoff and total nitrogen pollution load were not adjusted in the calculation of the MRP.
Because of lacking risk level standards for urban ecological risk probability, we referred to the thresholds proposed by the USEPA to determine risk levels (US Environmental Protection Agency (EPA) 2014). According to the thresholds, a risk probability higher than 0.05 indicates that the risk reaches a significant level and needs to be addressed, between 0.05 and 0.1 indicates a low risk level, between 0.1 and 0.5 means a moderate risk level, and higher than 0.5 means a high risk level (table 2).

Risk contribution analysis
In this case study, comprehensive risk probability (MRP) is the intersection of surface runoff risk probability (SRP1) and total nitrogen pollutant load risk probability (SRP2) (figure S2). The contributions from surface runoff risk probability (C SRP1 ), total nitrogen pollutant load risk probability (C SRP2 ), and comprehensive risk probability (C MRP ) to the ecological risk can be calculated based on equations (13)-(15). For explanations of the three equations, see section S5 in the supplementary materials.

Data collection and model running
The data used in this research mainly include land use data, impervious surface ratio data, precipitation data, and EMC of total nitrogen. Among them, the land use spatial map (30 m×30 m) of the research area was obtained from the Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences (Yu et al 2016, Yu andZhou 2017). The land use was classified into six types: forest, grassland, wetland, farmland, artificial surface, and barren (figure 2). Based on the classification of land use, the ratio of impervious surface was calculated in each assessing unit (Kang et al 2018). Precipitation data were downloaded from NOAA (http://www.noaa.gov/web. html). EMC of total nitrogen data were collected from previous research (Wang et al 2004, Ouyang et al 2010.
In our case study, the grid size of the assessing unit is 100 m×100 m. Therefore, the study area was divided into approximately 220 000 grids. 10 000 Monte Carlo simulations and copula function fittings were performed in every grid. Every grid has three risk probability types, including two SRP and one MRP. These risk probabilities were calculated using MATLAB 9.2. Spatial data preprocessing and the mapping of the three risk probabilities were realized in ArcMap 10.5.

Risk assessment results
The ranges of surface runoff risk probability, total nitrogen pollutant load risk probability, and comprehensive risk probability in the study areas were 0 to 0.89 with a mean value of 0.33, 0 to 0.64 with a mean value of 0.44, and 0 to 0.58 with a mean value of 0.23, respectively. According to the averages of risk probabilities in the entire study areas, all three risk types were at the middle risk level.
However, all three types of risk presented high spatial heterogeneity and an urban-suburban gradient. Areas exhibiting a high surface runoff risk, high total nitrogen pollutant load risk, and high comprehensive risk were mainly distributed in the areas within the Beijing Fifth Ring Roads ( figure 3). Furthermore, the areas between the Beijing Fifth and Sixth Ring Roads mainly exhibited no surface runoff and comprehensive risks, but still some regions presented moderate and high risk levels (figures 3(a) and (c)). Differently, large areas showing moderate and high levels in total nitrogen pollutant load risk could be found between the Beijing Fifth and Sixth Ring Roads ( figure 3(b)).

Risk contribution results
The ranges of contributions from surface runoff risk, total nitrogen pollutant load risk, and comprehensive risk in the areas within Beijing Sixth Ring Road were 0% to 31% (mean, 25%), 0% to 89% (mean, 32%), and 0% to 70% (mean, 43%), respectively. Therefore, the joint risk from surface runoff and total nitrogen pollutant load was overall the major risk type in the study area.
The major risk indicator in a certain region was discerned by mapping the contribution proportions of the three types. Comparing the contribution proportions of the three risk types, the contribution proportion of the joint risk from the two stressors was the highest (70% to 85%) and the contribution proportion of the individual risk from surface runoff was relevantly lower (15% to 30%) in the central areas of Beijing (figures 4(a) and (c)). These findings indicate that the occurrence of surface runoff risk is often accompanied with the occurrence of non-point source pollution risk in the areas with high impervious surface ratio. In other words, the central areas of Beijing were mainly affected by comprehensive risk. In addition, the findings can explain why the spatial pattern of the MRP (figure 3(c)) is identical to that of the SPR related to surface runoff ( figure 3(a)). Moreover, in the areas covered by farmland, the contribution proportion of the individual risk from total nitrogen pollutant load was the highest (85% to 90%), suggesting that the predominant contribution to nitrogen load came from agriculture (figures 2 and 4(b)). In contrast, the contribution proportion of the individual risk from surface runoff was very low (<10%) in these areas ( figure 4(a)). These findings suggest that although the volume of surface runoff was low, the nitrogen load was very high on farmland, which may induce negative impacts on surface water and groundwater. Therefore, both surface runoff and nitrogen pollutant load need to be managed and controlled in the central areas of Beijing, and only nitrogen pollution needs to be controlled in the areas covered by farmland.

Discussions
The risk assessment results of the case study show that the spatial patterns of artificial surface, high impervious surfaces ratio, and high comprehensive risk probabilities were similar (figures 2, S3, and 4(c)), indicating that the expanding of artificial surfaces is a major stressor of the urban water environment. These  In addition to risk stressors, the non-stationarity of important parameters would affect the risk assessment result. For example, if the climate has drastic changes in Beijing, extreme events of precipitation or drought would occur. In this situation, the surface runoff risk probability would not be the same as the results of our case study. In fact, according to historical data on the climate in Beijing, the annual precipitation was stationary in the past 40 years as it follows a normal distribution (Yang et al 2002). Therefore, the results would not be significantly affected by climate change in our case study.
Actually, our method considers the uncertainties associated with different factors, including the uncertainty related to climate change. Even though the uncertainties associated with some variables are large, a relatively accurate risk probability can be calculated by increasing the number of samples for Monte Carlo simulation (Maccheroni andMarinacci 2005, Ghersi et al 2017). In other words, they can be handled in the ecological risk assessment in complex urban ecosystems by using our method.
Risk threshold setting is also important for the risk assessment results. In our case study, the risk thresholds of the surface runoff and pollutant load were not adjusted in the multiple risk assessment, because the damage effects of the two single risk indicators are approximate independent. In fact, in the case that there are interactions between the damage effects of different risk indicators, adjusting the comprehensive risk thresholds will increase the rationality and accuracy of risk assessment results. In general, when the risk receptors are the same, there may be positive or negative interaction between these risk indicators. In this situation, the risk threshold of a single risk indicator in the multiple risk assessment is more stringent or looser than that of the corresponding indicator in the single risk assessment (Borgert et al 2004). For example, as important indicators of urban thermal environment, surface temperature and net radiation flux enhance each other (Kato and Yamaguchi 2005). Therefore, the risk threshold of surface temperature (or net radiation flux) in the calculation of the MRP should be lower than that in the calculation of the SRP associated with surface temperature (or net radiation flux). For an opposing example, urban surface runoff can effectively reduce urban surface temperature (Jimenez-Munoz andSobrino 2003, Olivera andDeFee 2007), thus the risk threshold of surface temperature in the comprehensive risk assessment can be higher than that in the single risk assessment associated with urban surface temperature.

Conclusions and outlooks
We have presented a new method that connects Monte Carlo simulation and the copula model to calculate the risk probabilities of the SRP and MRP for urban ecological risk assessment. The method requires ecological models and the probability distributions of the models' parameters and independent variables. Compared with traditional risk probability calculation methods, this method does not need the derivation of theoretical probability distributions of the single indicators and multiple indicators, and calculates the SRP and MRP by numerical simulations rather than definite integral operations. In addition, the proposed method can calculate the risk contributions of different risk indicators in different areas, has strong practicability in a variety of ecological risk probability calculations, and can provide scientific support for further risk management. Further applications of our method can look at other urban ecological risk types, more than two risk receptor indicators, and the applicability of the method in other regions.