A process-based flood frequency analysis within a trivariate statistical framework. Application to a semi-arid Mediterranean case study

This paper proposes a trivariate methodology for flood frequency estimation. It combines the flood peak, storm magnitude, and initial soil moisture condition (ISMC) as the main flood-related statistical variables to be considered. The semi-arid Mediterranean “ Rambla del Poyo ” catchment has been used as a representative case study where the influence of the spatio-temporal variability of the storms and the ISMC on floods can lead to differences of up to two orders of magnitude in quantiles when the most commonly used methods are applied. In order to incorporate the main flood-generating mechanisms, the integrated use of a multidimensional storm generator with distributed hydrological modelling is proposed. Flood quantiles are then estimated by combining the maximum flows with the storm magnitude and ISMC in a trivariate probability distribution function through the application of Bayes ’ theorem and Lagrange ’ s Mean Value theorem. Although the methodology proposed in this paper has been applied and tested in only one case study, it can be extended to other case studies due to its process-based orientation.


Introduction
Flash floods, common events in the Mediterranean hydro-climatic region, exert major socio-economic impacts (Gaume et al., 2009).Such events are preponderant in this area due to the convergence of relevant factors such as high mountain ranges close to the sea, high intensity convective storms (especially in autumn), small catchments with sharply differentiated steep slopes, sparse vegetation, thin soils, and permeable rocks (Marco, 1995).In the Western Mediterranean, recent studies have revealed a positive trend for extraordinary flashflood events for the 1900-2011 period (Llasat et al., 2016).In fact, according to the EM-DAT database (https://www.emdat.be), in the Spanish Mediterranean area there have occurred 4 of the 10 largest catastrophic flash-floods events registered in that country for the period 1960-2019.In addition, an intensification of torrential rain events in autumn is to be expected for this region due to climate change (Millán, 2014).For this reason, a process understanding is required for flashflood risk management because the dominant processes of the generation of runoff may change with the increase of storm severity (Borga et al., 2011).
Within the framework of flood risk management, one of the main studies supporting flood hazard assessment is that of flood frequency analysis.It is related to the so called "design flood estimation" and can be defined as the value of the peak flow rate corresponding to an assigned non-exceedance probability, usually expressed in terms of return period (Stedinger et al., 1993).In several cases, the requirements are high return period flood quantiles to ensure maximum safety levels or a socially acceptable level of risk.For example, the Flood Directive (European Union, 2007) establishes that the scenario of medium probability of flooding should have a return period greater than or equal to 100 years, and higher for that of low probability of flooding.Spanish legislation establishes that the scenario for this low probability of flooding should correspond to a return period of 500 years (España, 2010).
In order to estimate high return period flood quantiles, there are many methods that can be roughly grouped into the following 3 categories: statistical or probabilistic, deterministic, and hybrid or mixed.Probabilistic methods are based on the existence of recorded data that characterises flooding at the point of interest.Its "standard approach" is based on the adjustment of a mathematical distribution function of probability to the sequence of maximum flows recorded at the site of interest and on the extrapolation of this function tail for very low exceedance probabilities (Klemeš, 1993).Within the "deterministic" group, the most widely applied method is based on the use of eventbased rainfall-runoff models, where the return period of extreme precipitation design events is equivalent to those of the extreme flows generated by those events.This is called the "iso-frequency hypothesis".The hybrid or mixed group is based on the use of the above groups of methods, either through their combination or by adding sources of information or techniques, and they are generally referred to as "derived flood frequency" approaches.
Despite the widespread use in engineering and mathematical refinements of the "standard approach" of the probabilistic methods, there remains extensive debate regarding its high degree of uncertainty.Francés (1998) poses that this approach can give rise to highly variable results due to: the uncertainty of the statistical model; the data recording errors (principally on large floods); the high variance and asymmetry of the maximum flows; and the length of the time series.The latter constitutes the major issue, since the extrapolation of short time series to estimate high return-period flood quantiles may lead to major errors (Klemeš, 1993).Indeed, flash floods very often occur in poorly instrumented or ungauged basins (Gaume et al., 2009), which directly influences the method to be used.In fact, the length of the time series and the low density of monitoring stations constitute the main source of uncertainty and one of the main challenges to be faced in flood frequency analysis in arid/semi-arid regions (Metzger et al., 2020).In the case of the "standard approach" of the deterministic methods, there are, among others, three critical hypotheses according to Rogger et al. (2012): the choice of the design rainfall hyetograph; the iso-frequency hypothesis; and the selection of initial soil moisture conditions (ISMC, hereinafter) previous to the flood-generating storm.
Among the so-called "hybrid or mixed methods", is the pioneer work given by Eagleson (1972) who introduced the concept of the "derived frequency curve" based on a simple outline of understanding the hydrological processes that generate a certain frequency of events using precipitation records and rainfall-runoff models.Within this typology, certain authors suggest that, for flood frequency analysis, continuous simulation approaches are preferable to those that are event-based (Verhoest et al., 2010;Grimaldi et al., 2012;Lamb et al., 2016;Grimaldi et al., 2021).As emphasised by Grimaldi and Petroselli (2015), event-based approaches take advantage of the digital information and the computational power currently available, which has positioned these approaches as an effective way to sidestep the problems inherent in widely used methods such as the rational formula.However, one of the main drawbacks of this approach involves the need to select the ISMC prior to the storm, which is one of the reasons that gives the continuous frameworks an advantage over those that are event-based (Winter et al., 2019).Nevertheless, as pointed out by Astagneau et al. (2021), common failure in predicting floods is still encountered due to seasonal model bias.Furthermore, this approach cannot be generalised to all types of hydrological regimes, such as ephemeral systems in Mediterranean areas, in which there are no flows for most of the year (Marco, 1995;Metzger et al., 2020) and floods are sporadic (Boughton and Droop, 2003).In our experience, moreover, the current practical difficulties involved in implementing such weather generators should be considered (Beneyto et al., 2020;Beven 2021), as should, to a lesser extent, the computational time required for the long continuous subdaily simulations (Yu et al., 2019); a way to reduce this computational time is given in Grimaldi et al. (2021).
Given the disadvantages of the aforementioned hydrological modelling practices for flood frequency analysis in Mediterranean areas, this paper proposes a hybrid approach following the "combinatorial approach" reasoning as laid out by Klemeš (1993).The idea is to involve the main flood-generation factors (see Section 3.1) in the probability through an innovative trivariate methodology (see Section 3.2).As Grimaldi et al. (2021) have underlined, it could be a contradiction to use a complex approach in case studies without sufficient data for calibration and validation.However, the contribution of a process-based approach, such as the one in this paper, is that frequency estimation is supported by the generation of pieces of information that take full advantage of the available digital information of the catchment, scientific knowledge of the hydrological cycle, and modelling tools existing today.
Recent studies highlight both the potential and the challenges of process-based approaches.Perez et al. (2019) proposed an estimation of peak flow quantiles in Iowa (USA) using representations of regional peak flow distributions derived from a combined framework of stochastic storm transposition, radar rainfall observations, and distributed hydrological modelling.These authors point out that one of the aspects to be considered in future studies involves the development of representations of seasonally varying hydrological processes of a more realistic nature and their interaction within synthetic peak flow simulations.In the same vein, Yu et al. (2019) emphasise the advantages of using process-based approaches for frequency analysis; they draw attention, however, to the need to properly consider the effect of seasonality in two ways: input variables and process variation within the modelling.The effect of the structure of the hydrological model on flood analysis should also be borne in mind.As has been pointed out in several studies (Zhu et al., 2018;Sun et al., 2018;Tarasova et al., 2020), the spatial interaction between rainfall structure and soil moisture exerts significant impacts on the hydrological response to floods.The use of lumped models may therefore introduce a limiting factor in the prediction.In fact, such an effect has been pointed out by Astagneau et al. (2021) as one of the possible reasons for the deficiencies found in their study of a large set of catchments in France.
In light of the above challenges identified in the literature, this study S. Salazar-Galán et al. presents an innovative contribution based on the identification of the main processes to be considered and a way to combine them statistically through a trivariate methodology (see Section 3).This is based on the exploitation of information generated from a multidimensional stochastic storm model and hybrid distributed hydrological modelling: a continuous daily model to identify the predominant ISMCs of registered flood events that considers their seasonality, and an event-based subdaily model to generate flood events using synthetic storms and the predominant ISMCs.In order to verify the validity of the proposed trivariate methodology, it was applied in a representative case study in Spain where a catastrophic flood occurred in the autumn of 2000 and where hydrometeorological data are available for the validation of the initial hypotheses (see Section 2).The rest of the paper is organised as follows.The fourth section summarises the results obtained from the applied methodology.The fifth section presents a discussion considering the findings achieved.Finally, the sixth section presents the main conclusions.

Case study
The "Rambla del Poyo" catchment is part of the Jucar River Basin Authority (Confederación Hidrográfica del Júcar, CHJ, hereinafter).This catchment is located in the region of Valencia (Spain), near the coast.Its principal channel flows in a northwest/southeast direction into Lake Albufera (see Fig. 1) for approximately 43.5 km.As can be seen in Fig. 1, the lower part of the catchment is very close to the western side of the city of Valencia where part of its metropolitan area, including its international airport, lies within the "Rambla del Poyo" floodplain.
The catchment area up to its mouth at Lake Albufera measures 430 km 2 .At the headwaters, the slope is greater than 16.4%, and in the floodplain the slope is less than 2.7%.According to the lithostratigraphic and hydrogeological map of Spain (IGME, 2010), the catchment has the following characteristics: i) in the headwaters, there are predominant carbonate formations of moderate to high permeability; ii) in the central part, detritic formations of low permeability exist; iii) in the lower part, there are quaternary detritic formations of moderate to low permeability, and to a lesser extent, carbonate formations of moderate to high permeability; iv) the principal channel flows from the head to the lower part through quaternary detritic formations of very high permeability.According to the soil maps of the Valencian region (Generalitat Valenciana, 1996), there is a variety of taxonomic units which can be grouped, in accordance with level one of the FAO legend, in the following main categories: Leptosols and Luvisols at the head, Calcisols in the central part, and Fluvisols, Luvisols and Regosols on the floodplain.According to the CORINE land cover map (EEA, 2019), the catchment had the following coverage: 10% urban areas, 63% agricultural areas, 23% shrubland, 1% pine forests, and 3% water bodies.
The climate in "Rambla del Poyo" is semi-arid Mediterranean, with a mean annual rainfall of less than 500 mm and mean annual reference evapotranspiration of approximately 1,100 mm, which results in an ephemeral river.According to Francés (1998), in the Spanish Mediterranean catchments there are two kinds of floods: ordinary and extraordinary.The "ordinary floods" are generated by the most frequent types of rainfall, with low or medium convectivity, which produces minor floods.The "extraordinary floods" are less frequent but larger and are generated by heavy convective rainfall events occurring mainly during the autumn.
Regarding the hydrometeorological data available for this catchment, there are two sources of information.One of these is from the network of the Spanish Meteorological Agency (Agencia Estatal de Meteorología, AEMET, http://www.aemet.es)with daily data available for the period 1951-2019.The other network is the Automatic Hydrological Information System of the CHJ (SAIH-CHJ, hereinafter, http://sa ih.chj.es/chj/saih/) with daily data available on flow and precipitation from 1988 to 2019 at the Poyo station, which has a sub-catchment area of 184 km 2 .Observed hydrographs of the maximum events that occurred in the period 1988-2012 are available in five-minute time discretization (9 events with complete hydrometeorological data).Given the differences in spatial and temporal coverage of the available data, the period to be considered for the hydrological modelling (see Section 3.2.2) is 1951-2012 for the daily resolution and 1988-2012 for the sub-daily resolution.

Methodological framework
In this paper, a methodological framework is proposed based on the physically possible combinations of the principal characteristics involved in the behaviour of flash floods at the catchment scale, and their integration using a multivariate statistical approach.Following Klemeš (1993), this framework has two parts: estimation of the magnitude of an event, and estimation of its probability.To this end, the following flood-generating processes and methods are considered in order to obtain the flood quantiles in the catchment under study.

Main factors affecting the hydrological response to flash-floods
Floods are the result of numerous hydrological processes, characterised by significant levels of spatial and temporal variability, where the generating factors mainly depend on hydro-climatic characteristics of the study area (Merz and Blöschl, 2008).The case study analysed herein, "Rambla del Poyo", typically represents the predominant hydrological processes in small semi-arid Mediterranean catchments.
Flash floods frequently take place in these types of catchments, mainly during the autumn season (Merheb et al., 2016), resulting from heavy precipitation events developed in mesoscale convective systems (Romero et al., 1998;Pastor et al., 2010).As reported by Peñarrocha et al. (2002), several daily precipitation values over 800 mm have been recorded in the region where the case study is located.Runoff is mainly overland flow generated by infiltration excess.However, mixed generation mechanisms of runoff can also occur depending on the water retention capacity of the hillslopes and the ISMC (Calvo-Cases et al., 2003).It is particularly important to take this last issue into account in the modelling process.In fact, as highlighted by Perrin and Tournoud (2009), subsurface flow is sometimes ignored in flash-flood analysis, although flow registers show that hydrograph can continue for several hours (or several days) after the rain has actually stopped.
One of the key characteristics for the correct modelling of flash-flood events in Mediterranean regions is the space-time variability of convective storm intensities.As suggested by Sangati et al. (2009), an incorrect representation of the spatial variability of rainfall intensities can yield errors of up to 35% in peak flow predictions.Likewise, these authors also found that such errors could be even larger in the case of dry ISMC.
Indeed, ISMC plays a fundamental role in the runoff generation processes, since this condition controls infiltration capacity at the beginning of the storm.Several studies refer to the key influence of ISMC on the flood processes in the Mediterranean catchments (e.g., Marchi et al., 2010;Efstratiadis et al., 2014;Merheb et al., 2016).Mateo Lázaro et al. (2014) place the ISMC as the most influential factor in the magnitude of flash floods produced by the same amount of rainfall.These authors also remarked on the crucial role played by the temporal distribution of rainfall intensities on the hydrograph shape and magnitude of the flash floods.In the Valencian region of Spain, Camarasa Belmonte and Segura Beltrán (2001) found that the runoff coefficients in the ephemeral streams are highly variable and mainly depend on the rainfall intensity and the ISMC.
Finally, the spatial variability of the geomorphological characteristics and the river channel network connectivity of the catchment have also been identified in the literature as decisive factors to suitably model flash-flood response (Marchi et al., 2010;Mateo Lázaro et al., 2014).

Proposed trivariate methodology
Fig. 2 shows a diagram of the sequence to follow in the methodological framework proposed in this paper.Solid lines indicate input data, dashed lines are results, which in turn serve as input in another task.Accordingly, the proposed trivariate methodology needs the implementation of two models: (i) a stochastic multidimensional sub-daily model for the generation in space and time of convective storms; (ii) a distributed rainfall-runoff model to reproduce the existing spatial variability at two different temporal scales.The first is a continuous model at daily time scale for the identification of the predominant ISMCs of registered flood events, while the second is an event-based model at subdaily time scale to generate flood events using the synthetic storms and the predominant ISMCs.
The trivariate methodology is made up of the following tasks: (a) a storm frequency analysis based on regional records of the annual maximum of daily point precipitation, in order to assign probabilities regarding the magnitude of the synthetic storms (represented by the continuous random variable R, the point-rainfall daily equivalent); (b) an ISMC frequency analysis based on the results of the continuous model (which is a discrete random variable denoted by H); and finally (c) the estimation of the flood quantiles based on a trivariate statistical approach that takes into account the generated peak flows (called X) obtained in synthetic flood generation and the probability distribution function of both R and H.

Storm generator
Blazkov and Beven (1997) pointed out the practical interest of using stochastic models to generate representative ensembles of storm events.The aim is to obtain a wide range of storms suitably defined with a fine space-time resolution.In fact, a significant number of major storms (large return period T) are required by the proposed methodology.Indeed, such requirement is not possible to fulfil using only recorded historical events, as there are only a few or none of such events available that can be associated to large T values.Therefore, the strategy adopted in this research involves the generation of high-resolution space-time rainfall scenarios from a suitable stochastic rainfall model, whereby certain internal features and statistics observed in the most intense rainfall events recorded in the study region are reproduced.To this end, the multidimensional stochastic model proposed by Salsón and Garcia-Bartual (2003) has been used.
The model proposed herein was specifically conceived to reproduce structural properties of torrential convective storms characteristic of the  Mediterranean regime.Consequently, the rainfall field is conceptualised based on the description of convective cells, which are known to be the most relevant features of these types of rainfall events.Moreover, a suitable representation of these cells is crucial for the reproduction of the observed spatio-temporal rainfall variability.Following previous modelling research lines, rainfall intensity field results from the observed cellular structure associated with ensembles of rain cells (Waymire et al., 1984;Sivapalan and Wood, 1987;Willems, 2001).This family of Poisson cluster-based rainfall models, which also include popular temporal rainfall models (e.g., Rodriguez-Iturbe et al., 1987;Cowpertwait, 1991;Velghe et al., 1994), have been successfully applied in a wide range of practices for hydrological risk assessments (for example in Kim et al., 2014).
The identification, parametrisation, and shape description of rain cells has received major attention in a variety of experimental studies (Rigo and LLasat, 2004;Peleg and Morin, 2012).In the proposed model, the spatial location of the cell, the time of birth, and cell-centre rainfall intensity are random variables, while the individual cell intensities have a deterministic description accounting for the rain cell phases of growing, maturing, and gradual decay.The maximum intensities occur in the centre of cells and are assumed to follow an exponential distribution (Rodriguez-Iturbe et al., 1987).The validity of this hypothesis has also been proved in recent experimental studies (Goudenhoofdt et al., 2017).For the case of the Mediterranean coastal area, where our case study is located, the sample of maximum intensities corresponding to the 73 rainstorms reported in García-Bartual and Andrés-Domenech (2017) is analysed.Fig. 3 shows the frequency histogram of maximum intensities of the sample of rainstorms.The exponential distribution shows a satisfactory agreement with the data.Further details of the model, including analytical expressions for the covariance function for both the rainfall intensity and cumulative rainfall process, can be found in Salsón and Garcia-Bartual (2003).Büchele et al. (2006) recommend the combined use of statistical theory with the knowledge of the characteristics of the hydrological processes resulting from the use of rainfall-runoff models to reduce the levels of uncertainty inherent in the estimation of extreme values and to achieve reliable results.In the case of Mediterranean catchments, the use of distributed hydrological models is convenient since it is possible to consider both the effect of the high spatial-temporal variability of storms and the spatial variability of the catchment attributes that together shape the hydrological response.In this paper, the conceptual and distributed hydrological model TETIS (Francés et al., 2007) has been used for the establishment of the catchment water cycle.This has proved to be robust for the representation of the main hydrological processes in the catchment under study (Bussi et al., 2013;Salazar et al., 2012).

Distributed hydrological modelling
TETIS represents the spatial heterogeneity of the physical characteristics of a catchment by means of regular grid cells and their related model parameters.The conceptualisation of the model is an interconnected three-dimensional grid representing the main hydrological processes.The basis of the calculation involves the estimation of the water balance for each cell, and assumes that the water is distributed across six interconnected conceptual storage tanks, plus a seventh tank representing the river channel when it exists in the cell.In this way, TETIS considers the principal mechanisms of streamflow generation: infiltration-excess and saturation-excess overland flow, interflow, and base flow resulting from groundwater discharge.
For the case study, this paper proposes the following modelling strategy: while keeping the continuous modelling advantages for ISMC estimation with a broad temporal discretization (a daily resolution), the event-based modelling approach is used with a much finer discretization (ten-minute resolution) to reflect the high spatio-temporal variability of flood-generating storms.In both cases, the spatial discretization used in TETIS is 100 m.

Flood quantile estimation through a trivariate statistical approach
The problem to be solved involves the quantification of the marginal probabilities of the random variables of the flood-generating processes (storm magnitude and ISMC, represented by R and H, respectively) and subsequently its combination with the flood peak X based on a multivariate statistical approach, as explained below.

3.2.3.1.
Estimating the probabilities of synthetic storms.This task refers to the estimation of the frequency of occurrence of each synthetic storm, expressed in terms of return period.It is clearly impossible to assign a single probability to the entire storm due to its intrinsic multivariate characteristics.A practical procedure is therefore proposed for the reduction of a multidimensional synthetic storm to a single point-rainfall daily-equivalent (R).The exceedance probability of R, and hence the corresponding return period, can be obtained with an extreme value distribution previously adjusted to the observed annual maximum daily precipitation within the catchment.Of course, in order to reduce the uncertainty associated with estimating probabilities, the use of a regional frequency analysis is recommended (a discussion of methods is given in Svensson and Jones, 2010).

Estimating the probability of ISMCs.
Given that floods in Mediterranean ephemeral systems are sporadic and mainly occur in autumn, the ISMC has been considered as a random variable H with a discrete probability distribution (see an analysis into the influence of ISMCs on flood frequency distribution in De Michele and Salvadori, 2002).The results obtained from the implementation of the continuous hydrological model form the basis of the estimation of the frequency of H, given a threshold for the daily discharge to select the flood events.In this task, it is also necessary to verify whether a statistically significant correlation exists between R and H.In the case of "Rambla del Poyo", there is independence between these two random variables (see Section 4.4).For this reason, this paper assumes independence between H and R in the trivariate flood frequency analysis.

Flood quantile estimation.
The methodology for the frequency analysis of the maximum flows (X) is based on the existence of a joint probability distribution function of X , R (the point-rainfall dailyequivalent of the synthetic storms) and H (the predominant ISMC for significant floods).As explained in the previous paragraphs, it is possible to estimate the marginal distribution functions of R and H. Therefore, the objective is to determine the marginal distribution function of X from a sample of × , each of which is conditional on given values of storm magnitude r and ISMC h obtained with the simulations of the subdaily event-based hydrological model.By application of Bayes' theorem and if H and R are independent, then the joint density function of the trivariate model can be expressed as: where π j is the probability of H = h j with j = 1, … m.From the previous equation, the marginal distribution of X is given by: The latter can be rewritten to be evaluated by intervals (Ri, Ri + 1), as follows: According to Lagrange's Mean Value Theorem, an approximation of the previous integral can be carried out in this way: where r* is an intermediate point in the interval of integration.Finally, S. Salazar-Galán et al. the value of the continuous distribution function can in turn be approximated by a plotting position obtained from the sample of x values: where n ij (x) is the number of observations less than or equal to x within the range [Ri, Ri + 1] for the ISMC h j , and N ij is the total number of observations under the same conditions.

Synthetic storm generation
Thanks to the model formulation, it is possible to verify its capability to reproduce observed empirical statistics of the historical records, such as mean, variance, spatial correlation, point-rainfall temporal autocorrelation, duration of the event, maximum intensities distribution, normalised mean function, and total cumulative rainfall.Other statistics were also computed after the synthetic generation in order to verify the credibility of the generated events.For instance, derived synthetic intensity-duration curves for every point of the spatial grid.
Being essentially event-oriented, the outcome of the model consists of a number of independent synthetic rainfall events, each of which is a continuous space-time random field, with rainfall intensity values defined with time resolution of 10 min and spatial resolution of 1 km 2 .This choice follows previous research on hydrological modelling in the studied catchment that considers the effect of the spatio-temporal variability of storms on hydrological response (Guichard- Romero et al., 2009).
The estimation of model parameters follows the procedure proposed in Salsón and Garcia-Bartual (2003) at the regional scale.In particular, the Method of Moments is applied, whereby the mean and variance of the accumulated total rainfall, the spatial correlation function, the temporal autocorrelation in selected rain gauges, and the normalised mean function (temporal evolution of cumulated rainfall normalised by total cumulative rainfall) are all considered.High resolution rainfall data from the regional automatic hydrological information system (SAIH-CHJ) is used for parameter estimation.Fig. 4 shows both the autocorrelation of the temporally averaged intensity process and the correlation of total depth based on the event that occurred in October 2000, which is the most extreme event recorded to date by the SAIH-CHJ.
The stochastic rainfall model was employed for the generation of 100 synthetic storms.This ensemble provides a wide range of space-time rainfall intensity fields.Each of the synthetic events was carefully analysed to check its feasibility.To this end, several derived quantitative properties were checked, including point-and areal-derived hyetographs, intensity-duration curves, maximum intensities reached, and maximum cumulative rainfall quantities.Consequently, each of the generated synthetic events can provide an animation showing the evolution of the space-time rainfall intensity field.Fig. A1 in Appendix A shows an example of the spatial evolution of the intensity field for the most intense 6-hour period.In this example, there are 4 storms with a similar R with its relative hydrological response at the gauging station (see Fig. A2).As supplementary supporting material, the animation sequence of the whole storm duration of Appendix A is presented in Appendix B. Thus, the result of this modelling step was a representative and realistic collection of space-time storms which serve as input for both synthetic flood generation and trivariate frequency analysis.For the generated family, the ten-minute maximum cell intensity ranged from 26 to 95 mm/h.The spatial density of rain cell occurrence varied between 0.002 and 0.06 cells/km 2 .The duration of the storms exhibited a large variation, ranging from 1.5 h to 262 h.Finally, the areal average rainfall depth among the different events generated also presented a significant variation, ranging from 37 to 868 mm.

Implementation of the distributed hydrological model
The effective parameters of the model were obtained following a separate structure (see details in Francés et al., 2007) consisting of: i) maps of parameters estimated a-priori based on all available spatial information; and ii) a correction factor for each parameter map.In the calibration process, only the correction factors were adjusted.The suggestions of Klemeš (1986) were followed for the calibration-validation of the model in a split-sample test.A distinction between the modelling objectives of the different implemented temporal resolutions was made in order to accept each hydrological model.The aim of the continuous model (daily resolution) is to obtain the ISMC prior to flood events.For this reason, the acceptability criteria used were bias in volume (PBIAS_V %), the Nash-Sutcliffe efficiency index (NSE), and the ratio of the root mean square error to the standard deviation of measured data (RSR).The aim of the event-based model (ten-minute resolution) is to obtain the hydrograph at the point of interest for each combination of synthetic storm and ISMC.In this case, the acceptability criteria used were the NSE and the RSR.In addition, the bias in peak flow (PBIAS_PF%) and time to peak (BIAS_TPmin) were employed to check the behaviour of the hydrograph at its maximum value.
Daily data on rainfall and maximum and minimum temperatures from the AEMET network allowed the establishment of a continuous hydrological model for the period 1951-2012.However, there has been only one flow gauge station located in the central part of the catchment since 1988.The data from October 1990 to September 2003 was used for automatic calibration with a short warm-up period (aquifer is not connected), while the remaining data was used for validation (October 2003-September 2012).The calibration showed a PBIAS_V% of 4.5%, the NSE was 0.85, and the RSR was 0.14.Within the validation period, the PBIAS_V% was 19%, the NSE was 0.64 and the RSR was 0.75.According to Moriasi et al. (2015), these results should be considered suitable to achieve their purpose: the sample of ISMCs.
The range of maximum annual events recorded at sub-daily resolution by the SAIH-CHJ was considered for the establishment of the eventbased model.In all the events, the ISMC was taken from the proper date in the continuous modelling.For the calibration, the maximum recorded event of the available data (the October 2000 event) was used.For this event, the NSE was 0.85 and the RSR 0.39 (see Fig. 5) with a PBIAS_PF% of 9% and a PBIAS_TPmin of 10 min.The results of the model validation, considering the remaining events with sub-daily data for the 1988-2012 period (9 events in total considering the quality of the hydrometeorological records), showed values of the NSE index between 0.55 and 0.68 and between 0.57 and 0.69 for the RSR.The PBIAS_PF% were between 18% and 35% and the BIAS_TPmin were between 20 and 40 min.
Considering the reference values suggested by Moriasi et al. (2015), the results in terms of NSE and RSR, in calibration and validation, showed that the catchment model is acceptable for the established objective: the sample of peak flows.In all the simulated events, the results showed that the predominating runoff generation mechanism is the infiltrationexcess overland flow (rainfall exceeds infiltration capacity).However, in the case of large events such as the calibration event, mixed generation mechanisms of runoff were found with additional mechanisms such as saturation-excess overland flow and interflow.

Estimation of the exceedance probability of the magnitude of synthetic storms
The magnitude of each multidimensional synthetic storm was represented by a single point-rainfall daily-equivalent (R), which can be obtained as the areal averaged maximum daily precipitation of the storm, divided by the areal reduction factor of the catchment.The magnitude of the synthetic storms R ranged from 37 to 670 mm.
A regional study at a national scale of the annual maximum daily precipitation recommended by the Spanish authorities (Ministerio de Fomento, 2016) has been used to assign exceedance probabilities to the generated Rs.In this regional study, the probability distribution function is the square-root exponential type extreme distribution (SQRT-ET-max) proposed by Etoh et al. (1987).See Salas and Fernández (2007) for a discussion on the relevance of the use of the SQRT-ET-max distribution in the Spanish Mediterranean region from a regional frequency analysis point of view.This distribution has two parameters which were obtained using: i) the coefficient of variation of the homogeneous region from the original regional study (Ferrer and Ardiles, 1995); and ii) the local mean value obtained for this research from AEMET data (1951-2012 period).

Estimation of the probability of ISMC
The predominant ISMCs were obtained from the continuous hydrological modelling.To this end, the maximum annual flood events were located and their corresponding ISMCs were then selected.The ISMC prior to each event was obtained from the TETIS model state variable called "relative soil moisture content", which represents the quantity of capillary water contained in the upper part of the soil from wilting point (0%) to field capacity (100%).This variable has provided valuable information on the statistical behaviour of the ISCMs of floods over a long period of time (see Fig. 6).Fig. 6 shows that, for events occurring outside autumn, ISMCs must be greater than 40%.This result is consistent with expected higher correlations between flood-producing storms (R) outside autumn, typically of low and medium convective nature, and their ISMCs (H).In contrast, in autumn, when flood-generating storms generally occur with heavy convective rainfall events, there is a high dispersion in the R and H relationship.
Two rank correlation methods, Spearman (Spearman, 1904) and Kendall (Kendall, 1938), were applied to verify whether a statistically  significant correlation exists between R and H.To consider the behaviour of the maximum storms observed during the seasons of the year, these tests were carried out: i) for all events without seasonal distinction; and ii) by making a distinction between those events occurring in autumn and those occurring in other seasons.It was established as a null hypothesis that rainfall is independent of the ISMC.This was verified to a 95% confidence interval.If the entire sample is analysed without seasonal distinction, the null hypothesis is rejected in both tests.However, making the seasonal distinction, the independence between rainfall and the ISMC is accepted in both correlation methods.
For autumn, the focus in this case study, a "dry ISMC" and a "wet ISMC" were established using the 40% needed for a significant flood out of autumn as the threshold.The representative (sample mean) soil moisture content for these two states are 10% for dry and 70% for wet, and their probability is 0.4 and 0.6, respectively (horizontal lines in Fig. 6).

Flood quantiles estimation
Two hundred synthetic hydrographs were generated using the eventbased hydrological model with a combination of the one hundred synthetic storms and the two representative ISMCs (dry and wet) as input data.Fig. 7 shows the application of the trivariate plotting position given by Eq. 5, which refers to the location of the flow gauge station.The quantiles are plotted with an empty red circle, which is filled by two different colours to visualise dry (orange-filled) and wet (brown-filled) events.The method has been validated by comparison with the standard statistical approach using all the observed annual maximum floods (period from 1988 to 2019) in two ways: with the plotting position of the observations (following the compromise formula suggested by Cunnane (1978); and the parametric best-fit flood frequency distribution (log-Pearson Type III -(LPIII) using the Maximum Likelihood method for parameter estimation) with its respective 90% confidence interval.Due to the relatively short length of the observations (42 years), these comparisons are nonsensical for high return periods.

On the limitations of the statistical approach
As was previously explained in the introduction, the standard approach of the statistical methods can lead to major errors, among others, due to the sample length.As can be seen in Fig. 7, the plotting position of the observations show a plateau and a step change in slope because there is one extraordinary flood observed in the period 1988-2019, which also conditions the fitted LPIII distribution function.Fig. 7 shows that for low-return periods (quantiles less than 10 years), the slopes of the frequency curve of the statistical methods and the trivariate method are similar.For flood quantiles between 10 and 100 years, the LPIII smooths the actual behaviour of the catchment represented by a plateau followed by a step change in the observed plotting position probabilities and a lower slope in the trivariate method.For the estimation of floods with a high return period, the use of the best distribution function adjusted to the observed data results in a wide range of uncertainty (grey dashed line in Fig. 7).This result reflects the limitations of using a standard statistical method when extrapolating extreme values outside the length of the observed period.In fact, in the case of catchments with flash-flood regimes with high sample variability such as those in the Mediterranean, this is the main limitation as these are usually ungauged or have short time series and poor spatial representativeness (Gaume et al., 2009;Metzger et al., 2020), as in our case study.
The plateau followed by a step change in the plotting position of the observations and the lower slope in the trivariate method is likely to be reflecting the threshold effect and the non-linear behaviour typical of Mediterranean catchments (Salinas et al., 2013), where there are two types of flood populations (ordinary and extraordinary, according to Francés, 1998).In the ordinary floods, the main runoff type is overland flow.In the extraordinary floods, there are mixed processes of runoff generation.For the extraordinary flood recorded in the case study (October 2000, shown in Fig. 5), hydrological modelling has shown that the peak flow was produced by mixed runoff generation mechanisms, as has also been found in other studies (see Calvo-Cases et al., 2003).These characteristics (non-linearity, threshold effect, two flood populations), which cannot be properly considered with the statistical method, can be  S. Salazar-Galán et al. integrated with a method based on the understanding of the main factors affecting floods, such as that proposed in this paper.In Fig. 7, it can be observed that, with the trivariate plotting position the step change of the observed plotting positions is replaced by a curve with a change of slope that could be reflecting the different combinations of flood-generation processes.

On the critical hypotheses of the deterministic approach
As was also previously mentioned, the "standard approach" of the deterministic methods has three critical hypotheses related to the storm spatio-temporal characteristics, its ISMC and the iso-frequency assumption.
The combination of different spatio-temporal characteristics of the storms with a similar magnitude (represented as point-rainfall dailyequivalent R) and their ISMC in the hydrological response to floods (represented as maximum flow X) is illustrated in Fig. 8.In this figure, the maximum peak discharge obtained with a totally dry ISMC (at wilting point) can be up to four times higher for other storms with similar magnitude but with different spatio-temporal characteristics, even for those with the same ISMC.Such variability, considered in stochastic storm modelling, can only be exploited with a distributed hydrological model as proposed in this paper.This variability is therefore likely to influence: the favouring of one or more runoff generation mechanisms; the total or partial saturation of the drainage area; and/or the contribution to the magnification of the maximum peak discharge by the synchronisation of the tributary response.As has been pointed out in other studies (see Perez et al., 2019), a combination of tools such as those used in the present study has a high value in frequency analysis, since it has the ability to explicitly incorporate the physical interaction between rainfall, land surface, subsoil, and drainage network.In contrast, approaches using lumped models have shown that this can constitute a limiting factor for a suitable representation of flood behaviour (Astagneau et al., 2021).
By considering the entire physically feasible range of ISMCs (from wilting point to field capacity), there is an increase in the dispersion of the R-X relationship.As can be observed in Fig. 8, for storms with a similar depth, it is possible to obtain maximum flows with differences ranging from one time up to one order of magnitude (see in Section 5.3 how this effect is transferred in the flood frequency curve).For example, note that for a value of R of approximately 80 mm (±5 mm), it is possible to obtain a range of maximum flows between 4 and 51 m 3 /s, while for a value of R of approximately 205 mm (±5 mm), there is a range of maximum flows between 89 and 880 m 3 /s (see the example of the spatio-temporal variability of these storms and their hydrological response in Appendix A, and see the animated sequence for these storms in Appendix B).The spatio-temporal variability of the storm characteristics combined with different ISMC therefore clearly influence the response of the catchment and cannot be neglected in a hydrological analysis that supports flood frequency estimations.
Concerning the iso-frequency hypothesis, Fig. 9 compares the storm magnitude R return period and its related flood peak X return period obtained with the trivariate analysis for each extreme ISMC (wilting point and field capacity).Fig. 9 shows a very high dispersion (log scale), with differences up to two orders of magnitude between the rainfall and flood return periods.For example, note that for a storm magnitude with a return period of approximately 100 years (±10 years), it is possible to obtain a range of flood return periods between 6 and almost 420 years.Conversely, for floods with return periods of approximately 100 years (±10 years), there are associated storm return periods between 15 and more than 1000 years.As Viglione et al. (2009) analytically demonstrated, the relationship between the return period of both the maximum flow and its generating storm depends strongly on the ISMC, and can reach a difference of between two to three orders of magnitude.As Bocchiola and Rosso (2009) stated, the iso-frequency hypothesis is nonrealistic because no univocal relationship exists between a given flood return period and the generating storm, as was demonstrated in Fig. 9. Again, this relationship depends on the convergence of factors considered in this paper such as the influence of the spatio-temporal characteristics of the storm and its ISMC on the hydrological response.
By considering these results in relation with the fundamental hypotheses of the "standard" deterministic approach in frequency estimations, it has been demonstrated that its use would be inappropriate in the analysed case study and also probably in all catchments similar to those found in the Mediterranean.In this respect, an alternative method based on the main mechanisms that generate floods, as proposed in this paper, is justified.S. Salazar-Galán et al.

Sensitivity of the trivariate approach to ISMC
In order to quantify the effect of the initial soil moisture on the behaviour of the flood quantiles, the two extreme states of ISMC (wilting point and field capacity) have been considered as approximate lower and upper bounds of the trivariate plotting position in Fig. 10.The plotting positions for each state were estimated following Eq.( 5).These bounds reflect the physical limits imposed by the extreme ISMC on the hydrological response, that is, it is a way to introduce information regarding the feasible range of flood quantiles that consider the effect of ISMC.The same approach can be used if other sources of uncertainty are known (e.g., storm characteristic representations, storm return period assignment, input data and both the structure and parametrisation of the hydrological model) since these form the basis for the estimation of the associated uncertainty to the frequency analysis.This last issue was not considered since it falls outside the scope of this paper.However, other similar process-based approaches to that proposed by Perez et al. (2019) have already made progress in the quantification of error components stemming from epistemic assumptions, the parameter estimation method, the sample size, and, in the regional approaches, the number of pooled sites.For the specific case of uncertainty in hydrological models, see Moges et al. (2020).

Practical considerations
In this paper, the advantages of continuous modelling were maintained to describe the ISMC prior to maximum flows at a daily scale for a long period, and subsequently to lend support to its frequency analysis with a physical basis.At the same time, event-based distributed modelling was established to reflect the predominant hydrological processes associated with the typology of flash floods.As Boughton and Droop (2003) mentioned, due to the existence of isolated extreme events in ephemeral systems, which are typical of semi-arid areas as in the case study of this paper, sub-daily continuous modelling is not the most advantageous.If we consider the computational cost of hydrological modelling in the aforementioned systems, where high temporal and spatial resolutions need to be considered to properly capture flash flood processes, the applicability of such modelling strategy may be restrictive for practical purposes.Li et al. (2014) showed that a hybrid approach, based on both event-based and continuous hydrological models, provides estimates of the flood frequency distribution with an accuracy similar to that of the continuous simulation approach, but with a dramatically reduced computational time.
In both cases (continuous and event-based modelling), the challenge is to improve the performance of the model by reducing uncertainty in the input data, model structure and parametrisation (Garavaglia et al., 2017;Bárdossy et al., 2020;Moges et al., 2021), as well as improving the strategies used for the validation of the initial hypotheses (Arsenault et al., 2018).In the present study, the model has been shown to be robust given the performance in the validation process.In particular, the model is able to reproduce the peak flows as a synthesis of the main dominant processes.However, for the evaluation of the model performance, it was only possible to consider 9 events out of the total number of events for the period 1988-2012, given the quality of the sub-daily hydrometeorological records.For these events, the PBIAS_PF% were less than 35% and the BIAS_TPmin were less than 40 min.In any case, considering that this is a real case, it can be stated that the sub-daily model will suitably reproduce the flood-generation mechanisms.
The results of the 200 hydrological simulations were obtained from one hundred synthetic storms with two representative ISMCs.In the set of 200 simulated events, we include the entire range of observed rainfall quantities, as well as the predominant ISCM values.As was previously mentioned, R ranges from 37 to 670 mm, which is in line with what is observed in the region for convective storms (see Peñarrocha et al., 2002).Furthermore, the ten-minute maximum cell intensity of this family of storms (ranging from 26 to 95 mm/h) is in the most observed ranges of historical storms as shown in the histogram in Fig. 3.In the case of predominant ISMCs, these were obtained from a continuous hydrological catchment model validated with observed daily data.This case study used two predominant ISMCs, although Eq. ( 5) does allow any number thereof.In this vein, it is possible to obtain relevant information on the physics of the phenomena involved by considering the flood-generating mechanisms in the catchment of interest (Klemeš, 1993) and by following an integrated storm and hydrological modelling strategy as proposed in this paper.
Although there are studies suggesting the use of copulas in multivariate event-based analysis (Candela et al., 2014), that approach can have practical limitations due to its need for large temporal series of discharges.This is the case of flash floods, which often occur in poorly instrumented or ungauged basins (Gaume et al., 2009), as found in the case study presented in this paper.As Tong et al. (2015) concluded, the accuracy of copula models depends on the data length and this also exerts a negative effect on the marginals, which is an important factor when using a copula model to perform bivariate analysis.Therefore, a multivariate approach such as that presented herein has a high potential for application in the case of ungauged catchments or catchments with short flow records.
It should be underlined that the proposed approach in this paper is a way to advance in a "flood frequency hydrology" framework, by expanding causal information through the incorporation of the principal dominant hydrological processes in flood generation (Merz and Blöschl, 2008).This paper does not combine independent data sources, such as systematic at-site records and maximum flows obtained from hydrological modelling; that information, however, could be integrated through Bayesian approaches, such as that proposed by Viglione et al. (2013).
When it is necessary to estimate high return flood quantiles in ungauged catchments or in changing conditions (e.g., storm severity or land use), the present methodology is useful thanks to the use of a multidimensional storm generator (which can incorporate changes in the severity of storms) and a distributed hydrological model (which can incorporate changes in both the storm characteristics and land uses).This kind of process-based approach provides a meaningful pathway towards understanding current and future flood frequency in nonstationary conditions and can therefore be valuable in supplementing existing practices (Yu et al., 2019).For these cases, the hydrological model can be established using one of the methods proposed by Rosbjerg et al. (2013), and its evaluation is made possible by applying a proxybasin test or a differential split-sample test following Klemeš (1986).
Recalling Rogger et al. (2012), the selection of the appropriate method for estimating flood quantiles should consider the strengths and weaknesses of the different applicable methods.In the present study, it has been shown that the widely used standard probabilistic and deterministic methods contain a series of weaknesses that would be transferable into results with high uncertainty for decision-making (e.g., a flood risk management plan).Ultimately, changes in the flood frequency curve are a consequence of the changes in the runoff generation processes (Sivapalan et al., 1990), which cannot be adequately reflected by means of the standard approaches of frequency analysis.Indeed, it poses a major challenge when the availability of hydrometeorological records is a limiting factor in understanding the underlying processes associated with flash floods, as is the case of the Mediterranean area.In the trivariate methodology proposed in this paper, it is possible to explicitly consider the influence in the upper tail of the distribution function of physical characteristics that impose upper limits such as the spatiotemporal variability of storms, soil saturation of the whole catchment and the activation of mixed runoff generation processes, as was concluded by Klemeš (1993).

Conclusions
The integrated use of stochastic modelling of convective storms with distributed hydrological modelling at different time resolutions has enabled high return period flood quantiles to be estimated by considering the main flood-generating processes involved.The "Rambla del Poyo" has been used as a representative case study of the hydrological response in semi-arid Mediterranean catchments.On this basis, it has been possible to consider the principal flood-generation mechanisms and their probability distribution functions through a trivariate flood frequency analysis.This approach is based on the consideration of the spatio-temporal behaviour of the typical storms of the region studied through the multidimensional stochastic generation of storms of a convective nature.The statistical analysis of their magnitude should be carried out through the selection of an appropriate random variable (point-rainfall daily-equivalent) and the application of a regional annual maximum daily rainfall frequency study.The trivariate methodology also considers the spatial heterogeneity of the catchment physical support through a distributed hydrological model and the effect of the ISMC S. Salazar-Galán et al. on the processes of flood generation, where the ISMC has been contemplated as a random variable with a discrete probability distribution.With this approach, it is possible to reduce computational cost using a catchment distributed hydrological model for two reasons: continuous simulation at daily time step for ISMC analysis; and eventbased simulation at sub-daily time step for hydrograph generation using a wide range (in magnitude and spatio-temporal characteristics), but limited in number, of synthetic storms.
This paper has demonstrated that the use of probabilistic as well as deterministic "standard approaches" for flood quantile estimations of high return periods are not suitable in semi-arid Mediterranean catchments, such as the "Rambla del Poyo".In the hydro-climatic region of S. Salazar-Galán et al.

Fig. 3 .
Fig. 3. Frequency histogram: maximum intensity of the rain cells to the observed storms in the Spanish Mediterranean (data from García-Bartual and Andrés-Domenech 2017).

Fig. 4 .
Fig. 4. Temporal and spatial correlation functions of the stochastic rainfall model for the October 2000 event.

Fig.
Fig. Observed and simulated hydrograph of the October 2000 calibration event.

Fig.
Fig. Predominant ISMC prior to floods obtained from continuous modelling.

Fig. 8 .
Fig.8. between the magnitudes of storms and maximum flows for ISMC in the whole catchment when totally dry (wilting point) and fully wet (field capacity).

Fig. 9 .
Fig. 9. between the return periods of storms and maximum flows for the two extreme ISMCs.Fig. upper and lower bounds imposed by the ISMC on the flood frequency curve.

Fig. A1 .
Fig. A1.Example of the spatial evolution of the intensity field for the most intense 6-hour period for 4 storms with a similar R.

Fig. A2 .
Fig. A2.Hyetographs and associated hydrographs for the two predominant ISMC (dry and wet) referred to the gauging station for the examples in Fig. A-1.