Flood generating mechanisms investigation and rainfall threshold identification for regional flood early warning

A cost effective and easily applied methodological approach for the identification of the main factors involved in flood generation mechanisms and the development of rainfall threshold for incorporation in flood early warning systems at regional scale is proposed. The methodology was tested at the Pinios upstream flood-prone area in Greece. High frequency monitoring rainfall and water level/discharge time-series were investigated statistically. Based on the results, the study area is impacted by “long-rain floods” triggered by several days long and low-intensity precipitation events in the mountainous areas, that saturate the catchment and cause high flow conditions. Time lag between the peaks of rainfall and water level was 17–25 h. The relationship between cumulative rainfall Rsum on the mountainous areas and maximum water level MaxWL of the river at the particular river site can be expressed as: MaxWL = 1.55ln(Rsum) − 3.70 and the rainfall threshold estimated for the mountainous stations can be expressed as: Rsum = 20.4*D0.3, where D is the duration of the event. The effect of antecedent moisture conditions prior each event was limited to the decrease of the time lag between rainfall and water level response. The limitations of the specific methodological approach are related to the uncertainties that arise due to the other variables contributing to the complex flood generating mechanisms not considered (e.g., the effect of snowmelt and air temperature, soil characteristics, the contribution of tributaries, or the inadequate maintenance of river network that may cause debris accumulation and river bank failure).


Introduction
Floods have been reported to be one of the most destructive natural hazards affecting both the natural environment and infrastructures, especially in the Mediterranean area (Petersen 2001;Llasat et al. 2010;Gaume et al. 2016;Diakakis et al. 2020). They have often been associated with high rates of fatalities and injuries (Menne and Murray 2013;Diakakis and Deligiannakis 2017) and severe economic and social impacts (Petersen 2001;Barredo 2009 Operational real-time flood forecasting systems focus on identifying the flood generating processes in time. Floods can be exacerbated by land use change (Hall et al. 2014;Mentzafou et al. 2019) and high antecedent soil moisture conditions (Norbiato et al. 2009;Montesarchio et al. 2015;Grillakis et al. 2016). Apart from structure's failure and rapid snow melt, the most common and important flood triggering factor is precipitation (Hong et al. 2013). Therefore, the determination of rainfall thresholds is of particular interest as one of the evolving flood forecasting approaches (Golian et al. 2010) and can be used for local and regional flood forecasting and flood early warning systems (Norbiato et al. 2008;Montesarchio et al. 2009Montesarchio et al. , 2015Abancó et al. 1 3 242 Page 2 of 15 2016). Flood early warning systems are usually based on thresholds that assume that there is a physical or statistical relationship between a variable called "predictor" (e.g. rainfall) and a variable called "predictand" (e.g. river discharge or water level) (Martina 2011) and are incorporated in decision support systems (Kochilakis et al. 2016). Rainfall thresholds can be defined as the cumulated volume of rainfall during a specific storm event that occurs over a given catchment area, which can result to a critical discharge or water level (bankfull flow) at a specific river section (Georgakakos 2006;Martina 2011;Henao Salgado and Zambrano Nájera 2022).
Although Greece has been subjected to high flood risk since antiquity, an increasing trend of flood events during the last decades has been reported Angelakis et al. 2020). Recently, major flood events were followed by major adverse consequences on the socioeconomic activity and even fatalities (Papaioannou et al. 2019Giannaros et al. 2020). In Greece, the Ministry of Climate Crisis and Civil Protection (MCCCP) (Hellenic Republic 2021) operates the Civil Protection Operation Centre (CPOC) and is responsible to plan, organize and coordinate actions regarding risk assessment; to response to natural, technological or other disasters or emergencies; to coordinate rehabilitation operation; and to inform the public on these issues (Hellenic Republic 2020). The CPOC is informed of Emergency Weather Deterioration Bulletins (EDEK) and Emergency Severe Weather Forecast Bulletins (EDPEKF) produced by the Hellenic National Meteorological Service (HNMS) (http:// www. emy. gr/ emy/ en/ warni ng/ meteo alarm). HNMS is the official agency of Greece responsible for issuing severe weather alerts, such as heavy rain with risk of flooding, and severe thunderstorms (Hellenic Republic 1997). Apart from HNMS, many national or European research centers and educational institutions of Greece produce similar meteorological products (Kallos et al. 1997;Papadopoulos et al. 2002;Smith et al. 2016;Lagouvardos et al. 2017;Varlas et al. 2021). Nevertheless, based on the recent General Emergency Confrontation and Direct Mitigation Plan against Flood Events "Dardanos" of MCCCP, these products should not be used by the state for the official early warning alerts (Administrative Region of Thessaly 2020).
Flood forecasting and effective flood early warning systems at a specific area premise the understanding of how the hydrometeorological processes leading to extreme weather events, trigger water overflow at a river reach (Alfieri and Thielen 2015). Although the tendency is towards the use of physical-based hydrometeorological systems that combines atmospheric model with hydrological and hydraulic models (Papaioannou et al. 2019), there are still significant problems to be overcome to improve flood forecasting accuracy and reliability. For example, data errors, including those describing the physiographic characteristics of the watershed, sensitive dependence on the initial conditions, and errors introduced because of imperfections in the models' integration, may reduce the value of stream forecasting and routing. The present study intends to provide a defensible proof-ofconcept when only available measurements are considered for issuing a quick flood warning. In particular, it aims to fill the gap between atmospheric forcing and severe weather alerts, and a river's response and possible bankfull flow using a threshold-based flood warning approach. The final task is to propose a simple and cost-effective methodological approach for the better understanding of the underlying flood generating mechanisms and rainfall thresholds identification that can be incorporated into a flood forecasting and warning system for issuing generalized alerts for flood events at regional scale (Fig. 1). The methodology proposed was tested at the flood-prone upper part of the Pinios river's catchment in Greece but can be applied to any regional area.
The proposed methodology relies on high frequency monitoring time-series (hourly rainfall and water level/ discharge data). Information regarding the antecedent moisture conditions (AMC) of the catchment area prior each severe weather event was also taken into consideration during the analysis. To identify the severe weather events that resulted to a corresponding water level rise of the Pinios river, and to determine the time lag between the occurrence of the two events, cross-correlation analysis between rainfall and water level time-series was performed. Additionally, linear regression analysis and descriptive statistics, frequency histograms, box-plots and correlation analysis between the variables of interest were performed. The variables examined were related to: -severe weather event: duration of the severe weather event, maximum hourly rainfall, cumulative rainfall, rainfall intensity; -river's response: maximum water level, maximum and average discharge, and time lag between the peaks of rainfall and water level; -antecedent moisture conditions: Soil Water Index (SWI) on the date the severe weather event started.
Finally, rainfall thresholds were developed by defining the upper limit of conditions of storms that did not lead to flooding (Cannon et al. 2008;Diakakis 2012). The identification of these thresholds was achieved using the support vector machines (SVM) method. SVM is a supervised learning algorithm that can be used to determine the lines or boundaries dividing an n-dimensional space into separate groups, so that they can be classified as their proper categories when new data are given (Cortes and Vapnik 1995). This approach has been efficiently used in the past in similar applications (Nayak and Ghosh 2013;Pan et al. 2019;Chu et al. 2022).

Study area-monitoring program
The study area is located at the upstream, western part of the Pinios catchment, in Greece (Fig. 2i). The Pinios catchment area is about 11,000 km 2 , while the upstream area of interest is about 2250 km 2 . The Pinios river is located in Thessaly, Central Greece, which is the second most productive agricultural area in Greece (about 4150 km 2 utilized agricultural area for the year 2019 (Hellenic Statistical Authority 2022). The climate of the study area is typical Mediterranean, with cold winters and moderate precipitation rate, followed by relatively hot and dry summers (hot-summer Mediterranean climate Csa) (Kottek et al. 2006), and with large temperature differences (Mylopoulos et al. 2009) (Fig. 2ii). The rainfall of the Pinios catchment exhibits large spatial variability, ranging between 360 and 1850 mm, at the eastern-coastal and the western-mountainous part of the basin respectively, while the average annual precipitation of the entire basin is about 780 mm (Mylopoulos et al. 2009). The main course of the Pinios river is perennial and is fed mainly by both winter rainfalls and spring snowmelt (Bathrellos et al. 2018). Rainfall events in the mountainous areas usually take place between October and January (Hellenic National Meteorological Service 2022).
The Pinios river and its tributaries are flood prone since the antiquity and several flood protection structures along the entire hydrological network have been constructed over the last 2500 years (Mimikou and Koutsoyiannis 1995). The study area is located in the western part of the Thessalian plain, downstream of the city of Trikala, where the smooth morphology and low gradient favored the creation of a paleo-lake for the period between late Quaternary and late Holocene. This was gradually aggregated due to sedimentary processes (Migiros et al. 2011;Caputo et al. 2021); for this reason the study area is associated with many flood events Bathrellos et al. 2018). Additionally, the western part of Pinios watershed has been identified as an Area of Potential Significant Flood Risk (code EL08APSFR003) and in the study area over 22 flood events have been reported since 1979 (Ministry of the Environment and Energy of Greece 2020) (Fig. 2iii).
For the present study, hourly time series covering the period between 01/09/2019 and 31/03/2022 were employed. The water level telemetric monitoring station Nomi at the Pinios river is part of the automatic monitoring network of the Hellenic Centre for Marine Research (HCMR), installed through HIMIOFoTS national project (HIMIOFoTS 2020). The time series from the Nomi telemetric station are automatically stored on an FTP server. Graphical visualization of the dataset is available online in real time, while the alert water level is set at 2.0 m empirically (Panagopoulos et al. 2021) (Table 1).
Hourly rainfall data were available from private meteorological stations, part of the weathercloud network. Only stations upstream the Nomi water level monitoring station and with good data quality were selected for the present study. Overall, four meteorological stations meet the criteria mentioned above. Three of these stations are located in mountainous areas (Fylakti, Gardiki, and Elati stations), while one is located lowland (Trikala station). Finally, Gardiki station is located in the neighboring catchment  Table 1).
Before proceeding to the statistical analysis of the current database, quality assurance and control was conducted following the common practice proposed by the World Meteorological Organization (2013). Data screening and processing operations were performed by checking the data against specified screening criteria such as the allowable variable ranges, historical maxima or minima, allowable rates of change, comparison with measurements conducted by different instruments, etc.

Antecedent moisture conditions (AMC)
Many studies have highlighted the contribution of AMC at the beginning of a severe weather events on a catchment's hydrological response and flood generation (Berthet et al. 2009). Information regarding the AMC of the catchment area upstream Nomi water level monitoring station, prior each  severe weather event, was retrieved from satellite measurement of Soil Water Index (SWI). The SWI quantifies the moisture condition at various depths in the soil and is mainly driven by the precipitation via the process of infiltration. For the present study, the product used was the Copernicus Global Land Service Soil Water Index at 1 km resolution (SWI1km), derived using a data fusion approach, named SCATSAR (Scatterometer-Synthetic Aperture Radar) algorithm (Bauer-Marschallinger et al. 2018), from microwave radar data observed by the MetOp ASCAT and the Sentinel-1 C CSAR satellite sensors (Bauer-Marschallinger et al. 2020). The algorithm is based on a two-layer water balance model proposed by Wagner et al. (1996) to estimate profile soil moisture from Surface Soil Moisture (SSM) retrieved from scatterometer data (Eq. 1): where t n is the observation time of the current measurement, t i are the observation times of the previous measurements and T is the time constant of the filter in days. The parameter T characterizes the temporal variation of soil moisture within the root-zone profile (Laiolo et al. 2016), and is mainly controlled by climatic variables (precipitation and evaporation), land cover and runoff signatures, rather than soil properties Bouaziz et al. 2020), and usually ranges between 5 and 100 (Wagner et al. 1996;Bauer-Marschallinger et al. 2020). In vegetated areas, T values are higher than in bare ground Loizu et al. 2018), while T values are low in areas with high evaporative demand and less frequent but more intense precipitation (Albergel et al. 2008;Bouaziz et al. 2020). Very often and especially in cases of soil data deficiency, a T value of about 20 days is used in hydrological applications, while values between 15 and 30 days provide reasonable results (Wagner et al. 1996). Nevertheless, based on literature, in agricultural-dominant catchments (Ceballos et al. 2005;Brocca et al. 2010) or in forested areas (Brocca et al. 2010) a T value of about 50 days have been estimated. Since the upstream of Nomi station catchment is mainly covered by forests (68%) and secondarily by agricultural areas (28%) (Corine CLC 2018) (European Environment Agency 2020), a T value of 40 was adopted for the specific application. SWI gridded data are available to the public (after registration) through the Copernicus Global Land Service online product portal (European Commission-European Environment Agency 2022).

Time-series statistical analysis
In the specific study, the statistical analysis was performed between hourly data series (water level/discharge measurements of Nomi station, and rainfall of four upstream meteorological stations-Fylakti, Trikala, Gardiki and Elati). The statistical analysis in the present study was based on linear regression analysis, descriptive statistics, frequency histograms, box-plots and bivariate correlations analysis using Pearson correlation coefficient-CC. Additionally, the statistical tool employed to examine firstly the time lag, and secondly the inter-relationship between rainfall and water level/discharge of the selected monitoring stations of each severe weather event, was crosscorrelation analysis. This approach allows the determination of the extent to which two time-series exhibit oscillations, differing by a distance of k units in time (Legendre and Legendre 2012). The time lag between lag 0 and the lag of the maximum value of the cross-correlation function gives an estimation of the response of the system against an unitary impulse (Benavente et al. 1985). Cross-correlation coefficient r xy (k ) can be defined as (Eq 2): where C xy (k) is the cross-correlogram, and σ x and σ y are the standard deviations of the time-series. The overbar represents the temporal mean value of the signal (Larocque et al. 1998). The higher the r xy (k ), the more prominent is the interrelation between the two time-series (Lee and Lee 2000).
To identify the weather events that resulted in a corresponding response of the water level of the river, the following interpretation criteria were used: r xy (k) < 0.1: no correlation, 0.1 ≤ r xy (k) < 0.3: weak correlation, 0.3 ≤ r xy (k ) < 0.5: moderate correlation, r xy (k) ≥ 0.5: strong correlation (Shi et al. 2018). Only those severe weather events crosscorrelated with river water level resulting to a r xy (k) ≥ 0.3 (at least moderate correlation) were further analyzed.
The variables examined were related to: -severe weather event: duration of the severe weather event (D, in h), maximum hourly rainfall (R max , in mm/h), cumulative rainfall (R sum , in mm), rainfall intensity (R int , in mm/h) recorded at the four meteorological stations; -river's response: maximum water level (MaxWL, in m), maximum (Q max , in m 3 /s) and average (Q av , in m 3 /s) discharge at Nomi water level monitoring station, and time lag between the peaks of rainfall and water level (Lag, in h); -average SWI (%) of the catchment upstream Nomi station on the date the severe weather event started.

Rainfall threshold identification
Rainfall thresholds definition can be achieved by using different methodologies and approaches and by employing different indicators, and predictor variables. Four types of methods can be identified: (1) empirical/statistical rainfall threshold methods, (2) hydrological/hydrodynamic methods, (3) probabilistic methods, and (4) compound methods (Montesarchio et al. 2015;Henao Salgado and Zambrano Nájera 2022). In the specific study, the empirical method was adopted, which is also considered to be simple and the most widely used (Ramos Filho et al. 2021;Henao Salgado and Zambrano Nájera 2022). Additionally, empirical method is considered to be a cost effective and short-term solution (Bouwens et al. 2018;Young et al. 2021) and is preferable in case of data scarcity compared to the other approaches (Henao Salgado and Zambrano Nájera 2022). Empirical rainfall threshold methods can conform the complex physical processes in the study area (Reichenbach et al. 1998). This simplification can be considered an advantage of this approach since it removes the complexities of setting up a hydrological model (Ramos Filho et al. 2021; Henao Salgado and Zambrano Nájera 2022); nevertheless, they neglect other factors involved in flood generation, such as the lithological and morphological diversity, the different climate regimes and weather circumstances, and the heterogeneity and the incomplete dataset (rainfall data and floods) used to determine thresholds (Montesarchio et al. 2015;Santos and Fragoso 2016). Based on empirical/statistical rainfall threshold method, data from historical flood events are employed, and correlation analysis is performed between magnitude and duration of critical rainfall (Cannon et al. 2008;Norbiato et al. 2008;Diakakis 2012;Montesarchio et al. 2015). Here, after identifying the most effective variables responsible for flood triggering in the study area, thresholds were developed by defining the upper limit of conditions of storms that did not lead to flooding (Cannon et al. 2008;Diakakis 2012). These thresholds can be effectively incorporated into a system for issuing warnings for events that pose significant hazards to life and property (Cannon et al. 2008).
The identification of the rainfall threshold can be accomplished using the support vector machine (SVM) technique, one of the most popular and efficient supervised statistical machine learning algorithms, used mostly for classification (Cortes and Vapnik 1995). The transformation of data via a mathematical function is accomplished with kernel function. SVM can estimate an n-dimensional hyperplane differentiating the two classes of the training dataset. For the case of linear separable data, a hyperplane can be defined as follows: where w is a coefficient vector that describes the direction of the hyperplane in the feature space, b is the offset of the hyperplane from the origin, and ξ i is the positive slack variable (Cortes and Vapnik 1995). The optimal hyperplane is located where the margin between two classes of interest is maximized and the error is minimized. The maximization of this margin leads to the following constrained optimization problem: where α i is the Lagrange multiplier, C is the penalty (cost), and the slack variable ξ i allows penalized constraint violation. The decision function, which can be used for classifying new data, can then be written as: In the specific application, SVM analysis was performed using package e1071 v 1.7-11 (Meyer et al. 2022) in R v.4.1.0 (R Core Team 2021). Linear kernel function was used. Additionally, the SVM tuning of C factor was performed to enhance the model performance. For the specific task, tune function of e1071 package was used for crossvalidation. By default, tune function performs a k-fold cross validation (Geisser 1975), k = 10. Nevertheless, in the specific study the leave-one-out cross validation was performed (Stone 1974) that is a special case of k-fold cross validation with k = n, n the sample size of the dataset. Although leaveone-out cross validation is computationally expensive and therefore preferable in cases of small sample size datasets (Arlot and Celisse 2010) and may have high variance, it also has the lowest bias in estimating regression error (Hastie et al. 2009;Jiang and Wang 2017).
The dataset has been transformed to a logarithmic scale prior the analysis.

Descriptive statistics
In Table 2, the statistical characteristics of the severe weather events per meteorological station investigated in the present study are presented. Based on the results, the severe weather events that demonstrated r xy (k) ≥ 0.3 (at least moderate cross-correlation) between rainfall and water level during the period 01/09/2019-31/03/2022 and were further analyzed, were 27 in total.
As expected, the cumulative rainfall recorded in mountainous stations (Fylakti, Gardiki, and Elati stations) was higher than in lowland (Trikala station). Overall, the cumulative rainfall of all the weather events investigated for all four stations ranged between 1.5 and 246.4 mm.
a i a j y i y j x i x j , Mean and median values were 78.3 and 60.7 mm respectively, while 75% of the measurements were higher than 38.6 mm and only 25% higher than 119.4 mm (Fig. 3a). Mean and median duration of all the severe weather events investigated was 52 and 38 h respectively, while 75% of the events had duration higher than 18.5 h (Fig. 3b). 30-50% of the weather events identified, resulted in a corresponding water level rise of the river over the empirical water level alert threshold of 2.0 m. The time lag between the peaks of rainfall and water level at the Pinios river (for all severe weather events and meteorological stations investigated) ranged between 2 and 40 h, while the mean and median values were 22 and 20 h, respectively. 75% of the measurements were higher than 17 h, while only 25% of the measurements were higher than 25 h (Fig. 3c).

Correlation analysis
Based on the results (Fig. 4), the duration of the severe weather events was positively, moderately to strongly correlated (based on the criteria for correlation interpretation proposed by Hinkle et al. (Hinkle et al. 2003) to maximum discharge Q max of the river in all cases. The duration of the severe weather event was positively correlated with maximum water level MaxWL of the river for stations Fylakti (correlation coefficient CC 0.48), Trikala (CC 0.46), and Elati (CC 0.54) (in all cases statistically significant at 0.05 level).
Additionally, cumulative rainfall R sum of severe weather events investigated was positively, strongly correlated (Hinkle et al. 2003) to maximum discharge Q max of the river for the mountainous stations only. Cumulative rainfall R sum was positively, strongly correlated (Hinkle et al. 2003) to maximum water level.
Duration D and cumulative rainfall R sum of the severe weather event investigated were positively, moderately to strongly correlated (Hinkle et al. 2003) for the mountainous stations and moderately corelated for the lowland station. The most important factor affecting the time lag of water level rise of the river after a severe weather event is SWI of the upstream area.

Rainfall threshold
Further investigation of the inter-relation between the flood triggering variables that demonstrated the highest correlation (severe weather event duration and cumulative rainfall) for the stations also with the highest correlations (Fylakti, Gardiki, and Elati-the mountainous stations) resulted to the rainfall thresholds identification. The sample size of this dataset is 56. In Fig. 5i the SVM tuning results are presented, based on which the optimal C (cost) value was 20 (best performance value 0.16). In Fig. 5ii the duration-cumulative rainfall plot of the three mountainous meteorological stations and the threshold (hyperplane) resulted from the SVM analysis are presented. Based on the results, the threshold can be expressed as: In Fig. 6i the cumulative rainfall vs maximum water level at Nomi station vs duration of the mountainous stations is presented. In the cases where severe weather events did not lead to water level rise of the river over the empirical threshold of 2.0 m, cumulative rainfall ranged between 14.0 and 79.7 mm (median 40.8 mm) and the duration ranged between 4 and 61 h (median 19 h). In the cases where the threshold exceedance occurred, cumulative rainfall ranged between 24.9 and 246.4 mm (median 116.4 mm) and the duration ranged between 9 and 156 h (median 55 h) (Fig. 6ii). Finally, the relationship between cumulative rainfall (for the stations at the mountainous areas) and maximum water level can be expressed as (Fig. 6i):

Discussion
In the present study, an investigation of the main factors involved in the flood generating mechanisms of the Pinios upstream catchment area has been performed so as to test the effectiveness of the methodological approach proposed for the identification of flood generating mechanism and rainfall threshold identification.
Based on t he results, dur ing t he per iod 01/09/2019-31/03/2022, 27 severe weather events that resulted to a corresponding significant rise or overflow of water at Nomi station at the Pinios river were identified. Based on the results presented in "Descriptive statistics" section, water level rise of Nomi monitoring station at the Pinios river is affected mainly by weather events taken place in the mountainous areas of the watershed, rather than rainfall occurring lowland. The severe weather events investigated resulting to a corresponding water level rise at Nomi can be characterized by average cumulative rainfall (only 25% of the events investigated recorded values higher than 120 mm), with relatively long duration (75% of the events lasted longer than 18.5 h), that led to a rather rapid response of the river (time lag below 25 h for 75% of the cases), under relatively wet soil conditions (values higher than 58.3% for 75% of the events investigated). These results lead to the conclusion that the causative mechanisms responsible for most of the significant water level rise, overflow events and floods at Nomi wider area are several days long and lowintensity rainfall events, that saturate the catchment and cause high flow conditions. These "long-rain floods" (Merz and Bloeschl 2003) are very common in mountainous areas during autumn and winter (Raymond et al. 2019). Indeed, 74% of the 27 severe weather events investigated in the present study had taken place during winter and autumn (63% and 11%, respectively), and 7 in spring, while none was observed in summer.
SWI of the catchment upstream the water level monitoring station of the Pinios river prior a severe weather event seems not to contribute to the flood triggering directly and there was no conclusive evidence that antecedent moisture conditions (AMC) was a decisive factor of flood generating. The main effect of SWI to the river's response to a severe weather event, was the decrease of the time lag between rainfall and water level response. This can be attributed to the fact that in most cases SWI prior the severe weather events was relatively high during the period investigated, and practically none flash flood event triggered by short, high intensity rainfalls under dry conditions was recorded during the investigated time period. In mountainous areas "long-rain floods" high-magnitude events are triggered mainly by rainfall (Raymond et al. 2019). Nevertheless, further research on the flood triggering mechanism of the area could provide better understanding of the soil moisture contribution, since moisture content is a well-established factor in rainfall-runoff relationship (Diakakis 2012).
As expected, the duration of the severe weather events/ cumulative rainfall and maximum water level/maximum discharge of the river were moderately to strongly correlated in all cases. Additionally, duration and cumulative rainfall R sum of the severe weather event investigated were positively, moderately to strongly correlated.
The relationship between cumulative rainfall (for the three stations at the mountainous areas) and maximum water level MaxWL can be expressed as: MaxWL = 1.55ln(R sum ) − 3.70. The rainfall threshold estimated for the mountainous stations can be expressed as: R sum = 20.4*D 0.3 . This threshold, together with the 17-25 h time lag between the peaks of rainfall and water level at Pinios river resulted from the present study, can be incorporated in a flood early warning system of the upstream area of the Pinios river. Together with a weather forecasting system provide a valuable decision support tool. The limitations of the specific methodological approach are related to the uncertainties risen due to the numerous, possibly secondarily variables contributing to the flood generating mechanisms that cannot be incorporated in the specific methodological approach (Ramos Filho et al. 2021). Such variables that may be related to flood triggering and have not been included to the present study are snowmelt and air temperature (Blöschl et al. 2017;Brunner and Fischer 2022), soil characteristics (Gaál et al. 2012), the contribution of tributaries (Pattison et al. 2014), and the inadequate maintenance of river network that may cause debris accumulation (Mentzafou and Dimitriou 2015) and river bank failure (Viero et al. 2013).
Finally, it should be noted that the water level/discharge and meteorological monitoring stations have only recently been installed in 2019. The dataset is expected to be enriched with time and provide the adequate information required for testing the rainfall thresholds developed and to evaluate the ability of this approach in identifying possible critical discharge/water level and flood events at the study area in time.

Conclusion
This study aims to propose a methodological approach for the identification of the main factors involved in flood generation mechanisms and the development of rainfall threshold for incorporation in flood early warning systems at regional scale. The methodology proposed is relied on high-frequency monitoring data and statistical tools, and is and boxplots of (ii) cumulative rainfall R sum (mm) and (iii) duration D (h), for the three mountainous meteorological stations (Fylakti, Gardiki, and Elati) cost effective, easily applied and can be used in case of data scarcity that preclude the development of complex hydrological or hydrodynamic models. The methodology proposed was tested at the flood-prone upper part of the Pinios river's catchment in Greece. Despite the acknowledged limitations, this approach managed to provide an insight of the main flood mechanism and to identify the main factors responsible for flood triggering in the area. Additionally, rainfall thresholds were developed that can be effectively incorporated into a system for issuing warnings for events at regional scale that pose significant hazards to life and property. Further investigation, especially regarding the soil moisture contribution to flood generation, and validation of the results would reduce the uncertainties risen.
Author contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Angeliki Mentzafou. The first draft of the manuscript was written by Angeliki Mentzafou, Anastasios Papadopoulos and Elias Dimitriou and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Funding Open access funding provided by HEAL-Link Greece.
Data availability Data subject to third party restrictions.

Conflict of interest
The authors declared that there is no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.