Assessment and ranking flood events in a regulated river using information and complexity measures

The availability of a robust approach that describe the hidden features of flood events in regulated rivers is of great importance. The key goal of this research is to utilize some of information and complexity measures to assess and rank flood patterns within a regulated river system. To meet this goal, the Metric Entropy (ME) as measure of information content and Rényi Complexity (CR) as a quantification for complexity content were employed. To examine the role of river regulation on flood risk control, river stage records of two monitoring stations located at downstream of two different dams were considered in this research. The findings show that information and complexity metrics offer an image of the randomness embedded in dataset and the presence of internal patterns in studied data records. In general, this research shows that natural environmental risks and disasters can be assessed and ranked using a promising physical scheme based on information and complexity measures.


Introduction
Understanding different characteristics of rainfall-runoff process is of paramount importance to control the quantity and quality of water resources and mitigate potential flood risks and provide best water management practices.
Past works [1,2] documented that the climate records of the past decades had revealed evolving recurrent extreme weather events and natural disasters over the world, connected to adverse socioeconomic consequences. As a result of climate change, the occurrence of natural environmental disasters such as tropical cyclones, typhoons, droughts and floods had intensified [3].
Japan has a unique location where several disasters occur repeatedly, including volcanic, seismic, typhoons, and floods basically because it is being located in the Ring of Fire. These disasters impair the national sustainable development goals (SDGs). Therefore, further efforts are needed to explore how ecosystems are altered by the countless hydrological processes during usual and severe climates. Vast seasonal rainfall that occurs in the western part of Japan is one of the worst destructive disasters since it comes with landslides and mudflows.
Recently, information-based theories have received an increased concern in hydrological studies to detect the variability in hydrological dataset, including precipitation, temperature, and streamflow [5][6][7]. In particular, Al Sawaf and Kawanisi [7], investigated the different patterns of a mountainous river in Hiroshima over low and high-frequencies, and demonstrated that river flow has two regime scales. Moreover, they indicated that information and complexity measures can be used to assess flood events with very limited demonstration. As a result, the purpose of this study is to assess annual flood events variability in a river at two different stations located in Hiroshima prefecture which is mainly affected by the torrential floods that hit west of Japan.

Study site and data description
The Gono River which is located in the west region of Japan was chosen for this study. The river originates from Kita-Hiroshima flows through Hiroshima and Shimane prefectures and finally pours into the Sea of Japan. The watershed of the Gono River has a cool temperate climate with four pronounced seasons. Basically, precipitation happens in winter, however, heavy rainfall occurs in the monsoon (June and July) as well as during the typhoon season (August and September). The catchment area of the Gono River is 3963 km 2 . Furthermore, the Gono River has three tributaries, namely, Basin, Saijo, and Kannose Rivers (Fig. 1). The Gono River watershed is monitored by the Japanese ministry of land infrastructure and transport (MLIT) using real-time gauging stations ( Fig. 1) that measure the water stage (H) directly. In this study, Yoshida and Tono stations located at the downstream of the Haji and Haizuka Dams, respectively, were included in the assessment process.
Haji Dam is located at the upstream region of the Gono River, whereas Haizuka Dam is located at the upstream of the Basen River which is one of the main tributaries of the Gono River (Fig. 1).
The studied stations were established by the MLIT and continuously measures water stages. Therefore, the maximum daily water stage of those stations was obtained from the MLIT database from 2002 to 2020. At each stage monitoring station, the local MLIT administrations have defined a set of five levels warning protocols during a flood event.

Methodology
To assess and rank the variability of annual flood events at the selected stations, the performed method comprises the following steps.
First, following the symbol strings approach [5], daily stage records were encoded using a Boolean conversion into (S or I). In other words, we mapped each reading value to either S "Secure" if below the threshold stage (secure river site condition), or I "Insecure" otherwise (i.e., insecure river site condition). The threshold stage is predetermined at each monitoring station by the domestic MLIT offices. In this study, the "Designated water level" was selected as a threshold stage. Once the encoded record is determined, a word length (L) comprises two characters was defined. Hence, the number of possible different words that could be detected in the studied records was 2 L . As a result, the possible words that could be faced in the encoded dataset were SS, SI, II, and SI (see Fig. 2). The next step, involved investigating the following three sets of probabilities: i) pL i, is the state probability of the i th L word, where i= 1, 2, … , 2 L ; ii) pL ij, is the probability of shifting from the i th to the j th L word instantly, where i = 1, 2, … , 2 L and j = 1, 2, … , 2 L , and iii) pL i j is the conditional probability for the occurrence of the incidence j th L word, given that the i th L word event has been observed before. Once the aforementioned probabilities were determined, the information and complexity metrics could be estimated. In this study, we investigated Metric entropy (ME) as information metric and Rényi Complexity (CR) as a quantification for complexity content and explained below.
The Metric Entropy (ME) is the Shannon entropy (ES) divided by the word length (L); thus, it is a normalization of the Shannon entropy that offers an image of the contained information in a dataset but at the same time it is independent from the word length (L).
Rényi Complexity, on the other hand, is a measure of complexity can be defined by the differences of Rényi entropies of conjugated orders [8]. Let the Rényi Complexity CR (α) of the order α > 1 of a distribution of L-words be: the scaling factor yields independency of the word length and eliminates the zero for α → 1. Investigations, CR is specific in the sense of a measure of complexity only for α ≈ 1. Thus, the Rényi Complexity be defined as: River stage records were examined year by year to investigate the annual variation of flood patterns. It is important to point out that the above analyses were accomplished using SYMDYN script developed by Wolf [5].

Ranking and assessment using word number
The number of different words recoded each year gives an important indicator for annual flood pattern. In other words, each word gives an impression about the state of flood per year, for example, if the observed number of different words was 1, it meant that the pattern was SS (i.e., water level at a studied station was less than the selected threshold and no significant floods were detected at that year). Contrary, if the observed number of different words was 4, it meant that the pattern was II , also, it tells that at least one significant flood occurred during that year and progressed at least for two consecutive days. Finally, if the number of different words was three words, it meant that the flood pattern was either SI or IS and it indicates that at least one significant flood occurred at that year but did not progress for more than one day. Table 1 conveys information about the number of different words recorded each year at both Yoshida and Tono stations. Obviously, it can be noticed that in the case of Yoshdia Station, the following years 2002, 2007, and 2008 reported only one number of different words (i.e., SS). In other words, the daily water level recorded at this station during the mentioned years was less than the investigated threshold.
In general. the median number of different words recorded at Tono was greater than that observed at Yoshida Station.

Ranking and assessment using information and complexity measures
Assessment and ranking using information and complexity metrics is very useful since it provides further understanding regarding the randomness and internal patterns of embedded flood events.  First, the ME is a measure that quantifies the randomness in a dataset and represents an image of the contained information in a dataset, while simultaneously being independent of word length. On the other hand, complexity metrics are helpful measures that capture the existence of internal patterns in studied datasets, i.e., how a system transforms from one pattern to another. Hence, data that contain a high level of fluctuation have larger CR value. Table 2 compares the values of ME (information) and CR (complexity) as observed at Yoshida and Tono Stations. It can be seen that in the case of one word observed per year (e.g., 2002, 2007, and 2008 Table 1 Table 2. That is to say, the stage within all days of these years were less than the specified threshold and hence no disorder pattern could be noticed. Interestingly, in the case of Yoshida Station, it can be verified that the highest values of information and complexity were recorded in 2006 and 2004, respectively. However, in the case of Tono Station, it can be verified that the highest values of information and complexity were recorded in 2004 and 2010, respectively, suggesting that several patterns can be observed at Tono station due to different patterns of water release from Haizuka Dam (upstream of Tono Station).
To get sensible understanding regarding the information and complexity patterns observed at Yoshida and Tono Stations, Table 3 shows the number of days that in which water level at those stations were greater than the selected threshold. As can be seen, Yoshida station in 2020 shares the same scores with Tono station in 2011. Outstandingly, in 2020, flood patterns observed in Yoshida station were two single-day flood events and two events persisted for two-consecutive days. Alternatively, in 2011, Tono station showed the same annual pattern (i.e., two single-day flood events and two events persisted for two-consecutive days), and hence, their ME and CR were similar. In the same manner, Yoshida and Tono Stations shared the same information and complexity values in 2015 and the same annual flood pattern (i.e., one single-day flood event).

Annual ranking of flood events
Overall ranking of flood events was performed using the following empirical formulae as: The highest overall score indicates the higher occurrence of floods observed per year as it reflects the high number of disordered and complex patterns. The overall score is given it Table 2, the annual events ranked then starting by the severest year. In general, it seems that 2004 and 2006 were among the top affected years by floods for both studied stations.

Conclusions
Understanding different characteristics of floods and their hidden patterns is of paramount importance to control the quantity and quality of water resources and mitigate potential flood risks and provide best water management practices.
In this study, we assessed flood patterns using the daily records of water stages obtained from two stations located at the downstream of dams. The main purpose was to investigate the hidden patterns of floods using information and complexity measures namely the Metric Entropy as measure for information content and Rényi Complexity. The findings show that information metric offers an image of the randomness embedded in the dataset whereas complexity indicates the presence of internal patterns in studied datasets. In general, this research shows that natural environmental risks and disasters can be assessed and ranked using a promising physical scheme based on information and complexity measures.