Flood Frequency Analysis Using Participatory GIS and Rainfall Data for Two Stations in Narok Town, Kenya

Flood management requires in-depth computational modelling through assessment of flood return period and river flow data in order to effectively analyze catchment response. The participatory geographic information system (PGIS) is a tool which is increasingly used for collecting data and decision making on environmental issues. This study sought to determine the return periods of major floods that happened in Narok Town, Kenya, using rainfall frequency analysis and PGIS. For this purpose, a number of statistical distribution functions were applied to daily rainfall data from two stations: Narok water supply (WS) station and Narok meteorological station (MS). The first station has a dataset of thirty years and the second one has a dataset of fifty-nine (59) years. The parameters obtained from the Kolmogorov–Smirnov (K–S) test and chi-square test helped to select the appropriate distribution. The best-fitted distribution for WS station were Gumbel L-moment, Pareto L-moment, and Weibull distribution for maximum one day, two days, and three days rainfall, respectively. However, the best-fitted distribution was found to be generalized extreme value L-moment, Gumbel and gamma distribution for maximum one day, two days, and three days, respectively for the meteorological station data. Each of the selected best-fitted distribution was used to compute the corresponding rainfall intensity for 5, 10, 25, 50, and 100 years return period, as well as the return period of the significant flood that happened in the town. The January 1993 flood was found to have a return period of six years, while the April 2013, March 2013, and April 2015 floods had a return period of one year each. This study helped to establish the return period of major flood events that occurred in Narok, and highlights the importance of population in disaster management. The study’s results would be useful in developing flood hazard maps of Narok Town for different return periods.


Introduction
Return period is an essential tool in hydrology that is used to estimate the time interval between events of a similar size or intensity. However, estimating the return period of such events can become an arduous task due to the fact of various reasons such as missing data, short times data series, or the

Study Area
The study area was Narok Town, which is the headquarters of Narok County, which is situated in the southwestern part of Kenya. Narok County borders Nakuru County to the North, Kajiado County to the East, Republic of Tanzania to the South, and Bomet, Kisii, Migori, and Nyamira counties to the West (See location Figure 1). Narok catchment is formed by the Kakia and Esamburumbur subcatchment. The area of the watershed is 46.2 km 2 , with the elevation varying from 1844 m to 2138 m. The main permanent river, Enkare Narok, passes through Narok town. However, Kakia and Esamburmbur dry valleys, which fall within the study area, often turn into rivers during heavy rains. The longest flow paths for Kakia and Esamburumbur measure 13.20 km and 10.01 km in length respectively, with an average slope of 18%. The downstream area experiences frequent flash flooding which results in harmful consequences in the town, such as loss of life, and destruction of property. Flooding also interferes with the local community's culture, threatens lives and livelihoods, and often result in decline in people's economic fortunes and poverty, among other negative effects. The topography of Narok town gives the town a basin-like formation, where floods drain through during the heavy rains. In the higher areas of the town, deforestation and inadequate drainage structures lead to flooding and cause road and buildings submersion, while the catchment is made up of agricultural land with maize and wheat being the most dominant crops. For the purposes of this study, two time series of rainfall data were collected: one at the water supply station (WS) and the other at the meteorological station (MS). The location of the two stations is shown in Figure 1. The rainfall data ranged from 1959 to 2018 for the MS, and from 1968 to 2018 for the WS. In the rainfall data, months with more than 30 days of missing values, especially for the rainy season (March to May; November, December) for a specific year, were not used in the analysis. Thus, the total useful data for WS station were 30 years, and the ones for MS were 59 years of data series.
Hydrology 2019, 6, 90 3 of 11 missing values, especially for the rainy season (March to May; November, December) for a specific year, were not used in the analysis. Thus, the total useful data for WS station were 30 years, and the ones for MS were 59 years of data series.

PGIS
The National Centre of Geographic Information defines GIS as a system of hardware, software, and procedures to facilitate the management, manipulation, analysis, modelling, representation, and display of georeferenced data to solve complex problems regarding planning and management of resources. As a result of this definition, five functions are assigned to GIS, namely, data entry, data display, data management, retrieval, and information analysis [16]. Throughout recent decades, GIS has been used for environmental threat assessment. However, GIS hardware, software, and data are expensive and require a high level of technical expertise [14]. In addition, traditional GIS has been accused of not adequately addressing and incorporating social issues [17], which necessitated the inclusion of the term "Participatory GIS". Different researchers have examined the need to integrate the participation of the population in decision making, and the adequate means to achieve it. McCall [18] emphasized the need for precision in PGIS and stated that the degree of accuracy depends on the purpose of the PGIS. Despite the questions that have existed around participatory GIS, the method has been improved over the years and is increasingly used. Corbett et al. [19] applied PGIS for the assessment of social and ecological variation in Mpumalanga province, South Africa. The map obtained was based on people's knowledge. Tripathi and Bhattarya [20] evaluated the importance of integrating indigenous knowledge in GIS approach. The authors emphasized the importance of the participation of the local community in decision making. Rinner and Bird [21] used an online discussion forum for evaluating local community engagement in the development projects. The PGIS uses the diversity of experiences associated with "participatory development" [14] by involving people in GIS data collection and analysis in their community. For instance, a participatory approach to flood risk management requires the collection of information from the communities actually

PGIS
The National Centre of Geographic Information defines GIS as a system of hardware, software, and procedures to facilitate the management, manipulation, analysis, modelling, representation, and display of georeferenced data to solve complex problems regarding planning and management of resources. As a result of this definition, five functions are assigned to GIS, namely, data entry, data display, data management, retrieval, and information analysis [16]. Throughout recent decades, GIS has been used for environmental threat assessment. However, GIS hardware, software, and data are expensive and require a high level of technical expertise [14]. In addition, traditional GIS has been accused of not adequately addressing and incorporating social issues [17], which necessitated the inclusion of the term "Participatory GIS". Different researchers have examined the need to integrate the participation of the population in decision making, and the adequate means to achieve it. McCall [18] emphasized the need for precision in PGIS and stated that the degree of accuracy depends on the purpose of the PGIS. Despite the questions that have existed around participatory GIS, the method has been improved over the years and is increasingly used. Corbett et al. [19] applied PGIS for the assessment of social and ecological variation in Mpumalanga province, South Africa. The map obtained was based on people's knowledge. Tripathi and Bhattarya [20] evaluated the importance of integrating indigenous knowledge in GIS approach. The authors emphasized the importance of the participation of the local community in decision making. Rinner and Bird [21] used an online discussion forum for evaluating local community engagement in the development projects. The PGIS uses the diversity of experiences associated with "participatory development" [14] by involving people in GIS data collection and analysis in their community. For instance, a participatory approach to flood risk management requires the collection of information from the communities actually affected by the flooding [13]. Depending on the availability of data, researchers either engage directly with the community or use already existing information on the community [13]. However, PGIS is a continuously evolving method and researchers keep discovering new ways in which the method can be applied in solving different problems. The method is continuously being used in adding to current information, finding out new and unknown information, alternative competing positions, discovering and interpreting people's "natural geography" [18]. The information collected using PGIS from the population in Narok, although incomplete (because the specific days of flooding could not be identified in some cases), proved beneficial for this study.

Flood Frequency Analysis
Generally, the steps followed in flood frequency analysis are as follows: Step 1: Selection of the data Here, annual maxima daily, annual maxima of two cumulative days and three cumulative days are selected for the analysis. This research focused on two and three days of rainfall because three-days rain flood records form an accurate representation of the magnitude of the flood flows [22]. In addition, the three-days rain flood discharge is the most critical duration for designing and evaluating flood mitigation [23].
Step 2: Fitting the probability distribution Development of software for statistical extreme values analysis has been rapid [24]. Nowadays, different program package and pre-defined excel sheets are used to perform frequency analysis. Among the most used ones include: RAINBOW (developed by the Institute for Land and Water Management of the Katholieke Universiteit Leuven [25]); PeakFQ (that performs statistical flood-frequency analyses of annual-maximum peak flows [26]); CumFreq (initially developed for the analysis of hydrological measurements of variable magnitudes in space and time); Hydrognomon [27] developed by the ITIA (The name "Itia" is not an acronym, it is the Greek name of the willow tree) research team of National Technical University of Athens in 1997; and Hyfran (developed in Canada by the team of Bernard Bobée, chairman of statistical hydrology from 1992 to 2004). The Hyfran software can be used for any dataset of extreme values, provided that observations are independent and identically distributed [28]. Using frequently used probability functions such as normal, log-normal, Weibull, gamma, Gumbel, exponential or Pareto distribution, all these software can perform statistical analysis. This study used Hydrognomon software for rainfall frequency analysis. Hydrognomon is a software tool for the processing of hydrological data [27]. Although Hydrognomon is not commonly used in reviewed literature for flood frequency analysis, it was, nonetheless, used successfully by researchers to simulate the hydrology of Kaduna River in Niger [29], and for modelling future climatic variation [30]. Few researchers used Hydrognomon for time-series data analysis [29,30]. However, the software is freely available and can perform frequency analysis among many other hydrologic tasks. One of the advantages of this software is that it supports several time steps, from the most exceptional minute scales up to decades [27]; and filling of missing values. The software can also perform over thirteen statistical distributions and statistical test.
Step 3: Goodness of fit test to identify the best fitting distribution Goodness-of-fit test statistics are used for checking the validity of a specified or assumed probability distribution model [31]. A goodness-of-fit test, in general, refers to measuring how well the observed data correspond to the fitted (assumed) model. The commonly used goodness-of-fit tests are Kolmogorov-Smirnov (K-S); root mean square error (RMSE) test, Chi-square test, and Anderson-Darling (A-D). The K-S test is an exact test where the distribution of the K-S test statistic itself does not depend on the underlying cumulative distribution function being tested. In addition, the use of Chi-square helps to understand the results and, thus, to derive more detailed information from the statistic test than from many others [32]. It also has the advantage in that it can be applied to any univariate distribution. For those reasons, the K-S test and Chi-square test have been selected for testing the best-fitted distribution. The fact that Chi-square test requires a significant sample size is not a problem in the current study since we have a large sample size.

Results and Discussion
This study tested different distribution functions in order to compute the return period of significant flood events in Narok Town. Unfortunately, there is no meteorological station in the catchment. Although a Trans-African HydroMeteorological Observatory (TAHMO) station was installed in November 2018 in the northeastern part of the catchment, it lacked sufficient data for the analysis. However, the rainfall data from the MS and WS stations made it possible to confirm the information gathered during population interviews. Table 1 below presents the maximum daily precipitation for both WS and MS. Unfortunately, WS record has some missing data (from 1996 to 2010). In hydrology, missing data can lead to misunderstanding of rainfall variability and historical patterns [33]. The handling of missing data in meteorological time series is a relevant issue to many climatologic analyses [34]. The missing data can be filled using diverse techniques such as interpolation [35], correlation analysis [22] among adjacent stations, regression-based interval filling method [36], or inverse distance weighted techniques [37]. However, filling missing data can severely compromise its value for specific purposes [38]. Therefore, Oosterbaan [35] recommends that additional information (information used to fill data) be omitted from statistical analysis [38,39]. Notably, the maximum daily rainfall per year varies among the stations (Figure 2). The values recorded for the amount of rainfall in the two stations were at times very close or similar as was the case of the year 1975 and 2015, where the recorded max rainfall was 29.6 and 67 mm for MS and 29.7 and 67 mm for WS station. The MS recorded the highest amount of rainfall in more years because it is located further to the South of WS station. Notably, the maximum daily rainfall per year varies among the stations (Figure 2). The values recorded for the amount of rainfall in the two stations were at times very close or similar as was the case of the year 1975 and 2015, where the recorded max rainfall was 29.6 and 67 mm for MS and 29.7 and 67 mm for WS station. The MS recorded the highest amount of rainfall in more years because it is located further to the South of WS station.

Selection of the Best-Fitted Distribution
The maximum daily, maximum sum of two consecutive days, and the maximum sum of three consecutive day's rainfall of each year were used for statistical analysis. Tables 2 and 3 summarized the goodness-of-fit test results for each station. In the tables, LP III, Par, GEV, EV1, Gal, Gam, Pear, and LN represent log-Pearson type 3, Pareto, generalised extreme values, extreme values type 1, Galton, gamma, Pearson and log-normal distributions respectively. Table 2. Summary of best-fitted distribution for water supply station.   In the above tables (Tables 2 and 3), the selected distributions, which correspond to the one with the lowest value of K-S and Chi-square, are highlighted.
Goodness-of-fit test, as well as the skewness, standard deviation and the mean of the statistical parameters, were calculated. Water supply station was moderately skewed with a skewness equal to 0.527 while the meteorological station was highly skewed, posting a skewness of 1.725, which is greater than 1. The statistical comparison by K-S and Chi-square test for goodness-of-fit showed that the best-fitted distribution for the water supply station were, respectively, extreme values L-moment (0.08, 0.33), Pareto L-moments (0.07, 1.7), and Weibull distribution (0.07, 0.07) for maximum daily, maximum two days and maximum three days rainfall data. For MS, the best-fitted distributions were generalized extreme value L-moment (0.04, 1.15), Gumbel (0.06, 4.71), and gamma (0.05, 2.1) distribution for maximum one day, two days, and three days, respectively. However, the selected distribution did not necessarily exclude the use of other methods. In fact, for WS station, GEV L-moments kappa specified could be used for maximum daily rainfall analysis, while GEV L-moment could be used for maximum two days rainfall and normal L-moment or Pareto L-moment for maximum three days rainfall. Otherwise, normal distribution L-moment was as good as the Weibull distribution for maximum three days rainfall analysis, since it showed results that were very close to Weibull distribution. For MS, log-normal distribution could also be appropriated for the maximum daily rainfall as well as the two days and the three days maximum rainfall.

5, 10, 25, 50, and 100 Years Return Period Calculation
For the best-fitted distribution, Gumbel distribution was selected for return period calculation. Due to the small size of the data, 50 and 100 years return period intensity was computed using 95% confidence interval. Table 4 presents a summary of the 5, 10, 25, 50, and 100 years return period rainfall amount for each station, depending on the maximum daily rainfall or the maximum two days rainfall or three days.

The Return Period of the Significant Flood Event
From population interview, reliable information was collected on significant flood events that occurred in the town. Additional information collected from the Water Resource Authority (WRA) helped as support. Some inhabitants were able to indicate the level where water reached on walls or trees. In spite of difficulties such as language barrier, age of the inhabitant (too young to remember the event) faced in some areas, it was possible to gather similar information on three major past floods in the town. Most of the inhabitants confirmed that significant flood events that occurred in the town happened on January 1993, April and May 2013, and 28th April 2015.
Most of the information recorded from the population interviewed was confirmed by the amount of rainfall recorded. However, in some cases, the amount of daily rainfall seemed insufficient to justify the occurrence of flood, necessitating the consideration of two or three days' rainfall. For example, in 2015 the maximum daily rainfall was recorded on 6th May. According to the interview, this did not lead to flooding. However, the summation of three days rainfall from 26th to 28th of April 2015 (maximum three days rainfall) which was 63.2 mm (Table 5) led to a flood. There is a need to notice that the month of April recorded almost daily rainfall. Thus, the accumulation of water in the ground and the excess of surface runoff, compounded by the sparce vegetation in the town could explain the occurrence of that flood on 28th April 2015. Table 5. Rainfall distribution on the major flood events.     In the above table, the date on which daily maximum rainfall was recorded for each sifnificant flood event are highlited.

MS Rainfall in mm WS Rainfall in mm
The return period of the significant flood events was computed, and Table 6 below shows the corresponding return period of the significant flood event that occurred in the town. For example, the flood that occurred in January 1993 had a return period of six years, with a corresponding amount of rainfall equal to 81.7 mm recorded on the 20th January 1993.

Conclusions
Rainfall intensity can vary depending on the geographical location, and the study of rainfall variation over the years has emerged as an important aspect of flood management. However, an important consideration in flood frequency analysis is the determination of the best-fitted distribution in order to compute the appropriate return period. In this study, different probability distribution functions were applied to the time series data of two stations in Narok Town. The analysis covered not only the maximum daily data, but also the maximum two and three days of rainfall. This was done because it was noticed that some days reported flooding even when daily rainfall was not much, owing to the fact that the soil had absorbs the rainfall in the previous days, and so even a small amount of additional rain led to flooding. The K-D test and Chi-square test were applied to each distribution and the values of each parameter helped to determine the best-fitted distribution in each case. This was followed by the computation of the 5, 10, 25, 50, and 100 years return periods rainfall. The results of the statistical analysis were used together with PGIS to compute the corresponding return period of significant flood events that occurred in the town. Findings from the study show that integrating PGIS in flood management could be helpful in gathering information on past flood events. This opens up the potential for extending application of PGIS into the analysis of flood extent and flood depth mapping. Going forward, the methods used in this study could be applied in developing flood hazard maps for each maximum rainfall intensity, while results of the study could be used by the government in flood management in Kenya. More significantly, the method and results could be used by development planners at the county and national levels in identifying flood-prone areas in Narok and in developing mitigation strategies. The outcome of the study could also be used in urban planning for Narok Town.