Value of quality controlled citizen science data for rainfall-runoff characterization in a rapidly urbanizing catchment

The major concern of applying citizen science in water resources is the quality of the data. However, there are limited scientific studies addressing this concern and showing the data value. In this study, we established a citizen science program in the Akaki catchment which hosts Addis Ababa, Ethiopia. Citizen scientists monitored river stage at multiple gauging sites for multiple years. We evaluated the quality of citizen science data through a systematic quality control. Reference data was obtained from neighboring stations of the citizen science program and professionals while the evaluation involved graphical inspections and statistical methods. The quality-controlled data were applied to evaluate the spatial and temporal variation of rainfall-runoff relationships. Initially, large numbers of suspicious data were detected using single station data but that was significantly reduced when the data of multiple sites were compared. Further comparison against professional data revealed excellent agreement with high correlation coefficient (r > 0.95), and low centered root mean square error (RMSE) < 0.03 – 0.08 mm. The citizen science data indicated a large difference in rainfall-runoff relationship over the dominantly urban and rural sub-catchments. The citizen science data allowed comparison of runoff coefficient and base flow index for recent and historical periods where recent streamflow data is unavailable from a formal data source. This study illustrates the immense value of (i) multiple data quality assessment steps for building confidence on the quality of citizen science data, and (ii) citizen science for enhancing our understanding of rainfall-runoff relationships and change in a rapidly urbanizing catchment.


Introduction
Citizen science (CS) refers to the participation of non-scientist volunteers in the generation of new scientific knowledge and the collection of scientific data in hydrological research (Buytaert et al., 2014).Data quality concerns remain one of the major barriers to realizing the full potential of CS data in the field of water resources research (Downs et al., 2021;Zheng et al., 2018).These data quality concerns were raised due to expected observational errors and data incompleteness and inconsistency (Walker et al., 2016), which is mainly caused by citizen scientists' varying skill and motivation levels (Buytaert et al., 2014;Njue et al., 2019).Insufficient mechanisms for CS data validation, especially in the absence of formal data, can further limit the use of such data in water resources research (Walker et al., 2016;Zheng et al., 2018).In contrast, to improve CS data quality and extend the CS program, studies suggested motivation of participants through monetary and nonmonetary incentives including social recognition (Rutten et al., 2017).In addition, citizen scientists' motivation can be enhanced by offering training, providing timely feedback, actual use of the data, and process evaluation (Tedla et al., 2022;Walker et al., 2021b).Verbrugge et al. (2016) noted that the incentives for CS can vary, ranging from valueoriented to social capital.While addressing concerns about the accuracy, it is important to demonstrate how CS data can help to advance research and operational water management.
Despite concerns about CS data quality, there is emerging evidence showing that citizen scientists can provide river stage data of high quality that is fit for research (Strobl et al., 2020).For instance, Weeser et al. (2018) evaluated river stage observations from CS and formal monitoring programs and reported insignificant differences.Similarly, Etter et al. (2020) reported a Kendall rank correlation coefficient (τ) of around 0.65 to 0.90 between CS and formally measured river stage.Lowry and Fienen (2013) found low Root Mean Square Error (RMSE = 5 mm) between the CS data and river stage data measured by a pressure transducer.Relatively higher RMSE value (27 mm) was reported by Little et al (2021) who compared river stage data from CS and formal monitoring programs.Most studies showed that CS programs can provide high-quality river stage in catchments with diverse physical and social characteristics.
CS is a cost-effective hydrological data source that can be used in studying rainfall-runoff catchment response (Mazzoleni et al., 2017;Njue et al., 2019;Starkey, 2018).A few scholars used CS data to calibrate hydrological models, acknowledged their benefits, and suggested further study (Etter et al., 2018;Mazzoleni et al., 2017).Starkey et al. (2017) showed that community-based river stage data combined with formal measurements can be employed to successfully calibrate hydrological models.Etter et al. (2018) showed that river stage data can be employed to calibrate a bucket-type rainfall-runoff model (HBV) although having limited temporal resolution (observations not at regular time intervals throughout the year like formal sources) and some inaccuracies in the data.Weeser et al. (2019) used crowdsourced river stage to calibrate rainfall-runoff models, and subsequently, the model was used for ungauged catchments.Similarly, Mazzoleni et al. (2017) showed streamflow obtained from crowdsourced river stage observations can enhance flood prediction if incorporated in hydrological models.Further, Walker et al. (2019) and Gowing et al. (2020) demonstrated that citizen science data can inform recharge estimation and rainfall-runoff modeling to enhance understanding of shallow groundwater resources in a previously ungauged catchment.
One of the potential hydrological contributions of CS is providing river stage data to investigate the spatial and temporal variation of rainfall-runoff (RR) relationships.Most studies suggested that CS data can fill spatio-temporal gaps in river stage data (Lowry and Fienen, 2013;Njue et al., 2019;Strobl et al., 2020).For example, Ilja Van Meerveld et al. (2017) demonstrated how CS river stage can be used to generate model-based river flow time series for ungauged basins.Helmrich et al. (2021) reviewed and suggested that CS data are increasingly available to support fine-grained urban flood modelling and prediction.Crowdsourced data is shown to improve the prediction performance of hydrological-hydraulic models in near-real time (Annis and Nardi, 2019;Avellaneda et al., 2020;Mazzoleni et al., 2018).For instance, Avellaneda et al. (2020) assimilated crowdsourced CS data into a SWAT model and improved estimates of streamflow as compared to initial SWAT parameters.Dasgupta et al. (2022) employed crowdsourced water levels to calibrate hydraulic models and showed useful results in ungauged catchments.Assumpção et al. (2018) reviewed studies on citizen science data for flood modeling and found promising potential to enhance model performance.However, they noted the lack of clear methods for data collection (spatial and temporal), and validation.
There are only a few studies in the literature demonstrating the possible use of CS data to generate knowledge (Buytaert et al., 2014;Starkey et al., 2017).Most studies investigated how CS can provide useful information for flood related studies (events).While studies that use CS data to enhance understanding of hydrological characteristics over multiple years are rare.This requires long-term observations of river stage at sub-daily time intervals which may add burden to citizen scientists.The analysis by Davids et al. (2019) showed that daily to monthly observation intervals can be useful for minimum river flow studies whereas maximum river flow can be captured by varying the observation frequency (e.g.recording river stage when it rains).This allows engagement of citizen scientists in intermittent monitoring of river flow, which is still useful to investigate rainfall-runoff relationships.Blöschl et al. (2019) identified twenty-three unsolved problems and noted that CS has not yet been fully exploited to fill the gaps of hydrological data.Further, CS programs are commonly set up in developed countries (Tiago et al., 2017;Walker et al., 2021a) though such programs are equally important in countries with scarce resources.
Despite increasing studies on using citizen science programs in hydrology, there are limited studies that explore CS engagement in a wide spatial coverage, relatively long-term observation, and detailed data quality assessment to ensure reliability and accuracy.Additionally, the potential of CS to enhance understanding of rainfall-runoff relationships in various catchments including those witnessing urbanization has not been fully investigated.In this study, we addressed two research gaps (i) lack of systematic approach to check citizen science data quality, and (ii) unavailability of sufficient studies utilizing citizen science data to better understand rainfall-runoff relationships in urbanizing catchment.We provide a sequential approach to evaluate citizen science data quality.
The Akaki catchment is rapidly urbanizing because of Addis Ababa city and the surrounding small towns.We designed and established the CS program in 2020 and have been operational since then.Citizen science engagement at multiple sites across the catchment allowed evaluation of the rainfall-runoff relationship across space.Studying spatiotemporal variation of rainfall-runoff response using CS data that were collected for multiple years (three years) over multiple urbanizing subcatchments is rare in the literature.The benefits of our study include filling the data void of formal monitoring networks, providing empirical evidence on the value of CS data, reconnecting residents to the urbanizing catchment and contributing to the acceleration of the Sustainable Development Goals.

Study area
Akaki is the headwater catchment of the Awash River Basin (Fig. 1).Geographically, the catchment is located between 8 0 46′-9 0 14′ N and 38 0 34′-39 0 04′E with a total surface area of about 1500 km 2 with elevations ranging from 2,028 to 3,370 m.a.m.s.1.The catchment has an average terrain slope of 10.6 %, with steep slopes ranging from 8 to 15 % in 25 % of the catchment area (Bekele et al., 2022).Based on Ethiopia's traditional agro-ecological zones, the study area has two zones: Dega (2,300-3,200 m) and Woina Dega (1,500-2,300 m).Dega occupies nearly 73 % of the area, while Woina Dega covers 27 %.The mean annual rainfall in 33 years  of recorded data from meteorological stations is around 1130 mm.The months of July and August have the highest average monthly rainfall (310-360 mm).The catchment's mean annual potential evapotranspiration is about 948 mm, and its mean annual temperature is 17.4 Akaki catchment is divided into Big Akaki (which covers 71.2 % of the catchment and drains the eastern part) and the Little Akaki (drains the western part) sub-catchments, which join at Aba Samuel reservoir.The catchment provides surface (Gefersa, Dire, and Legedadi reservoirs) and groundwater supply for the city of Addis Ababa and its surroundings.
The Ministry of Water and Energy (MoWE) used to monitor the Akaki River at Big Akaki, Little Akaki, and Mutinicha stations.Except at the old bridge of Big Akaki River, river flow monitoring has been interrupted at these stations since 2004.Because of the absence of an updated rating curve, the river stage data has not been converted to river flow data at the old bridge of Big Akaki.Therefore, the Akaki can be considered an ungauged catchment since 2004.
A recent study by Negash et al. (2023) shows that the Land Use Land Cover (LULC) of Akaki catchment has six major classes: agriculture (rainfed and irrigated), urban, forest, water body, and bare land.Agriculture and urban are the dominant land cover classes followed by forestland.Addis Ababa city is situated in the central part of the catchment.The city and its outskirts have experienced significant expansion because of rapid population growth and urbanization.The city's population increased from 443,728 in 1984 to 2,739,551 in 2007 G. Kebede Mengistie et al. with an annual growth rate of around 3 % (CSA, 2007).In 2023, the population is projected around 3.945 million for the city but there is ongoing debate that the population may exceed at least 5 million.
The urban areas expanded with different rates at sub-catchment levels.Fig. 2 shows the urban expansion of Bulbul (BUL subcatchment) increased from 3.88 % (28.65 km 2 ) in 1990 to 14.67 % (110.08 km 2 ) in 2020.Over the same time, the urban area of Kebena (KE) catchment increased from 35 % (35.30km 2 ) to 42 % (43.2 km 2 ).This shows a larger rate of urbanization in Bulbula (which hosts part of Addis Ababa and many small towns) than in Kebena (which mainly hosts the most developed part of Addis Ababa).New built-up areas of both sub-catchments indicated in Fig. 2 are concentrated around their outlets, which can significantly alter rainfall-runoff response.In general, the urban area of Akaki has increased during the past three decades, expanding from 8.05 % in 1990 to 29.21 % in 2020 (Negash et al., 2023).

Secondary data
The datasets used in this study were obtained from the Ethiopian Meteorology Institute (EMI), Ethiopia's Ministry of Water and Energy   (MoWE), global data sources, and our own citizen science program.The daily rainfall data was obtained from EMI for the years 1990 to 2022.Initially, the daily rainfall data were collected from 24 stations.The rainfall was recorded using manual rain gauges that have funnel-shaped tops and cylindrical containers at a daily time step at 9.00 AM.
The daily-observed river flow data from 1990 to 2004 was obtained from the Ministry of Water and Energy (MoWE) for the Big Akaki flow gauging station.However, flow discharge data is not being availed to users since 2004 though river stage data is still recorded at this station.This is because the rating curve has not been updated for a long time despite the changing river morphology.
The river flow and climate data were screened using visual inspection and a simple plot of the data.Incorrect decimal places (unexpected higher and lower values in dry and wet seasons respectively) in numeric fields were corrected by shifting decimal places, and non-numeric data were considered missing records.After screening the data, the rainfall data of 11 stations which have below 20 % missing records were considered fit for further analysis.The inverse distance weighting method (IDW) was applied for filling missing rainfall records using rainfall values from the neighboring stations.The average areal rainfall was determined for each sub-catchment using the Thiessen polygon method.This method gives weight to station data based on the area coverage (considering site proximity) of the station and creates polygons surrounding each of the gauges.

Citizen science
Currently, the Akaki catchment can be considered ungauged despite the rapidly changing catchment conditions.Therefore, we initiated a citizen science program to monitor the river flow of the Akaki River at key locations on 1st of January 2020.Citizen scientists were recruited as volunteers from the local community at the beginning of January 2020.Initially, the objective of the citizen science program was described to people living around gauging sites.Finally, one-to-one meetings with volunteers who live around the selected stations were made to identify citizens who were interested in the program.Volunteers of both genders were given equal opportunities.Finally, six citizens were selected for data recording who showed strong interest and live very close to the stations.Their ability to read and write as well as their willingness to volunteer as citizen scientists were considered.Citizen scientists measured the river stage above or below a fixed reference surface using graduated markers on a concrete surface or a combination of rope, stick and measuring tape subject to convenience of the site for measurement.They received training on data collection and recording before they started the monitoring.They also received a refresher training after monitoring river stage for 6 months.The refresher training focused on data quality assessment and improvement.The citizen scientists collected the data with small monetary incentives (25 % of the salary of the MoWE observers).Considering the small amount of the monetary incentives, the citizen scientists are more intrinsically rather than extrinsically motivated, coming from their sense of belonging to the community and care for the river, though future study is needed to fully understand this.
The citizen scientists collected river stage data twice per day at 6:00 AM and 6:00 PM.This observation interval is similar to the practice by MoWE.A hydrologist visited the citizen scientists once or twice per month to provide feedback.The hydrologist also measured the river stages at irregular time intervals to provide reference data to evaluate the quality of the CS data.We also measured velocity and river stage during the low, medium, and high flow conditions of the river flow over three years (2020 to 2022).Velocity measurements during high flow were conducted in collaboration with hydrological technicians of MoWE.The paired measurements of river stage and velocity were used to establish rating curves as described in the methods section.
Fig. 1 shows the gauging stations monitored by the citizen scientists.The river flow of Legedadi River and Dire reservoir release are measured at the Legedadi (LE) station.The river flow through the LE station, releases from the Legedadi reservoir, and catchment runoff are all measured at the Bole Arabsa (BAR) station.Our citizen scientists capture the river flow of Kebena River at Kebena (KE) station, which is situated at the outlet of the river.The Bulbula&Kebena station (BK) is situated just downstream of the confluence of Bulbula and Kebena River.The downstream station is the Big Akaki (BA) station.Kebena and Bulbul Rivers are the major tributaries of the Big Akaki, with the former river rising from the catchment's north part (the old part of Addis Ababa) and the latter river draining the catchment's northeast peri-urban portion.Little Akaki River flow is measured at the Little Akaki station (LA), which drains the northwest portion of the Akaki catchment.The Little Akaki River also conveys river flow from the Gefersa reservoir to the Aba Samuel reservoir.

Quality assessment of citizen science (CS) data
CS data quality assurance is critical (Buytaert et al., 2014) as data acquired at observation sites frequently contains suspicious data.Therefore, we followed data quality assessment steps that refer to the multiple procedures and stages involved in assessing CS data quality and ensuring the accuracy and reliability of citizen science data.In this paper, data is considered suspicious when it cannot be proved that it is correctly recorded at a certain data quality assessment stage and requires subsequent quality assessment.If none of the quality assessment stages prove that the data is correct, then it will remain identified as suspicious data.The data quality assessment was performed using (i) reference data obtained from weekly river stage monitoring by a professional hydrologist, and (ii) CS data of neighboring stations to evaluate data consistency.
Taylor's two-dimensional diagram (Taylor, 2001) was used to evaluate the statistical degree of similarity between professional (reference) and CS river stage data.It provides a summary of statistical information including correlation coefficient (CC), centered root mean square error (RMSE), and normalized standard deviations (SD) of the two river stage datasets displayed on a single diagram.For the Taylor's diagram, CS data from Kebena (KE) and Bulbula&Kebena (BK) stations were considered in the quality assessment due to the lack of professional (reference) data for other stations.Comparison of the professionally obtained data and citizen science data is not adequate to conclude that the entire data collected by the citizen scientists has adequate quality throughout the observation period.This raises the need for multiple data quality assessment steps.
Visual evaluation using boxplots and scatterplots of the stage data differences between morning and afternoon records was also used to evaluate any unexpected jump or drop in the stage data of each of the six stations (Moog et al., 1999).The data were normalized by the seasonal mean in the wet and dry seasons of 2020-2022 so that data of similar magnitude could be compared.The box plot was used to detect outliers, in this study referred to as suspicious data that deviated from the majority, while scatter plots were applied to identify inconsistencies or unexpected patterns in data across nearby stations.The number of suspicious data identified by the box plot is expected to be larger than those identified by the scatter plot.This is because the box plot is purely a statistical test while the scatter plot seeks some physical explanation for data inconsistency.The scatter plot can show the relationship between step changes in river stage at pairs of neighboring stations.If the magnitudes differ significantly, then the data are flagged as suspicious and will be checked by subsequent data quality assessment steps.Furthermore, rainfall records were employed as one of the detection mechanisms to confirm whether suspicious data obtained by boxplots and scatter plots were caused by observation errors or climatic conditions.These suspicious data were detected by visual inspection of the joint plots of the time series of estimated areal rainfall and streamflow data for each sub-catchment (Fowler et al., 2018).Moreover, suspicious data were identified by visual comparison of the characteristics (pattern, rising limb, falling limb, and peaks) of the hydrographs of the morning and evening stage data of each station.

Developing stage-discharge rating curves
The rating curves were established using paired measurements of stage-discharge following a standard procedure (Domeneghetti et al., 2012).Flow velocity was measured at each gauging site by dividing the wetted cross-section into multiple segments.The width of these segments was determined by the size of the cross-section.The discharge of each segment was calculated using the area-velocity technique, and the total discharge was estimated as the sum of the discharges of the segments.The recorded streamflow was classified into low, medium, and high flows based on the climate seasons.Low flow measurements refer to data collected from October to May (dry season), medium flows were recorded from June to second week of July and from middle-second week of September to first week of October (during and immediately after main rainy seasons) and high flows refer to the data recorded from middle-second week of July to middle September (main rainy season).Table 1 shows the number of paired stage-discharge data per flow regime (low, medium, and high flow) at each of the gauging sites.The widely applied rating curve equation (i.e.power function) was selected in this study (Haile et al., 2023;Westerberg et al., 2011).The equation reads as follows: Where: Q is discharge; H is river stage; H o is gauge reading corresponding to zero discharge; a and b are rating curve parameters.The parameters were determined by minimizing the Root Mean Square Error (RMSE).A simple Linear Regression method (SLRM) was applied to estimate the parameters of the rating curve equation.The performance of the established rating curves was evaluated with the coefficient of determination and RMSE (Othman et al., 2019).The rating curves were used to generate time series river flow data at each of the gauging sites.
Almost for all stations, with the exception of Legedadi station (95 %), more than 98 % of the CS river stage data were converted to river flow without extrapolating the developed rating curves.

Baseflow separation
We used an online baseflow separation method (WHAT: Web Based Hydrograph Analysis Tool) to separate the streamflow hydrograph into two major components, which are baseflow and direct runoff (Lim et al., 2005).The tool has been used in different catchments (Bastola et al., 2018;Ferede et al., 2020) because it provides an automated hydrograph analysis and delivers results in less than a minute.After the hydrograph separation, the monthly baseflow index (BFI) was estimated according to the definition of the Institute of Hydrology (1980) in which the total monthly baseflow was divided by the total monthly river flow.Finally, for selected historical and CS observation period, the monthly runoff coefficient (RC) was calculated as the ratio of monthly runoff depth to monthly rainfall depth (Liu et al., 2020).

Accuracy of citizen science data using statistical measures
Fig. 3 shows the results of the CS river stage data quality assessments conducted with Taylor diagrams at the KE (a) and BK (b) stations.It summarizes the statistical similarity between professional (reference) and CS river stage data with correlation coefficient (CC), centered root mean square error (RMSE), and normalized standard deviations (SD).
The correlation (r) between stage data collected by citizen scientists and professionals (reference data) was 0.95 (BK station) and 0.99 (KE station).This indicates excellent agreement between CS and reference data.Since correlation-based measures indicate linear agreement between the two datasets, RMSE was also estimated as a measure of random error.The centered RMSE value was found less than 10 mm when evaluated against the reference data (Fig. 3).The RMSE of the CS data was 2 mm at KE and 0.08 mm at BK stations.These RMSE values are within the probable measurement error of common staff gauges used for measuring river stages.
At both stations, the smallest difference (low variability) in terms of SD values was found against the reference stations.The difference in SD values was very small (around 0.02).The CS data has similar variability with the reference data.The CS data well captured the reference river stage data.Hence, the CS stage measurements at KE and the confluence of Bulbula and Kebena rivers can be considered accurate at least in terms of correlation, RMSE, and SD.However, performing additional data quality assessments is essential for further checking the validity of CS data for all sites.

Data quality assessment using graphical techniques
Assessment of runoff data quality is not trivial due to the difficulty of distinguishing between erroneous measurements and what are genuine extreme measurements.In this study, we used multiple data quality assessment approaches to detect erroneous measurements.Fig. 4 shows the box plot of the difference between morning (6:00 AM) and afternoon (6:00 PM) stage measurements by citizen scientists over the observation period.It shows outliers that indicate the presence of suspicious data at all stations during the wet and dry seasons of 2021.
The larger the magnitude of the stage difference (outliers), the more suspicious the data.However, these outliers are not necessarily incorrect measurements since there could be high runoff due to an exceptionally intense rainfall event.In both seasons of 2021, BAR has the largest magnitude of suspicious data.There was some disturbance of river morphology upstream of BAR station due to construction of a bridge, which might have affected the data collection process at BAR.However, further systematic evaluation is needed to be conclusive.In the wet season of 2021, the values are also large at LA and LE stations.At most stations, the suspicious data occurred just before the main rainy season (May) and after the end (October) of the rainy season.Similarly, a boxplot analysis was performed for the citizen science data collected in 2020 and 2022.The results reveal that few suspicious data values appeared in both the 2020 and 2022 years.Scatter plots of the stage difference between the morning and afternoon records of the pair of stations is shown in Fig. 5 for the 2021 data.The most suspicious data are observed at BAR since the stage difference at this station is significantly larger than that of the other stations for the relatively large number of observation days.The LE and KE stations have moderately suspicious data.These suspicious qualities of data occurred mostly before and after the main rainy season of 2021.Similarly, the scatter plots show the higher suspicious data observed at the BAR station in both the 2020 and 2022 years.
At several stations, the box plot detected the largest number of suspicious data than the scatter plot (Table 2).This is expected as the box Similarly, in the wet seasons, the number of common suspicious data detected by both plots is higher in 2020 than in 2021 and 2022 at several stations (Table 2).At this stage, the commonly identified suspicious data with both plots are considered suspicious data for the whole (2020-2022) observation period.However, in most stations, the boxplots overestimated the suspicious data whereas the scatter plots underestimated it.Therefore, further additional data quality assessment is needed to confirm whether the observed suspicious data in these plots consistently occurred with the nearest stations.It is also possible that some runoff records are associated to short duration peaks that can be captured by the citizen scientists at only few stations.

Comparison of hydrographs for data quality assessment
In both the wet and dry seasons, comparisons of morning and evening hydrographs revealed that an increase in the stage in the morning is mostly associated with a decrease in the stage in the afternoon and vice versa at all stations.In Akaki catchment, most rainfall events occur late afternoon or overnight.The base flow is supported by groundwater with declining rate in the dry season and increasing rate for days or weeks subsequent to rainfall events.In contrast, the Akaki River is significantly impacted by anthropogenic activity throughout the daytime in the dry season including water abstraction for irrigation, building, and livestock consumption, which may contribute to reduced base flow during daytime.In most stations, more than half of the stage data which was identified as suspicious by the previously presented data quality assessments were found to be correctly recorded data.In the dry season, the observed suspicious data at LE and BK stations are reduced to one and three in 2020 respectively after inspecting the hydrographs.Similarly, in the wet season of 2021 and 2022, the largest number of suspicious data at BA, KE, and BAR decreased.
Further data quality assessments were performed to confirm whether these data observed in these plots are due to observation errors or climatic conditions.Suspicious data were identified using time series comparisons of areal rainfall and river flow data at each site (Fig. 6).The rainfall hyetograph and river flow hydrograph revealed that most increments of stations' runoff time series occurred in the presence of rainfall events.Except for two minor suspicious hydrographs that appeared at BK station on 05 to 11 April and 19 and 20 May 2021, no major runoff events occurred without corresponding rainfall events.

Rainfall-runoff relationship in Akaki sub-catchments (Bulbula and Kebena)
Fig. 7(a) shows the monthly rainfall distributions at Intoto station in the Kebena (KE) catchment for the 2020-2022 periods.In 2020, rainfall started in March and continued until September.The rainfall began to intensify in April, and peaked in July and August.Significant rainfall was recorded in September, followed by insignificant rainfall in the

Table 2
Summary of suspicious data identified using boxplots and scatter plots of the stage difference between morning and afternoon records in the dry (a) and wet (b) seasons of 2020, 2021, and 2022.Note that in box plots, suspicious data refer to CS stage data identified as outliers, fall outside the whiskers (1.5 times* interquartile range), and in a scatter plot, suspicious data refer to any CS stage data that showed unusual patterns or deviated significantly from expected relationships among stations.subsequent months.Whereas in 2021, it began in February but declined in March.For the remaining months, the rainfall of 2021 followed a similar pattern as that of 2020 but with relatively small magnitude.In 2022, the rainfall onset was in June with the July and August rainfall amounts higher than same months in 2021.In 2020 and 2021, nearly equal peak rainfall amounts occurred in July and August but in 2022 the peak rainfall occurred in July.Fig. 7(b) shows the monthly rainfall distributions at Sendafa station in the Bulbula (BUL) catchment for the 2020-2022 periods.In 2020, rainfall began in March and continued with intensified amount until September, peaking in July and August.September had significant rainfall, followed by a remarkable amount in October and November.In 2021, rainfall began in April, peaked in July and rapidly declined until October.In contrast to the other years, rainfall in 2021 had lower and larger amounts during the rainy and dry seasons respectively as compared to 2022.In 2022, the rain started in June with large amounts, peaking in July.After the peak, it gradually decreased until September and was followed by negligible rainfall in subsequent months.
The monthly hydrographs of the KE catchment for observation periods 2020-2022 are displayed in Fig. 7(c).In 2020, the river flow was significantly high in July, August and September.The peak flow occurred in August.This agrees with the observed rainfall.The peak flow was observed in September for 2021, which shows lag time of one month compared to the timing of the rainfall peak.In contrast, in 2022, the river flow peaked in July abruptly dropping afterwards.The slight difference between monthly rainfall and river flow patterns suggests that other rainfall characteristics also affect runoff generation.Overall, there was a significant difference in the hydrograph patterns of the three years with higher river flow in 2020 than the other two years.Fig. 7(d) shows the monthly hydrographs of the BUL catchment as observed for the three years (2020-2022).There are some noticeable differences between the hydrographs of the three years.The river flow of 2021 was higher than the other years in April and May.This can be explained by the observed rainfall in these months combined with the wetting of the catchment by the extreme rainfall in 2020.The river flow was relatively higher in 2020 with an exception from April to July.The presence of the two water supply reservoirs may have affected river flow intra-annual distribution in BUL catchment.BUL catchment rainfall  mostly shows a similar pattern with lower magnitudes and longer durations until November compared to Kebena catchment's rainfall distribution.Rainfall-runoff response in Bulbula is influenced by both rainfall and reservoir operations, whereas in Kebena, it primarily depends on the monthly rainfall amounts and catchment memory among other factors.
Fig. 8 shows the difference in cumulative river flow and rainfall for the selected historical years and the current CS observation years at Big Akaki station.Cumulative difference (CD) refers to difference in cumulative obtained by subtracting the cumulative river flow from the cumulative rainfall at each day of the year.
We observed some similarities in the river flow of 1998 and 2020, 1995 and 2001, and 2001 and 2022.This is the reason Fig. 8 shows joint plots for these years.The CS observations enabled to perform the CD analysis and compare hydrological similarities between historical and current periods.The pattern of CD plots provided valuable insights about differences and similarities in runoff initiation and recession times, and patterns of rainfall-runoff response over time.
From January to March, the 2020 CD has lower magnitude than that of 1998 suggesting lower dry season river flow in recent year.Both years experienced rapid increase of CD almost at similar rate and magnitude for the subsequent months.CD peaked in August for both years, followed by a slight decline.Unlike that of 1998, CD consistently increased beyond the rainy season in 2020 suggesting presence of rainfall and river flow response even up to the third week of October.In general, the 1998 and 2020 rainfall-runoff relationship show similar behavior in the wet season but differed outside the rainy season.
In 1995, the CD was considerably higher than in 2021, particularly before and after the main rainy season.This suggests that there is slightly larger river flow throughout the year of 2021 than 1995.The CD increased at the same rate and magnitude during the rainy seasons for both years and reached its peak in September.In 2021 CD increased rapidly and flattened at the beginning (April) and end of rain events (May) respectively.In contrast, CD showed gradual changes in 1995.In general, the rainfall-runoff relationship between 1995 and 2021 reveals similar response during the wet season but different behavior during the dry season.
Nearly throughout the entire year, the values of CD in 2001 were significantly higher than those of 2022.The CD increased at the same rate and magnitude during the rainy seasons for both years.The CD peaked in July in 2001 and August in 2022.The overall average CD in the historical time series (1995)(1996)(1997)(1998)(1999)(2000)(2001) was higher than current (2020-2022) in most months.This suggests that there is a relatively higher runoff response in the current years than in past selected years.For both years, the CD peaked in August.In the past, CD increased gradually before flattening at the end of October suggesting the existence of rainfall and river flow for an extended period outside the rainy season.Generally, the rainfall-runoff relationship between the past and present exhibits a similar response during the wet season (with an exception of differing response magnitudes) but different behavior in the dry season.

Characteristics of the runoff coefficient
Investigation into runoff coefficients (RC) provide crucial information about catchment response.Table 3 indicates the RCs of Big Akaki catchment for the past and present periods.We did not repeat this comparison for the other stations because of unavailability of historical river flow data.In the wet season (June-September), RCs are higher for the recent periods (when CS were engaged) than the historical periods.These differences are highest in the wettest years (1998 and 2020).For instance, in August 2020, the RC's values increased from 0.5 (1998) to 0.9 (2022).On average, wet season RCs higher in 2020-2022 compared to those of 1995-2001.
The baseflow index (BFI) was estimated using the data from the CS program for recent years and historical data from the hydrological service and are shown in Table 4 for the Big Akaki station.For this study, the BFI of past years ranged from 0.73 to 0.65 with an average value of 0.69, indicating that 69 % of the river flow in the Big Akaki catchment is supported by shallow subsurface flow and groundwater discharge.The fact that we ignored the return flows from various sectors might have slightly elevated the shallow groundwater contribution.Similarly, the estimated BFI values in the CS observations (recent period) varied from Fig. 8.The difference in the cumulative river flow and rainfall during the historical and CS periods (paired years were selected based on similarity of the river flow and rainfall data).Note that the cumulative river flow is subtracted from the cumulative rainfall.0.45 to 0.77, which indicates that groundwater discharge contributes about 60 % of river flow.The result highlights the CS data allows evaluation of a potential decline in BFI changes over time.However, conclusions on the observed changes can be drawn once sufficiently long time series will allow to perform more robust statistical analysis.The minimum BFI value was observed in 2020.The highest BFI values were recorded in 1998 and 2021.These findings may be related to the high rainfall events in 1998 and 2020.

Discussion
The value of CS data is immense for advancing hydrology research.Data quality concerns can be addressed through appropriate CS program design, rigorous supervision, and multiple steps of quality testing.However, adequate data quality testing to build trust in the reliability of CS river stage data and utilization of the data to evaluate rainfall-runoff response in medium and urbanizing catchments is rare.This study contributed to filling such gap by implementing multiple data quality assessment steps on river stage data collected by citizen scientists and demonstrating its value for evaluating the spatial-temporal variation of rainfall-runoff response in the urbanized Akaki catchment.
Formal river monitoring networks are insufficiently dense to provide data for understanding spatial and temporal variation of rainfall-runoff relationships particularly in small and heterogeneous (urban-rural) catchments.Even for developed countries, formal monitoring networks do not have adequate density to provide data for understanding rainfallrunoff relationships in urbanized catchments, which have high heterogeneity.Urban catchments require high temporal and spatial resolution of rainfall and river flow data for analysis of the rainfall-runoff spatiotemporal variability.Implementing a citizen science program at multiple sites enabled the stepwise quality testing of river stage data and the evaluation of the rainfall-runoff relationships over several subcatchments of Akaki.It eliminated the need to rely only on formal secondary data for quality testing of citizen science data and water research investigations, particularly in data-scarce catchments.
The results of this study demonstrated the benefits of the CS program for reconnecting residents with the urbanizing catchment.The connections were established through three successive training events from 2020 to 2022.The first year's trainings created awareness of the CS program and instructed citizen scientists how to record data on river stage.In the second year, the quality of the collected data was coevaluated by researchers and citizen scientists.Finally, the data were co-interpreted with the citizen scientists enabling them to better understand how the catchment behaves and how the urbanization affects rainfall-runoff relationships.Our citizen scientists were involved in collecting and interpreting data, which can enhance their connectivity to their own water resources, which can help to accelerate achievement of the Sustainable Development Goals (SDGs).Similar to this finding, Shulla et al. (2020) suggested that participation of citizen scientists and acknowledging their role in hydrological research are essential to attaining the SDGs.
Our findings revealed many suspicious data were initially identified with quick visual inspection (boxplots) and significantly reduced through scatter plots.This suggested that suspicious data can be simply and quickly detected with visual graphical inspections, which helps to inform citizen scientists for easily correcting and enhancing data quality.This is similar to the findings of Shinbrot et al. ( 2020) and Tedla et al. (2022) for CS rainfall data evaluation.Further validation through hydrograph comparison and validation against professional data revealed that the citizen scientists provided data of adequate quality.Similar to our finding, Walker et al. (2016) and Tedla et al. (2022) indicated citizen scientists capability to provide high-quality rainfall data, obtaining good agreement with respect to statistical criteria (CC, SD and RMSE).In general, multiple tests of data quality in this study indicated that most of the citizen scientists provided accurate river stage measurements.Therefore, the CS data can serve as an important input to hydrological investigations.These include studying rainfall-runoff responses, and validating rainfall-runoff modelling for evaluating the availability of water resources.
Our findings showed that use of only one type of data quality test suggested that most of the stations had a considerable amount of suspicious data.However, progressive application of multiple tests revealed the data were of good quality.Tedla et al. (2022) demonstrated the same need for multiple quality assurance steps for rainfall data collected by citizen scientists.In this study, we noticed relationships between the data quality and the period that training was offered to the citizen scientists.Similar to these findings, Rutten et al. (2017) and Walker et al. (2021) indicted that incentives such as training and feedback are needed to enhance the quality of CS data.Our finding suggested that CS can provide high quality river stage data to supplement formal river stage data.
Our study revealed the potential of citizen science in collecting data for relatively long periods over multiple sites in the dynamic Akaki catchment.The Akaki catchment with rapid urbanization experiences frequent LULC changes, which can make rainfall-runoff relation unstable for a long period.In the study area, the LULC changes rapidly due to the ongoing development in Addis Ababa and the surrounding towns.As a result, we selected a short time window of three years.The periods 1995-2001 and 2020-2022 were selected since the rainfall of these two periods has some similarity.Another limitation of this study is that our citizen science program provided only three years of data, which is still longer than data provided by many other citizen science programs.A review by Njue et al. (2019) revealed that approximately 70 % of the citizen science programs for water level and precipitation were located in North America and Europe, with a temporal coverage ranging from 3 weeks to 1 year.Similarly, Assumpção et al. (2018) reviewed nearly 70 % of the citizen science research was conducted in developed countries, indicating limitations in spatial and temporal coverage of citizen science studies.Similarly, few studies in Ethiopia applied the citizen science program in the area of hydrology which covered an observation period of around one year (Tedla et al., 2022;Walker et al., 2016).
Our findings revealed spatiotemporal rainfall-runoff variations across Akaki catchment.The two small nearby sub-catchments (BUL and KE) showed varied rainfall-runoff response in terms of peak river flow and rising and recession limbs.The peak flow aligned with the highest rainfall amounts and distributions in the KE catchment over the three observation years.However, the timing of peak river flow varied across the years.In contrast, the river flow of BUL catchment was influenced by not only rainfall distribution but also the reservoir storage and releases.Peak flows of the two rivers do not necessary occur in the same months, which has implications on flood occurrence in the downstream parts of the catchment.
The runoff coefficient for recent years varied from 0.1 to 0.5 for the Big Akaki catchment which is 29.3 % urbanized (Negash et al., 2023).Seidl (2020) indicated a runoff coefficient of 0.86 for a completely urbanized area and 0.43 for a fully vegetated zone.Similarly, Wang et al. (2019) revealed Beijing's urban areas increased from 8.73 % in 1980 to 22.22 % in 2015, and the value of the runoff coefficient doubled during the 1980-2010 period.For Akaki catchment, we found an increased runoff coefficient over two decades, but the rate of increase is not as large as reported for Beijing.
Over the past two decades, monthly base flow index (BFI) decreased on average from 0.70 to 0.60 in Big Akaki catchment parallel to urban expansion.There could be related to LULC change (expansion of built-up areas) that reduced infiltration and recharge.High groundwater abstraction in the studied areas from 1998 to 2022 may also be connected to the drop in BFI (Ayenew et al., 2008;Birhanu et al., 2018;Muleta and Abate, 2021) since groundwater has been constantly extracted from the study region to fill huge water demand gaps.It currently provides 63 % of Addis Ababa's water (Muleta and Abate, 2021).Further reduction of the baseflow would have been reported if we had considered the increase in return wastewater disposal to rivers (e.g. from wastewater treatment plants).Bhaskar et al. (2016) and Liu et al. (2013) indicated that urbanized watershed's baseflow increased as a result of anthropogenic activities that cause wastewater to be discharged directly into streams.We are aware that the CS observations should extend for long period to allow performing hydrological change detection with robust statistical techniques.However, our analysis shows the potential of CS programs to provide data that adds value to hydrological change detection and similarity analysis.
The results of our study revealed that using multiple data quality assessment steps can increase confidence in CS data and the ability to evaluate spatio-temporal variation of rainfall-runoff responses at subcatchment levels.However, our study was performed based on daily citizen scientist (CS) observation hence there will be limitations in capturing sub-daily rainfall-runoff variations of urbanizing catchments.In addition, monthly-based analyses of runoff coefficient can hide shorttime temporal dynamics.Future research should consider semi-daily or hourly temporal resolution data for exploring event-based rainfallrunoff responses in diverse catchments as well as for validating rainfallrunoff models using CS data.

Fig. 1 .
Fig. 1.Location of Akaki catchment and distribution of hydro-meteorological stations in the catchment.Note that the streamflow gauges are monitored by citizen scientists and most of the sub-catchments are named after the gauging stations at their outlet.

Fig. 3 .Fig. 4 .
Fig. 3. Taylor diagrams showing the performance of CS river stage data against reference river stage data.Note that plots (a) and (b) refer to Kebena (KE) and Bulbul&Kebena (BK) stations respectively.The purple and red dots refer to CS data collected at Kebena (KE) and Bulbula&Kebena (BK) station, respectively.(For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 5 .
Fig. 5. Scatter plots of the stage differences between morning and afternoon records after normalization by seasonal averages 2021.Note: hollow circles show the data quality is suspicious.

Fig. 6 .
Fig. 6.Comparisons between areal rainfall and river flow data collected by citizen scientists at BK station.

Fig. 7 .
Fig. 7. Monthly rainfall and river flow over KE catchment (a, c) and BUL catchment (b, d) for the observation periods (2020-2022).Note that plots (a) and (c) refer to monthly rainfall at Intoto station and monthly river flow over KE catchment respectively.Plots (b) and (d) refer to monthly rainfall at Sendafa station and monthly river flow over BUL catchment respectively.

Table 1
Numbers of stage-river flow data collected from each gauging site during the study period(2020)(2021)(2022).
the factors that contributed for improving data quality over time need to be investigated in more detail in the future.

Table 4
Akaki base flow index for the CS observation and historical periods.