Assessment of GPM Satellite Precipitation Performance after Bias Correction, for Hydrological Modeling in a Semi-Arid Watershed (High Atlas Mountain, Morocco)

: Due to its important spatiotemporal variability, accurate rainfall monitoring is one of the most difﬁcult issues in semi-arid mountainous environments. Moreover, due to the inconsistent distribution of gauge measurement, the availability of precipitation data is not always se-cured and totally reliable at the instantaneous time step. As a result, earth observation of precipitation estimations could be an alternative for overcoming this restriction. The current study presents a framework for either the hydro-statistical evaluation and bias correction of the Global Precipitation Measurement (GPM) Integrated Multi-SatellitE Retrievals version 06 Early (IMERG-E), Late (IMERG-L), and Final (IMERG-F) products. On a sub-daily duration, from the Taferiat rain gauge-based station, which was used as a benchmark from 1 September 2014 to 31 August 2018. Statistical analysis was performed to examine each precipitation product’s performance. The results showed that all Post_Real_Time and Real_Time IMERG had a high level of awareness accuracy. The IMERG-L results were statistically similar to the gauge data, succeeded by the IMERG-F and IMERG-E. The Cumulative Distribution Function (CDF) has been employed to adjust the precipitation values of the three IMERG products in order to decrease bias estimation. The three products were then integrated into the “HEC-HMS” hydrological model to assess their dependability in ﬂow modeling. Six ﬂood occurrences were calibrated and validated for each product at 30-minute time steps. With a mean Nash-Sutcliffe coefﬁcient of NSE 0.82, the calibration ﬁndings demonstrate that IMERG-F provides satisfactory hydrological performance. With an NSE = 0.80, IMERG-L displayed good hydrological utility, slightly better than IMERG-E with an NSE = 0.77. However, when the ﬂood events were validated using the initial soil conditions, IMERG F and IMERG E overestimated the discharge by 13% and 10%, respectively. While IMERG L passed the validation phase with an average score of NSE = 0.69.


Introduction
The global hydrological cycle and atmospheric energy exchange are highly dependent on precipitation [1,2]. Understanding the spatiotemporal variability and accuracy of rainfall information is critical for numerous utilizations such as hydrological modeling, climate forecasting and water management [3][4][5].
the IMERG V06 products with gauge precipitation. To validate the implementation of the used products in the hydrological modeling of extremes across complex topography, and identify products that can estimate well-simulated precipitation and runoff in semi-arid mountainous regions.
The following paper is structured as follows: Section 2: Description of the study area and presentation of the used precipitation data sets. Section 3: Assessment of the spatial distribution of rainfall and its impact on surface water at the outlet. Then, employing the QM method and the CDF function, the processing of data and bias correction of the SPP are performed. Section 4: Calibration by forcing the HEC-HMS model with IMERG precipitation data and comparing it with the rain gauge data, in order to determine its capacity to detect different precipitation events, and to reproduce extreme flows. Finally, Section 5: contains findings and suggestions for upcoming work. This section provides the first comprehensive analysis of the viability of SPP in the modern era. The results should provide researchers with useful information on the performance of the various runs, to validate the suitability of the newest IMERG V06 version as a source of rainfall data for forecasting and providing early warning of potential hazards as extreme precipitation events in the Mediterranean mountainous zones.

Study Area
The study area is the Zat catchment, a tributary of the Tensift river basin. It is located in the South EST of Marrakech city with latitudes 31 • 30 and 31 • 45 north and longitudes 7 • 30 and 7 • 45 west, as shown in Figure 1. With elevations ranging from 756 to 3777 m and a surface of 521 km 2 [7], while the annual precipitation average is 395 mm, of which 30% is snow at altitudes above 2000 m [48,49], the climate is intermediate between semiarid and humid in the upstream, semi-arid in the downstream, and greatly impacted by altitude [50]. The Zat catchment is characterized by a complex river network, recognized by a dense and complicated hydrographic network. A river drains the Zat Basin and connects with the Ourika River to form the Hadjar River's two main tributaries. The upstream geology is defined by the outcrop of Precambrian terminal igneous and metamorphic rocks and is considered impermeable. While the downstream is in the northern sub-Atlantic zone, the principal outcrops are from the post-Hercynian layer mostly permeable [51]. At the highest elevations, the slope's average is 19% [52]. These environmental conditions encourage the creation of runoff. The vegetation is divided into two sections: the forests in the mountain area upstream and the agricultural section downstream, which includes cultivated land along the river channel.

Gauge Precipitation Data
Precipitation varies considerably in space and time, especially in mountain basins. Ground observations are used to analyze precipitation and runoff, as well as a reference for rainfall data since it provides a continuous record in a specific location. Within this study, we used the 10-min sub-daily precipitation from the Taferiat station, upscaled to 30 min to match the satellite precipitation (Figure 1), the Tensift Hydrological Agency (ABHT) supplied these observations. Although rainfall in the Zat basin is very low and

Gauge Precipitation Data
Precipitation varies considerably in space and time, especially in mountain basins. Ground observations are used to analyze precipitation and runoff, as well as a reference for rainfall data since it provides a continuous record in a specific location. Within this study, we used the 10-min sub-daily precipitation from the Taferiat station, upscaled to 30 min to match the satellite precipitation (Figure 1), the Tensift Hydrological Agency (ABHT) supplied these observations. Although rainfall in the Zat basin is very low and there is only one station downstream. Data collection was based on the availability of rainfall records from 1 September 2014 to 31 August 2018. All recorded data were subjected to rigorous quality control, which includes checking for outliers and nulls with a variety of approaches such as the "letter-value" approach of the python package "seaborn" and hoarding curves. This ensures that the ground-based rain gauge data is of excellent quality. This in situ data collection was exploited as a baseline to validate the used IMERG products.

Earth Observation of Precipitation Data
The GPM project is a joint collaboration between NASA and JAXA. GPM provides three distinct data treating levels. It includes gridded precipitation and snowfall data at a spatial resolution of 0.1 • × 0.1 • at a 30-min frequency [53], based on a combination of PMW and IR data from GPM and gauge analysis by the GPCC. The principal modifications in the recent IMERG release are (1) the inclusion of SAPHIR and the IMT estimates; (2) Passive microwave estimation are altered at high latitudes to decrease spatial gaps; (3) the inclusion of GEOS, FP, and MERRA-2 model data for time interpolation in place of the IMERG V05 infrared data [54].
The IMERG system will run two instantaneous runs, once to generate IMERG-E at about 4 h after the rated observing time, then to generate IMERG-L data with about 12 h latency, and after receiving the monthly level analysis, the last IMERG run will be run to generate IMERG-F data at about 3.5 months. The IMERG run (NRT) uses the climatological observations from the gauges for bias adjustment, it also uses the monthly analyses from the GPCC gauges [54].
In this study, temporal, statistical, and hydrological analyses were performed to determine the ability of the GPM to identify significant rainfall events that occurred between 1 September 2014 and 31 August 2018.

Discharge Data
Flow data from the Taferiat gauging station was used for calibration and validation of the HEC-HMS model. The Tensift hydrological agency (ABHT) provided the flow data from 2014 to 2018. Throughout these four years, only six flash floods were triggered by heavy rains, the details are presented in Table 1. When analyzing these flood events, the following points should be highlighted: Event 1 has not been calibrated or validated due to a lack of precipitation data, Event 3 has a higher peak discharge value than the other events and has a Multi-modal temporal distribution, whereas the other events have a Uni-modal distribution. Furthermore, event 4 has a significantly lower maximum drain than the other events. It should be emphasized that precipitation has a significant impact on flood events.

HEC-HMS Software
HEC-HMS is a deterministic, semi-distributed, event-based/continuous, and mathematically based model. It is able to model in a wide variety of geographical regions, and different climatic contexts, such as arid and semi-arid mountainous climates [15]. It easily conducts a wide range of hydrological study functions, such as losses, discharge transformation, routing in open channels and analysis of meteorological data, simulation of precipitation and runoff, and estimation of parameters [55].

Processing Data
To evaluate the IMERG products, satellite and gauge precipitation data were preprocessed to ensure data consistency and accuracy. This involved several actions ( Figure 2): (1) Examine the continuity of the rain gauge and IMERG data. (2) Remove outliers and missing data to ensure symmetry. (3) upscaling the rain gauge data from 10 min to 30 min to match the satellite data. (4) Compare the satellite data from the GPM IMERG V06 products with the rain gauge data (taking into account the average precipitation pixels that are closely correlated with runoff). (5) Computation of statistical metrics, and bias correction using the CDF function. (6) Finally, hydrological modeling using the HEC-HMS hydrological model by using observed and satellite data.

Satellite Monitoring of Precipitation Products
Satellite precipitation data from the IMERG product were downloaded from Net-CDF grid files from 1 September 2014 to 31 August 2018. Data visualization with an extraction tool was applied to extract gridded precipitation from the Net-CDF files [56]. In addition, various metrics including average, variation coefficient, and standard deviation were used to assess the overall performance of the earth observation rainfall products. This provides insight into the outcomes of the remotely sensed precipitation estimations For this purpose, two scenarios were carried out. The first scenario concerns the calibration with the forcing of the model with observed and satellite rainfall data, whilst the second scenario concerns the validation using the "leave-one-out" method based on the

Satellite Monitoring of Precipitation Products
Satellite precipitation data from the IMERG product were downloaded from Net-CDF grid files from 1 September 2014 to 31 August 2018. Data visualization with an extraction tool was applied to extract gridded precipitation from the Net-CDF files [56]. In addition, various metrics including average, variation coefficient, and standard deviation were used to assess the overall performance of the earth observation rainfall products. This provides insight into the outcomes of the remotely sensed precipitation estimations selected for the research area. A method for analyzing the effect of precipitation on streamflow in each pixel of the research area was developed. Section 4.1 Precipitation spatialization and runoff evaluation in the findings section discusses the detailed method better.

Metrics Assessment
Multiple analytical methods have been used to quantify the quality of the three SPPs in comparison to in situ precipitation and to investigate the adequacy of the IMERG products. Statistical metrics including, Root Mean Square Error (RMSE), Pearson Correlation Coefficient (CC), Bias, Mean Squared Error (MSE), The Mean Absolute Error (MAE), are utilized to represent the agreement and uncertainties between the IMERG products and the gauge measurements. The purpose of the CC is to assess the amount of linear connection between the IMERG data and the measurements. CC value can range from −1 to 1, with 0 representing no correlation.
Root Mean Squared Error (RMSE) is a commonly used metric to evaluate the performance of regression models. It is the square root of the Mean Squared Error (MSE) and measures the average magnitude of the differences between the predicted and actual target values. Where a lower value indicates an excellent quality assessment. Note that positive Bias numbers denote an over-estimate and negative numbers an under-estimate. IMERG products are commonly considered credible if the CC value is superior to 0.7 and the Bias value ± 10% [57]. The Mean Squared Error (MSE) is a commonly used loss function for regression problems, is widely used as a performance metric for regression models, and is used to minimize the difference between the predicted values and the actual values. The Mean Absolute Error (MAE), is defined as the average of the absolute differences between the predicted values and the actual values.
The diagnostic indices are as follows: X o represent satellite estimates and their average, respectively. Three contingency metrics have been used to assess the ability of precipitation detection capability of IMERG namely POD, FAR and CSI. To differentiate precipitation days and days without precipitation, a threshold value of 0.1 mm/h was chosen.
The fraction of SPPs that correctly recognize the precipitation recorded in the measurement gauges are measured by POD. FAR stands for the percentage of SPPs that detect rainfall events that are not detected by measurement stations. The former approach compensates for random events, while the latter describes the correspondence between the days of precipitation recorded by the PPSs and the levels. POD = 1, CSI = 1, and FAR = 0 for a perfect score. Examples of categorical statistical measures include: with H as the precipitation event recorded by the satellite and ground station at the same time, M as the rainfall event recorded by the rain station but excluding the earth observation, and F as the inverse of M. The calculus of the CSI includes the identification of a threshold of rain/no rain events. In this study, the time resolution of the rain gauge and SPP is 30 min. The rainfall threshold was set at 0.1 mm/30 min.

Quantile Mapping Method
Using a non-parametric method, such as the QM method, the Empirical Cumulative Distribution Function (ECDF) of earth precipitation observations can be fitted to match the (ECDF) of gauge estimations using a transfer function for each quantile [58].
This strategy has already demonstrated promising results in removing systematic biases from climate models whose resolution is too rough to be explained the great variability of precipitation patterns. Similarly, satellite products from a certain time period are modified to account for the statistical features of precipitation records [59][60][61]. Wilks, 2011 [62] detailed the method for generating ECDFs, and Themeßl, 2012 [63] used it for Quantile Mapping (QM) precipitation from climate models.
ECDFs are calculated for all SPPs in this study. The matching process used here can be mathematically represented as follows: where P IMERG-Cor is the corrected rainfall quantity, P IMERG is the rainfall quantity to be corrected, ECDF-1G represents the equivalent of the empirical CDF of the rain station data, and ECDF Gauge is the empirical CDF of the gauge precipitation. Each percentile of the IMERG precipitation series is substituted by a percentile of the rain gauge precipitation data-series in this approach. In other words, for the pixels in the Zat basin, the probability of surpassing the SPPs precipitation quantity is computed using the empirical CDF generated from the SPPs. The empirical CDF from the in-situ observation is then utilized to calculate the probability of precipitation. The sub-daily precipitation quantity for the most representative pixels of the Zat basin (P2, P5, P6, P7, P9, and P11) are adjusted through this CDF matching technique based on the ECDF generated for the SPPs and ground data for the analyzed basin. This approach not only adjusts themes and standardizes their deviance, but it also keeps each quantile magnitude from observations, including those from the distribution's higher tail [63].

Hydrological Process
The IMERG products were hydrologically evaluated using the HEC-HMS model  Table 2 lists the parameters and methodologies used in watershed modeling. The used Methods incorporate SCS-CN (Soil Conservation Service) Curve Numbers, Clarke Unit Hydrographs, and Base Flow Recession required to establish the hydrologic loss rates, runoff transformation, and base flow [7,[65][66][67][68][69][70].
After completion of the model calibration, IMERG products were validated using the initial soil conditions, and the soil moisture from the ESA-CCI database for each flood occurrence throughout the study period. Indeed, the ESA-CCI soil moisture product, provided by the European Space Agency (ESA) (http://www.esa-soilmoisturecci.org/) (accessed on 2 February 2023), gives estimates of SM on the day(s) preceding a flood event considered as the initial soil condition, which gives an idea of the soil saturation rate and therefore an idea of flood occurrence probability. The product has a temporal sampling of 1 day and a spatial resolution of 0.25 • . However, the implementation of the SM (from ESA-CCI) in the model was developed by applying a linear regression between the initial satellite soil moisture for each event and the CNs obtained after the calibration of each event. New CNh were derived from the linear equation relating the soil moisture data to the calibrated CNs using a re-sampling approach without reinsertion. The model was then forced by the newly obtained CNh, at the validation level. This method has already been validated in the High Atlas watersheds.
In addition, IMERG products were incorporated into the HEC-HMS model by replacing precipitation from the gauge with IMERG data. The Nash-Sutcliffe coefficient NSE, PBIAS, and RMSE have been utilized to assess the model to estimate the " quality of fit " of the observed and estimated flows.
Furthermore, NSE is a commonly used metric for evaluating the performance of hydrologic models and is based on the comparison of simulated and observed flow values. In our case, the HEC-HMS model privileges the calculation of NSE, RMSE, and PBIAS directly at the platform level. This leads us to take into account the NSE rather than other metrics.
The performance of the HEC-HMS model is compared to the literature using the following criteria: The Nash-Sutcliffe Coefficient (NSE) ranges from −∞ to 1, with negative values indicating poor performance [68]  (NSE > 0.8). It can be used to assess the model's predictive power and is computed as follows: The PBIAS is for assessing the average tendency of observed flow to simulated flow difference [68]. It is formulated as follows: The RMSE for calculating the residuals between simulated and observed runoff values [68]. RMSE is expressed mathematically as follows: where Qoi is the data being evaluated, Qsi is the simulated data, − Qo is the mean of observed data and n is the total number of observations.

DEM
Terrain preprocessing begins with the preparation of a Digital Elevation Model (DEM). It is tiled with a resolution of about 30 m. Downloaded from the United States Geological Survey (USGS).
This DEM was cropped along watershed boundaries using county polygon shapefiles downloaded from ESRI ( Figure 1).

Rainfall Spatialization and Runoff Assessment
Precipitation and runoff have important spatiotemporal characteristics, especially in semi-arid mountainous areas. Therefore, an approach to analyzing the influence of precipitation falling in each pixel on streamflow patterns in the outlet of the basin has been developed. The Zat watershed was classed into 15 pixels of 0.1 • × 0.1 • , which is the resolution of the satellite pixels. The grid of pixels was analyzed by applying a series of evaluation metrics which are Pearson's Correlation Coefficient, and Root Mean Squared Error, to identify the pixels having the greatest influence on watershed flow. The six most representative pixels, which occupy a volume higher than 60% of the total pixel area, and whose correlation coefficients are satisfactory (P2, P5, P6, P7, P9, and P11) as shown in Figure 3, with correlation coefficients of 0.5, 0.48, 0.73, 0.52, 0.94, and 0.67, respectively.
Subsequently, a regression of multiple linear approaches has been applied for each pixel in the three products, which has confirmed the previous results of the representative pixels, noting that each pixel is defined by P i=1 i=15 representing the volume of precipitation falling on each pixel, which is shown in the equations below (Equations (13)-(15)). Multi-linear regression factors are based on the relationship between multiple variables and are calculated by examining the correlation between them. The calculation of multiregression factors can vary depending on the data set. In our case, different precipitation products (early, late, and final) were used to build the multi linear regression, which gives different factors. Atmosphere 2023, 14, x FOR PEER REVIEW 10 of 25 Subsequently, a regression of multiple linear approaches has been applied for each pixel in the three products, which has confirmed the previous results of the representative pixels, noting that each pixel is defined by =15 =1 representing the volume of precipitation falling on each pixel, which is shown in the equations below (Equations (13)-(15)). Multilinear regression factors are based on the relationship between multiple variables and are calculated by examining the correlation between them. The calculation of multi-regression factors can vary depending on the data set. In our case, different precipitation products (early, late, and final) were used to build the multi linear regression, which gives different factors. = 1.18 * 2 − 3.99 * 5 + 6.04 * 6 + 3.33 * 7 − 3.79 * 9 + 0.03 * 11 (13) = 2.9588 * 2 − 4.9272 * 5 + 4.9634 * 6 + 4.1293 * 7 − 1.3765 * 9 − 1.887 * 11 (14) = 0.45622 * 2 − 2.832 * 5 + 7.4248 * 6 + 1.2513 * 7 + 0.45496 * 9 − 1.4794 * 11 (15) P (1,3,4,8,10,12,13,14,15) are removed from equations 11,12, and 13, because the calculated factors multiplying these pixels are equal to zero, therefore, their impact on the equation is eliminated. This means that the equation will no longer be affected by these values and any changes in them will have no effect on its result.
In addition, each pixel is multiplied with a factor linked to the volume of precipitation and its distance from the measuring station. The resulting equations can be used to forecast the average flows directly from IMERG precipitation data.

Performance of CDF Matching Method
Based on previous research, such as [58,69], the CDF approach performs well. This produced the most accurate representations and was less susceptible to inaccurate adjustment of isolated anomalies in locations where the gauge is sparse, as is the case in our situation. Although the data sets may be quite different locally, their relative frequency may be similar, limiting the vulnerability to excessive fitting mistakes. The gamma distribution may not be acceptable at dry locations with many zero totals, since zero values P (1,3,4,8,10,12,13,14,15) are removed from equations 11,12, and 13, because the calculated factors multiplying these pixels are equal to zero, therefore, their impact on the equation is eliminated. This means that the equation will no longer be affected by these values and any changes in them will have no effect on its result.
In addition, each pixel is multiplied with a factor linked to the volume of precipitation and its distance from the measuring station. The resulting equations can be used to forecast the average flows directly from IMERG precipitation data.

Performance of CDF Matching Method
Based on previous research, such as [58,69], the CDF approach performs well. This produced the most accurate representations and was less susceptible to inaccurate adjustment of isolated anomalies in locations where the gauge is sparse, as is the case in our situation. Although the data sets may be quite different locally, their relative frequency may be similar, limiting the vulnerability to excessive fitting mistakes. The gamma distribution may not be acceptable at dry locations with many zero totals, since zero values must be removed in the adjustment. Indeed, with a larger data set, this approach will almost certainly be more effective, for providing only 4 years' worth of data may limit the strength of the goodness-of-fit, particularly for the extremes. For example, in the [44], they improved the quantity of available data for fitting using daily observations and taking into account the adjacent grid cells, which have a tendency to improve the precision of this technique.
The CDF technique, on the other hand, does not depend on gauge analysis to provide accurate totals, but rather on the more accurate gauge frequency. Indeed, the representation is comparable to that obtained by linear correction models. The CDF matching approach is a simple algorithm, but it can be effective in some situations. It may be useful in locations where rain gauge networks are more sparse and accurate analysis of rain gauges is more difficult, as is the case in the study watershed. Figure 4 shows a comparison of 30 min SPPs during the whole study period using Quantile-Quantile (Q-Q plots). The precipitation Q-Q plots demonstrate reasonable bias-corrected performance relative to observations, with data lines extremely close to the baseline for both the Late and Final runs. The Q-Q graphs show that the corrected satellite precipitation products have a decent ability to reduce bias in general.
locations where rain gauge networks are more sparse and accurate analysis of rain gauges is more difficult, as is the case in the study watershed. Figure 4 shows a comparison of 30 min SPPs during the whole study period using Quantile-Quantile (Q-Q plots). The precipitation Q-Q plots demonstrate reasonable bias-corrected performance relative to observations, with data lines extremely close to the baseline for both the Late and Final runs. The Q-Q graphs show that the corrected satellite precipitation products have a decent ability to reduce bias in general.
The results reveal that following bias reduction using the CDF matching method, all data products (Figure 4, particularly IMERG_L and IMERG_F), (Figure 4B,C) improved significantly. While the IMERG_E data indicated a slight improvement, ( Figure 4A). As a result, the QM technique results indicated that the IMERG_E data may have higher uncertainty than the IMERG_L and IMERG_F data.

Statistical Indices Assessment
In order to emphasize the significance of bias correction for remotely sensed precipitation, a statistical comparison was conducted between bias-corrected and uncorrected (IMERG_E), (IMERG_L), and (IMERG_F) products and ground-based precipitation data at the Taferiat station for six typical pixels in the Zat Basin. The study period was from 1 September 2014 to 31 August 2018.
The statistical assessment results at sub-daily (30 min) time steps are shown in Table  3. Indeed, near-real-time products IMERG_E and IMERG_L achieved comparable Correlation coefficient (CC) values that were somewhat lower compared to IMERG_F and statistically insignificant before adjustment. In contrast, after bias corrections for all SPP, the (CC) values between ground-based precipitation observations and the SPP improved dramatically and became statistically significant. The major cause is because of the SPP algorithm, short-term SPP estimates typically include significant biases, and aggregation of The results reveal that following bias reduction using the CDF matching method, all data products (Figure 4, particularly IMERG_L and IMERG_F), (Figure 4B,C) improved significantly. While the IMERG_E data indicated a slight improvement, ( Figure 4A). As a result, the QM technique results indicated that the IMERG_E data may have higher uncertainty than the IMERG_L and IMERG_F data.

Statistical Indices Assessment
In order to emphasize the significance of bias correction for remotely sensed precipitation, a statistical comparison was conducted between bias-corrected and uncorrected (IMERG_E), (IMERG_L), and (IMERG_F) products and ground-based precipitation data at the Taferiat station for six typical pixels in the Zat Basin. The study period was from 1 September 2014 to 31 August 2018.
The statistical assessment results at sub-daily (30 min) time steps are shown in Table 3. Indeed, near-real-time products IMERG_E and IMERG_L achieved comparable Correlation coefficient (CC) values that were somewhat lower compared to IMERG_F and statistically insignificant before adjustment. In contrast, after bias corrections for all SPP, the (CC) values between ground-based precipitation observations and the SPP improved dramatically and became statistically significant. The major cause is because of the SPP algorithm, short-term SPP estimates typically include significant biases, and aggregation of SPP from thinner to coarser time step resolutions could partially compensate for precipitation bias in the finer time steps, resulting in increased (CC) in the longer time steps. However, precipitation was significantly underestimated in IMERG_E and IMERG_L and slightly overestimated in IMERG_F before bias corrections. In addition, although IMERG_L outperformed IMERG_F and IMERG_E in general, a small margin of error was noted. Table 3 reveals that IMERG_L had marginally lower RMSE values by comparing it to IMERG_E and IMERG_F before correction, but the resultant RMSEs are significant after correction. This result suggests that the IMERG SPP contain a number of outliers. Based on Bias, IMERG_E and IMERG_L significantly overestimated precipitation before corrections by 23.53% and 15.83%, respec-tively, while IMERG_F attained a significantly lesser systematic Bias (6.87%) due to the in-situ adjustment using precipitation on a monthly basis from the Global Precipitation Climatology Centre (GPCC) gauge analysis, while the obtained Bias is significantly improved and significant after corrections. Moreover, the Mean Square Error (MSE) and the Mean Absolute Error (MAE) represent values between 0.17 and 0.23 and 0.40 and 0.60, respectively, before corrections, and between 0.13 and 0.19 and 0.04 and 0.06, respectively, after corrections, which shows a clear improvement of the results after corrections. Regarding the ability of SPP to detect rainfall events, IMERG's three SPP performed similarly, with PODs and CSIs ranging from (0.18 and 0.28) and (0.06 and 0.09), respectively, before correction, and from (0.93 and 0.99) and (0.94 and 0.97), respectively, after correction, indicating a significant improvement. FAR readings below 0.90 before rectification, followed by an astonishing improvement with values extremely close to the optimal 0. In all criteria, the IMERG_L outperformed the IMERG_E real-time product and the IMERG_F post-real-time product.

Hydrological Calibration and Validation
In terms of calibration and validation, rainfall events have been chosen depending on the disposal of data records of discharge and precipitation for the most extreme events that have occurred recently. The selected rainfall events in Table 1 are used for calibration and validation. Both processes have been done manually using data from the Taferiat measuring station at 30 min time step for the period from 2014 to 2018.
The events studied in this research were classified based on their point flow into three categories: Heavy flash flood, which has a flow of ≥50 m 3 /s, Moderate flash flood, with a flow of ≥20 m 3 /s, and Low flash flood, which has a flow of ≥10 m 3 /s. Events 1 and 3, which occurred during summer and autumn, had a peak flow (Q max) of 66.82 m 3 /s and 136.61 m 3 /s, respectively, ranking them as heavy flash flood events. On the other hand, Events 2, 5, and 6 took place between autumn and spring, which is the peak season for flash floods in the region, with peak flows of 34.30 m 3 /s, 38.86 m 3 /s, and 21.74 m 3 /s, respectively, classifying them as moderate flash floods. Event 4 was a small spring flood event that resulted from snowmelt, with a peak flow of 12.13 m 3 /s, which was categorized as a low flash flood event.
Calibration process: A total of 24 events were simulated and calibrated using the initial values. However, the calibration was developed by maintaining the maximum and minimum ranges of the calibration parameters based on the literature [7,15]. The averages of these intervals were considered as initial values in the case of this paper. These values were manually adjusted to obtain a good fit within the observed and estimated flows. The quality of the adjustment was judged using the visualization of the hydrographs and the calculated statistical values. However, a critical analysis was applied to ensure that the input parameter values used are physically meaningful by evaluating the watershed and stream characteristics. Table 4 represents the calibration parameters of the 24 events studied which are: 1-Curve number (CN): mainly depends on the land use land cover maps (LULC), and soil maps. It could be calculated using GIS methods. However, in this study, the CN was calculated using the weighted average of the total number of curves in the Zat basin and resulted in an average value of CN = 72.
2-"Time of concentration": defined as the time required by water to flow from the farthest point in the watershed to the outlet side. It was calculated using the Giandotti equation as below: where: Tc is the Time of concentration in (Km/ √ m), S is the basin area in (Km 2 ), L is the length of the main stream (Km); H avg is the average altitude (m), and H min is the minimum altitude (m).

3-"Recession constant"
: describes the rate of baseflow decay, the constant represents the ratio of the baseflow at the present time to the flow one day earlier and is therefore between 0 and 1. The actual value would be further defined in the hydrologic modeling process, looking for the value that gives the best efficiency.
Validation Process: is used to verify the model's accuracy in this study was performed using multiple approaches due to the limited size of the identified and studied flood events. Specifically, the "leave-one-out" resampling method was employed in this paper, whereby each of the n flood events (i) was removed in turn. By doing so, the relationship between the root and soil moisture measured using a time domain reflectometry tool (TDR) and the two models' most sensitive calibration parameters (curve number (CN) and time of concentration (TC)) could be established.
The calculated CN values obtained by this procedure are then used to model flood event and the simulated flow is compared to the observed flow. The validation results for the 24 events are presented in Table 5, indicating better model performance when using the SCS-CN model and accounting for soil moisture, with Nash coefficients between 0.47 and 0.90, using the leave-one-out procedure [15]. This approach is a good alternative for hydrological modeling in poorly gauged or ungauged basins [15]. In addition, all hydrographs from the three products have been compared based on several statistical indicators, involving flood volume, maximum floods, NSE, PBIAS, and RMSE, which are significant performance metrics in event-based modeling Tables 4 and 5.
In general, results demonstrate a close agreement between the observed and estimated flow at the peak flow values and a reasonable agreement in terms of discharge distribution. In addition, the evaluation criteria show satisfactory values for the NSE, which is between 0.6 and 0.91 for the calibration, and 0.47 and 0.90 for the validation. The results showed that IMERG Early is efficient at capturing intensive precipitation time series in the high mountains, as its latency time is about 4 h. The rise and recession curves were overall well-reproduced for all events, and the peak flow rate was generally achieved for all events in the calibration and validation portions.
However, it is highlighted that IMERG Early was able to simulate accurately the streamflow at events 1 and 3 for the heavy flash flood category, and to slightly overestimate the simulated streamflow volumes at events 2, 4, 5, and 6 for both moderate and low flash flood categories.
Nevertheless, the results indicate a good fit among the data sets with an NSE between 0.60 and 0.91 in the calibration, and between 0.47 and 0.84 in the validation. The decrease in the validation criteria is because the IMERG Early product does not consider the initial humidity conditions due to its latency time.

Calibration and Validation of IMERG Late Events
The hydrographs in Figure 6 illustrate the comparative results of calibration and validation of the IMERG Late precipitation data. The model reproduced accurately the general flow pattern of the observed flow during the calibration and validation events. Furthermore, results showed that IMERG Late can estimate the intense precipitation time series better than its ancestor IMERG Early, due to its latency time of about 14 h, which allows it to make some data adjustments.
However, the simulated flows were in good fit with the observed one during calibration and validation, with NSEs between 0.59 and 0.90, and between 0.54 and 0.90 respectively, showing that in the validation part, the consideration of the initial soil moisture condition was more meaningful for IMERG Late and clearly improved its validation results.
A significant improvement of the simulated flow volumes was noticed at events 1 and 3 in the heavy flash flood category, with a slight underestimation of precipitation in the moderate flash flood category on events 2, 5, and 6, followed by a good improvement of the curve shape and the simulated flow volume of event 4 at low flash flood category, noticing its higher capacity to detect the peak discharge.

Calibration and Validation of IMERG Final Events
The results depicted in Figure 7 display that the simulated flows generated by the IMERG final run performed well for calibration and validation. The rise curves were generally well-reproduced, with peak flows mostly achieved for most events, while recession curves were mostly overestimated for the calibration and validation parts. However, it should be noticed that IMERG Final slightly overestimated the simulated event volumes for all flash flood categories.
Nevertheless, the results of the evaluation criteria indicate a good agreement between the datasets with NSEs between 0.68 and 0.90 in the calibration, and between 0.51 and 0.80 in the validation. Indeed, the decrease in the evaluation criteria in the validation results is because the IMERG Finale product is moderately adapted to flash flood simulations in mountainous regions compared to the IMERG Late product, but can give better results at lower altitudes.
However, this research is one of the first works that evaluate the GPM IMERG products in this region [12,15,16]. Hence, obtained results can be used to improve future IMERG strategies affiliated with arid mountainous zones, since poorly gauged basins in these regions urgently need accurate rainfall data for disaster management and flood forecasting.

Conclusions
This research assessed the effectiveness of the GPM IMERG V06 products components, in the semi-arid Zat basin, which is not a well-gauged mountainous watershed, equipped with only one downstream measuring station that these data were used as the benchmark, over the four-year period from 2014 to 2018. The assessment of the IMERG products was achieved by considering four approaches (1) Spatial distribution of precipitation and its influence on runoff at the outlet are identified, (2) Data pre-processing and bias correction using the CDF function and QM method. (3) Comparative statistical valuation of the 3 IMERG data, with a determination of their abilities to detect various precipitation types, (4) Calibration of the IMERG under the HEC-HMS model by comparing them to the gage data, and validation of the SPP products considering the initial soil conditions.
The principal findings of this research are: (1) The QM is an effective process for correcting the bias of satellite precipitation estimates when ground precipitation is not available. The statistical evaluation findings of the QM method indicated that IMERG_L showed a moderate improvement and performed slightly superior to IMERG_E and IMERG_F. Overall, the lack of raingauge stations prevents the correct evaluation of earth observation products and leads to an underestimation of the product's performance, which is our case. (2) Regarding the effectiveness of the three SPPs, IMERG Late surpassed the remaining two SPPs in the majority of statistical metrics. However, IMERG Final ranked second to IMERG Early which slightly overestimated total precipitation. (3) The results of the hydrological model indicate that the IMERG Early, Late, and Final products achieved satisfactory hydrological performance with mean evaluation criteria (NSE) of 0.77, 0.82, and 0.82 respectively. However, during the validation of the flood events, by considering the initial soil conditions, IMERG_F and IMERG_E showed a significant overestimation of the discharge of 13%, and 10% respectively, while IMERG_L performed satisfactorily in the validation part with an avg. value of NSE = 0.69. (4) In synthesis, we can report that IMERG Early is quite reliable for capturing short-term extreme rainfall events of high intensity, and less suitable for precipitation events of medium and long duration and low intensity. Due to its 4-h latency, this product is not sensitive to the initial soil moisture conditions applied during the validation, which explains the decrease in these evaluation criteria, especially the NSE of 10%. (5) Furthermore, the IMERG Late precipitation product has the aptitude to estimate the precipitation time series at different flood intensities and durations, better than the IMERG Early and Final products. However, due to its time latency of about 14 h, it allows for some data adjustments, e.g., to take into account the initial soil moisture condition which clearly improved its validation results.
(6) Nevertheless, the IMERG Finale product is not well adapted to short duration flash flood simulations in mountainous regions, which explains further the decrease in validation performance criteria by 13%. This may be due to the rugged topography of the region, which is characterized by mainly high-altitude areas.
Altogether, this research has the potential to provide earth observation precipitation users with reliable guidance for choosing between several IMERG precipitation products in the context of flood forecasting. A framework that comprehensively studies the characteristics of IMERG precipitation products in many aspects in the case of a mountainous, semi-arid and poorly gauged watershed.