Groundwater Augmentation through the Site Selection of Floodwater Spreading Using a Data Mining Approach (Case study: Mashhad Plain, Iran)

It is a well-known fact that sustainable development goals are difficult to achieve without a proper water resources management strategy. This study tries to implement some state-of-the-art statistical and data mining models i.e., weights-of-evidence (WoE), boosted regression trees (BRT), and classification and regression tree (CART) to identify suitable areas for artificial recharge through floodwater spreading (FWS). At first, suitable areas for the FWS project were identified in a basin in north-eastern Iran based on the national guidelines and a literature survey. Using the same methodology, an identical number of FWS unsuitable areas were also determined. Afterward, a set of different FWS conditioning factors were selected for modeling FWS suitability. The models were applied using 70% of the suitable and unsuitable locations and validated with the rest of the input data (i.e., 30%). Finally, a receiver operating characteristics (ROC) curve was plotted to compare the produced FWS suitability maps. The findings depicted acceptable performance of the BRT, CART, and WoE for FWS suitability mapping with an area under the ROC curves of 92, 87.5, and 81.6%, respectively. Among the considered variables, transmissivity, distance from rivers, aquifer thickness, and electrical conductivity were determined as the most important contributors in the modeling. FWS suitability maps produced by the proposed method in this study could be used as a guideline for water resource managers to control flood damage and obtain new sources of groundwater. This methodology could be easily replicated to produce FWS suitability maps in other regions with similar hydrogeological conditions.


Introduction
A major part of Iran falls under arid and semi-arid climates with a low amount of precipitation, high temperature, and high evapotranspiration rates [1]. Iran receives only one-third of the average annual rainfall in the world [2]. Spatial and temporal variations of rainfall in Iran do not have a been successfully used in flood susceptibility modeling [28], landslide susceptibility mapping [29][30][31][32][33][34], and land use modeling [35,36].
In the above literature, it can be seen that the previous studies on FWS suitability mapping have mainly concentrated on using models which are totally based on expert judgment or at least to some extent are affected by them. By reviewing the published papers in different spatial issues, it can be observed that there is a shift from simple statistical procedures that suffer from expert judgment-related errors to more complex data mining approaches that are able to extract information from different input variables. However, these models have not yet been used in FWS suitability mapping. To fill the research gap, the current study tries to reduce the related errors of expert judgment through the application of data mining models. Thus, the main novelty of this research is the application of three state-of-the-art models in FWS suitability mapping: the BRT, CART, and WoE methods. Consequently, the objectives of this research are: (i) selecting suitable areas for FWS systems by the BRT, CART, and WoE algorithms, (ii) ranking the importance of FWS conditioning factors by the BRT and CART algorithms, and (iii) extracting the relationships between FWS suitability and its conditioning factors by the WoE model. Figure 1 represents the methodology used to detect FWS suitable and unsuitable locations, as well as preparing FWS conditioning factors and modeling procedures, in this research.

Materials and Methods
Water 2018, 10, x FOR PEER REVIEW 3 of 22 In the above literature, it can be seen that the previous studies on FWS suitability mapping have mainly concentrated on using models which are totally based on expert judgment or at least to some extent are affected by them. By reviewing the published papers in different spatial issues, it can be observed that there is a shift from simple statistical procedures that suffer from expert judgmentrelated errors to more complex data mining approaches that are able to extract information from different input variables. However, these models have not yet been used in FWS suitability mapping. To fill the research gap, the current study tries to reduce the related errors of expert judgment through the application of data mining models. Thus, the main novelty of this research is the application of three state-of-the-art models in FWS suitability mapping: the BRT, CART, and WoE methods. Consequently, the objectives of this research are: (i) selecting suitable areas for FWS systems by the BRT, CART, and WoE algorithms, (ii) ranking the importance of FWS conditioning factors by the BRT and CART algorithms, and (iii) extracting the relationships between FWS suitability and its conditioning factors by the WoE model.

Study Area
The Mashhad plain was considered in this study to investigate the efficacy of the BRT, CART, and WoE algorithms in determining FWS suitable areas. The Mashhad plain is located between 35 • 59 18" and 37 • 03 53" N and 58 • 21 33" and 60 • 00 38" E, with an area of 9600 km 2 ( Figure 2). The study area has a mean annual precipitation of around 240 mm according to the Khorasan Razavi Regional Water Authority [37]. The elevation in this basin ranges between 850 and 3250 m. A total of about 3,000,000 people live in the study basin who are primarily dependent on GW resources for drinking and crop-cultivation purposes. The inappropriate management of the water resources in this area has led to a 12-meter water table depletion in a 20-year period [38].
Water 2018, 10, x FOR PEER REVIEW 4 of 22

Study Area
The Mashhad plain was considered in this study to investigate the efficacy of the BRT, CART, and WoE algorithms in determining FWS suitable areas. The Mashhad plain is located between 35°59′18″ and 37°03′53″ N and 58°21′33″ and 60°00′38″ E, with an area of 9600 km 2 ( Figure 2). The study area has a mean annual precipitation of around 240 mm according to the Khorasan Razavi Regional Water Authority [37]. The elevation in this basin ranges between 850 and 3250 m. A total of about 3,000,000 people live in the study basin who are primarily dependent on GW resources for drinking and crop-cultivation purposes. The inappropriate management of the water resources in this area has led to a 12-meter water table depletion in a 20-year period [38].

Floodwater Spreading Dataset
FWS systems are comprised of different parts including a diversion dam, a conveyance canal, a sedimentation basin, water gateways, several percolation surfaces, and a spillway at the end of its  [39,40]. Diversion dams have the role of diverting the floodwater from the ephemeral river into the conveyance canal [41], where it is delivered to the first sedimentation basin (Figure 3). In the sedimentation basin, the floodwater deposits its sediment load. The first and second ponds in the system are constructed for sedimentation. Floodwater in the sedimentation basin can reach up to a 20-25 cm height. The sedimentation basins are designed similar to infiltration ponds with respect to width, slope, and area. As the water height reaches a certain amount, it enters the first percolation surface through the gateways at the end of the sedimentation basin [42]. In this part of the FWS system, water infiltrates and recharges GW. There are several infiltration ponds in an FWS system. The number and size of these infiltration ponds depend on the flood discharge, the volume of diverted floodwater, slope, and soil texture. The FWS system is designed to divert 30 to 40% of a 100-year flood of the basin. The width of FWS systems varies between 100-1500 m depending on the land availability and land surface characteristics. The floodwater height can reach up to 20-25 cm in the infiltration ponds. The conveyance canal normally has a slope of about 0.0003. Depending on the floodwater discharge rate, the width of the conveyance canal changes. There is also a spillway, which returns the flood into the river in order to prevent damaging the FWS system.
For modeling FWS suitability in this research, first, data on the current established FWS system, the Jamab system, were gathered [37]. Then, we defined the best locations for FWS establishment in the study area by considering the national guidelines and published documents [9,10,13,43]. To model FWS suitability, the main conditioning factors were used. Those factors include slope percent, rainfall, aquifer thickness, and EC. Slopes of less than 5% are recommended for FWS systems by different authors [11,16,17]. However, slopes up to 8% could be conservatively regarded as moderately suitable for surface recharge [10,12,14]. Previous research has confirmed that slopes of more than 8% are unsuitable for artificial recharge through FWS system. Thus, this study selected the suitable areas for FWS systems with slopes of less than 8%. In the case of aquifer thickness that has a limiting role in these artificial recharge methods, this study did not consider values of less than 10 m. In the case of rainfall, there was no limiting effect by this factor as it is higher than 200 mm in the whole area. FWS systems not only improve GW conditions, but also decrease salinity in the aquifers. FWS systems are more efficient for the aquifers with lower EC values than 6000 µmhos/cm [9]. Therefore, this research regarded areas with <6000 µmhos/cm EC for FWS establishment. According to the stated criteria, the initial possible locations were selected for FWS construction. To check whether it is possible to divert floodwater from the main rivers over the surface, some field surveys were conducted. At last, thirteen FWS suitable areas with a surface area of 1500 ha were selected for this purpose (Figure 4a). The Jamab FWS system has been used since 1995 [37,44] (Figure 4b,c). This area has an aquifer thickness of about 50 m, an average slope of 4.03%, and 16 infiltration ponds [37]. Since the establishment of the Jamab FWS system, the area has been flooded frequently [45]. The Jamab system has the capability of infiltrating a floodwater volume of about 1.96 million m 3 at each event. It should be noted that 80-90% of the diverted floodwater can be infiltrated in the system, the rest will be lost through evaporation and/or will return to the river via a spillway/outflow channel. The diversion dam is designed based on a 100-year return period peak discharge of the study area (calculated as 138 m 3 s −1 ) and it can transfer a discharge amount of about 8 m 3 s −1 . In the case of a major flood, a discharge up to the 8 m 3 s −1 is diverted and the rest flows through the main river channel. Investigation of the two water wells around this system shows a 10-m water table increase between 1995 and 2011 as a result of the FWS system. Infiltration ponds have an average infiltration rate of about 1.8 cm min −1 .

Floodwater Spreading Conditioning Factors
In this section, FWS conditioning factors, as well as their classes, are explained. FWS conditioning factors include both categorical and continuous variables. Geological units in addition to land use classes were considered as categorical variables in all the models. It must be noted that the WoE algorithm needs only categorical inputs; hence, the other layers were classified into different classes to be used in this model. In contrast, BRT and CART deal with both categorical and continuous factors. Thus, these two models were applied to the raw dataset.

Slope percent
Slope percent affects flow speed over the surface while it is being flooded. In steeper slopes, the flow speed is higher and results in different kinds of erosion. This factor was produced using a 30 m DEM of the Mashhad Plain. The slope percent of the study region was classified into 0-2, 2-5, 5-8, and >8% categories (Figure 5a).

Plan and profile curvature
Plan curvature defines how concave or convex the surface is [46]. Flow distribution on the surface depends on the topography, which can be represented by profile curvature. The profile curvature ranges from negative to positive values [47]. The plan and profile curvatures maps were generated from the DEM of the study area by SAGA software (Figure 5b,c).

Transmissivity
In order to prepare the transmissivity map of this study, hydraulic conductivity data, which have been defined through well-pumping investigations by KRRWA in 2017 [37], were used. The transmissivity was then calculated through the hydraulic conductivity multiplied by the saturated aquifer thickness of the study region. This FWS conditioning factor represents the horizontal movement of water through the saturated parts of the aquifer [17]. This factor was categorized into 4 classes ( Figure 5d).

Aquifer thickness
Aquifer thickness has an important role in defining the suitable places for FWS as low values cause saturation and are not ideal choices for this purpose [9]. This factor was obtained from the KRRWA project report [37] as a result of well-logging and geophysical studies. This FWS conditioning factor was categorized into 4 classes ( Figure 5e).

Electrical conductivity
EC defines the ability of the material to pass electricity through itself. This ability depends on the total dissolved solids in the aquifer [17]. EC values of the exploration wells measured by KRRWA [37] were used for preparing the EC layer map. The EC layer ranges from 0 to 9085 µmhos cm −1 (Figure 5f).

Rainfall
A rainfall map of the study region was also obtained from KRRWA [37] by interpolating the average annual rainfall of multiple stations in and around the Mashhad Plain. This map includes four rainfall classes of 200-275, 275-350, 350-425, and 425-500 mm year −1 (Figure 5g).

Distance from rivers and river density
To prepare the distance from rivers, the Euclidean distance was implemented in ArcGIS software. River density was produced by the line density function. This factor ranges from 0 to 11,926 m ( Figure 5h). The latter one was categorized into four classes using the natural break method (Figure 5i). This method of classification was selected because there are jumps in the river density layer [48].

Soil infiltration
The soil infiltration layer was prepared based on the hydrologic soil groups and soil conservation service method [37,49]. Four categories of soil infiltration in the Mashhad Plain are 0-12.7, 12.7-38.1, 38.1-76.2, and >76.2 mm h −1 (Figure 5j).

Land use map
The study region is covered by four classes of land use i.e., orchard, rangeland, residential areas, and agriculture (Figure 5k). Rangeland and agriculture are the main land use classes covering 64 and 24.1% of the study area, respectively. It is noted that the demand for agricultural activities has

Geological units
There are 35 different lithological units in the Mashhad Plain. These units cover a variety of lithological units from sedimentary to igneous rocks (Table 1; Figure 5l). The main units are gravel fan, shale, dolomite, and terraces.

Modeling Approaches
In order to apply the binary classification models, i.e., BRT and CART, both presence and absence data are needed. Presence data were produced by converting the polygons of the FWS suitable areas to points (i.e., 223 points). Absence or FWS unsuitable areas were created using a random algorithm with the same number of the presence data (i.e., 223 points). Among these points, the training dataset contains 70% of both the FWS suitable and unsuitable points (i.e., 312 points). On

Modeling Approaches
In order to apply the binary classification models, i.e., BRT and CART, both presence and absence data are needed. Presence data were produced by converting the polygons of the FWS suitable areas to points (i.e., 223 points). Absence or FWS unsuitable areas were created using a random algorithm with the same number of the presence data (i.e., 223 points). Among these points, the training dataset contains 70% of both the FWS suitable and unsuitable points (i.e., 312 points). On the other hand, the validation dataset receives 30% of both the FWS suitable and unsuitable points (i.e., 134 points).

FWS Suitability Modeling by BRT
BRT can be regarded as a mixture of two strong statistical methods, the boosting method, and regression trees. The boosting method is similar to model averaging (with the difference), implementing a forward stage-wise process in which the trees are fitted to a part of the training set [50]. The mentioned sub-dataset in each iteration is chosen without replacement [50]. This process is known as stochastic gradient boosting which increases the model accuracy and diminishes overfitting problem [51].
Decision trees are created on the basis of recursive binary partitioning of training subsets. Recursive binary partitioning is a technique for multivariate analysis which categorizes different cases in the population dividing them into sub-groups according to different binary input factors [52]. In BRT, several trees are created iteratively till minimizing the loss function [50]. The final value can be calculated as below: where, i = 1, 2, 3, . . . , n shows the different fitted trees, n shows the total number of trees, and the learning rate depicts the contribution of each tree in the final value.
For running the BRT model in R program, there are three main parameters that must be defined or calibrated. Those parameters are the number of trees or iterations, learning rate, and max tree depth or interaction depth. Interaction depth defines the size of single trees. For applying BRT, the caret and gbm scripts were implemented in the current work [53].

FWS Suitability Modeling by CART
Tree-based models such as CART are alternative techniques for classification and regression which are not based on normality presumption [54]. Further, unlike discriminant analysis models, CART is very simple to interpret. CART is able to deal with a large number of cases and variables in addition to its capability to be resistant to outliers [55]. CART is a good choice when the user is dealing with a large number of input variables as it can recognize the main contributors and their interactions [54]. CART implements a regression technique that divides the data until classes become homogeneous or include fewer observations than a specific threshold determined by the modeler [56]. To avoid an overfitting issue in this algorithm, a process called pruning is applied. Pruning selects the best trade-off between the decrease of deviance and the number of terminal nodes [24,57]. This model was implemented using the rpart script in the R program.

FWS Suitability Modeling by WoE
WoE as a data-driven algorithm is on the basis of Bayesian probability [58]. The main idea of the WoE model is that several binary patterns can be considered together to define a new binary pattern [59]. This model computes the weights of the FWS conditioning factors on the basis of the existing FWS suitable areas in the study area [60]. First, the positive and negative weights of each FWS conditioning factor are calculated as follows: where P denotes the probability; B and B show the presence and absence of the dichotomous pattern, respectively; and D and D denote the presence and absence of FWS suitable areas, respectively. W + and W − represent the WoE for FWS suitable and unsuitable cases, respectively [61,62]. The contrast value could be obtained by W + − W − . A contrast value of 0 depicts that the variable is insignificant in modeling FWS suitability. Positive values are representative of positive correlation, while negative values depict the reverse correlation between the variables and FWS suitability [59].

Application of BRT
Based on the calibration results, the final BRT model has 1400 trees, an interaction depth of 9, 20 minimum number of observations in the trees terminal nodes (or n.minobsinnode), and 0.05 shrinkage ( Figure 6). The initial parameters, i.e., the number of trees, varies between 500 and 1200, the max tree depth changes from 5 to 11, and the shrinkage includes 0.001, 0.010, 0.050, and 0.100. Figure 7 shows the FWS suitability map produced by the BRT model. As seen in the figure, most of the study region (89.3%) has been categorized as unsuitable for FWS construction. On the other hand, the high FWS suitable zone includes a very small portion of the watershed (6.7%). The contrast value could be obtained by . A contrast value of 0 depicts that the variable is insignificant in modeling FWS suitability. Positive values are representative of positive correlation, while negative values depict the reverse correlation between the variables and FWS suitability [59].

Application of BRT
Based on the calibration results, the final BRT model has 1400 trees, an interaction depth of 9, 20 minimum number of observations in the trees terminal nodes (or n.minobsinnode), and 0.05 shrinkage ( Figure 6). The initial parameters, i.e., the number of trees, varies between 500 and 1200, the max tree depth changes from 5 to 11, and the shrinkage includes 0.001, 0.010, 0.050, and 0.100. Figure 7 shows the FWS suitability map produced by the BRT model. As seen in the figure, most of the study region (89.3%) has been categorized as unsuitable for FWS construction. On the other hand, the high FWS suitable zone includes a very small portion of the watershed (6.7%).    The contrast value could be obtained by . A contrast value of 0 depicts that the variable is insignificant in modeling FWS suitability. Positive values are representative of positive correlation, while negative values depict the reverse correlation between the variables and FWS suitability [59].

Application of BRT
Based on the calibration results, the final BRT model has 1400 trees, an interaction depth of 9, 20 minimum number of observations in the trees terminal nodes (or n.minobsinnode), and 0.05 shrinkage ( Figure 6). The initial parameters, i.e., the number of trees, varies between 500 and 1200, the max tree depth changes from 5 to 11, and the shrinkage includes 0.001, 0.010, 0.050, and 0.100. Figure 7 shows the FWS suitability map produced by the BRT model. As seen in the figure, most of the study region (89.3%) has been categorized as unsuitable for FWS construction. On the other hand, the high FWS suitable zone includes a very small portion of the watershed (6.7%).   Further, the importance of the FWS conditioning factors was determined by the BRT algorithm. Based on Table 2, it can be seen that transmissivity, distance from rivers, and EC are the most important FWS conditioning factors in the modeling procedure. On the contrary, soil infiltration, lithology, plan curvature, and land use was defined as the least important FWS conditioning factors for the implemented dataset.

Application of CART
The CART algorithm was calibrated using the training dataset. Based on the results, the CART model was pruned by a complexity parameter (cp) of 0.03. The final classification tree obtained by the CART is presented in Figure 8. As seen, the CART has used only four factors in its final pruned classification tree. Distance from rivers was selected as the root of the CART that shows its high impact on the modeling procedure. The CART assigned 0 (i.e., FWS unsuitable location) to the distance from rivers with values >1514. Then, aquifer thickness and transmissivity were used to split the dataset into FWS unsuitable and suitable locations. Finally, distance from rivers and rainfall factors were taken into account. Figure 9 shows the FWS suitability map obtained using the CART algorithm. As seen, the low class of suitability occupied the largest area (73.7%). The high FWS suitable zone, on the other hand, includes only 15.7% of the watershed.
Further, the importance of the FWS conditioning factors was determined by the BRT algorithm. Based on Table 2, it can be seen that transmissivity, distance from rivers, and EC are the most important FWS conditioning factors in the modeling procedure. On the contrary, soil infiltration, lithology, plan curvature, and land use was defined as the least important FWS conditioning factors for the implemented dataset.

Application of CART
The CART algorithm was calibrated using the training dataset. Based on the results, the CART model was pruned by a complexity parameter (cp) of 0.03. The final classification tree obtained by the CART is presented in Figure 8. As seen, the CART has used only four factors in its final pruned classification tree. Distance from rivers was selected as the root of the CART that shows its high impact on the modeling procedure. The CART assigned 0 (i.e., FWS unsuitable location) to the distance from rivers with values >1514. Then, aquifer thickness and transmissivity were used to split the dataset into FWS unsuitable and suitable locations. Finally, distance from rivers and rainfall factors were taken into account. Figure 9 shows the FWS suitability map obtained using the CART algorithm. As seen, the low class of suitability occupied the largest area (73.7%). The high FWS suitable zone, on the other hand, includes only 15.7% of the watershed.      Table 3 shows the results of WoE for the training dataset. In the case of the slope percent, it was seen that the classes of 0  Figure 10 presents the FWS suitability zones obtained from the WoE algorithm. It can be observed that there is not a dominant suitability zone in this map. Each zone occupies an area of around 20%. The unsuitable and high classes of suitability cover 25.2, and 16.2% of the watershed, respectively.

Validation of the FWS Suitability Maps by a ROC Curve
The validation of the methods is a critical step in modeling, which shows whether they are applicable. This study implemented a receiver operating characteristics (ROC) curve to validate FWS suitability maps obtained by the BRT, CART, and WoE algorithms. ROC is a well-known method for validating binary models, which have been employed by many scholars for landslide, gully, flood, and GW studies [63][64][65]. This curve plots the true and false positive rates against each other [66,67]. The area under the ROC curve ranges between 0 and 1, where a value of 1 represents an ideal model [68][69][70][71]. In contrast, an area of 0.5 depicts a weaker model [72][73][74]. The results of the ROC curve are shown in Figure 11. As seen, the BRT, CART, and WoE methods have an area under the ROC curve with values of 92, 87.5, and 81.6%, respectively.

Validation of the FWS Suitability Maps by a ROC Curve
The validation of the methods is a critical step in modeling, which shows whether they are applicable. This study implemented a receiver operating characteristics (ROC) curve to validate FWS suitability maps obtained by the BRT, CART, and WoE algorithms. ROC is a well-known method for validating binary models, which have been employed by many scholars for landslide, gully, flood, and GW studies [63][64][65]. This curve plots the true and false positive rates against each other [66,67]. The area under the ROC curve ranges between 0 and 1, where a value of 1 represents an ideal model [68][69][70][71]. In contrast, an area of 0.5 depicts a weaker model [72][73][74]. The results of the ROC curve are shown in Figure 11. As seen, the BRT, CART, and WoE methods have an area under the ROC curve with values of 92, 87.5, and 81.6%, respectively.

Validation of the FWS Suitability Maps by a ROC Curve
The validation of the methods is a critical step in modeling, which shows whether they are applicable. This study implemented a receiver operating characteristics (ROC) curve to validate FWS suitability maps obtained by the BRT, CART, and WoE algorithms. ROC is a well-known method for validating binary models, which have been employed by many scholars for landslide, gully, flood, and GW studies [63][64][65]. This curve plots the true and false positive rates against each other [66,67]. The area under the ROC curve ranges between 0 and 1, where a value of 1 represents an ideal model [68][69][70][71]. In contrast, an area of 0.5 depicts a weaker model [72][73][74]. The results of the ROC curve are shown in Figure 11. As seen, the BRT, CART, and WoE methods have an area under the ROC curve with values of 92, 87.5, and 81.6%, respectively.

Discussion
Middle Eastern countries including Iran are known to be the most water-scarce areas worldwide [75]. The over-exploitation of groundwater in the Mashhad Plain has led to a decline in the water table. This has caused problems for the sustainable development of the region that has resulted in the migration of the residents to other areas during the last decade. This study investigated the efficacy of the BRT, CART, and WoE algorithms for spatial modeling of FWS suitable areas. The findings depicted that BRT had the best performance, followed by the CART, and WoE algorithms. All the implemented algorithms had an area under the ROC curves of higher than 80%, which represents the acceptable efficiency in diagnostic algorithms. In the case of the contribution of the FWS conditioning factors in the modeling procedure, the CART used only four out of twelve factors in its final tree. Those conditioning factors are the distance from rivers, transmissivity, aquifer thickness, and rainfall. It shows that this algorithm employed the least number of factors. A simpler algorithm is believed to have more stability and be able to be generalized [27,76]. These models are appropriate when they are applied to a wider area. The simpler decision tree obtained by the CART leads the model to be more applicable as it is not over-fitted to many variables. Further, a fewer number of variables is needed for running the CART, and this causes the model to be able to be applied in different areas simply.
The BRT, on the other hand, used all the twelve FWS conditioning factors for creating trees in its process. This algorithm had similar results to the CART as it defined transmissivity, distance from rivers, and aquifer thickness as the most important factors in its procedure. These factors are three out of the four factors that were used in the final CART. The superior efficacy of the BRT algorithm in modeling FWS suitability in this study can be related to some of its strong features. BRT combines different simple trees like CART outputs to predict the final response [51]. This feature is called boosting and it is similar in nature to model averaging, which can reduce overfitting problem in this model [24]. BRT produces better outputs than single trees such as CART as it grows many trees [51] and extracts information from any possible relationship between the response and input variables. Further, the suitability map by the WoE model showed acceptable performance. The weights-of-evidence of each class of the FWS conditioning factors show whether there is a direct correlation (for positive values) or reverse correlation (for negative values) with FWS suitability. The WoE produced more interpretable outputs that are more understandable for the managers as well as the stakeholders. The importance of the FWS conditioning factors could be regarded as a guideline for the modelers and water resources managers as it shows the variables that have a higher contribution to the modeling process. In other words, importance values represent the sensitivity of the model to each one of its input variables. For obtaining more accurate FWS suitability maps in the Mashhad Plain, modelers as well as water resources managers ought to focus on the most important factors in determining the suitable areas. For the case of the Mashhad Plain, these factors encompass the transmissivity, distance from rivers, and aquifer thickness. An applicable guideline of this study is to encourage water managers to ensure the quality and accuracy of the above-mentioned factors. According to the results, the application of the BRT model for FWS suitability mapping can be suggested for the Mashhad Plain due to its higher accuracy relative to the other models.
Artificial recharge is an efficient technique to augment GW resources, which is related to both flood-related and GW-related factors. Hydrogeological phenomena are complex and the relationship between response and input variables has a non-linear nature. Additionally, flood-related factors are complex to model. These uncertainties obliged us to implement three more sophisticated algorithms to get more reliable outputs. The interpretation of the FWS suitability maps showed that there is a strong pattern in the vicinity of the rivers in the study area. This pattern exists in all the maps produced by BRT, CART, and WoE. The BRT model defined a large part of the study area as unsuitable. The moderate and high suitable classes for FWS construction by the BRT cover a very small part of the studied region. In the FWS suitability maps by the CART and WoE algorithms, most of the watershed has been assigned to low suitability. Despite these differences between the outputs of the models, it can be seen that all the models have defined similar areas as the high FWS suitable zone. This fact diminishes the existing uncertainties in the results as the three model outputs are quite similar in the case of defining the high suitability zone for FWS establishment.
With the outputs of this study, suitable areas can be selected with less time and cost. The managers can double-check the suggested locations for FWS construction produced by the suitability maps through field surveys. Accordingly, highly suitable areas for FWS projects produced by the models are suggested for the artificial recharge of groundwater. Construction of several FWS systems in suitable spots based on the FWS suitability maps in the Mashhad Plain can improve groundwater conditions and reduce flood damage in this populated area. Further, the application of the BRT, CART, and WoE can also be used for defining suitable locations for other types of artificial recharge techniques, i.e., injection wells, in the Mashhad Plain considering the proper conditioning factors. The methodology is also suggested to be used as a guideline for water sector managers and stakeholders. The construction of FWS systems, as well as other artificial recharge methods, could change the critical situation of water resources in different countries.