Prediction of groundwater flow in shallow aquifers using artificial neural networks in the northern basins of Algeria N. Guezgouz, D. Boutoutaou and A. Hani

Prediction of groundwater flow fluctuations is considered an important step in understanding groundwater systems at this scale and facilitating sustainable groundwater management. The objective of this study is to determine the factors that influence and control groundwater flow fluctuations in a specific geomorphologic situation, by developing a forecasting model and examining its potential for predicting groundwater flow using limited data. Models for prediction of groundwater flow are developed based on artificial neural networks (ANNs). Neural networks with different numbers of hidden layer neurons were developed using climatic and geomorphological characteristics as input variables, giving predicted groundwater flow as the output. To evaluate enhanced performance models, several regression statistical parameters are compared. As an example, relative mean square error in groundwater flow prediction by ANN and correlation coefficient are 0.015 and 97%, respectively. The results of the study clearly show that ANNs can be used to predict groundwater flow in shallow aquifers of northern Algeria with reasonable accuracy even in the case of limited data.


GRAPHICAL ABSTRACT INTRODUCTION
Integrated water resources management is a systematic process for sustainable development, allocation and monitoring of water resources viewed as both a geomorphological influence and a climatic variation. This conceptual model interprets the two systems through three components including the watershed nature, the stream characteristics and the rainfall, influencing groundwater flows.
To assist water planners and managers to gain adequate knowledge and understanding of the relationships between the response variables and water resources mobilization, there is a need to use a proper methodology to define the effective response variable influencing the attractiveness of water resources mobilization.

Study area
The study basins are situated in the north of Algeria ( Figure 1). They are bordered by Morocco from the west, the Algerian Sahara basin from the south, Tunisia from the east and the Mediterranean Sea from the north. The total area of the northern Algeria river basins is about 480,000 km 2 , comprising 17 river basins and 224 sub-basins.
Water resources in the study area are vulnerable to the fast-growing demand of urban and rural populations, demand of economic sectors including agriculture, industry and public institutions. Groundwater from shallow aquifers in northern Algeria is used primarily to irrigate vegetable crops, over an area exceeding 1 million hectares. It is also an important source of drinking water in rural areas through traditional wells.

Data description
Groundwater flow data and response variables were implemented in the ANN model using the software package of STATISTICA 8 (Serial: STA862D175437Q). The Arc-Hydro Toolbox was used to extract geomorphometric land surface variables and features from Digital Elevation Models (DEMs). It comprises a series of Python/NumPy processing functions, presented through an easy-to-use graphical menu in the widely used ArcGIS package. Climatic data were sourced from government agencies as independent datasets (each case is independent) for the observed sub-basins. The response variables were: • groundwater velocity (GWV) (mm yr À1 ); • area of watershed (AWS) (km 2 ); • drainage density (Dd) (km km À2 ); • order of the main stream in the basin (OMSB) (-); • length of the main stream in the basin (LMSB) (km); • slope of the main stream in the basin (SMSB) (m km À1 ); • length of hydrographic network in the basin (LHNB) (km); • hydro-morphological coefficient of the basin (HMCB) (-); • annual rainfall in the basin (ARB) (mm yr À1 ); • stream water flow (SWF) (mm yr À1 ); and • evapotranspiration in the basin (ETR) (mm yr À1 ). The variables representing the response category are considered as the possible input variables while the target output variable is GWV. All input variables will be compared with expert opinion and judgment ranking to assess the performance of the conceptual model.  ranges. X i is expressed as follows:

Evaluation criteria
Each signal comes via a connection that has a weight (W ij ). The net integral of incoming signals to a receiving hidden node (NET j ) is the weighted sum of the input signals, X i , and the corresponding weights, W ij , plus a constant reflecting the node threshold value (TH j ): The net incoming signals to a hidden node (NET j ) are transformed to an input (O j ) from the hidden node by using a non-linear transfer function ( f ) of the sigmoid type, given by the following equation form: O j passes as a signal to the output node (k).
The net entering signals to an output node (NET k ) are given by The net incoming signals of an output node (NET k ) are transformed using the sigmoid type function to a standardized or scaled output ( O k ), that is: Then, O k is standardized to produce the target output:  cross-verification and testing. For the ANN models described in this paper, 50% of the available data were used for training, 25% were used for verification and 25% to test the validity of network prediction (Lallahem et al.

Setup of the model inputs
ANN models have the ability to determine which inputs are critical. They are useful mainly for complex problems where the number of potential inputs is large and where a priori knowledge is not available to determine appropriate inputs (Lachtermacher & Fuller ). In this study, a sensitivity analysis can be carried out to identify the importance of the input variables.
This indicates which variables are considered to be most

RESULTS AND DISCUSSION
In the northern Algeria basins, groundwater flow is driven by stream flow, annual precipitation in the basin, drainage density and other various geomorphological variables.
Stream flow has produced a root of the limited available groundwater flow and is assured by precipitation.
The types of considered networks are MLP with two back-propagation algorithms (BFGS and SCG) and RBF.
During the analysis, many other networks were tested. The best optimal ANN model found is MLP (BFGS 137) with four hidden nodes and a smaller error (0.015) than the other types of ANN networks tested (Table 1). Verification of the model demonstrates a good fit to the available data, RMSE values for training, verification and testing are consistently small in magnitude, indicating that the data subsets are from the same population ( Jalala et al. Table 2). In addition, the correlation coefficient for each phase exceeds 97% which shows a close agreement between the observed and predicted groundwater velocity