Dataset of development of stochastic groundwater flow under uncertain recharge

Highlights • The methodology presented in this paper will be useful for groundwater researchers and modelers.


Specifications table Subject Area
Engineering More specific subject area: Water -Civil Engineering Method name: Groundwater Modelling Name and reference of original method: Steady -state Numerical model was developed using MODFLOW -20 0 0 [3] . The governing partial differential equation for a confined aquifer used in MODFLOW is; MODFLOW is a computer program that numerically solves the three -dimensional groundwater flow equation for a porous medium by using Finite difference method [3] .The code is based on the flow equation of Darcy and mass continuity equation.
In this study the groundwater flow was estimated using groundwater flow equation.
How data were acquired: Data was collected from different sectors listed below; -Department of water affairs (DWAF) -Wells data -National Groundwater Archive (NGA) -Wells data -South African weather services (SAWS) -Climatic data -Consultant's offices e.g (GRIP, S&W Limpopo, VSA, GPM and APHANE -CONSULTING) -Boreholes, Aquifers properties data -Geosciences department -Geological shapefiles, Landuse information Data format: Raw Software used for data analysis: ArcGIS and Microsoft Excel Software used for simulation: MODFLOW 20 0 0 Software used for programme coding:

MATLAB 2004a
Data accessibility: Data represented with the article Description of borehole data collection: Site was identified, site visit done for some of the boreholes and their geographic coordinates were recorded using a handheld global positioning system (GPS) device. Value of the data: • Generation of recharge field for stochastic analysis • The development of a distributed groundwater flow model for the study area aimed at providing information on the groundwater aquifer system as well as the water budget associated with it. • It also helped in understanding the behaviour of the groundwater system in response to stress due to groundwater abstractions. • The developed model can be used for management of groundwater through scenarios analysis and which will aid in decision making.
1 Methods details

Methodology on the development of a groundwater flow model
Development of steady -state, numerical model was done using [3] MODFLOW -20 0 0 [3] . The model was discretized into 117 columns and 164 rows. The grid cell size was 10 0 0 m in both the x and y-directions. Due to insufficient log data, a single model of confined aquifer layer 60 m thickness (obtained from Groundwater Resource Information Project (GRIP) dataset) was used to simulate flow and mass balance of the study area. The modelled domain covered an area of 19188 km 2 which was projected with UTM projection system. Fig. 1 a represent the discretization of the modelled area. During model development, the following data were used as input in MODFLOW; Elevation: The elevation data was gathered from elevation map ( Fig. 1 b) processed in ArcGIS. After importing the elevation, the model executed interpolation of the imported data through Neighbouring Interpolation technique.

Wells
Boreholes data was obtained from GRIP Limpopo and National Groundwater Archive (NGA). During this process, two types of wells data was prepared; (i) pumping wells and (ii) observation wells. A total of 203 boreholes were captured on the model ( Table 1 ) with only 100 randomly selected dataset of pumping wells shown in the Table 1 for clarity. Fig. 1 c shows the distribution of pumping wells in the study area. Observation wells were added in the model for the purpose of model calibration only. A total of 30 observation wells were used for this task ( Table 2 ). The choice of observation wells was based on (i) recent data availability (ii) completeness of the data and (iii) spatial distribution of the wells. Data processing for wells was done through ArcGIS using extraction and clipping on Geoprocessing tool and later imported in MODFLOW through the import function. Aquifer hydraulic conductivities: In MODFLOW, the hydrogeology properties of the aquifer use the following: (i) Conductivity (K x , K y , K z ) (ii) Storage (S y , S s ,P eff ,P tot ) (iii) Initial head and vadoze zone. The groundwater flow model requires property values of conductivity, storage and initial heads for each grid cell in order to run a flow simulation. The parameters of hydraulic conductivity, transmissivity and storage coefficient were obtained from the results of pumping test analysis. Hydraulic conductivity varied between 3.568 × 10 −3 to 30.78 m/d. The hydraulic conductivities were assigned in five distinct zones based on their geological formation and point hydraulic conductivities ( Fig. 1 d).
Static water levels records of the wells were used as initial heads. The initial heads were interpolated within the model to obtain the initial heads for the entire model ( Fig. 1 e). The observation heads were interpolated using the nearest Neighbour technique. Table 3 indicate the parameters used in the developed groundwater flow model.
Boundary conditions : Three types of boundary conditions were incorporated in this study which included; constant head, river and recharge. Dirichlet boundary conditions (fixed boundary condition) of the study area were used as a constant head. The constant head was allocated within a minimum distance of 2 km from pumping to boundary of the modelled aquifer. A boundary located at a distance of 2 km will ensure that effect of pumping from the nearest well will not be felt at such a distance.      River: MODFLOW simulated the interaction between the surface and groundwater via seepage layer separating the surface water from groundwater system. Four rivers were included in the model; Sandriver, Houtriver, Seepabana and Brak rivers ( Table 4 ). All rivers allocated within the model domain were imported from ArcGIS as shapefiles. Stage elevation, river depth and thickness data were obtained from Department of Water Affair (DWA). River bottom elevations were obtained through

Eqn 1 and 2
Where; C is Conductance value, L is length of a reach through a cell, W is Width of the river in the cell, M is thickness of the riverbed, K is the vertical hydraulic conductivity of the riverbed material.
Recharge Package (RCH): The study area recharge polygon shape file map was imported in MODFLOW under boundaries. At this point recharge was considered as certainty parameter for deterministic model. Fig. 1 f shows the boundary conditions used as input in to groundwater flow model (deterministic) where recharge is applicable.

Model simulation
After all the input were imported in the model, the model was then simulated using MODFLOW and Zone budget through the run engine ( Fig. 1 g). The simulated model aimed to produce results for zone budget and mass balance statistics. The model was run for one stress period for 360 days with maximum outer (MAXITER) iteration of 50 and maximum inner iterations (ITERI) of 25 ( Fig. 1 h). (This parameter provides an upper limit on the number of outer iterations to be performed. The maximum number of iterations will only be used if a convergent solution is not reached beforehand.)

Step 2: Model Calibration
A total of 30 observation wells from GRIP database were selected for the calibration of steady state model. In this study, the hydraulic conductivity was used as parameter of interest for calibration. The calibration was done using PEST module (the parameter estimation model developed by [2] Doherty  Fig. 1g. Display of Engine to run for MODFLOW and model inputs. [2] ) within MODFLOW. PEST applies the use of a nonlinear least-squares regression method for determination of model parameters while minimizing the following objective function ( Eq. 2 ): Where , is the objective function which in reality is sum of the squared weighted residuals, N is the number of measurements w i , is the weight for the i th measured quantity, and r i is the i th residual ( i th measured quantity -i th simulated quantity).
Five zones of hydraulic conductivities were used during the calibration process. PEST executed MODFLOW with the initial hydraulic conductivity's values of five zones and output of MODFLOW were compared with observed data. The evaluation of calibration was done by comparing predicted and observed heads through scatter plot and by applying the three common ways of error quantifying methods which are (i) Mean Error (ME) (ii) Mean absolute error (MAE) and (iii) Root Mean Square Error (RMSE) adopted from [ 1 ] Anderson and Woessner [1] .

Step 3: Model Validation
The model was validated using 30 boreholes ( Table 5 ). In order to validate the model, dataset of 2011 boreholes were selected. This is because most of the data needed for the model was available in this year. The preparation of validation wells was done using Excel and ArcGIS and later on imported to MODFLOW using Well import tool. The model performance statistics were evaluated using ME, MAE, RMSE and Normalised Root Square Mean (NRMS).
In summary the first section of the methodology was development of groundwater flow model (which is deterministic in nature). Thus in order to develop stochastic model, recharge uncertain parameter was used as input in deterministic model. The methodology on the development of a stochastic solution is presented below; PART 2: Methodology on the development of a stochastic solution   Sets of random numbers generated from this distribution were then used in groundwater flow simulation model (deterministic) to determine simulated hydraulic heads. The procedure was repeated by generating more sets of possible outputs up to 10 0 0 realizations. This process is a core procedure for Monte Carlo method. The randomly generated recharge grid cell was based on 117 × 164 rows and columns of the study area, resulting in 19188 model grid cells of recharge values. The expected mean recharge was based on 50, 10 0, 30 0, 50 0, 80 0 and 10 0 0 realizations. A series of Monte Carlo (MC) experiments was done in order to determine the number of realisations for which the residual mean was insensitive to increasing number of realisations used. A Monte Carlo (MC) technique was used to solve stochastic groundwater problem. The flow chart of stochastic solution methodology is shown in Fig. 2 a. For purposes of clarity on how uncertainty was evaluated in the simulation model, (see Eq. 5 ), assume recharge is a random variable denoted by μ, hence, by considering the source/sink term in Eq. (5) as being stochastic, Eq. (5) can be transformed into Eq. (6) which becomes stochastic because its solution depends on the outcomes of realisations of the random values of recharge (This study dealt with two dimensional therefore the K z dimension falls off). Thus, a large number of realisations of recharge were generated and Monte Carlo (MC) approach was used to solve the problem repeatedly using methodological steps shown in Fig. 2 a.
This study implies that the stochastic model can be used as an appropriate tool to manage the groundwater resources in the study area.

Declaration of Competing Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.