Reactive transport modelling of push-pull tests A versatile approach to quantify aquifer reactivity

Push-pull tests (PPTs) were evaluated with 1-D radially axisymmetric multi-component geochemical reactive transport modelling (RTM) to assess aquifer reactivity controlling groundwater quality. Nutrient fate and redox processes were investigated in an Aquifer Storage and Recovery (ASR) system, in which oxic tile drainage water (TDW; nitrate ~14 mg/L; phosphate ~17 mg/L) was stored in an anoxic aquifer for later re-use as irrigation water. During the PPTs, the ASR system did not operate. PPTs were performed in two monitoring wells (MW2, MW3), with 1 m well screens in contrasting geochemical formations at different depths. In these wells, 300 L TDW was injected, and consecutively 720 L was abstracted within 12 days, during which water quality changes were studied. The RTM simulated cation exchange, precipitation of Hydrous Ferric Oxides, Iron (III)-phosphate and Calcium-phosphate minerals, and surface complexation as equilibrium processes. Oxidation of Pyrite, soil organic matter, and dissolved ferrous iron were simulated with kinetic rate expressions. Oxygen (within 2 days) and nitrate (within 4 – 7 days) were fully reduced during the PPTs. The main reductants were ferrous iron (Monitoring Well (MW) 2: 2%, MW3: 13%), soil organic matter (MW2: 93%, MW3: 6%), and Pyrite (MW2: 5%, MW3: 81%). The intra aquifer differences in dominant reduction pathways are remarkable as higher reduction rates coincided with lower contents of soil organic matter and Pyrite, respectively. Phosphate was mostly re-abstracted (MW2: 73%, MW3: 64%) and partially immobilized due to precipitation of Iron-hydroxyphosphates (MW2: 4.6%, MW3: 35%), Hydroxyapatite (MW2: 23%, MW3: 0%), and to a lesser extent by surface complexation on various minerals (MW2-3: < 1%). The PPT-RTM approach enables a better understanding of reaction networks controlling water quality changes, and the reaction kinetics. PPT-RTM is a promising tool in exploratory studies or regular monitoring of water quality aspects of subsurface water technologies. This where in an aquifer during wet periods and abstracted during droughts for irrigation water use. PPTs were applied to 2 monitoring wells (MW2 and 3) with 1 m well screens in contrasting geochemical formations at different depths. The objective was to assess nutrient fate and redox processes in this aquifer in a period without ASR operation. PPT results showed relatively fast O 2 and NO 3 reduction and PO 4 immobilization in both monitoring wells. For each monitoring well, PPT results were simulated with a 1-D radial RTM using PHREEQC-3, to obtain information about the reaction networks related to the observed water quality changes. In MW2, 92% of injected O 2 and 34% of NO 3 was reduced. SOM reduced 93%, Pyrite 5% and Fe(II) oxidation 2% of O 2 and NO 3 . The aquifer was more reactive at the well screen depth of MW3, which resulted in 94% O 2 and 67% NO 3 reduction. Pyrite reduced 81% of O 2 and NO 3 , and SOM and Fe(II) oxidation contributed to 6% and 13% reduction, respectively. Reduction pathways vary remarkably in MW2 and 3. Higher SOM (MW2) and Pyrite oxidation (MW3) rates were observed where their contents were lower. PO 4 immobilization by Fe-hydroxyphosphate and Hydroxyapatite In MW2, 73% of the PO 4 abstracted during 23% immobilized by and MW3, 4


Introduction
Push-Pull Tests (PPTs) are performed to quantify in situ aquifer reactivity after introducing various reactants such as oxidants or organic pollutants (Istok et al., 1997). In the push-phase, a solution is injected comprising selected reactants and a conservative tracer in an aquifer by means of a groundwater well. Subsequently, injected water is gradually abstracted from the same well and water samples are frequently taken for chemical analysis during the pull-phase. Water composition changes during the PPT contain highly valuable information on reactive processes. PPTs are widely used to quantify aquifer reactivity in relation to oxidants (e.g., McGuire et al., 2002;Vandenbohede et al., 2008), nutrients (e.g., Boisson et al., 2013;Eschenbach et al., 2015), and trace metals (e.g., Radloff et al., 2017;Ziegler et al., 2015). PPTs have substantial advantages over laboratory batch or column experiments: (i) aquifer volume investigated is typically larger and therefore more representative, and (ii) there is less potential disturbance and contamination of aquifer materials (Istok, 2012).
In recent years, PPT results have been interpreted with various models. For example, in situ reaction rates, and retardation factors were estimated with simplified analytical models (e.g., Haggerty et al., 1998;Istok et al., 1997;Schroth et al., 2000). Furthermore, numerical models simulating groundwater flow and solute transport have been applied to fit reaction rate constants to observed data (Phanikumar and McGuire, 2010;Vandenbohede et al., 2008). Both model types calculate the best-fit of a single-process rate equation to the observed PPT results. Therefore, they cannot interpret coupled processes, such as aqueous equilibria, mineral dissolution/precipitation, cation exchange, surface complexation, and coupling specific oxidants to specific reductants.
Multi-component geochemical reactive transport models (RTMs) can simulate these coupled processes but have not yet been applied to simulate PPTs. They provide information about possible reaction networks, rates, and factors that control these rates. RTMs have successfully simulated aquifer reactivity in relation to, contamination by oxidants (Antoniou et al., 2013;Appelo et al., 1998), nutrients (Korom et al., 2012;Spiteri et al., 2007), and trace metals (Rahman et al., 2015;Wallis et al., 2011); subsurface water technologies such as Aquifer Thermal Energy Storage (ATES) (Bonte et al., 2014;Possemiers et al., 2016), Monitored Natural Attenuation (MNA) (Bessinger and Hennet, 2019;Thouement et al., 2019), and Managed Aquifer Recharge (MAR) (Antoniou et al., 2013;Zuurbier et al., 2016). For the current study, RTM advantages compared to the before mentioned models are, (i) the opportunity to couple and quantify O 2 and NO 3 consumption to their specific reductants, (ii) to assess PO 4 immobilization coupled to precipitation processes and surface complexation, and (iii) to address secondary effects of redox processes and mineral precipitation/dissolution on pH, surface complexation, cation exchange and solute concentrations.
In this study, we illustrate the advantages of interpretation and simulation of PPTs with RTM. The PPT-RTM approach was used to study nutrient fate in an agricultural ASR system in which tile drainage water (TDW) was injected, consisting of relatively high NO 3 and PO 4 concentrations. Nutrient fate during aquifer storage and recovery (ASR) needs to be understood: do they degrade/(im)mobilize, do they induce other desired/undesired processes, or can they be re-used in irrigation water? The objectives of this study are (i) the development of a RTM to simulate PPT data, and (ii) application of the PPT-RTM approach at two depths to assess and quantify nutrient fate within the aquifer. Finally, we evaluate the role of PPT-RTM in subsurface water technology research, such as MAR, ATES, and MNA.

Description of the ASR system
The research site (coordinates: 52.891224, 4.825781) is located approximately 1.3 km from the Wadden Sea, in a polder close to the village Breezand, in the Northwest of the Netherlands. The main land use in the polder is agriculture, more specificly flower bulb cultivation. The research set-up consists of an Aquifer Storage and Recovery (ASR) system and 5 monitoring wells (MW) (see Fig. 1).
The ASR system stores fresh tile drainage water (TDW) collected from a 2.3 ha parcel in a confined aquifer. Stored TDW can be abstracted for irrigation water use in dry summers. In February 2014, the system was taken into service, after which 5055-7455 m 3 TDW was injected per year for 2015-2016. This is 26-44% of local precipitation. The collected TDW origins from drainage pipes below the parcel. They all end up in a collection drain that discharges in a reservoir (volume is ~1 m 3 ), in which the Electrical Conductivity (EC) is continuously sensed as a measure of salinity. TDW with an EC > 1700 μS/cm is discharged to the surface water system. Suitable water is transported to a buffer tank (volume is ~32 m 3 ). As the waterlevel exceeds a threshold in the buffer tank, water is pumped to a slow sand filter (surface area = 32 m 2 , thickness = 1.59 m, grainsize = 1.0-1.8 mm, capacity = 240 m 3 /day), followed by a rapid sand filter (surface area = 0.30 m 2 , thickness = 0.69 m, capacity = 336 m 3 /day). Afterwards, the water is injected into the aquifer through all 4 ASR wells in equal proportions. These wells are constructed in one borehole, each with 4 m well screens, separated by a 1 m gap, together ranging from a depth of 11-30 m below ground surface (b.g.s.) (Fig. 2). Abstraction only occurred from the upper 3 wells, to prevent salinization. Injected and abstracted water volumes were monitored per well screen. At the time of the PPTs (June/July 2015), 11,200 m 3 TDW was injected and approximately 1100 m 3 abstracted.

Hydrogeology
Local hydrogeology has been derived from local bore hole sediment descriptions and the national database DINOloket (TNO-NITG). DINOloket estimates local hydrogeology by interpolating data from drilling descriptions and soundings around the target location (GEOTOP model). The top soil consists of an approximately 1 m sand layer, wherein the drainage pipes are situated. The deeper subsurface consists of a Holocene confining top layer till 7.5 m-b.g.s., consisting mostly of clay and small peat layers. Thereunder, a late Holocene and Pleistocene aquifer is situated, built up from various geological formations. With at the top the Boxtel formation reaching to about 20 m-b.g.s, in which MW1 is positioned. It is formed by mostly fine eolian and fluvial sands deposited from early Holocene till middle-Pleistocene (Schokker et al., 2005).

Fig. 1.
Site maps of the ASR system, showing the drained area of the agricultural field, the monitoring wells (MW 1-5) and the ASR system in the left panel. The two panels at the right show the ASR system region.
E. Kruisdijk and B.M. van Breukelen Below from 20 to 30 m-b.g.s, the Drenthe formation was formed by more coarse sands during the last glacial period of the middle-Pleistocene (Bakker et al., 2003). The well screen of MW2 is situated in this formation. MW3 is situated in the Urk formation which is observed underneath, consisting of mostly fluvial fine sands from about 30 until 45 m-b.g.s., which are deposited in the middle-Pleistocene (Bosch et al., 2003).

Groundwater and sediment sampling and analysis
Three monitoring wells (MW1, 2, and 3) were installed in a bailer drilled borehole at approximately 1 m from the ASR well (see Fig. 2). Before and during the PPTs, water samples were taken daily from these wells starting on 25 June 2015. Furthermore, water samples were taken outside the ASR system influence in MW4 and 5 at 07 July 2015, to characterize native original groundwater. These monitoring wells were unaffected by injected TDW, which can be concluded from stable and relatively high EC sensed by CTD divers (van Essen Instruments, the Netherlands) (see Appendix 1 (A.1)).

Deriving mineral contents from total element analysis
Various geochemical parameters were estimated from the total element analysis of the sediment samples. Total reactive iron (Fe TR ), Pyrite (FeS 2 ), Pyrite bound iron (Fe py ), and non-Pyrite reactive iron (Fe reac ) contents were obtained from S, Fe 2 O 3 , and Al 2 O 3 contents. This method has been succesfully used for Dutch aquifer sediments in previous studies (Bonte et al., 2013;Griffioen et al., 2012;Zuurbier et al., 2016).
Total reactive iron was calculated using the following empirical equation: where Fe TR is total reactive iron (% d.w.), M i is the molecular weight of i (g/mol), and Fe 2 O 3 and Al 2 O 3 are the total Fe and Al content, respectively (% d.w.). The Fe content is assumed to be partly fixed in low reactive silicate structures (Canfield et al., 1992). This is adressed in an empirical relation, where silicate-bound Fe 2 O 3 amounts to approximately 22.5% of total Al 2 O 3 content (Dellwig et al., 2001(Dellwig et al., , 2002Huisman and Kiden, 1998). Pyrite and Pyrite bound iron contents were calculated from the total S content: where S is the sulphur content (% d.w.). The total S content of the sediment samples is assumed to originate from the mineral Pyrite, and thus not from organic S, Gypsum, or other Fe-sulphide minerals. This is justified as: (i) observed reaction stoichiometry during incubation experiments indicate that Pyrite is prevalently present as reductant in various Dutch sediments (Hartog et al., 2002(Hartog et al., , 2005van Helvoort et al., 2005), and (ii) field studies on S speciation show that iron sulphide minerals other than Pyrite are minor in various kinds of sedimentary groundwater settings (Bates et al., 1998;Chambers and Pederson, 2006;Jakobsen and Cold, 2007;Massmann et al., 2004;Schwientek et al., 2008). Total non-Pyrite reactive Fe (Fe reac ) can be determined by subtracting the Pyrite-bound Fe (Fe py ) from the total reactive Fe (Fe TR ) (equation (4)). The calculated Fe reac mainly relates to Hydrous Ferric Oxides.

Cation exchange capacity
The cation exchange capacity (CEC) was calculated with an empirical formula (Appelo and Postma, 2005), which resulted in satisfying results for Dutch sediments in previous studies (Karlsen et al., 2012;Zuurbier et al., 2016). This formula relates CEC to clay and organic carbon content: where % clay is the fraction of the grain size distribution <2 μm and %C is the organic carbon content.

Push-pull tests (PPTs)
Push-Pull Tests (PPT) involve injecting water of known chemical composition through a well screen into an aquifer, followed by gradual abstraction of this water, during which water samples are taken to assess water quality (Fig. 3) (Istok et al., 1997). Initial groundwater samples were taken before the start of the PPTs. PPTs were conducted in MW1, 2, and 3, from which the top of the 1 m well screens are at − 11.0, − 22.5, and − 33.8 m-b.g.s, respectively. During the PPTs, the ASR system was not operating.
TDW was collected from the ASR buffertank (see Section 2.1) and was stored in a 500 L tank. After adding the conservative tracer Br (as NaBr; final concentration 35-40 mg/L), water was manually mixed with a pole. Storage time in the 500 L tank was under an hour, after which injection started. Approximately 300 L TDW was injected (push-phase) with a steady flow rate of approximately 2 L/min, monitored with a flow and volume meter. During injection, we took 4 water samples of the injection water at the exit of the 500 L tank, to verify that the tracer was well mixed within the TDW. In the abstraction phase (pull-phase), water samples were collected after abstracting 60 L water with a 1-2 L/min flow rate every 24 h during 12 days. The abstracted 60 L ensured that water residing in the aquifer was sampled as the maximum dead volume of the wells was ̴ 37 L. To recover most of the injected water, 2.5 times the injected volume was abstracted (720 L).
At the shallowest well screen (MW1), injected water was poorly retrieved during the abstraction phase, as demonstrated by the low tracer recovery (see A.6). We assume that injected water drifted away by unexpectedly high groundwater flow, which did not allow for further data interpretation and modelling. Therefore, the PPT at MW1 will not be discussed further in this article. applied to the well screen. RTMs were developed using PHREEQC (version 3.4.5;Parkhurst and Appelo (2013)). The WATEQ4F database was used for equilibrium constants for, acid-base, mineral dissolution and precipitation, cation exchange, and surface complexation reactions (Ball and Nordstrom, 1991). PPTs were conceptualized as axisymmetric one-dimensional flow paths (see A.2.1 for rationale). The radial axisymmetric one-dimensional flow was simulated by cells of varying length (Antoniou et al., 2013;Appelo and Postma, 2005;Bonte et al., 2014;Rahman et al., 2015). Each cell has the same volume, but due to radial flow cell lengths decrease further away from the injection/abstraction point. The flow path length is based on the maximum injected water radius during the PPT, which is approximately 1.0 m, assuming a 300 L injection volume, a porosity of 0.3, horizontal flow, a homogeneous aquifer, and neglecting dispersion. This flow path was divided in 50 cells. The length of the first model cell was calculated according to equation (6) (Appelo and Postma, 2005), and the successive cell lengths were calculated according to equation (7), where n tot is the total amount of cells and n is the cell number. A part of the solutes transports further into the aquifer due to dispersion. To enable simulation of dispersion, additional cells were added to the RTMs in three steps. First, the longitudinal dispersivity was determined per RTM, by automatic parameter optimization (see Section 2.5.4) in a conservative transport model version with 300 cells. Second, the obtained longitudinal dispersivity was utilized in the RTMs and its results were used to assess the influence of dispersion. Cells in which the influence of dispersion was smaller than 1‰ (meaning that less than 1‰ of the injected water reached those cells) were removed from the model, to optimize run time. Third, one extra cell was added at the flow path start, which simulates the non-reactive monitoring well and gravel pack. No chemical processes were set to occur in this cell. A small cell length (0.001 m) was appointed to this cell, so that the impact on dispersion during the simulation was negligible. The final RTM cell length of MW2 was 121 cells, and of MW3 148 cells.
In both RTMs, the push-phase was simulated with 50 forward shifts of 0.00208 days (51 together with the forward shift to the non-reactive first cell). During each shift, advection is simulated by moving the solution in each cell to the downstream neighbouring cell. Dispersion is simulated afterwards by mixing the solutions contained in neighbouring cells in certain proportions. For model simplicity, the 12 day pull-phase was simulated as a continuous abstraction phase with a low steady flow, instead of the actual abstraction of max. 1 h with a high flow rate followed by a stagnant phase for the remainder of each day. This simplification did not result in significantly different RTM outcomes (A.9). The pull-phase was simulated with 120 shifts in backward direction with a time step of 0.1 day.

Injection and initial groundwater composition adopted in RTM.
Injection water composition (as applied in PHREEQC) was determined by averaging the four samples taken during injection. Concentration deviations over time were less than 5% for all solutes compared to the mean concentration, except for the low Fe(II) and NH 4 concentrations (maximum deviation 20%) (see A.2.2). The last PPT sample composition (almost entirely initial groundwater) was selected to represent initial groundwater in the RTMs, instead of the initial groundwater composition for reasons explained in Appendix Section A.2.3. Fig. 4 presents a simplified conceptual reaction network of the RTMs. Cation-exchange reactions and pH effects are not visualized, and only a simplified version of surface complexation is presented.

Overall conceptual hydrogeochemical model and implemented reaction network
Oxic TDW (containing O 2 and NO 3 ) is injected in an anoxic aquifer containing various reductants that may subsequently oxidize. Aerobic respiration and denitrification are processes known to occur in aquifers by oxidation of organic matter, Pyrite, and dissolved Fe(II) (Antoniou et al., 2013;Griffioen et al., 2012). In the RTMs, oxidation of dissolved Fe(II) was only assumed by O 2 (see Section 2.5.3). Dissolved Fe(II) oxidation results in Fe(III), which will quickly form Fe(III)-precipitates under circumneutral pH conditions. Senn et al. (2015) investigated the interdependent effects of PO 4 , silicate, and Ca on the composition and structure of Fe(III)-precipitates. They proposed that Fe(III)-precipitates should be described as a mixture of three types, whose proportions depend on formation conditions and physicochemical properties of the precipitates: (i) amorphous (Ca-)Fe (III)-phosphate precipitates with varying compositions, which were simulated in the RTM as the minerals Fe-hydroxyphosphate (Fe 2.5 PO 4 (OH) 4.5 ) and Hydroxyapatite (HAP; Ca 5 (PO 4 ) 3 OH); (ii) Fe-hydroxides, which was modelled as amorphous Fe(OH) 3 ; and (iii) poorly-crystalline lepidocrocite (FeO(OH)) and Goethite (FeO(OH)), which were simulated in the RTM as Goethite. Ca-and Fe(III)-precipitate hydroxide groups function as surface complexation sites which sorb ions such as PO 4 (Dzombak and Morel, 1990).

Chemical processes
2.5.3.1. Redox reactions. Pyrite, SOM, and dissolved ferrous iron oxidation were simulated as kinetically controlled processes. Abiotic Pyrite oxidation by O 2 was simulated using the Williamson and Rimstidt (1994) rate equation. This equation was extended with the oxidation by NO 3 , using the modifications from Eckert and Appelo (2002): where A/V is the initial surface area to solution volume ratio (m 2 L − 1 ), m/m0 is a factor which accounts for initial surface area changes resulting from the progressing reaction, m i is the concentration of i (mol L − 1 ), and f is a factor that is assumed to be 1, but which could be decreased to fit lower denitrification rates. The term ( 1− ΩFeS 51 ) accounts for possible dissolution or precipitation in the absence of oxidants, where Ω FeS is the saturation ratio for Pyrite and the factor 51 is used to obtain a smooth transition. Biological SOM oxidation was simulated using a Monod type reaction from Van Cappellen and Gaillard (1996): where m i is the concentration of i (mol L − 1 ), ) SOM is the current SOM content divided by the initial content, r max(i) is the maximum rate constant of i (d − 1 ), k i is the half-saturation constant, corresponding to the concentration of i which is equivalent to 0.5 r max(i) (mol L − 1 ). The term kO 2 in kO 2 in +mO 2 was included in this reaction to prohibit NO 3 reduction if O 2 is available, where k O2 in = k O2 as suggested by Van Cappellen and Gaillard (1996). This reaction only simulates SOM oxidation related to aerobic respiration and denitrification. SO 4 and Fe(III) reduction were assumed to be insignificant concerning the relatively short time span of the PPTs, and as the oxidized conditions impeded their occurrence. A WATEQ4F database modification was required to simulate homogeneous ferrous iron oxidation by O 2 to ferric iron. Ferrous and ferric iron valence states were decoupled, as was successfully performed before by Antoniou et al. (2013) and Rahman et al. (2015). This process was simulated kinetically using the rate expression from Singer and Stumm (1970): where k Fe is the rate constant, [OH − ] represents the OH − activity, P O2 is the O 2 partial pressure, and m Fe 2+ is the ferrous iron concentration (mol L − 1 ). A rate constant (k Fe ) was used of 2 × 10 13 M − 2 atm − 1 min − 1 , obtained from Davison and Seed (1983). This universal rate constant can be used in natural freshwaters with a pH range between 6.5 and 7.5, and a temperature range between 5 and 35 • C. Smith et al. (2017) demonstrated that anoxic nitrate-dependent iron oxidation can occur in groundwater. Nevertheless, it was not simulated in this study. We obtained satisfying fits by only simulating ferrous iron oxidation by O 2 , which indicates that simulation of ferrous iron oxidation by NO 3 does not impact iron concentrations significantly. Heterogeneous oxidation of adsorbed Fe(II) was simulated kinetically in an exploratory run using the rate equation provided by Tamura et al. (1976). Effects on Fe(II) concentrations were negligible (see A.3) and therefore the process was excluded from the RTM.

Mineral precipitation.
Hydrous Ferric Oxide (HFO) precipitation is a fast reaction and was therefore simulated as an equilibrium process, as performed before by Antoniou et al. (2013) and Rahman et al. (2015). In the RTMs, HFO were divided in three groups according to Section 2.5.2.: freshly precipitated amorphous Fe(OH) 3 ; amorphous (Ca-)Fe(II)-phosphate precipitates, simulated as freshly precipitated Fe-hydroxyphosphate (Fe 2.5 PO 4 (OH) 4.5 ) and HAP (Ca 5 (PO 4 ) 3 OH); and initially present aged crystalline Goethite minerals (FeO(OH)). Amorphous Fe(OH) 3 precipitation/dissolution was simulated in equilibrium using the chemical reaction below from the WATEQ4F database, with a log K of 4.891: Fe-hydroxyphosphate precipitation/dissolution was simulated using the reaction proposed by Luedecke et al. (1989), with a log K of − 96.7: HAP precipitation shown in equation (13) with a log K of − 3.421, was modelled as a kinetic reaction: To our knowledge, rate equations for HAP precipitation in aquifers are not available. Therefore, the rate was modelled in the simplest way, as the product of the observed rate constant and the saturation state minus 1, as performed before by Nancollas (1979) and van Breukelen et al. (2004): where R is the precipitation rate, K obs is the observed rate constant and Ω is the saturation state. Crystalline iron oxides initially present in the aquifer were simulated as Goethite minerals. The Fe reac content calculated with equation (4) was kept constant during the simulations.

Surface complexation.
Surface complexation of Ca, Mg, Mn(II), Fe(II), HCO 3 , PO 4 , SO 4 , and F was modelled on amorphous Fe(OH) 3 , Goethite, Fe-hydroxyphosphate, and HAP. An extensive surface complexation model and associated database (electrostatic diffuse double layer model) is available on HFO (Dzombak and Morel, 1990), which is included in the WATEQ4F database. A surface area of 600 m 2 /g, site densities of 0.2 mol weak sites/mol, and 0.005 mol strong sites/mol were used for amorphous Fe(OH) 3 . The same model and database were also applied for surface complexation to Goethite, as executed before by Appelo et al. (2002); Bonte et al. (2014); Dixit and Hering (2003). Goethite is less reactive than HFO, therefore a lower surface area and site densities were used, which resulted in successful simulations in previous studies (Rahman et al., 2015;Stollenwerk et al., 2007). A surface area of 2.89 m 2 /g, and site densities of 1.02 × 10 − 4 mol weak sites per mol, and 2.55 × 10 − 6 mol strong sites per mol were adopted from Stollenwerk et al. (2007).
Unfortunately, such a database is not available for Fehydroxyphosphate and HAP. These minerals were simulated adopting the same surface area and site densities as for amorphous Fe(OH) 3 . Therefore, the WATEQ4F database needed slight adjustments. The surface complexation part in the WATEQ4F database was copied, and amorphous Fe(OH) 3 was replaced with the respective mineral names. Fe-hydroxyphosphate and HAP minerals were assumed not initially present.
In parameter optimization, weights appointed to observations play an important role. They were determined based on a method proposed by Hill (1998). A 5% accuracy was expected for the measured solutes. Therefore, the weights were calculated as: where w i is the weight of observation i, and C i is the concentration of observation i. Surface complexation, cation exchange, and Fe(II) oxidation parameters were not optimized, and instead adopted from the WATEQ4F database or literature. Parameter optimization was performed for each process in individual PEST runs. First, the dispersivity coefficient was optimized by fitting the RTMs to the observed tracer concentrations. Second, the initial surface area to solution volume ratio (A/V) for Pyrite oxidation was optimized by fitting the model to observed SO 4 concentrations. Pyrite oxidation produces SO 4 , which can be used to quantify the oxidation rate (Korom et al., 2012). SO 4 was assumed to behave conservatively after production, because (i) SO 4 reduction is unlikely in the short time span and the oxidized conditions, (ii) surface complexation effects on SO 4 concentrations are little (see A.5), and (iii) SO 4 minerals like gypsum were under saturated and therefore not present in the aquifer. Third, the r max(i) and k i (where i = O 2 and NO 3 ) terms of SOM oxidation were fit on observed O 2 and NO 3 concentrations.
For MW2, the last step was to optimize the HAP precipitation rate. Multiple RTM runs were performed manually (without PEST) with varying precipitation rate constants (K obs : 0-1 × 10 − 8 mol/year), after which the best fit was visually examined after plotting RTM results.

Nutrient and redox mass balances
Two mass balances were made to identify governing processes controlling O 2 , NO 3 , and PO 4 fate. The first, for O 2 and NO 3 , depends on various redox processes. The second, for PO 4 , depends mostly on precipitation and surface complexation processes. Compound masses going in, going out, and remaining in the aquifer were determined based on the RTM results. Masses going in the aquifer were determined by multiplying compound concentration with the injection water volume during the push-phase. Masses going out of the aquifer were obtained by multiplying compound concentrations departing the aquifer during the pull-phase with the cell volume, after which all masses of the time steps were summed. These solute concentrations were obtained from the first non-reactive model cell during simulation and adjusted for initial groundwater concentrations. During the pull-phase, 420 L initial groundwater was abstracted as part of the total volume of 720 L. This initial groundwater volume was multiplied with initial groundwater concentrations and subtracted of the total abstracted compound masses. Masses remaining in the aquifer were determined from geochemical content changes, e.g., of Pyrite, SOM, Fe-hydroxyphosphate, and the sorbed PO 4 on minerals by surface complexation. Initial content was deducted from the final content after RTM simulation for every cell, after which these content changes were summed.
An additional step was performed for the O 2 and NO 3 mass balance. Obtained masses were multiplied with the potential release or uptake of electrons, yielding electron mass balances. The number of electrons involved were obtained from the reaction equations in the WATEQ4F database (Table 1). Table 2 presents the composition of injected tile drainage water (TDW), (initial) groundwater at the start of the push-pull tests (PPTs), and native groundwater composition not influenced by the aquifer storage and recovery (ASR) system. TDW injected during the two PPTs was relatively fresh (EC = 1130-1180 μS/cm) and contained relatively high nutrient concentrations (NO 3 : 13.5-14.2 mg/L, PO 4 : 17.0-17.1 mg/L, SO 4 : 112-113 mg/L, K: 59.5-61.2 mg/L), which probably originate from agricultural fertilizers. The pH (=7.9) was relatively high compared to the native and initial groundwater. As and Ni concentrations were elevated compared to native groundwater. These compounds can originate from, (i) the natural Dutch subsurface, which contains these trace metals (Huisman et al., 1997) and TDW injection may have induced mobilization, and/or (ii) phosphate and organic fertilizers, which often contain trace metals (Atafar et al., 2008;Jiao et al., 2012). The redox state was oxic, as dissolved O 2 and NO 3 is present. Consequently, relatively low Fe(II) and Mn(II) concentrations were observed. Native groundwater was analysed from 2 monitoring wells (MW4 and 5) that were not influenced by the ASR system. Groundwater was relatively fresh at MW4 (EC = 2720 μS/cm) and more brackish deeper in the aquifer at MW5 (EC = 9360 μS/cm), which indicates a salinity stratification from fresh to more saline with depth. The groundwater redox state is deeply anoxic, with a combination of Fe(III) reducing, as shown by the presence of Fe(II); SO 4 reducing, as SO 4 concentrations are near zero, while the groundwater originates from sea water with higher Cl/SO 4 ratios; and possibly methanogenic conditions. ASR injected water influenced the groundwater composition at MW1-3 before the start of the PPTs. At MW1-2, groundwater had a TDW signature, with lower EC and higher pH and nutrient concentrations (NO 3 , PO 4 , SO 4 , and K) than the native groundwater. The redox state is anoxic, which implies that O 2 and NO 3 present in the injected TDW has been reduced. MW3 has a comparable EC to MW5, which are located roughly at the same depth. This indicates that ASR injected water did not influence the groundwater composition at this depth, because the top of the well screen is located approximately 3.5m below the ASR well screens. Nonetheless, slight TDW influences can be seen in MW3, especially by higher nutrient concentrations (NO 3 : 1.64 mg/L, PO 4 : 2.24 mg/L, K: 92.7 mg/L). Table 3 presents the geochemical aquifer characteristics at the well screen depths of MW2 and 3, and the mean regional contents determined by the Geological Survey of the Netherlands (TNO) (Klein et al., 2015). TNO investigated geochemical characteristics of the first tens of meters subsurface in the Western part of the Netherlands (provinces Noord-and Zuid-Holland), based on 47 drillings and 1191 soil samples. As the aquifer studied consists of sandy sediments, mean contents were determined based on 617 soil samples of the lithology sand (here referred to as "mean regional contents"). More detailed information on specific geological formations have not been published.

Aquifer geochemistry
Geochemical contents are the same order of magnitude as the mean regional contents for MW2-3. MW2 has a lower clay and SOM content, and therefore also a lower CEC compared to MW3 and the mean regional contents. Contrary, MW3 has a relative high CEC, compared to the mean regional content. The SOM content of MW3 is below the mean regional content. A ten-fold higher Pyrite content is observed in MW2 compared to MW3. The Pyrite content in MW2 is relatively high and in MW3 relatively low compared to the mean regional content. Carbonate contents of MW2 (5.4% d.w.), MW3 (6.7% d.w.), and the mean regional contents (4.0% d.w.) are in about the same range. Reactive Fe content in MW2 and MW3 is relatively high compared to the mean regional contents.

Results PPT-RTM
Figs. 5 and 7 present the PPT results and the associated RTM simulations for the well screen depths of MW2 and 3, respectively. The conservative tracer (Br) was fully recovered at both PPTs. Model results show generally an acceptable fit with the observed concentrations. Simulated and observed bromide concentrations match relatively well, which means that appropriate dispersivity coefficients are applied. The PPTs timescale is sufficiently long to study nutrient fate, as shown by the significant different trends of observed reactant concentrations compared to Br concentrations. Altogether, the reaction network for MW2 and MW3 seems well described, as for all solutes a sufficient fit was observed. An overview of all model parameters is shown in A.7.

Results PPT-RTM MW2
PPT results at MW2 show O 2 and NO 3 concentrations decreasing with time, compared to their conservative mixing concentrations. Injected O 2 is fully reduced after 2 days, and NO 3 after ~7 days. Simulated O 2 reduction fits observed concentrations relatively well. Observed and simulated NO 3 concentrations display a slower reduction rate in the first 3 days than afterwards, which reflects that O 2 reduction was more favourable during this period. Remarkably, observed concentrations show that NO 3 reduces simultaneously with O 2 . In previous studies, similar results were observed, which were attributed to, (i) grain-scale aquifer heterogeneity resulting in different redox regimes in pore spaces (Jakobsen, 2007;Jakobsen and Postma, 1999), and (ii) aerobic denitrification by bacterial communities (Marchant et al., 2017). Simultaneous O 2 and NO 3 reduction simulation was not possible with the PHREEQC database used. Therefore, simulated NO 3

E. Kruisdijk and B.M. van Breukelen
concentrations declined after O 2 consumption, somewhat overshooting the first observations. Note that NO 2 is formed and quickly reduced. O 2 and NO 3 reduction was mostly resulting from SOM oxidation. Pyrite oxidation and Fe(II) oxidation influences were only minor. Observed SO 4 concentrations indicate that mostly dispersion controls concentration variations, albeit the slight concentration increase observed in the first 3 water samples. This could not be explained by Pyrite oxidation, as this would have resulted in continuous elevated SO 4 concentrations instead of only at the first three days. Furthermore, higher Pyrite oxidation rates would have resulted in higher Fe(II) concentrations. Observed Fe(II) concentrations are lower than the conservative mixing concentrations and are increasing slowly, pointing to Fe (II) oxidation. An appropriate Fe(II) oxidation rate was used as shown by the satisfying fit between simulated and observed Fe(II) concentrations (except for the last sample). Subsurface iron removal (SIR) is suggested by the low Fe(II) concentrations during the abstraction of solely groundwater towards the end of the PPT. Introduced O 2 reacts with dissolved and desorbed Fe(II) whereby fresh Hydrous Ferric Oxides (HFO) is produced, which can sorb additional Fe(II) from initial groundwater during the pull-phase. The pH drops below the conservative pH in the first 3 days, which is mostly resulting from O 2 reduction. Simulated pH fits the observed well.
Observed PO 4 concentrations are lowered compared to conservative mixing concentrations. The largest PO 4 decrease occurs within the first day. PO 4 concentrations approach initial groundwater concentrations after approximately 4 days and stay stable afterwards. Notice that PO 4 concentrations never decrease below initial groundwater concentrations. Simulated PO 4 concentrations fit the observed concentrations well for the first 2 days, after which the simulated concentrations are higher than the observed till day 8. This may point to stronger PO 4 sorption than modelled. The equilibrium constants of surface complexation reactions are often calibrated in field studies by adjusting their values (e. g., Rahman et al., 2015). However, increasing surface complexation constants of PO 4 did not result in better fits (results not shown). Hydroxyapatite precipitation (HAP: Ca 5 (PO 4 ) 3 OH) is the main process immobilizing PO 4 . Observed Hydroxyapatite SIs are supersaturated during the whole PPT, with decreasing SIs from the start. They show a comparable trend as pH, which could be explained by the pH dependency of the reaction (equation (13)). An acceptable fit was obtained for HAP during the first 2 days, after which simulated are slightly higher than observed SIs till day 6.
Observed alkalinity concentrations show slightly lowered concentrations compared to conservative mixing concentrations during the first 5 days of the PPT, afterwards concentrations are slightly higher. Simulated alkalinity concentrations are in the same range, but do not perfectly follow this trend. Observed Ca and NH 4 concentrations show a similar trend. They show a mostly conservative behaviour till the bromide dispersion front, thereafter concentrations decrease below conservative mixing concentrations and increase later on to reach initial groundwater concentrations. Observed NH 4 and Ca trends could not be explained by cation-exchange processes in the RTM. Fig. 6 shows observed concentrations for four trace metals during the PPT in MW2. Mn concentrations are mostly below conservative mixing concentrations, which could indicate Mn(II) oxidation. Although, observed concentrations are scattered and show no clear trend. Pyrite oxidation can result in increasing As, Ni, Zn concentrations, as these trace metals may have co-precipitated during formation (e.g., Larsen and Postma, 1997;Stuyfzand, 1998;Zhang et al., 2009). During this PPT, only Ni shows unambiguously elevated concentrations. As concentrations show mostly lowered concentrations during the first six Table 2 Concentration ranges of tile drainage water (TDW) injected during the PPTs; the initial groundwater composition at the start of the PPTs in monitoring wells MW1, 2, and 3; and the native groundwater composition not influenced by the ASR system in MW4 and 5.  Table 3 Geochemical aquifer properties at the well screen depths of MW2-3. The last column shows the mean regional geochemical contents in Western Netherland (provinces Noord-and Zuid-Holland) for the lithology sand at the first tens of meters of the subsurface (cf. days, and afterwards slightly increased concentrations compared to the conservative mixing concentrations. Possibly, As could have been sorbed to freshly precipitated minerals during the PPT. Increased concentrations afterwards could have been caused by displacement from sorption sites by competing anions in initial groundwater (Wallis et al., 2011).

Results PPT-RTM MW3
The PPT at MW3 shows similar trends for most solutes compared to MW2. Concentration differences are larger between TDW and initial groundwater for some parameters (particularly for Ca, Fe, NH 4 , pH, and SI HAP), due to more saline conditions of the initial groundwater at MW3. Simulated and observed Br concentrations do fit slightly poorer compared to MW2. Contrarily, simulated SO 4 , Ca, Fe, and NH 4 concentrations show a remarkably better fit to the observed concentrations.
O 2 reduced within ~2 days, similar as in the PPT at MW2. Noteworthy, NO 3 reduction was faster at this depth. It reduced in ~4 days, while this took about 7 days at the PPT at MW2. The observed NO 3 trend contrasts compared to MW2, as a fast reduction rate is observed directly from the start of the PPT. As a result of this faster rate, NO 2 concentrations were twice higher compared to MW2. Simulated O 2 and NO 3 concentrations fit the observed concentrations well. O 2 and NO 3 reduction is mainly caused by Pyrite oxidation, and to lesser extent by SOM and Fe(II) oxidation. Pyrite oxidation was indicated by an increase of observed SO 4 concentrations, which showed a satisfying fit with the simulated concentrations. PO 4 concentrations show a similar trend as observed during the PPT at MW2. Observed concentrations approach initial groundwater concentrations after approximately 9 days, instead of about 4 days at MW2. Simulated PO 4 concentrations fit the first observations fit relatively well and the observations afterwards poorer. Similar to MW2, the fit did not improve by increasing surface complexation constants of PO 4 (results Fig. 5. PPT in MW 2 (shallow) -observed, conservative mixing, and simulated solute concentrations during the pull-phase of the PPT. Blue dots are the PPT observations, green dashed lines the simulated concentrations in the case only dispersion happened but reactions do not occur (here referred to as "conservative mixing concentrations"), and red lines are the final mode results. The dotted purple line shows the simulated concentrations simulated with the parameters used for the RTM of MW3. Concentrations of initial groundwater and TDW are indicated with horizontal dashed black and cyan lines, respectively. Concentrations are plotted against time after injection and the ratio between the volume abstracted and the total volume injected (Vabs/Vinj). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) not shown). Observed HAP SIs were supersaturated before and during the mixing front but reach equilibrium afterwards. Simulated HAP SIs did fit observed concentrations well, without simulating HAP precipitation. Remarkably, Fe(III)-hydroxyphosphate precipitation is the main process causing the lowered PO 4 concentrations, instead of HAP precipitation at MW2.
Simulated alkalinity concentrations show a slightly better fit compared to MW2, although the trend is similar: before the mixing front simulated concentrations are often higher than observed, and after the mixing front lower. Simulated pH fits observed concentrations well, except of one outlier after ±4 days. Ca and NH 4 show a more conservative breakthrough curve compared to MW2, probably because of the larger concentration differences between the injected and initial groundwater.
Observed Mn(II) concentrations are slightly lower during the mixing front than the conservative mixing concentrations, indicating Mn(II) oxidation (Fig. 8). As, Ni, and Zn seem to respond similarly at MW3 and MW2. However, Ni concentrations increase less at MW3 compared to MW2 (MW2: max ±0.35 μmol/L (≈21 μg/L); MW3: max ±0.075 μmol/L (≈4.4 μg/L)). Table 4 shows PEST optimized and adopted values of various parameters of the RTMs, and a literature range of parameter values. The two RTMs showed notable contrasts between Pyrite and SOM oxidation parameters, which were optimized with PEST. The Pyrite oxidation term (A/V) was fit to SO 4 concentrations. In both PPTs, a slight increase (±0.1 mmol/L) of SO 4 concentrations was observed in the first 3 observations. Nevertheless, a relatively low (A/V) term was obtained for the RTM of MW2 compared to a high term for MW3. This contrast resulted from SO 4 concentrations continuously exceeding the conservative mixing concentrations in MW3, compared to only the first 3 observations in MW2. Contrarily, average SOM oxidation values were obtained compared to the literature range for the RTM of MW2, and low for MW3. Parameter optimization for HAP precipitation resulted in a K obs of 2.8 ×10 − 11 for the RTM of MW2. In MW3, adding HAP precipitation did not result in a better model fit, therefore this process was not further considered.

Model results and discussion: aerobic respiration and denitrification
Aerobic respiration and denitrification are coupled redox processes, which means that O 2 or NO 3 reduction will only occur when there is a reductant available. SOM, Pyrite, and Fe(II) were the simulated reductants in the RTMs. Electron mass balances were made to obtain quantitative insight in the most important reductants related to O 2 and NO 3 reduction. At MW2, 92% of O 2 and 34% of NO 3 was reduced, compared to 94% of O 2 and 67% of NO 3 at MW3. The part not reduced was retrieved in the abstracted water. Fig. 9 displays electron mass balances of O 2 and NO 3 coupled to Pyrite, SOM, and Fe(II). The accepted electrons by O 2 reduction are almost identical for MW2 and 3, as similar TDW was used and almost all O 2 was reduced. For NO 3 , less electrons were accepted at MW2, as less reduction occurred during this PPT.
At MW2, the most important electron donor for O 2 and NO 3 reduction was SOM. SOM reduced 93% of the O 2 and NO 3 , during this PPT. Pyrite and Fe(II) oxidation reduced only 5% and 2%, respectively. At MW3, pyrite oxidation reduced most of the O 2 and NO 3 (81%). SOM and Fe(II) oxidation were responsible for only 6% and 13% of the reduction, respectively. MW3 showed more Fe(II) oxidation than MW2, which resulted from higher Fe(II) concentrations in the initial groundwater and a higher Pyrite oxidation rate. Table 5 shows an overview of first-order aerobic respiration and denitrification rate constants obtained in this study compared to previous studies. First-order rate constants were calculated with the wellmixed reactor model by Haggerty et al. (1998). Rate constants were estimated based on observations with a maximum mixing ratio of 30% TDW and 70% groundwater. The reliability of aerobic respiration rates is less compared to denitrification rates, as only 2 or 3 observations could be used (A.8). Aerobic respiration rate constants were relatively low in comparison to previous studies. These studies were mostly determined in contaminated aquifers, except for the rate constant determined by Vandenbohede et al. (2008). Intermediate denitrification rate constants were obtained in comparison to previous studies. Literature studies show rate constants down to 10-1000x smaller and up to 10x larger. The large range of aerobic respiration and denitrification rates is probably caused by factors such as hydrogeological aquifer properties, pH, microbial activity, and the abundance and reactivity of electron donors (Einsiedl and Mayer, 2006;Korom, 1992).  (4.6%). PO 4 immobilization by surface complexation occurred only slightly on HAP and Fe-hydroxyphosphate precipitates, respectively 0.60% and 0.35%. In MW3, abstracted PO 4 is similarly the main PO 4 (out) component (64%). PO 4 immobilization processes are notably different than in MW2. Fe-hydroxyphosphate precipitation is the main cause of PO 4 immobilization (35%). Furthermore, Fe-hydroxyphosphate precipitates are also the main component of surface complexation (0.87%). On other minerals, surface complexation of PO 4 was smaller than 0.1%.

Model results and discussion: phosphate immobilization
Main processes sequestering PO 4 differ in both aquifer layers. HAP precipitation was only simulated in MW2, because adding this process to MW3 resulted in poorer HAP SI fits. It was ambiguous why HAP precipitation only occurred at MW2 as the HAP SIs at both locations were similar. In the RTM, we assumed that the initial HAP content was 0. Nevertheless, the initial groundwater SI at MW2 was supersaturated for HAP (SI = 3.2), which could indicate that HAP was initially present in the aquifer. This could explain HAP precipitation at MW2, as minerals do not often form by spontaneous formation from solution but mostly on pre-existing surfaces (Appelo and Postma, 2005). In MW3, the main process for PO 4 immobilization is Fe-hydroxyphosphate precipitation. This process occurred more strongly at MW3 as, Fe(II) was more available, due to the higher Fe(II) concentrations in initial groundwater and due to more Pyrite oxidation.

Contrasts between biogeochemical reactions at MW2 and MW3
PPTs were performed in two different geological formations, with different groundwater compositions (Table 2), geochemical characteristics (Table 3), and at different depths in relation to the ASR system (Fig. 2). Model parameter sets used in the RTMs of MW2 and 3 are therefore significantly different (see A.7). Two model runs were performed, where kinetic parameters of MW2 were used for the PPT Fig. 7. PPT in MW 3 (deep) -observed, conservative mixing, and simulated solute concentrations during the pull-phase of the PPT. Blue dots are the PPT observations, green dashed lines the simulated concentrations in the case only dispersion happened but reactions do not occur (here referred to as "conservative mixing concentrations"), and red lines are the final mode results. The dotted purple line shows the simulated concentrations simulated with the parameters used for the RTM of MW2. Concentrations of initial groundwater and TDW are indicated with horizontal dashed black and cyan lines, respectively. Concentrations are plotted against time after injection and the ratio between the volume abstracted and the total volume injected (Vabs/Vinj). (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) simulation at MW3, and vice versa. Kinetic parameters of MW3 used for the PPT simulation at MW2 resulted in faster NO 3 reduction, due to higher Pyrite oxidation rates. Pyrite oxidation also resulted in an increase of SO 4 and Fe(II) concentrations, which did not correspond to the observed concentrations. Additionally, this parameter set resulted in poorer PO 4 , pH, and SI HAP fits. Using the MW2 parameters for the MW3 simulation (Fig. 7) resulted in acceptable fits for O 2 and NO 3 , but lower Pyrite oxidation rates resulted in poor fits for SO 4 and Fe(II) concentrations. Other solutes showed relatively sufficient fits. This shows that the different parameter sets are not exchangeable at the different well screen depths, and that intra aquifer variations require different parameter sets for an appropriate simulation.
At MW2, ten-fold higher Pyrite contents did remarkably not result in more Pyrite oxidation compared to MW3. Variations between the A/V terms at MW2 and MW3 illustrate that surface area is a more important factor than content in controlling Pyrite oxidation rates. Surface areas are difficult to estimate from sediment samples and can vary multiple orders of magnitude (Beckingham et al., 2016). Optimized A/V terms are within the literature range for each RTM, as shown in Table 4. Descourvieres et al. (2010b) deduced even larger A/V term variations within one aquifer, although the sediment samples were recovered from a wider range of depths (190-530 m-b.g.s.). They observed that higher A/V terms correlated with finer sediments. This corresponds with our findings, as higher A/V terms were observed in the finer sediments of MW3.  (9) 1.6 × 10 − 9 -1.2 × 10 − 4 ( 1,2,5,7,8,9) k O2 mol L − 1 1.0 × 10 − 5 2.9 × 10 − 4 (9) 1.0 × 10 − 6 -2.9 × 10 − 4 (1, 2,5,8,9) r max(NO3) s − 1 1.0 × 10 − 9 1.7 × 10 − 11 (9) 1.7 × 10 − 11 -1.2 × 10 − 4 (1, 2,5,7,8,9)   Similarly, SOM oxidation rates were lower in MW3, despite the 2.5x higher SOM contents compared to MW2. This implies that SOM content is not the most important parameter for SOM reactivity in this aquifer. Massmann et al. (2004) studied redox processes in an aquifer and similarly concluded that SOM oxidation rates are defined by its reactivity rather than its content. Middelburg (1989) observed 8 orders of magnitude variation for first-order SOM decay rate constants in marine sediments, which displays the large variation possible in reactivity. SOM reactivity in marine sediments has been widely studied, but less is known about aquifer sediments. Nevertheless, Postma et al. (1991) stated that similar variations can be expected. SOM origin and composition influences largely its reactivity (Kristensen and Holmer, 2001), but also the extent of past aerobic oxidation (Hartog et al., 2004). SOM depositional environments vary significantly at the well screen depths of MW2 (glacial deposits) and MW3 (fluvial deposits), which makes it probable that the past aerobic oxidation extent, origin, and composition of SOM vary significantly too. Another influence on SOM reactivity could be the extent of exposure to ASR injected water. As elaborated in Section 3.1, the aquifer at the well screen depth of MW2 is influenced more by the ASR system than at MW3. SOM oxidation is a biological process, which could be enhanced by ASR injected water consisting of nutrients and bacteria.

PPT-RTM for exploration and monitoring of subsurface water technologies
The PPT-RTM approach is useful to obtain insights in aquifer reactivity with respect to subsurface water technologies (SWTs). It can be used in support of, or as an alternative for, full-scale monitoring (e.g., Antoniou et al. (2013);Zuurbier et al. (2016)), laboratory incubation experiments (e.g., Descourvieres et al. (2010a); Hartog et al. (2002)), or surface area characterization of minerals (Beckingham et al., 2016). A PPT and a SWT differ in the spatial and temporal scales of application. As shown in this study, aquifer heterogeneity resulted in different PPT outcomes at different aquifer depths. Water quality insights at SWT scale can be obtained by performing PPTs at multiple depths, as performed in this study. PPT results can be extrapolated to SWT scale assuming limited heterogeneity in longitudinal direction. Furthermore, the temporal scale of a PPT is in the order of days-weeks, while an SWT system is constructed to operate for many years. The information gained from a PPT thus represents a snapshot of aquifer reactivity. PPTs are ideally repeated during SWT operation to obtain insights in evolution of chemical and biological processes.
Regular full-scale SWT monitoring data can be challenging to interpret as temporal water quality variations in observation wells can relate to spatial variations in groundwater chemistry, or the (highly) variable composition of infiltrated water. PPT-RTM simplifies interpretation, as the injected solution composition is known and mixing between the injected water and initial groundwater can be assessed using a conservative tracer.

Conclusion
We proposed a versatile approach to assess in-situ aquifer reactivity, which combines Push-Pull Tests (PPTs) with Reactive Transport Table 5 Overview of first-order degradation rate constants observed for aerobic respiration and denitrification in several studies (e.g. the review papers of McGuire et al. (2002) and Korom (1992)). Empty cells indicate that data was not available. A factor 2 was assumed to convert total organic carbon to SOM when needed (Pribyl, 2010). Furthermore, pyrite contents were calculated from total S by equation (2) Korom et al. (2012) 0.00049-0.0031 0.034-0.10 0.36-0.47 Sand and gravel Kölle et al. (1985) and Böttcher et al. (1989) 0.0013-0.0023 Sand and gravelly sand Cunningham et al. (2000) 0.1-0.6 Silty fine sand; Contaminated with hydrocarbons this study 2.5-3.8 0.26-0.63 0.4-1.0 0.05-0.53 Fine to coarse sands Schroth et al. (1998) 3.6-40 2.2-10.1 Clayey silt and silt; Petroleum contaminated McGuire et al. (2002) 14.4 5.0-7.4 Sand; Contaminated with BTEX and chlorinated solvents Vandenbohede et al. (2008) 8.8 18 Fine sand Fig. 10. PO 4 mass balance for the PPTs at MW2 and 3, where PO 4 (in) is the total injected PO 4 mass during the PPT, PO 4 (out) is the retrieved PO 4 during the pullphase and the precipitated or sorbed PO 4 within the aquifer. SC in the legend is an abbreviation for surface complexation, and (a) Fe(OH) 3 of amorphous Fe(OH) 3 .
Modelling (RTM). This method was performed at an agricultural Aquifer Storage and Recovery (ASR) site, where nutrient rich tile drainage water (TDW) is injected in an aquifer during wet periods and abstracted during droughts for irrigation water use. PPTs were applied to 2 monitoring wells (MW2 and 3) with 1 m well screens in contrasting geochemical formations at different depths. The objective was to assess nutrient fate and redox processes in this aquifer in a period without ASR operation. PPT results showed relatively fast O 2 and NO 3 reduction and PO 4 immobilization in both monitoring wells. For each monitoring well, PPT results were simulated with a 1-D radial RTM using PHREEQC-3, to obtain information about the reaction networks related to the observed water quality changes. In MW2, 92% of injected O 2 and 34% of NO 3 was reduced. SOM reduced 93%, Pyrite 5% and Fe(II) oxidation 2% of O 2 and NO 3 . The aquifer was more reactive at the well screen depth of MW3, which resulted in 94% O 2 and 67% NO 3 reduction. Pyrite reduced 81% of O 2 and NO 3 , and SOM and Fe(II) oxidation contributed to 6% and 13% reduction, respectively. Reduction pathways vary remarkably in MW2 and 3. Higher SOM (MW2) and Pyrite oxidation (MW3) rates were observed where their contents were lower. PO 4 immobilization was mainly induced by Fe-hydroxyphosphate and Hydroxyapatite precipitation. In MW2, 73% of the injected PO 4 was abstracted during the pull-phase, 23% was immobilized by HAP precipitation and 4.6% by Fehydroxyphosphate precipitation. In MW3, the main PO 4 immobilization process was Fe-hydroxyphosphate precipitation, which immobilized 35% of injected PO 4 . Surface complexation on Fe-hydroxyphosphates and Goethite contributed to less than 1% of PO 4 immobilization and 64% of injected PO 4 did not immobilize and was abstracted. The PPT-RTM approach resulted in a better fundamental understanding of geochemical processes that determine aquifer reactivity. Insights were gained about linking oxidants to specific reductants, and PO 4 immobilization to precipitation and surface complexation processes.

Declaration of competing interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.