An integrated framework for surface deformation modelling and induced seismicity forecasting due to reservoir operations

: Induced seismicity and surface deformation are common observable manifestations of the geome-chanical effect of reservoir operations whether related to geothermal energy production, gas extraction or the storage of carbon dioxide, gas, air or hydrogen. Modelling tools to quantitatively predict surface deformation and seismicity based on operation data could thus help manage such reservoirs. To that effect, we present an integrated and modular modelling framework which combines reservoir modelling, geomechanical modelling and earthquake forecasting. To allow effective computational cost, we assume vertical ﬂ ow equilibrium, semi-analytical Green ’ s functions to calculate surface deformation and poroelastic stresses and a simple earthquake nucleation model based on Coulomb stress changes. We use the test case of the Groningen gas ﬁ eld in the Netherlands to validate the modelling framework and assess its usefulness for reservoir management. For this application, given the relative simplicity of this sandstone reservoir, we assume homogeneous porosity and permeability and single-phase ﬂ ow. The model ﬁ ts the measured pressure well, yielding a root mean square error (RMSE) of 0.95 MPa, and the seismicity observations as well. The pressure residuals show, however, a systematic increase with time that probably re ﬂ ects groundwater ingression into the depleted reservoir. The interaction with groundwater could be accounted for by implementing a multiphase-ﬂ ow vertical ﬂ ow equilibrium (VFE) model.This is probablythe major factor that limits the generalapplicability of the modelling framework.Nevertheless,hemodelledsubsidenceandseismicity ﬁ tverywellthehistoricalobservationsinthecaseof the Groningen gas ﬁ eld.

The increasing demand for energy and the need to mitigate the impact on climate is driving various industry operations that involve either injecting or extracting fluids from the sub-surface. These operations include the storage of carbon dioxide, air, gas or hydrogen, gas extraction or geothermal energy production. They imply pressure changes and geomechanical deformation which can lead to measurable surface displacements and seismicity (Rutqvist et al. 2016). There is now abundant literature on this topic. For example, Vasco et al. (2018) report surface deformation and seismicity at sites of carbon dioxide injection in Algeria; Williams-Stroud et al. (2020) report seismicity triggered by CO 2 injection at the Illinois Basin -Decatur Project (IBDP); Li et al. (2021) document surface deformation and seismicity induced by groundwater extraction from a sedimentary aquifer at the Raft River geothermal field; and Shirzaei et al. (2016) document surface deformation and seismicity induced by wastewater injection in Texas. Seismicity is a concern because of the hazard posed to infrastructures and residents, but also because it could jeopardize the mechanical integrity of the reservoir in case of caprock fracturing. Surface deformation might or might not be a major liability, but it can be in any case a valuable source of information about pressure changes in the reservoir. For these reasons, there is most value in computationally effective methods to relate reservoir operations (well flow rates and pressures) to surface deformation and seismicity.
A number of studies have proposed computationally efficient methods for either reservoir modelling (Cowton et al. 2018;Jenkins et al. 2019), geomechanical modelling (Kuvshinov 2008;Bourne and Oates 2017;van Wees et al. 2019;Jansen and Meulenbroek 2022) or seismicity modelling (Dieterich 1994). Along these lines, we present here a computationally efficient modelling framework which consists of different modules: a simplified reservoir model based on the vertical flow equilibrium (VFE) approximation, a Green's function approach to calculate poroelastic stress changes and surface subsidence and a simple earthquake nucleation model to relate stress changes to seismicity. We use the well-documented example of the Groningen gas field in the Netherlands (Fig. 1), where gas extraction has caused measurable subsidence since the 1960s and induced seismicity since the 1990s (Bourne et al. 2014), to assess the modelling framework. The region experienced initially small magnitude events (M , 2.5) ( Fig. 1b) but stronger and more frequent seismic events after 2004, culminating with a M 3.6 event in 2012, which caused public alarm. The authorities then requested a reduction of production and a plan for complete shut down by 2030 (Fig. 1c). Production went from 53.8 billion cubic metres (BCM) in 2012 to about 20 BCM in 2018. The governmental plan was revised later and now demands a complete shutdown by 2023, and production levels not exceeding about 4 BCM per year. The seismic hazard concern prompted large efforts to monitor the seismicity and surface deformation, and public release of detailed information on the reservoir characteristics. There is therefore a wealth of information publicly available on this reservoir (Burkitov et al. 2016; de Jager and Visser 2017;Dost et al. 2017) and it has been used as a test case in a number of previous studies of surface subsidence and induced seismicity (Bourne et al. 2014(Bourne et al. , 2018Bourne and Oates 2017;Buijze et al. 2017Buijze et al. , 2019Dempsey and Suckale 2017;Candela et al. 2019;Smith et al. 2019;van Wees et al. 2019;Richter et al. 2020;Heimisson et al. 2021).
Here, we first present briefly the setting of the Groningen gas field. We then describe the reservoir and the geomechanical and seismicity models integrated in our workflow as different modules. We next demonstrate the performance of the integrated model in the case of the Groningen gas field. We use our workflow to forecast the geomechanical effects and induced seismicity for future production scenarios. We consider the cold winter production scenario (Nederlandse Aardolie Maatschappij (NAM) 2013) from the end of 2016 to 2030, a shut-in scenario with arrest of the production at the end of 2016 and, as a thought experiment, a cold winter scenario with pressure management. In the last section, we discuss the limitations of our modelling approach and the possible bias introduced by the simplifying assumptions made in our application to the Groningen gas field. We note that the Groningen example is a relatively favourable case because of the rather simple geometry and the relatively homogeneous properties of the sandstone unit that forms the gas reservoir. We discuss possible improvements that would make the framework more generally applicable.

Setting of the Groningen gas field
The Groningen gas field was discovered in 1959 and has been in production since 1962 (Bourne et al. 2014). It extends approximately 35 km east-west and 50 km north-south. The reservoir is located in the Upper Permian Rotlingend formation, a fluvial-aeolian sequence with interbedded sandstone, conglomerate and clay units. It was deposited in the Permian in a rift basin. The facies distribution shows a proximal to distal trend with dominantly conglomerate in the south, sandstones in the centre of the reservoir and clay in the north (Stäuble and Milius 1970;de Jager and Visser 2017).
The reservoir lies at a depth of about 3000 m and dips by about 3°northwards, corresponding to c. 600 m deepening over its 40 km extent. The thickness of the reservoir increases from 90 m in the SE to 300 m in the NW. The reservoir is sealed by an overlying thick and impermeable layer of evaporite and anhydrite. This caprock formation comprises a 50 m-thick basal anhydrite and 200 m to 1 km-thick evaporite with disconnected anhydrite lenses. The reservoir is structurally controlled by normal faults and closed by an aquifer in the north (de Jager and Visser 2017) (Fig. 1a).
The initial gas reserve was estimated to be 2913 BCM (Burkitov et al. 2016) and about 2200 BCM had been produced as of May 2017. The reservoir is layered (Burkitov et al. 2016), with the free gas layer on top of the water interface. Due to the northern dip of the reservoir, the water-gas contact is responsible for the north boundary (Burkitov et al. 2016). Because of its limited connection to groundwater, gas extraction has led to a significant pressure drop. This is concordant with the small amount of water extracted (Burkitov et al. 2016).
The reservoir has a permeability ranging from tens of millidarcies (1 mD = 9.869233 × 10 −16 m 2 ) up to a few darcies, with an average value of 260 mD (3.55 × 10 −13 m 2 ) and higher values in the centre of the reservoir (Burkitov et al. 2016). The porosity ranges from 10 to 25% with a mean value of 17%, and a similar spatial distribution with larger values near the centre of the reservoir (Burkitov et al. 2016). The initial pressure of the reservoir was about 34.68 MPa, close to hydrostatic (Burkitov et al. 2016). The geothermal gradient is estimated to 27°C km −1 so that the reservoir temperature ranges from 80 to 120°C with a mean value of 102°C (de Jager and Visser 2017). The gas is composed of 14% nitrogen, 1% carbon dioxide (CO 2 ) and the rest is mainly methane (CH 4 ) (Stäuble and Milius 1970;Burkitov et al. 2016). It can therefore be considered as a dry gas (Yang 2016), and was modelled that way in the 2012 Groningen Field Review. It was, however, modelled as a wet gas in the 2015 revision because of the condensed water dissolved in the gas (Burkitov et al. 2016).

Reservoir modelling
State-of-the-art reservoir models can simulate fluid flow in 3D and account for the two-way interaction between fluid flow and poroelastic deformation of the reservoir (Jha and Juanes 2014). These fully coupled reservoir models have been used in a number of studies to investigate the mechanics of induced seismicity (Juanes et al. 2016;Byrne et al. 2020;Kroll et al. 2020). However, they require substantial computations, which make it challenging to perform the large number of realizations needed to match observations or to make data-driven predictions with account for uncertainties. To circumvent this issue, analytical solutions of linear poroelasticity (Wang 2000) can be used in ideal cases where the flow can be approximated by diffusion in 2D or 3D in a unbounded, homogeneous and isotropic medium (Zhai and Shirzaei 2018;Zhai et al. 2019). This approach, however, is limited to single-phase flow and is not suited to model fluid flow within a reservoir of finite dimension with complex geometry. In this study, we adopt an alternative strategy that consists in reducing the model dimension by assuming VFE (Coats et al. 1971). In the sub-sections below, we first describe briefly the industry reservoir model, which we used to benchmark our model, and then provide details on our implementation of the VFE model and history matching.

Industry high-resolution pressure depletion model
The current reference reservoir model for Groningen is MoReS (modular reservoir simulator), which has been used for business purposes and risk assessment (NAM 2013). It accounts for the detailed geometry of the reservoir, which was determined based on seismic reflection and seismic refraction data (Burkitov et al. 2016): shape, faults, thickness and depth. The model ignores poroelastic coupling but can be used to predict poroelastic deformation. The watergas interaction is represented using a pressurevolume-temperature (PVT) two-phase fluid flow model. The model depends on 96 adjustable parameters. These parameters were optimized through history matching using the wells' flow rates and the borehole's pressure measurements (NAM 2013). However, this procedure is computationally expensive, requiring access to high performance computing facilities and hundreds of computational hours for a single history-matched model, only returning the optimal solution without quantification of uncertainties. We use MoReS to benchmark our simplified reservoir model.

Simplifying assumptions
We aim at a reservoir model that can be used to forecast seismicity, with quantification of the uncertainties resulting from matching both the reservoir data and the seismicity observations. This objective requires a computationally effective workflow so that methods such as the Markov chain Monte Carlo algorithm can be used for parameter estimation. This is challenging due to the non-linear physics describing fluid flow in a porous medium, especially in the case of multiple-phase flow or in presence of reservoir heterogeneities. We therefore make three major simplifications. First, we assume vertical flow equilibrium, which leads to a 2D instead of a 3D calculation (Coats et al. 1971). Second, we ignore spatial and temporal variations of the reservoir properties. Third, we assume only one phase. The single-phase approximation does not allow one to account for the response of the aquifers to the drop in gas pressure in the reservoir, or for flow from the over-and underburden.
The reservoir is considered planar with a spatial extent identical to that used in MoReS (Burkitov et al. 2016). This reservoir's footprint is clearly consistent with the observed pattern of surface subsidence (Smith et al. 2019) (Fig. 1a). We assume no flow at the boundaries. The trapping of gas over a geological time supports this hypothesis regarding possible leakage of gas from the reservoir as long as the mechanical integrity of the sealing formation is preserved. However, it ignores possible influx of groundwater in the depleted reservoir. This boundary condition is probably realistic for the eastern, western and southern boundaries, which are fault bounded. It is more questionable for the northern boundary, which is controlled by an aquifer. Given the small dip of the caprock, the reservoir is assumed horizontal.
We simulate pressure diffusion in the reservoir assuming vertical flow equilibrium (Coats et al. 1971). Pressure diffusion in 3D is approximated with a 2D calculation whereby only the verticallyintegrated pressure and flow are solved for. This method has been used to model pore pressure diffusion or gravity driven flow of CO 2 (Cowton et al. 2018) and can be extended to model multiphase flow (Jenkins et al. 2019). The VFE is valid if the ratio of the horizontal diffusion time over the vertical diffusion time is typically larger than 10 (Yortsos 1995). Expressed as a function of the thickness Δz, the horizontal extent Δx and the vertical and horizontal permeabilities k z and k x , this ratio is written as: In the Groningen reservoir case, permeability can be assumed to be isotropic due to the dominantly conglomeratic and sandstone lithology. With k z = k x , Δz of up 300 m and Δx between 35 and 50 km, we find R L . 117, so the condition for the validity of the VFE approximation is met.
The reservoir is assumed to be homogeneous and entirely connected. The assumption of homogeneous permeability and porosity could be relaxed, as we are using a finite element method, but seems a reasonable first-order approximation for the case of the Groningen gas reservoir given the dominantly conglomeratic and sandstone lithology. Some areas near the southern and western edges of the reservoir have, however, a pressure history that suggests poor hydraulic connection with the main part of the reservoir (Burkitov et al. 2016). In addition, comparison of the pressure history at the various wells suggests some heterogeneity of transport properties.
Finally, the gas might in fact occupy only a fraction of the total volume of the reservoir due to residual water trapped in wetting films and bridges in the pore network (Jenkins et al. 2019). Following Jenkins et al. (2019), we therefore include a parameter, the gas saturation, to account for this effect. Note that influx of groundwater or displacement of the boundary between the gas and the groundwater could bias the estimated gas saturation, which would then result in a possible incorrect, apparently time-dependent, estimate of the real volume of the gas reservoir.
With these simplifying assumptions, our model depends only on three adjustable parameters: permeability, porosity and gas saturation. They are assumed uniform in space and constant in time. We thus neglect the effect of sediment facies variation in space, that porosity and permeability probably decrease with time as the reservoir is compacting and that the gas saturation could be changing due to possible aquifer intrusion into the depleting reservoir. We therefore expect these factors to limit the ability of our model to match observations. The three unknowns are solved for through history matching.

Governing equations
The governing equations are derived from mass conservation and the balance of linear momentum for fluid flow in a porous medium (De Marsily 1986). Let us first ignore the effect of gas saturation. The mass conservation equation is written as: Where ϕ is the porosity, between 0 and 1, ρ is the density of the fluid, u is the fluid velocity and q is the source term representing injection or extraction of fluid. Darcy's law (Darcy 1856) is written as: Where μ is the fluid dynamic viscosity and p the pressure. Combining equations (2) and (3), and ignoring the gravity effect thanks to the vertical flow equilibrium assumption, yields: The development of equation (4) relating each term to the pressure gives: ϕ dρ dp dp dt + ρ dϕ dp dp dt Assuming that the compressibility of the solid grains is at least one order of magnitude lower than the compressibility of the bulk matrix (β s ,, β m ), and that the regional stress has been constant (Birdsell et al. 2018) during the exploitation of the reservoir, we get: We can now write equation (5) using equation (6): The matrix compressibility for the Groningen reservoir is estimated to β m ≈ 1 -10 × 10 −11 Pa −1 (Burkitov et al. 2016; van Eijs and van der Wal 2017). The fluid density is given by the equation of state (Yang 2016): Where P is pressure, M is molar weight of the gas, R is the gas constant, T is temperature and Z is compressibility factor, between 0 and 1. The compressibility factor also depends on the temperature, pressure and composition, and can either be calculated using a polynomial function or extracted from charts. The gas extracted from the Groningen field is composed of mainly methane (CH 4 , 85%), nitrogen (N, 14%) and carbon dioxide (CO 2 , 1%). The molar weight used in this study is the mean value over the six PVT zones considered in MoReS (Burkitov et al. 2016) M = 18.3815 g · mol −1 . For methane at a temperature of 385 K and pressure between 5 and 40 MPa, the Z-factor varies between 0.95 and 1. For simplicity, it is assumed to be constant and equal to 1. The term dρ / dp on the left-hand side of the equation is then a constant: A comparison of the time-dependent terms indicates that the second term of the left-hand side (1 − ϕ)ρβ m (dp/dt) can be neglected because of the compressibility term, which is extremely low and therefore (1 − ϕ)ρβ m (dp/dt) ≈ 8 · 10 −9 ,, ϕ(dρ/ dp)(dp/dt) ≈ 9 · 10 −7 This means that equation (7) can be simplified to: ϕ dρ dp dp dt The source term, representing the flow rates at the wells, is given in kg m −3 s −1 . It is converted to a two-dimensional source term in kg m −2 s −1 by dividing it by the local thickness of the reservoir, Δz: Where Q is the source term in kg m −3 s −1 and corresponds to the extracted flux. The wells are considered as point sources and the area is taken to be 1 m 2 . This implies that we assume an equal extraction rate from all depths within the thickness of the reservoir, an assumption consistent with the vertical flow equilibrium hypothesis. Taking the gas saturation into account, the differential equation governing pressure diffusion with the VFE assumption is then reduced to: Where ∇ = (∂/∂x) + (∂/∂y). The equation is solved using the open-source finite element solver FEniCS (Alnaes et al. 2015) with an implicit Euler method for time discretization using a time step of 1 month. The source terms are then monthly averaged extraction rates. The equation solves for the pressure at each time step given the extraction history. The viscosity and density are computed using the formulation given by Yang (2016): see equation (8) and μ = 10 −4 K exp (Xρ y g ), which is the empirical formula of Lee et al. (1966) and is also used in the MoRES model (Burkitov et al. 2016). Because our VFE model assumes only one fluid phase, the only two non-linearities left in the flow simulator are due to the pressure dependencies of the gas density and the gas viscosity. An alternative way to overcome the former would be to use a pseudo-pressure formulation (Al-Hussainy et al. 1966;Hagoort 1988). This approach requires just a single pre-simulation run to determine the pseudo properties, and thereafter allows for a linear near-steady-state simulation that exactly maintains the non-linear relationship between pressure and density. This approach has been applied to the Groningen gas reservoir with success (Postma and Jansen 2018). The non-linear effect due to the dependence of viscosity on pressure is probably not important for Groningen. This source of non-linearity is not a significant computational hurdle as we are using an empirical analytical expression. It is therefore included to keep the model applicable to settings where this effect might be more significant.

Pressure and extraction history matching
History matching consists of adjusting the three parameters characterizing the reservoir (porosity, permeability and gas saturation) to best fit the pressure measurements given the production flow rates. All the wells are fitted simultaneously so that we directly estimate the model parameters at the entire field scale. Bottom well pressure is assumed to differ from wellhead pressure only due to the hydrostatic effect caused by the weight of the fluid column.
We minimize the misfit between the modelled and the measured pressure at the boreholes, consisting of 1186 static measurements between 1957 and 2017 across 29 different locations. Given that the computation is very fast for a single forward model, we use a simple three-dimensional bruteforce grid search. This method makes it straightforward to estimate the uncertainties on the model parameters. An alternative method would be to use a Markov chain Monte Carlo method. The minimum, maximum and separation values between grid points describing the searched space of model parameters are given in Table 1. The grid search leads to a total of 12 400 simulations of pressure depletion models. The fit is quantified using a simple root mean square error (RMSE). The reported pressure measurements do not have uncertainties associated with them, so we give equal weight to all the measurements.
The best-fitting model yields pressure histories that are remarkably consistent with the observations (Fig. 2). Figure 3 shows the residuals from the bestfitting VFE model and from MoReS as a function of time. The best-fitting VFE model corresponds to a permeability of 3.1 + 0.68 × 10 −13 m −2 and a porosity of 18.5 + 6.5%. These values are consistent with the average permeability (3.55 × 10 −13 m −2 ) and porosity (17%) reported by Burkitov et al. (2016). Our best-fitting estimate of the gas saturation (27 + 2.4%) also falls in the range of 26 to 0.35% reported by Burkitov et al. (2016). The best VFE model yields a RMSE of 0.956 MPa compared to 0.548 MPa for MoReS (Fig. 3). This is a remarkably good fit given that the VFE model has only three adjustable parameters compared to 96 for MoReS. Not surprisingly, the distribution of residuals in space shows larger misfits obtained with the VFE model. We note, in particular, larger misfits and larger differences between the VFE and MoReS model predictions in the southwestern area of the reservoir. These misfits might reflect that this area might in fact be poorly connected to the main reservoir or might have properties different from the main part of the reservoir. We also note a north-south gradient in the comparison between the MoReS and the VFE model, which is probably due to the fact that our model ignores the interaction with the aquifer at the northern boundary of the reservoir. Another most obvious difference is the drift of the VFE residuals to larger values starting in the 1990s. By contrast, no such drift is visible in the MoReS residuals. The drift is probably due to the influx of groundwater from the aquifer bounding the reservoir to the north, or possibly also from the over-and underburden. This effect is modelled in MoReS but ignored in our VFE model. The VFE can be tweaked to account for this effect in an ad hoc way by allowing for variations of the gas saturation with time. In effect, the drift of the residuals is suppressed if the gas saturation increases from 0.31 to 0.36 between 1992 and 2016. We did not follow that route to keep the model as simple as possible, and also because it is not physical and would not be applicable moving forward beyond the historical period. Altogether, despite the drastic simplifications made (VFE, homogeneous properties, single-phase flow, no influx of groundwater), the best-fitting model yields a pressure depletion history remarkably close to the pressure evolution predicted by MoReS. We estimate the uncertainties on the VFE model parameters using chi-square statistics. We assume that the model is well specified and that the residuals are dominated by model errors, in particular because of the assumption of homogeneous reservoir properties. We assume that measurements from one single well have a correlated model error, and that measurements made at different wells are independent. We choose a confidence level of 95%. Given the number of model parameters (three) and the number of wells (29), the 95% confidence domain on the model parameters is given by all the parameter sets yielding a RMSE of less than 1.09 MPa. The uncertainties on each model parameter are derived from the corresponding marginal distributions (Table 1). We note, however, that the residuals are not normally distributed (Fig. 3). The choice of minimizing the RMSE could therefore be questioned. The framework could allow the implementation of a more sophisticated method of uncertainty quantification that would provide the complete probability distribution of model parameters taking into account a priori knowledge of the model parameters and the fit to the observations. This development is, however, deferred to future studies. We conducted out-of-sampling tests to assess the quality of the calibration and the sensitivity to the choice of the training period. We vary the training period and use the remaining data for validation ( Table 2). The RMSEs for the training period are consistently between 0.858 and 0.957 MPa. The RMSEs for the validation period are larger, between 1.08 and 1.84 MPa. Both the training and validation RMSEs degrade with time. We interpret this as the result of the assumption of constant gas saturation for the whole period. As explained above, a better fit can be obtained by allowing variations of the gas saturation to increase with time. This approach is not adopted here as it cannot be used to predict the future evolution. In that case, a multiphase-flow Prediction of pressure evolution for future production scenarios Our VFE model can then be used to forecast the pressure evolution in time and space for various hypothetical production scenarios. We consider three scenarios: (1) the shut-in scenario assumes a sudden arrest of production at the end of 2016; (2) the cold  winter production scenario (NAM 2013) is based on the estimated gas demand in the case of cold winters starting from January 2017, and assumes shut-in by 2030; (3) the cold winter production scenario with pressure management. This third scenario is also based on the cold winter production scenario (NAM 2013) but we consider in addition that production is compensated by injection so that the net volume of fluid in the reservoir is kept constant from the beginning of 2017. In practice, pressure management would imply the injection of a fluid with different properties than the gas present in the reservoir. We ignore this and therefore cannot solve for the relative distribution of the gas and the injected fluid as this would require a multiphase-flow model. The single-phase assumption is, however, a valid approximation to predict the pressure evolution (Oldenburg et al. 2001). For consistency with a preexisting pressure management scenario (Overleggroep Groningen 2022), we assume gas extraction at an arbitrarily chosen set of wells located in the portion of the gas field compensated by injection at the other wells (see locations of extraction and injection wells in the inset of Fig. 4). We acknowledge that this simulation is not very realistic as the injected fluid is assumed to have the same properties as the extracted fluids given that our VFE model considers only one phase. Using the history-matched vertical flow equilibrium model, we can forecast the pressure depletion for each of these scenarios taking into account the uncertainties on the model parameters.
For each scenario, we store the time-dependent pressure field of all models with parameters lying within the 95% confidence domain derived from the history matching. We use this ensemble of pressure field predictions to forecast subsidence and seismicity as described in the next two sections.

Geomechanical modelling and surface subsidence
Surface subsidence over the Groningen gas field has been well documented with different geodetic and remote sensing techniques including optical levelling, persistent scatterer interferometric synthetic aperture radar (PS-InSAR) and continuous global positioning system (cGPS). Smith et al. (2019) combined all these data to describe the evolution of surface subsidence and the related reservoir compaction from the start of gas production until 2017. Here, we show that the VFE model predicts a reservoir compaction consistent with the measured surface subsidence (Fig. 5). For a given distribution of pressure change within the reservoir since the onset of production, the surface displacement can be estimated assuming poroelastic compaction of the reservoir. Given the relatively shallow depth of the reservoir compared to its lateral extent, strain can be assumed uniaxial and vertical. The uniaxial compaction due to pressure depletion then only depends on the uniaxial compressibility of the reservoir (Geertsma 1973) according to: Where Δh is the compaction of the reservoir, C m the uniaxial compressibility, ΔP the pressure drop and h the reservoir thickness. Fig. 4. Mean reservoir pressure evolution predicted with the vertical flow equilibrium (VFE) model for the shut-in scenario, the cold winter production scenario and the cold winter production with pressure management scenario. Inset shows the distribution of extraction and injection wells in the pressure management scenario. The lines and shaded areas show the prediction from the best VFE fitting model obtained from history matching and the associated 88% confidence interval assuming shut-in (red), cold winter (blue) and cold winter with pressure management (green) scenarios. The purple line shows the mean reservoir pressure from modular reservoir simulator (MoReS). The vertical dashed line marks the transition from history matching to forecasting in 2016.

Surface deformation and induced seismicity forecasting
The deforming reservoir might be represented as a series of point sources of strain (Candela et al. 2019;van Wees et al. 2019). This approach is efficient as the Green's functions are analytical. It allows one to calculate strain and stress changes in the 3D volume and can feed a seismicity forecasting scheme easily. The method is, however, very sensitive to the location of the point sources representing the reservoir and to the location of the receiver points where stress changes are evaluated. This is due to the singularity of the analytical function at the source location (the stress increases to infinity at the location of each point source). Here, to alleviate that issue, the deforming reservoir is represented as a series of polyhedral volumes which are deforming poroelastically and assumed to be isotropic and homogeneous. This assumption is probably appropriate for the sandstone unit that forms the reservoir at Groningen. It is an efficient way to represent, to the first order, spatial variations of the reservoir geometry, due in particular to the faults offsetting the reservoir. Because the faults offsetting the reservoir are steep, with dip angles typically larger than 70°, we can in fact use simple cuboids. The displacement and stress Green's functions for polyhedral volumes are semi-analytical and can be obtained by integration of the point source solution (Geertsma 1973) over the volume of each cuboid (Kuvshinov 2008). The distribution of uniaxial compressibility over the reservoir was estimated (Smith et al. 2019) based on the pressure depletion The root mean square (RMS) difference between the compaction derived from geodesy and from the reservoir models is reported on the panels. The bottom panel shows the time evolution of the mean subsidence derived from the geodetic measurements and predicted by the VFE and MoReS models. The prediction for the period beyond 2017 is based on the cold winter scenario. InSAR, interferometric synthetic aperture radar; VFE, vertical flow equilibrium; MoReS, modular reservoir simulator. predicted by MoReS, the reservoir thickness and the reservoir compaction derived from the linear inversion of the surface displacements measured from InSAR and GPS.
The bottom panel of Figure 5 compares the time evolution of the spatial pattern of compaction predicted by our VFE model and MoReS with the compaction derived from the inversion of the geodetic (GPS and optical levelling) and remote sensing measurements of surface subsidence (Smith et al. 2019). It shows that both the VFE and MoReS predict a compaction quite consistent with the measured surface subsidence. MoReS does, however, better at fitting the spatial distribution of compaction. In fact, the VFE model fits the time evolution of the compaction derived from the surface displacements slightly better than MoReS. The quality of the fit obtained with the VFE model is remarkable as the compressibility distribution was optimized to fit MoReS. The surface measurement of displacement could therefore be included in the dataset used for reservoir history matching (van Oeveren et al. 2017), and our framework would allow us to introduce spatial variations of reservoir properties to improve the fit to both the pressure measurements and the surface subsidence. This could help refine the spatial distribution of the reservoir characteristics, including its geometry. This approach could be interesting to constrain reservoirs less well known than Groningen where injection or production would produce a measurable surface displacement signal. In any case, this comparison suggests that the strain and stress changes predicted by the VFE and MoReS models are valid to first order and can provide a good basis to estimate strain and stress changes due to pressure changes in the reservoir. The modelled stress changes are thus used as an input in the seismicity modelling described in the next section.

Stress-based seismicity forecasting
We describe here how induced seismicity is predicted based on the temporal evolution of the fluid pressure distribution estimated using the reservoir model. Rock failure is commonly assessed using the Mohr-Coulomb failure criterion (Handin 1969). A number of studies have demonstrated that this criterion applies effectively to assess earthquake triggering by stress changes (King et al. 1994). According to this criterion, failure occurs when the shear stress τ exceeds the shear strength of the material τ f , represented by: Where τ f is shear-stress, σ n is the normal stress, positive in compression, P is pore pressure, μ is internal friction and C 0 is the cohesive strength. If the material is not at failure, the strength excess is τ f − τ.
Fluid pressure changes play an important role in preventing or promoting fault failure. Assuming the total stresses do not change, a greater pore pressure acts to lower the effective normal stress and promotes failure. By contrast, a pressure decrease should inhibit failure. It is therefore customary to assess jointly the effect of stress changes and pore pressure changes using the Coulomb stress change defined as: Where ΔCFF is the change in Coulomb stress (the notation is customary and refers to the 'Coulomb failure function'; an alternative common notation is ΔCFS for 'Coulomb failure stress'), Δτ is the shearstress change, μ is the internal friction, Δσ n is the change in normal stress and ΔP is the change in pore pressure. Detailed studies of the seismicity of the Groningen field show hypocentres with a depth distribution peaking within the reservoir Spetzler and Dost 2017;Willacy et al. 2019), or in the caprock (Smith et al. 2020). We thus need to model the stress redistribution due to the reservoir compaction and pore pressure variations within and outside the reservoir with account for poroelastic effects (Wang 2000). A number of previous studies have explored different approaches. Bourne and Oates (2017) developed the Elastic Thin Sheet model (ETS), a semi-analytical reservoir depth-integrated model. The ETS formulation approximates the reservoir deformation as a uniaxial vertical strain field. It allows estimating stress changes within the reservoir and was designed to account for stress concentrations at the faults offsetting the reservoir. The faults' characteristics are not explicitly represented but accounted for indirectly from the smoothed spatial gradient of the reservoir thickness. The cuboids representation of the reservoir, which we use to model surface subsidence, provides an alternative computationally more costly approach to calculate stress changes within and outside the reservoir (Kuvshinov 2008;Smith et al. 2020). The faults' geometry can be approximately represented via the cuboid mesh. The two approaches were implemented in our framework and compared in the previous study based on MoReS (Smith et al. 2022). The cuboid method is, however, more general and does not require assuming that the earthquakes occur within or close to the reservoir. The two methods make equivalent predictions when MoReS is used as an input (Smith et al. 2022). So, for the purpose of this study, we use only the ETS model, which is computationally more effective.
The next element needed to forecast seismicity is a method to relate stress changes to seismicity. All stress-based methods assume that earthquakes result from frictional unstable slip, and therefore require a friction law. The reader is referred, for example, to Ader et al. (2014) and references therein for details. A common approach, often referred to as the standard Coulomb model, assumes that friction drops instantaneously from a static value, which is the ratio of the shear stress to the normal stress at the onset of slip, to a dynamic value, the ratio of the shear stress over the normal stress during slip. Both quantities are generally assumed constant in the standard Coulomb model. This model results in instantaneous nucleation, and the seismicity rate is simply proportional to the Coulomb stress rate as long as it increases monotonically; seismicity ceases when the Coulomb stress rate decreases. This model yields satisfying results at the annual time-scale at Groningen (Smith et al. 2022), but it is probably not valid at shorter time-scales because it ignores the fact that earthquake nucleation cannot be instantaneous in reality. The observation that earthquakes show little to no correlation with solid Earth tides is indeed compelling evidence that the nucleation process is long enough that earthquakes are not sensitive to stress changes at periods equal to or lower than the dominant 12 hour period of tides (Beeler and Lockner 2003). This is also consistent with the observation in the laboratory that friction drops gradually during unstable slip.
Earthquake nucleation on a particular fault depends on the initial stress and on the friction law relating stress to fault slip. Theoretical nucleation models can be derived using rate and state friction laws that were determined in laboratory studies (Dieterich 1994;Ampuero and Rubin 2008). Such models explain relatively well the characteristics of aftershock sequences (Dieterich 1994) and the insensitivity of earthquakes to solid Earth tides (Beeler and Lockner 2003;Ader et al. 2014). In this formalism, friction depends on the slip rate and on the state variable that allows for healing ('restrengthening') after a slip event. Dieterich (1994) derived an approach to directly relate stress changes to seismicity rate. The original formulation (Dieterich 1994) assumes that all faults are initially 'above steady state', meaning that they are assumed to have been on their way to failure from the start of perturbation of the stress field when gas production started. This approach has been applied with some success to induced seismicity at Groningen (Candela et al. 2019;Richter et al. 2020), but the fact that seismicity lags the onset of gas production by decades requires a very long nucleation process. A significant improvement was made recently by relaxing the 'above steady state' hypothesis (Heimisson et al. 2021). This revised formulation assumes that the faults were initially in a relaxed state (away from failure). This assumption is probably adequate in the case of the stable tectonic context of the Groningen gas field. The revision introduces a stress threshold needed to be exceeded for earthquake nucleation. The threshold is equivalent to the minimum stress change, the 'initial strength excess', needed for failure in the case of the simple static Coulomb failure model. With the introduction of this threshold, it turns out that the duration of the nucleation process, the time needed to reach failure, derived from matching the observed seismicity is greatly reduced. Assuming instantaneous failure therefore provides a good approximation of seismicity rate at the annual timescale (Smith et al. 2022). As a result, we adopt here the simple assumption of instantaneous nucleation once the stress change equates an initial strength excess, which is treated as a stochastic quantity (Smith et al. 2022). The stochastic representation provides a way to account for stress heterogeneity and the diversity of fault orientations.
The distribution of strength excess depends on the probability distributions describing the faults' orientation, stress and strength. We assume that these factors result in a Gaussian distribution strength excess (Smith et al. 2022). If we assume that the initial Coulomb stress values on different fault patches are independent and identically distributed random values, the probability of failure of a fault at a location with a maximum Coulomb stress changes ΔC is derived from integration of the Gaussian function (Smith et al. 2022), yielding: Where θ 1 , θ 2 represent the mean and standard deviation of the Gaussian distribution, representing the fault strength distribution. A third parameter, θ 0 , has to be introduced to represent the density of nucleation points per unit area yielding induced earthquakes with magnitude above the magnitude of completeness of the catalogue used for calibration. This model is similar to the extreme threshold model of Bourne and Oates (2017), which assumes that the seismicity only reflects the tail of the failure probability function, meaning the failure of the faults with the smallest strength excess. The extreme value theory implies an exponential rise of seismicity for a constant stress rate (Bourne and Oates 2017). The Gaussian model also produces an initially exponential rise, but the seismicity rate gradually transitions to a steady regime where the seismicity rate is proportional to the stress rate. The transition occurs when the stress increase is of the order of the mean initial strength excess (θ 1 ). The formulation thus allows, in principle, the system to move out of the initial exponential rise of seismicity.
The model parameters are estimated by history matching using the seismicity catalogue. Hypocentral depths are not considered since, with the ETS formulation, earthquakes are assumed to occur within or at the boundary of the reservoir. For consistency with the study of Heimisson et al. (2021), we quantify the misfit between the predicted and observed seismicity using a Gaussian log-likelihood function: Where R(m, i) is the model predicted rate density in year i, where m is the vector of model parameters. R o i is the observed rate in year i. In that calculation, we use yearly average stress rates. Integration in easting x, and northing y, is carried over the area Σ, corresponding to the outline of the reservoir in map view. During the training, we sample the probability distribution function (PDF) (equation 17) using a Metropolis-Hastings sampler. After a sufficient number of samples, hindcasts are obtained by selecting 1000 random samples of m = m 1 , m 2 , … at random and computing R p (m, t) for t . y e + 1. For calibration of the model, we use the catalogue of Dost et al. (2017), which reports earthquake locations since 1990, with a completeness of M LN . 1.5 since 1993. The model parameters and their uncertainties derived using the best-fitting (maximum a posteriori, MAP) history-matched VFE model are listed in Table 3. We also list the mean and the range of model parameters obtained from the ensemble of models within the 95% confidence domain determined during the history-matching procedure. The model parameters derived when using MoReS as input to the stress calculation are listed in Table 4. They are close to those obtained with VFE models.
The spatial and temporal variations of seismicity rate predicted with either the VFE models or the MoReS model are very similar and consistent with the observations (Fig. 6). One noticeable difference is that the VFE models predict more seismicity than MoReS in the southwestern area of the reservoir. This is due to the fact that the VFE models predict a smaller pore pressure drop than MoReS in that area presumably poorly connected to the main part of the reservoir (Burkitov et al. 2016). By contrast, the VFE models predict a lesser seismicity rate than MoReS in the central part of the reservoir. Both models are consistent to first order with the observed distribution of earthquakes. Figure 6d shows the mean expected annual seismicity rate (blue line), and the range of expected seismicity rate for the ensemble of VFE models within the 95% confidence domain derived from history matching. The two models also predict a seismicity rate consistent with the observations over the validation period with comparable success (Fig. 6). A slightly better validation fit is actually obtained with the VFE model. It should be noted that the plot does not account for the variability of seismicity rate expected from the stochastic nature of seismicity. This is the main cause of inter-annual variability in the observed seismicity. This term could be included assuming a non-homogeneous Table 3. Mean and standard deviation of the parameters of the Gaussian stress threshold model used to relate stress changes to seismicity for the best-fitting (MAP) VFE model and across all the pressure history-match models within the 95% confidence domain

Mean
Standard deviation  Poisson process and some model of aftershock statistics such as ETAS (epidemic-type aftershock sequence; Ogata 1998). It is not included here as it would obscure the contribution of the uncertainties on the reservoir model parameters. Figure 6 also shows the expected maximum magnitude, and associated uncertainties, based on the VFE and MoReS models. This calculation assumes that the frequency-magnitude distribution of earthquakes follows the Gutenberg-Richter law for a b-value of 1. For simplicity, we did not include any consideration for the uncertainty and possible temporal variations of the b-value (Bourne and Oates 2020).
Once the history-matched seismicity production values are determined, we can forecast the earthquake rate for the different hypothetical production scenarios described above. Figure 7 shows the seismicity forecast for the shut-in scenario, the cold winter scenario and the cold winter scenario with pressure management. The shut-in scenario leads to the most abrupt drop in seismicity. Seismicity does not completely shut down, however, because of pressure readjustment in the gas field after production is stopped. It should be noted that the model does not account for the lag in the seismicity response that would result from the earthquake nucleation process, which is not instantaneous, as assumed in our model, but time-dependent. Comparison with Heimisson et al. (2021, fig . 6), which accounts for timedependent nucleation but assumes no further stress changes after shut-in, shows that the induced lag is in fact quite short. It is, however, probable that our model predicts a too abrupt drop in seismicity at the time of shut-in because post-shut-in pressure equilibration is neglected. In addition, the model presented here ignores sub-annual variations of extraction rates. In reality, production rates vary significantly seasonally (Fig. 1). Because the model is non-linear, the decision not to consider sub-annual variations introduces a bias. The model is therefore probably not adequate to forecast the evolution of seismicity at the sub-annual time-scale. This limitation can be alleviated by refining the temporal resolution of the VFE model and accounting for time-dependent nucleation using the threshold rate-and-state approach (Heimisson et al. 2021). This improvement comes at the computational cost of introducing one additional parameter. The code developed for this study, which is available online, includes the threshold rate-and-state model of earthquake nucleation, but the results presented here only assume instantaneous nucleation.

Discussion
The modelling framework presented in this study was designed to be computationally effective and as general as possible. A number of simplifications were, however, made for the purpose of the specific application to the Groningen gas field. Thanks to these assumptions, our VFE model is computationally very effective. It takes 60 s to run a forward simulation for the 1956-2030 period with a monthly time resolution on a MacBook Air M1. For comparison, it takes 120 h for a similar simulation using MoReS on Shell's supercomputer (NAM 2013). These simplifications are, however, a source of limitations, and some could be relaxed to improve the quality of the forecast or ensure the applicability of our modelling framework to other setting. We revisit these assumptions here.
The reservoir model assumes vertical flow equilibrium (VFE), uniform transport properties, a single fluid phase and no fluid flow into the reservoir from over-or underburden. The VFE assumption is probably well justified given the geometry of the Groningen reservoir. This assumption is the major factor of computational efficiency, and it can probably apply to other reservoirs and other applications than gas production, carbon dioxide storage or conventional geothermal reservoir operations, for example. A major limitation is that our reservoir model considers only one fluid phase. As a result, the response from the aquifers bounding the reservoir or the influx of groundwater from the over-or underburden cannot be accounted for. Even though the model yields a rather good fit to the observation, it is certainly a major limitation of the model as it stands. The gradual increase in the residuals suggests Surface deformation and induced seismicity forecasting gradual influx of groundwater in the depleted reservoir. While the bias is modest over the historical period, it might become more significant in the future. Adjusting the gas saturation with time during the history matching would be an ad hoc way to correct the bias, but it would not be applicable for forecasting. The forecast for the various future scenarios presented in the study should therefore be considered with caution. This limitation could be addressed with the implementation of a multiphase VFE model (Jenkins et al. 2019) as it would allow including explicitly the interaction with groundwater.
Another major assumption regarding the reservoir model is that of homogeneous permeability and porosity. These assumptions seem valid to first order, although the large residuals observed at some particular wells (ZW1 and ZW2, for example; Fig. 2) could probably be absorbed by allowing for heterogeneities. This is doable within our modelling framework since we are using a finite element model. Correctly resolving a large contrast in permeability would, however, require appropriate space and time meshing and would come at a computational cost. Most importantly, the calibration of the model parameters would be a challenge.
The geomechanical model assumes strictly poroelastic reservoir deformation, with homogeneous and isotropic properties. This assumption allows fitting the geodetic and inSAR observations well (Smith et al. 2019). However, laboratory experiments simulating compaction of the Groningen reservoir (Hol et al. 2015(Hol et al. , 2018Pijnenburg et al. 2018Pijnenburg et al. , 2019 suggest significant (30-50%) rate-sensitive inelastic deformation. So, although our model reproduces the amount and distribution of reservoir compaction, the assumed poroelastic behaviour is questionable. The assumption of strictly poroelastic deformation could, in particular, bias the model prediction beyond the historical period. The poroelastic model could in fact be substituted in our modelling framework with a more sophisticated model allowing for inelastic compaction, using, for example, the rate type compaction model RTIC of Pruiksma et al. (2015). It might affect the prediction of the time evolution of subsidence and help fit the geodetic an inSAR data better. The impact on the seismicity forecast for the historical period would probably be very small.
Regarding the seismicity model, the results presented here assume homogeneously distributed point sources of earthquake initiation and instantaneous nucleation. The fact that earthquakes are probably triggered at stress concentration where the reservoir is offset by faults is accounted for using the 2D Elastic Thin Sheet (ETS) reservoir model of Bourne and Oates (2017). This approach does not take into account the detailed reservoir geometry, though. As mentioned earlier, our modelling framework also includes the possibility of using cuboids to better represent the reservoir geometry and to calculate stress changes in 3D (Kuvshinov 2008). This approach is also efficient because it makes use of Green's functions. It is, in that regard, similar to the distributed point source of strain representation of van Wees et al. (2019), but allows us to represent the reservoir geometry in more detail at a lower computational cost and with a lower sensitivity to the meshing (Smith et al. 2022). The cuboids can be designed to match relatively precisely the faults offsetting the reservoir. We have compared the two methods in a previous study using MoReS as an input, and obtained very similar results (Smith et al. 2022). We therefore opted here for the less costly 2D ETS method.
Finally, our seismicity model does not model the dynamic evolution of fault slip on the faults offsetting the reservoir. More sophisticated models have been developed to that effect, which can solve for the initiation and propagation of fault slip (Buijze et al. 2017(Buijze et al. , 2019van Wees et al. 2017;Jansen and Meulenbroek 2022). Such models require a detailed description of the distribution of initial stress and are deterministic and therefore less adapted for hazard assessment over a large area and long period of time. We opted for a simplified representation, which is much less costly and directly yields seismicity rate variations in time and space, while magnitudes are drawn at random from the Gutenberg-Richter distribution. In the results presented here, the nucleation process is assumed instantaneous because the nucleation process does not affect the seismicity rate prediction at the annual time-scale (Heimisson et al. 2021). Time-dependent nucleation is probably needed to forecast seismicity variations at the sub-annual time-scale. No such simulations were presented in this study but can be run with our modelling framework.

Conclusions
When combined with semi-analytical formulations to calculate poroelastic stress changes and a simple model of earthquake nucleation, the vertical flow equilibrium (VFE) assumption allows calculating reservoir fluid pressure, compaction, surface subsidence and induced seismicity at a low computational cost (60 s on a MacBook Air M1 for a forward simulation with a monthly resolution over 74 years). This integrated modelling framework, tested in the context of the Groningen gas field, yields predictions of well pressure, surface subsidence and seismicity consistent with observations and close to the predictions obtained with the more sophisticated MoReS model, which was developed by the operator and represents the industry standard. Our modelling framework thus provides a tool to assess the expected subsidence and seismicity response to production scenarios with account for uncertainties. The simplifications made in the simulations shown in this study are drastic: VFE, one single fluid phase, homogeneous reservoir transport and mechanical properties, strictly poroelastic deformation of the reservoir and earthquakes represented by point sources with instantaneous nucleation. Our results show that they are adapted to model subsidence and seismicity in the case of Groningen field. They can, however, be a source of bias and they certainly limit the general applicability of the model. Predictions at the sub-annual time-scale would, for example, require accounting for sub-annual variations of stress changes due to the seasonality of production and for time-dependent earthquake nucleation. Prediction of subsidence and seismicity for future scenarios can be biased by the decision not to consider inelastic rate-dependent compaction of the reservoir and possible influx of groundwater into the reservoir. The first issue could be addressed by incorporating a more sophisticated rheology to describe compaction (Pruiksma et al. 2015). The other issue is probably the reason for the drift in the residuals obtained from history matching with the VFE model. A multiphase VFE flow model could be implemented (Jenkins et al. 2019) to alleviate that limitation. This would be beneficial also for the application of this framework to CO 2 storage where multiphase flow would be required to estimate the extent of the CO 2 plume. This and the other improvements discussed above, such as the introduction of a rate-sensitive compaction rheology, could easily be incorporated in our modelling framework given its modular organization. Finally, the modelling framework presented here could in principle also be used to design and optimize pressure management in reservoir operations. The location and flow rates of the injection wells could, for example, be adjusted so that seismicity and subsidence would be minimized.