Bayesian seismic inversion for stratigraphic horizon, lithology, and fluid prediction

We have developed an efficient methodology for Bayesian prediction of lithology and pore fluid, and layer-bounding horizons, in which we include and use spatial geologic prior knowledge such as vertical ordering of stratigraphic layers, possible lithologies and fluids within each stratigraphic layer, and layer thicknesses. The solution includes probabilities for lithologies and fluids and horizons and their associated uncertainties. The computational cost related to the inversion of large-scale, spatially coupled models is a severe challenge. Our approach is to evaluate all possible lithology and fluid configurations within a local neighborhood around each sample point and combine these into a consistent result for the complete trace. We use a one-step nonstationary Markov prior model for lithology and fluid probabilities. This enables prediction of horizon times, which we couple laterally to decrease the uncertainty. We have tested the algorithm on a synthetic case, in which we compare the inverted lithology and fluid probabilities to results from other algorithms. We have also run the algorithm on a real case, in which we find that we can make high-resolution predictions of horizons, even for horizons within tuning distance from each other. The methodology gives accurate predictions and has a performance making it suitable for full-field inversions.


INTRODUCTION
In a lithostratigraphic framework, the underground is divided into layers containing one or more lithofacies originating from the same depositional process (Ringrose and Bentley, 2015). Changes in the depositional process due to, e.g., changes in climate or sea level give rise to new layers containing a different mix of lithofacies. The interfaces between the layers define horizons. In geomodeling, the stratigraphic model is routinely built based on the interpretation of seismic reflectors, and by using well observations and expected layer thicknesses. Often, multiple combinations of lithofacies are possible across a horizon, either due to the complexity of the depositional environment or because the horizon defines an erosional event. When this occurs, we observe changes in the seismic ampli-tude and polarity along the horizon, which make interpretation of the stratigraphic horizon difficult.
Quantitative lithology and fluid prediction from seismic is historically defined as a two-step approach. Seismic data are first inverted to elastic properties, and then discrete lithology and fluid classes are predicted from the elastic properties using rock physical relations (Goodway et al., 1997;Avseth et al., 2001). The rock-physics models define absolute levels and possible ranges for the elastic properties for different lithology and fluid classes and provide links from the estimated elastic parameters to possible lithology and fluid classes (Avseth et al., 2005;Bosch et al., 2010). In practical use of two-step inversions, the uncertainty of the elastic parameters is often treated without rigor, giving results that tend to underestimate the uncertainty.
In the following, we define facies as classes of specific lithology and pore fluid within a specific stratigraphic layer, for example, overburden shale, and gas-, oil-, and brine-filled sandstone reservoir.
Bayesian methodology defines a general framework useful for a wide range of different applications, in which a prior probability model is updated to a posterior probability distribution based on data (Tarantola and Valette, 1982;Gouveia and Scales, 1998;Buland and Omre, 2003b;Tarantola, 2005;Grana and Rossa, 2010;Buland et al., 2011;Buland and Kolbjørnsen, 2012). In seismic facies prediction, prior geologic knowledge can be quantified by a joint spatially coupled prior probability distribution. This prior distribution can describe multiparameter relations between different rock properties and elastic parameters as well as the spatial structure of the problem. Buland et al. (2008) describe a fast Bayesian algorithm, but due to the pointwise approach, the elastic parameters are mapped to facies classes without considering the surrounding locations, and the algorithm does not ensure correct vertical ordering of the stratigraphy or pore fluids. Kjønsberg et al. (2010) present a brute force Markov chain Monte Carlo (MCMC) simulation algorithm in which new layers could appear, be removed, or be changed, but the algorithm is very time consuming even for a single vertical profile.
A practical and useful way to model spatial geologic dependencies is by the use of Markov chain models (Krumbein and Dacey, 1969). Markov chains are defined by simple dependencies between neighbor locations and can be used to model spatial continuity and transitions between the facies classes. Larsen et al. (2006) use Markov chains to define the prior model for a vertical profile of facies classes. Assuming a hidden Markov model and that seismic data can be considered as independent observations at each data point, an approximate posterior distribution can be obtained using the efficient recursive forward-backward algorithm defined by Scott (2002).  and  extend the Markov chain prior model in Larsen et al. (2006) to three dimensions by also including lateral dependencies, and Rimstad and Omre (2010) and Rimstad et al. (2012) also include rock-physics depth trends. However, the assumption of a localized likelihood, i.e., that seismic samples are independent observations, is not realistic, and the results from these methods indicate that the solutions tend to be too conclusive. Rimstad and Omre (2013) evaluate various possible improvements to this MCMC algorithm. Hammer et al. (2012) correct for the independence assumption above and present an MCMC algorithm for models with vertical dependencies of the rock properties, but a proper handling of these models is computationally complex. Fjeldstad and Grana (2017) estimate the facies probabilities using a 2D Markov random field with a stationary Markov prior, using an MCMC approach. Connolly and Hughes (2016) solve the same problem using a rejection sampling algorithm. Nawaz and Curtis (2018) relax the localized likelihood assumption to a quasilocalized likelihood and solve the problem using a variational approximation in two dimensions; however, the parameterization sets limit on the geophysical formulation. Kemper and Gunning (2014) and Gunning and Sams (2018) adopt the spatial approach by Rimstad et al. (2012), but did only search for the maximum posterior solution using optimization algorithms such as expectation-maximization and global annealing. The weakness of this approach is that there is no guarantee that it converges to the correct solution, and it does not address the question of inversion nonuniqueness. The general problem with the sampling-based algorithms described above is that the number of facies combinations is very large. To ensure convergence of the algorithm, either the algorithm needs to run for a long time, or approximations needs to be done, like limiting the number of facies classes.
We have developed an efficient approach that addresses the issue of nonuniqueness while preserving a vertical structure. Our approach is to evaluate all possible solutions in a moving local neighborhood rather than trying to solve the full joint posterior distribution. For many problems, it is possible to obtain adequate approximations of the posterior distribution by solving a small inverse problem on a local neighborhood Kolbjørnsen et al., 2016). The prior model is given as a nonstationary Markov chain, encompassing a prior for the horizons and for the facies configurations within the stratigraphic layers. The definition of the stratigraphic model is flexible, and may include many finelayered subseismic horizons, or only a single layer defining the complete inversion interval.
The use of this prior model makes joint inversion for facies and horizons possible. The solution is represented by the posterior model in which laterally consistent horizons with posterior uncertainty are a key part of the solution. The information defined in the prior model combined with the seismic amplitudes allow highresolution prediction of the horizons, even for layers with thickness below seismic tuning. The prior model allows multiple facies within a stratigraphic layer, which enables the prediction of layer boundaries, even if the layer transition is not linked to a unique facies transition. The Markov prior also incorporates knowledge about the order of facies within the layer. This might help with constraining the possible facies transitions at layer tops and bottoms. The solution also includes a posterior probability model for the facies with uncertainties. The methodology is presented below and illustrated on a synthetic example and a real case.

THE BAYESIAN INVERSION FRAMEWORK
The seismic time-angle gather at a specific lateral location, denoted as d, is assumed to be the response of an elastic vertical profile with elastic parameters m. The elastic parameters define the seismic reflection coefficients and the amplitudes of the seismic data depending on the reflection angle. For isotropic materials, the elastic properties can be described by three parameters, for example, the P-and S-wave velocity and density. Hence, the vector m has dimension 3L with L being the number of discrete, regularly sampled time samples in a vertical profile.
The vertical earth profile is also characterized by a vector of categorical facies classes f of length L. The objective of this work is to make inferences about f based on the seismic data d (see Figure 1). In Bayesian inversion, all inferences are completely based on the posterior distribution. The posterior distribution for the facies vector f given the seismic data d can be written (1) defining a direct one-step inversion, where pðfÞ is the facies prior model. Including the intermediate elastic model in Figure 1, we can write the likelihood for the seismic data given a facies configuration pðdjfÞ as where pðdjmÞ is the seismic likelihood and pðmjfÞ is the rockphysics likelihood.

Seismic likelihood model
The seismic likelihood model is based on the seismic forward model defined in Buland and Omre (2003a). A seismic time-angle gather can be written in linear form, (3) where the elastic properties m are represented by the respective logarithms of the P-wave velocity, S-wave velocity, and the density along the vertical profile, G is a modeling matrix, and e is an additive colored Gaussian noise term with covariance matrix Σ e . The modeling matrix is defined by where W is a block-diagonal matrix representing one wavelet for each reflection angle, A is the matrix of angle-dependent weak contrast coefficients defined by Aki and Richards (1980) for either PP or PS reflections or both if properly aligned, and D is a differential matrix giving the contrasts of the elastic properties. This defines the colored Gaussian seismic likelihood function pðdjmÞ.

Rock-physics prior model
The rock-physics prior model pðmjfÞ is built up of the pointwise distribution of the elastic parameters pðm i jfÞ and a vertical dependency structure within and between facies.
A local Gaussian rock physics prior can be estimated from well-log samples, or alternatively from samples simulated with a stochastic rockphysics model, defining the link from rock properties to effective elastic parameters for each facies (Avseth et al., 2005). Typical rock properties are mineral composition, fluid saturation, porosity, and texture characteristics. The rock-physics models are generally nonlinear and local in the sense that the rock properties of a facies in one location i only affect the elastic parameters in that specific location, such that pðm i jfÞ ¼ pðm i jf i Þ; (5) where pðm i jf i Þ is the rock physics prior at one specific location i.
In this paper, we will assume that the rock physics prior for each facies can be estimated by a Gaussian distribution. The prior distributions for the elastic parameters, pðmÞ, will thus be mixed Gaussian with peaks related to the different facies configurations, and they are given by pðmÞ ¼ where the sum runs over all possible modes. Figure 2 shows the rock physics priors for four different facies used in the examples in the remainder of this section. The facies are two different shales (shale 1 and shale 2) and a sand that can be either gas or brine filled (gas sand and brine sand).

Facies prior model
The aim of the prior model used in this paper is not only to capture the pointwise prior facies probability but also to incorporate information about stratigraphic layering and facies ordering within the layers. To accomplish this, the prior model has two levels. First, a stratigraphic prior model pðsÞ, where s is the vector of stratigraphic layer identifiers with same length as f. Second, a prior model for facies class distribution given a stratigraphic layer pðfjsÞ. The final prior model pðfÞ is then approximated as a one-step Markov model, meaning that the facies class probabilities pðf iþ1 Þ at sample location i þ 1 are given by the facies class probabilities pðf i Þ in the sample location directly above, together with the facies transition probabilities pðf iþ1 jf i Þ (Krumbein and Dacey, 1969).
A conceptual model of a reservoir with gas-or brine-filled sand with a shale layer above and another shale below is shown in Figure 3. Here, the two horizons top 2 and top 3 split the reservoir into three stratigraphic layers, 1-3. Four facies classes are defined, a shale in the overburden layer 1, a gas-filled sand and a brine-filled sand in the reservoir layer 2, and another shale in layer 3 below the reservoir. The horizon uncertainties are illustrated by the dashed green lines. In addition to the horizon times, the facies content of the reservoir layer at a given sample point is uncertain.

Stratigraphic prior model
To develop a Markov model for the actual facies classes, we first set up a Markov model for the stratigraphic layers containing the facies. Corresponding to f, the vertical earth profile can also be given as a vector s of stratigraphic layer identifiers k. The stratigraphic prior model pðsÞ can then be determined from the stratigraphic layer probability distribution at the top of the trace pðs 0 Þ together with the layer transition probabilities pðs i js i−1 Þ. The probability for a layer transition from layer k to k þ 1 between two samples is the same as the probability for the time of the corresponding horizon h k being between the time of these two samples.
Layer transition probabilities for three sample locations in the highlighted trace in Figure 3 are shown in Figure 4. The first sample is above the uncertainty band of the top horizon, and the probability for being in layer 1 is equal to 1 in the marked cell and the cell above, which gives pðs i ¼ 1js i−1 ¼ 1Þ ¼ 1, whereas all of the other transition probabilities are equal to 0. The two remaining sample locations are within the uncertainty band of the two horizons. Here, the horizon can be above and below the sample, giving a nonzero probability for a layer transition.

Facies prior model within stratigraphic layer
We assume that within each individual stratigraphic layer, a set of possible facies classes is given and that the facies prior probabilities given the stratigraphic layer are defined by a Markov chain model with stationary conditional transition probabilities. The Markov chain model is then defined by the facies probabilities at the top of the stratigraphic layer pðf i js i ≠ s i−1 Þ and the conditional transition probabilities pðf i jf i−1 ; s i ¼ s i−1 Þ. The conditional transition probabilities control the facies transitions within a single stratigraphic layer, and they can control expected facies layer thicknesses and prevent unphysical combinations like brine-filled sand directly above gas-filled sand. Figure 5 shows the facies prior given the stratigraphic layer for each of the layers in Figure 3.
The final facies transition probabilities pðf i jf i−1 Þ can then be calculated from the conditional facies transition probabilities and the layer transition probabilities as Here, we have made use of the fact that a given facies belongs to one specific layer; thus, s i is contained within the definition of f i . Furthermore, note that because the layer transition probabilities pðs i js i−1 Þ are nonstationary, the facies prior model will also be nonstationary. Figure 6 shows the final facies transition probabilities at the trace highlighted in Figure 3.

LOCAL NEIGHBORHOOD APPROXIMATION
The goal of the seismic inversion algorithm described here is to gain information about the facies and stratigraphic layering. This is done by finding the probability distribution for facies (with containing layer) in a given sample location pðf i jdÞ. The corresponding cumulative probability distribution for the location of the horizons pðh k < t i jdÞ is also found, where t i is the time of sample i and h k is the time of horizon k. A brute-force approach to finding these is to evaluate all possible facies configurations f using equation 1, but the number of possible configurations is far too high for this to be realistic. Our approach is to limit the problem to evaluate all facies configurations in a neighborhood around a location of interest. This reduces the number of configurations to a manageable amount, while still allowing inference on the local sequence of facies.
Around a sample location i, we define the local neighborhood as a small vertical window denoted as w i (see Figure 7). Within this window, we evaluate all permissible facies configurations. Because the permissible facies configurations are common for all of the locations in the inversion volume and all calculations within this section are assumed to be done around this location, we suppress the location identifier for the remainder of this section, and we just denote the window w.
In general, the window w can be any preselected set of sample locations close to the location i. Within this neighborhood, all facies configurations f w are considered. All possible facies configurations for the example shown in Figure 3 for a window length of three are shown in Figure 8. With four facies and a window length of three, there are a total of 4 3 ¼ 64 possible configurations. However, after enforcing a strict ordering of the stratigraphic layers, disallowing brine sand directly above gas sand and enforcing the presence of the reservoir between the overburden and the underlying shale, there are only 18 permissible configurations.
The posterior probability for a facies configuration in the window is given as The prior probability function pðf w Þ represents the prior probabilities of facies configurations in the window w, and they can be calculated directly from the facies prior model given above. The challenge is to approximate the likelihood pðdjf w Þ. The method for doing this is described below as a modeling sequence. The algorithm and hence computational complexity grows exponentially with window length (see the "Discussion" section), and  we usually use a vertical window that is three or five sample locations long. With a typical seismic sampling rate of 4 ms, this corresponds to 12-20 ms, which is generally too short to capture the characteristics of the seismic signal. To counteract this, the distributions of the elastic parameters and the seismic amplitudes given the facies configuration need to be calculated for a larger interval. In the remainder of this section, m and d have a length equal to this "data matching window" length, stretching half a wavelet out in each direction. This corresponds to the interval where the seismic signal is directly affected by the facies configuration at the window location.

Approximate likelihood model
To be able to calculate the likelihood pðdjf w Þ, the rock-physics likelihood pðmjf w Þ is needed. Finding the exact distribution is generally not feasible, and the key idea to solve this is to approximate pðmjf w Þ by a Gaussian distribution p Ã ðmjf w Þ, completely defined by the expectation vector μ mjf w and the covariance matrix Σ mjf w . These quantities can be computed from pðm; fÞ, the known joint distribution of elastic parameters and facies configurations. The elastic parameters are assumed to be correlated vertically between sample locations containing the same facies.
The approximate distributions of acoustic impedance (AI) for a subset of the different facies configurations within the window from Figure 8 are shown in Figure 9. The expected elastic values are shown along the trace as a thick black line, and the associated uncertainty is shown in gray. The values at the center of the plot correspond to the expected value and uncertainty for the facies in the facies configuration shown in the top right corner. The values away from the center are calculated based on the possible transitions upward and downward from the facies at, respectively, the top and bottom of the window. Note that the uncertainty is low outside the window in the cases in which there is only one possible facies above or below the window, i.e., windows with either shale 1 in the top or shale 2 in the bottom, whereas the uncertainty increases significantly if multiple different facies are possible.
Under the Gaussian approximation p Ã ðmjf w Þ, we can apply the seismic likelihood model defined above in equations 3 and 4 (Buland and Omre, 2003a) for each facies configuration f w . The resulting approximate likelihood function for the seismic data in the window, p Ã ðdjf w Þ is Gaussian with expectation vector and covariance matrix given as (9) Figure 10 shows the probability distributions for the seismic amplitudes corresponding to the elastic distributions shown in Figure 9. The seismic signal is mainly covered by contrasts in the elastic properties for the facies within the window. Note that there is a significant expected seismic signal for the window with only gas-filled sand, due to the high probability for a facies transition with a high elastic contrast not far above and below the window.
The approximate likelihood p Ã ðdjf w Þ can thus be found by standard methodology,   Figure 8. The distance to the window center is given on the vertical axis. The boxes in the upper right corner represent the facies configurations. The uncertainty represented by the area between the 2.5th percentile and the 97.5th percentile is given in gray.
The covariance matrix Σ djf w is independent of data, and we can thus precompute the inverse covariance for each window for fast likelihood computations.

Approximate posterior model
The approximate posterior probability function p Ã ðf w jdÞ for the facies configuration in the window can be computed by substituting the likelihood pðdjf w Þ in equation 8 with the approximate likelihood p Ã ðdjf w Þ. The approximate posterior distribution p Ã ðf w jdÞ is then given as Note that replacing equation 1 with equation 11 simplifies the computations significantly. This allows us to approximate the posterior probability for the facies configuration in the window by considering only the facies configurations in the window rather than all facies configurations in the trace. There is a trade-off between the computational complexity and accuracy of the result by adjusting the window size. A large window gives large accuracy but has a large computational cost; a short window decreases the accuracy, but it is also faster to compute. The algorithm has the following key features: 1) It can be used to honor physical and geologic relations inside each window. This is done by including only geologically possible facies configurations inside the window as shown in Figure 8. 2) Inside the window, there is no approximation of the distribution; hence, a larger window gives a better approximation. With a window size equal to the trace length, there is no longer any approximation.
Note that facies transitions within the window f w force reflections into the solution. That is, when computing the approximate likelihood p Ã ðdjf w Þ, we are checking whether the reflections implied by the facies configuration fits the data. This also means that we explicitly check on the existence of thin layers that are normally considered to be below tuning thickness.

Posterior facies probability
Equation 11 gives an approximate posterior probability for facies configurations within a window, and it can be marginalized to find the approximate posterior probability for facies in a cell, by summing over the relevant windows. However, if windows are not very long, this probability does not fully take into account the joint probability for facies along the entire trace. For instance, a facies configuration with only gas would be about as likely as a facies configuration with only shale at a location with no significant seismic reflections. But if there are no major reflections anywhere in the trace above this window, the gas case should be unlikely because we would expect a reflection at the transition from shale to gas. Thus, a facies should only have probability locally if there are facies traces that are consistent with geology and seismic data, which give this facies at the location.
To incorporate full trace consistency, we use the probabilities for facies configurations to compute a higher order Markov chain for local transitions, starting from the top of the trace. This restricts the solutions to those that are consistent in a full trace sequence starting from the top. Similarly, Markov chains starting from the bottom are created. Marginal posterior facies probability at any location i, pðf i jdÞ can then be found as an average of the downward and upward marginals. Details are given in Appendix A.
Because this is done trace by trace, no lateral constraints on the facies sequences are included in the inversion, just the vertical ordering. The inversion results will still have lateral continuity induced by the lateral continuity of seismic data. In theory, accounting for lateral continuity in geologic features can help improve the inversion results. Facies structures within a layer are, however, generally complicated with abrupt facies transitions, and they are better modeled with either object models or pixel-based models conditioned to marginal facies probabilities (Holden et al., 1998;Stien and Kolbjørnsen, 2011).

Posterior horizon distribution
The layer probability pðsjdÞ is found by marginalizing on the layer instead of facies. The probability for a sample i being below Figure 10. Probability distributions for seismic signals for three different angles corresponding to the elastic properties shown in Figure 9. The distance to the window center is given on the vertical axis. The boxes in the upper right corner represent the facies configurations. The uncertainty represented by the area between the 2.5th percentile and the 97.5th percentile is given in gray.
horizon h k is then the same as the probability for the sample belonging to a layer below h k . This probability is then given by the cumulative distribution pðh k < t i jdÞ ¼ X j>k pðs i ¼ jjdÞ; where t i is the time of sample i. The horizons are generally smooth, but with potential local discontinuities due to faulting. This can be modeled by introducing lateral correlation directly into the posterior distribution. The inversion results h Ã k ðx l Þ for a trace with lateral position x l can be seen as estimates of the true horizon time h k ðx l Þ that we want to predict where εðx l Þ is the estimation error. The distribution of h Ã k ðx l Þ does not have a parametric form, but it can be approximated by a Gaussian distribution with mean and variance estimated from equation 12. Incorporating data of the form of equation 13 is a common spatial problem and can be solved using local kriging. Because there is a lateral dependency in the seismic amplitude errors, there is also lateral dependency in the corresponding observation errors εðx l Þ. The estimation of this lateral dependency is given in Appendix B.

SYNTHETIC CASE
We define a synthetic case to compare the posterior facies probabilities from the inversion algorithm with two other Bayesian algorithms. We use a reference facies model similar to the concept model in Figure 3 and corresponding synthetic seismic data are shown in Figure 11.
The procedure for generating the synthetic seismic data is as follows. Each facies in the reference model is populated with elastic properties by sampling a Gaussian field with mean, variance, and covariance according to the rock-physics distributions in Figure 2. The variogram has a range of 100 traces laterally and 10 ms vertically, and the field is sampled in a stratigraphic grid so the correlation structure follows the stratigraphy. The stratigraphic grid is then resampled to a regular grid using a nearest-node approach. Finally, the synthetic seismic is obtained by applying the seismic forward model in equation 3 to each trace in the grid.
The posterior facies probabilities from the inversion algorithm are compared with a pointwise inversion and an inversion using an importance sampling algorithm. The pointwise algorithm estimates the facies probabilities at each sample point from the inverted local elastic properties according to Buland et al. (2008). The importance sampler estimates the facies probabilities based on calculated likelihoods for facies traces sampled from the prior model. Because of ambiguity in the seismic forward model, we do not compare the inversions with the reference facies model. Instead, we consider the importance sampler (Ripley, 1987) as the baseline because it represents the target distribution that we aim to approximate in our computations.
The importance sampler is very computationally intensive and therefore generally not practical, but due to the simplicity of this particular case it can be run to convergence. The window length of our inversion algorithm is set to five samples.
The aim of the inversion is to detect gas sand and predict the gasbrine interface as illustrated in Figure 3. Therefore, a small prior probability (<0.2) for gas sand is assigned to the sand layer between the top 2 and top 3 horizons. The total probability of having sand is approaching 1 in the middle of the layer and zero outside the layer. The initial uncertainty of the time of top 2 and top 3 is set to 20 ms. The resulting prior probability model for shale 1, sand, and gas sand is shown in Figure 12a-12c. Figure 12d-12f, 12g-12i, and 12j-12l shows the corresponding posterior probabilities of the importance sampler, our algorithm, and the pointwise inversion, respectively. All three algorithms perform equally well with respect to the detection of shale 1 and sand. However, note that the estimated uncertainty of the horizon between shale 1 and sand is slightly higher with the pointwise inversion than with the other two.
The importance sampler gives a sharp interface between gas sand and the surrounding sediments (Figure 12f). Our algorithm returns a similar pattern ( Figure 12i); however, there is some variability in the estimated probabilities seen as vertical stripes in the gas-sand probability. For the pointwise inversion, only about half of the thickness of the gas sand is identified (Figure 12l), corresponding to the major reflection at the top reservoir. Away from the reservoir top, the algorithm tends to return the initial prior model because the seismic data there do not contain any clear signal that could push the posterior away from the prior.
Increasing the number of samples in the window of our algorithm improves the approximation. The average Kullback-Leibler (K-L) divergence between the results of our algorithm and the importance sampler is shown in Figure 13 for window lengths of one to five samples. We observe a systematic reduction in the divergence as the number of samples in the window is increasing. For reference,  Figure 11. Synthetic seismic at near (5°), mid (15°), and far (25°). The vertical axis shows the TWT. The vertical noisy stripes at, e.g., crossline position 40 is a result of the resampling from a stratigraphic to a regular grid. Figure 13 also shows the K-L divergence of the prior model and the importance sampler. The latter represents a minimum divergence due to Monte Carlo effects and is obtained by comparing the posteriors of importance samplers with different random seeds. We conclude that the posterior facies probabilities from our algorithm are preserving most of the features of the importance sampler while keeping the computational complexity manageable by only evaluating permissible facies configurations within a small window. The gas reservoir and the gas-brine interface are better illuminated than in the pointwise inversion, which suffers from the local nature of the algorithm.

EXAMPLE FROM THE EDVARD GRIEG FIELD
The Edvard Grieg oil field is located on the southwest flank of the Utsira High in the North Sea approximately 180 km west of Stavanger, Norway, with a water depth of approximately 110 m. The reservoir at approximately 1900 m depth was deposited in a continental half-graben setting of Triassic age. It constitutes of sandstones, conglomerates, mudstones, fractured, and weathered basement with a complex and interfingering structure. Hence, tuning effects and significant variability in geologic facies and elastic properties are observed over relatively short distances within the reservoir.
The seismic inversion is based on angle-limited stacks from a 2016 Ocean bottom seismic survey. An amplitude-variation-withoffset (AVO) friendly processing sequence including mild spectral whitening is performed on the data, and three stacks corresponding to 5°, 15°, and 25°have been extracted.
The reservoir consists of three main units. We will here focus on unit III that consists of homogeneous sandstones with excellent reservoir qualities. A full-field inversion study using the methodology presented in this paper can be found in Ndingwan et al. (2018). The underlying unit II consists of fluvial sandstones, alluvial facies, floodplain, and lacustrine mudstones. In parts of the field, the interface between unit III and unit II does not represent a distinct seismic reflector. The top of unit III is obscured by an overlying chalk layer in which the top is a significant seismic reflector, and these reflectors are generally within tuning of each other. A facies prior model is defined consisting of six layers (overburden; chalk; units III, II, I; and basement) and 11 facies classes each with elastic parameters following Gaussian distributions. The prior model is set up based on facies interpretations from 11 wells together with regional geo-logic knowledge, and it is shown for one trace in Figure 14 coinciding with a well. The well's facies log and petrophysical interpretation given as bulk volume of clay (VCL), brine (BVW), and hydrocarbon (BVH) are also shown. The inversion is performed with window length of five samples resulting in approximately 3.7 × 10 4 permissible facies configurations. The inversion focusing on estimating the top and base of unit III in the southwestern part of the field is presented below. The six (of the 11) facies classes that are most relevant are displayed at the bottom of Figure 14, and the distribution of the elastic properties for sand, shale, and conglomerates is shown as a scatterplot in Figure 15. The southwestern part corresponds to a region where unit II is mostly dominated by floodplain and lacustrine mudstones with similar elastic properties as the main sandstones of unit III. Note that due to the vertical coupling, the content of unit II will influence the results in unit III; thus, separate facies classes for the floodplain-dominated section are required (Ndingwan et al., 2018).
The conditional facies transition probabilities within unit III are defined in Figure 16. The transition probabilities between unit III and surrounding strata are estimated from the top and base of unit III and their uncertainties. The final facies transition probabilities are then calculated according to equation 7. Figure 17 illustrates permissible facies transitions into, out of, and within unit III for the six most relevant facies classes.
Using the methodology in this paper, we have found the posterior facies probabilities for unit III. Figure 14 shows the prior and posterior probabilities of these along one trace at a specific well location. Note that all facies have nonzero prior probabilities in units II and III and above unit III due to the initial large uncertainty of the top and base of unit III. The posterior result is clearly identifying the top of unit III and indicating a thinner unit than the prior as seen by the abrupt reduction in sand probability and the corresponding reduction in location uncertainty. At approximately 1940At approximately -1950At approximately and 1975At approximately -1990 ms, the seismic data have increased the probability of shale by a factor larger than five. This indicates that in these two locations, the seismic data contain strong evidence of shale, which is confirmed by the well data.   The facies above the transition is displayed in the first column, and the facies below is displayed along the first row. Figure 18 shows the prior and posterior result in a crossline in which only the most probable facies is shown. The prior surfaces of unit III are obtained from seismic, and the expected uncertainty is indicated by the shaded areas. From the posterior probabilities, we have used the previously described laterally coupled surface prediction approach to generate surfaces (referred to as correlated surfaces). For lateral continuity, we have used an exponential variogram with a range of 2.5 km. The resulting correlated top and base surfaces with uncertainty are overlaying the posterior result in Figure 18. The prior and posterior surfaces with uncertainties are also overlaying the seismic AVO data in Figure 19 for the same crossline as shown in Figure 18. Figure 20 shows the estimated thickness of unit III as well as the thickness from predicted surfaces without lateral continuity, in which we only predict surface locations in each trace independently of each other (referred to as uncorrelated surfaces). We see that the lateral dependency generates a smoother surface, consistent with the prior assumption of the spatial dependence. As we would expect from Figure 19, we also see a major change from prior to posterior, with a decrease in thickness, and a clearly defined pinchout of the layer.
Our surface prediction approach also provides uncertainties for the surfaces. These are shown in Figure 21 for the top and base surface along the cross section of Figure 19. Figure 21 plots the standard deviations (i.e., the uncertainties) of the prior and the correlated and uncorrelated posterior surfaces. We see that the central part of the top correlated surface is well-defined with a standard deviation of 2.0 ms, corresponding to a position accuracy down to seismic sample resolution. The uncertainty reduction for the base correlated surface is also significant. Introducing the lateral dependency pulls the uncertainty at a location toward the minimum uncertainty in the region around the location, giving a further reduction in uncertainty over what is achieved for independent traces.
The results of the above example are in accordance with our Bayesian model. The top surface is at a very clear reflector in the central part, and it does not change much because the prior is already at this reflector, so the main effect here is a reduction of uncertainty, confirming that this reflector matches the prior rock-physics model for transition into unit III. The base surface is much more uncertain, and the posterior here differs from the prior. This update reflects where the seismic signal gives the best match the prior model for facies and rock physics at the base of unit III. The posterior distribution for this surface is also much more uncertain than the top because the reflector used here is not as obvious as the top reflector. Just as in the synthetic case, the pinchout is also well-defined in this real case because our algorithm is robust for tuning effects within the window length.
We have also compared the inversion result with wells around the field and generally found a good match. Figure 22 shows the well indicated in the cross sections and in the thickness maps. The comparison is done in the V P , V S , and density domain in which the inverted properties are a probability weighted estimate from the facies posterior probabilities and the rock-physics likelihoods. The most visual discrepancy is perhaps V S above the reservoir in the overburden for two-way traveltimes (TWTs) less than approximately 1860 ms. A closer analysis shows that the well has an anomalous high V P ∕V S relationship compared to other overburden wells Figure 17. The transition matrix into, out of, and within unit III. Matrix elements that are filled with the pink and light-pink colors indicate permissible transitions, respectively, within unit III and going into or out of unit III. The gray-filled elements correspond to permissible transitions within the chalk and unit II as well as permissible transitions directly from the chalk to unit II if unit III is missing. The white elements represent impermissible transitions. explaining most of the discrepancy. There also appears to be a time alignment issue, which might come from the time depth curve or a lateral shift from the migration. There is a clear decrease in elastic properties at the top of unit III (approximately 1880 ms), which corresponds to the location of the softer sand compared to the surrounding harder conglomerate B as shown in the scatterplot of Figure 15. The mapping of the main sand boundaries, which was the primary concern in the inversion, also aligns well with observed 4D seismic effects from a more recent 2018 survey.

Inversion for rock properties
Posterior estimates for elastic properties are found by performing an elastic inversion given the facies configuration for each facies configuration within the window, weighting the inverted elastic properties by the posterior probability for the facies configuration, and integrating upward and downward. The algorithm presented here can also be extended with estimation of rock properties such as saturations and clay content using a local linear relationship between rock properties and elastic properties to estimate the rock properties given a facies, similar to Grana et al. (2017). The benefit of the proposed approach is that the reservoir parameters are interpreted in the context of a layer and a facies. Taking the Edvard Grieg case as an example, there is a large difference between the seismic response of a sandstone and conglomerate B (sandy conglomerate) even if the porosities are identical. For cases in which a linear relationship between elastic properties and reservoir parameters is a poor approximation, one might consider extensions along the lines of Jullum and Kolbjørnsen (2016).

Computational complexity
The main factors affecting the computation time are the number of layers, the number of facies within each layer, and the length of the window. Similar to Rimstad and Omre (2013), there is a clear trade-off between the precision in the inversion results and the computing demands with respect to window length. The computing demands are generally proportional with the number of permissible facies configurations within the window, which goes as Oðn w l f Þ, where n f is the number of facies and w l is the window length. However, adding layering and a fixed facies ordering can lead to a great reduction in the computational demands. The computations performed for each location further scale with wavelet length giving a computational complexity of OððN Ã n 2 l þ n 3 l Þ Ã n m Þ, where N is

Prior model assessment
There are two sources of error in our approach. We have already discussed the error due to the approximation of the algorithm. The other is the error introduced by assuming that the proposed prior model covers the true earth profile. This latter problem is not unique to our methodology, but it is common for all model-based (Bayesian) approaches. Methodologically, we can resolve this issue by adding an additional class for an undefined response, which will be preferred if the likelihoods of all facies configurations are small, as is done in Buland et al. (2008). However, there are methodological challenges for integrating this approach in the setting of vertical coupling. Thus, it remains a relevant research topic for the future.
In practical applications, we find that prior models that do not cover the true profile sufficiently well tend to give lateral instabilities in the inversion. The upside of solving the equations trace by trace is that we automatically test the robustness of the result toward small data perturbations because neighboring traces tend to be similar. If the results fluctuate laterally, it is likely due to either model errors or overconfidence in the data. We also find it useful to test the sensitivity of the results by running more than one prior model. Plotting a map of the trace-wise root mean square of the residuals is an additional quality control. If there are regions with large residuals, this can be a sign of the model being inadequate. Ndingwan et al. (2018) combine the two latter types of quality control by comparing the root mean square of the residuals for several prior models to select the model that is optimal for a given region.

CONCLUSION
We have presented a probabilistic algorithm that predicts facies and horizons using AVO data and geologic information. The Markov prior model allows us to impose restrictions on facies ordering and to incorporate facies layer thickness information. The introduction of layer information in the nonstationary transition probabilities enables update of the position of the layer-defining horizons based on the seismic amplitudes. These positions can be further refined by imposing lateral continuity. We use a local neighborhood approximation in which we find the local probability for all facies configurations within a small window. Using Markov chains, we combine these local probabilities into a consistent set of lithology and fluid probabilities for the entire trace. The results show that the algorithm is capable of detecting facies transitions, even below tuning thickness. Others have used similar prior models for lithology and fluid predictions, using different approximations to compute the predictions. They also get good results, indicating that this is a good prior   Figure 14.
model for this problem. Our local neighborhood approximation allows us to apply this even to complex cases, in which MCMC-based approaches have convergence problems. We are also able to incorporate temporal continuity in elastic parameters, which if ignored tend to push probabilities toward the extremes. The algorithm performs better than a pointwise facies prediction, and the efficiency of the algorithm is such that it is feasible to invert full-field data sets with complex prior models containing a rich set of facies classes.

ACKNOWLEDGMENTS
We thank the Edvard Grieg license partners Lundin, OMV, and Wintershall Dea for permission to publish the data from the Edvard Grieg field. We also thank the sponsors of the GIG consortium, AkerBP, ConocoPhillips, Equinor, Lundin, Total, Vår Energi, and Wintershall Dea for financing the development of the algorithm presented in this paper, and especially Equinor for financing the early versions of the algorithm and for releasing it to the consortium. Finally, we thank D. Barker at the Norwegian Computing Center for assistance generating some of the figures in this paper.

DATA AND MATERIALS AVAILABILITY
The data associated with the synthetic test case are available and can be obtained by contacting the corresponding author. Data associated with the Edvard Grieg field example are confidential and cannot be released.

APPENDIX A USE OF HIGHER DIMENSIONAL MARKOV CHAIN
The appendix describes the algorithm used to ensure full-trace consistency when calculating the local marginal posterior facies probabilities pðfjdÞ. The final result is found by building upward from the bottom of the trace and building downward from the top of the trace. When building upward, the marginal facies probabilities at the bottom of the trace are initialized with a standard marginalization of pðf w jdÞ. The probability of the facies in a sample location can be computed from f j>i , all the cells below, by a standard relation: pðf j≥i jdÞ ¼ pðf i jf j>i ; dÞpðf j>i jdÞ: (A-1) The approximation through a k order Markov property is now introduced by inserting the approximation pðf i jf j>i ; dÞ ≈ pðf i jf iþk≥j>i ; dÞ; (A-2) into equation A-1. The right side can be derived directly from the probabilities for facies configurations pðf w jdÞ in the window centered at sample i. Similarly, it is possible to compute marginal facies probabilities downward from the top of the trace by reversing the ordering of the computations. Either of these approximations is equally good, but will in general differ. The probability of a facies in a location is then finally approximated by combining these probability calculations. We find the final likelihood as the geometric average of the two likelihoods. Let the upward posterior probability approximation be p u ðf i jdÞ, and the downward being p d ðf i jdÞ. The corresponding likelihoods can be found by dividing the posterior with the prior, giving p Ã ðf i jdÞ ∝ pðf i Þ p u ðf i jdÞ pðf i Þ Thus, the approximated probability is proportional to the geometric mean of up and down probabilities. The algorithm for computing the approximation posterior facies probabilities pðf i ¼ kjd i Þ then becomes 1) For each sample location i in each trace: • extract seismic data d i • for all facies configurations f w centered on i, compute p Ã ðf w jd i Þ from equation 11.
2) Compute the approximations p u ðf i jd i Þ and p d ðf i jd i Þ from running Markov chains upward and downward as described above. 3) Compute the final approximation from equation A-3.

APPENDIX B DETAILS RELATED TO LATERAL COUPLING FOR HORIZONS
This appendix describes the details related to the estimation of the local uncertainties defined in equation 13, which are needed in the spatial approach used to introduce lateral coupling for horizon observations in the different traces. These local uncertainties are interpreted in terms of Gaussian error models. We assume that the error locally at trace i can be mapped into two components where ε is the common component, whereas ε R ðx i Þ is a correlated component. These two components are independent and have unit variance. The uncertainty in the inverted horizon time for a trace is then a sum of the scale of the common and correlated components σ 2 obs ðx i Þ ¼ σ 2 R ðx i Þ þ σ 2 c ðx i Þ: (B-2) According to equations 13 and B-2, there are three terms that contribute to the variability in h Ã ðx i Þ. The common component σ c ðx i Þ and the horizon time itself vary slowly, so we assume that these quantities are constant in a small region around the location. The variance component σ R ðx i Þ can then be estimated by using a standard variance estimate. Because the variance σ 2 obs ðx i Þ depends on location x i , it is natural to use the local variance as a weight in this estimation. Let Nðx i Þ be a neighborhood of x i . The weighted averaged in the neighborhood can be found as The correlated component is now the weighted standard deviation from this average for the observations in the neighborhood, given by From this, we can find the common component The correlation structure of the colored component can also be estimated from the residuals within the neighborhood. Because this model is valid for the local region, only data from this neighborhood are used in the kriging when computing the smoothed result. The size of the neighborhood is limited by the local constant assumption.