Fast approximation of the intensity of Gibbs point processes

: The intensity of a Gibbs point process model is usually an intractable function of the model parameters. This is a severe restriction on the practical application of such models. We develop a new approximation for the intensity of a stationary Gibbs point process on R d . For pairwise interaction processes, the approximation can be computed rapidly and is surprisingly accurate. The new approximation is qualitatively similar to the mean ﬁeld approximation, but is far more accurate, and does not exhibit the same pathologies. It may be regarded as a counterpart of the Percus-Yevick approximation.


Introduction
In the statistical analysis of spatial point pattern data, an important role is played by Gibbs point process models, especially pairwise interaction processes [12,45,31]. However, many important properties of these models are intractable, including the intensity (expected number of points per unit volume), higher moments, and the partition function (normalising constant of the likelihood). This intractability is a severe impediment to the use of Gibbs models in applied probability and statistics, and indeed it motivated the invention of Markov Chain Monte Carlo methods [29].
Many approximations to the moments and partition function of Gibbs point process models have been developed in statistical physics [41,14] and in spatial statistics [46,34,39,28,13]. Two main categories are the simplifying approximations, such as the Mean Field approximation, derived from heuristic arguments and designed to be tractable; and convergent sequences of approximations, typically truncated Taylor series in one of the model parameters θ, expanded about a value θ 0 which corresponds to a Poisson point process. Well-known difficulties with series approximations include slow convergence, restricted radius of convergence, and intensive computation requirements.
This paper describes a simplifying approximation, which we believe is new, to the intensity of a Gibbs point process. The fundamental Georgii-Nguyen-Zessin identity [17,33,23] is viewed as a self-consistency equation for the process. Poisson approximation is applied to one side of the identity, in a manner reminiscent of saddlepoint approximation. The resulting approximate self-consistency equation can be solved numerically to yield an approximation to the intensity function of the process.
The rationale for our approximation is reminiscent of the Mean Field approximation to the intensity, and of the Ornstein-Zernike [35] equation for the pair correlation function. The functional form of our approximation is also quite similar to that of the Mean Field approximation. However our approximation is far more accurate, in the cases studied. (solid lines) with the mean field approximation (dashed lines) and true values estimated by a large Monte Carlo experiment (circles) for the popular Strauss point process model [44,24]. Our new approximation is quite accurate over the full range of values of the interaction parameter γ. Importantly it does not need the 'sparseness' conditions required for Poisson limit theorems [42,20,22]. Possible uses for this approximation include prediction for a fitted model (i.e. calculating the intensity for a Gibbs point process model that has been fitted to point pattern data), residual analysis [3] and other diagnostics for a fitted model, approximations to maximum likelihood estimation (since the likelihood score involves the point process intensity), and stability improvements to Markov Chain Monte Carlo methods. Our approximation may also be used for qualitative study of model behaviour, whereas the mean field approximation cannot (as shown by the behaviour as γ → 0 in Figure 1).
For simplicity of exposition, we concentrate on stationary, pairwise-interaction Gibbs processes in the present paper. Extensions to other processes, including non-stationary processes, and improvements to the approximation, are outlined at the end of the paper. Section 2 gives basic background. Our approximation is defined in Section 3. In Section 4 we derive elementary properties of our approximation and of the classical mean field approximation. Section 5 gives numerical examples. We conclude with a discussion.

GNZ formula
The Georgii-Nguyen-Zessin (GNZ) formula [17,32,23] is a fundamental relationship between the probability distribution of a point process and its reduced Palm distribution. For simplicity, assume X is a stationary point process on R d , with intensity λ > 0. The reduced Palm distribution P !0 at the origin 0 is, roughly speaking, the conditional probability distribution of X \ {0} given that there is a point of X at 0. For accessible explanations, see [23,43]. Under suitable conditions, the GNZ formula states that P !0 is absolutely continuous with respect to the distribution P of X, with Radon-Nikodým density λ(0; X)/λ, where λ(0; X) is the Papangelou conditional intensity [36] of X at 0. That is, for any statistic h(X) that is absolutely integrable with respect to P !0 , where E and E !0 denote expectation with respect to P and P !0 , respectively. In the special case h ≡ 1, we obtain an expression for the intensity: For many popular Gibbs models, the conditional intensity λ(0, X) has an explicit, analytic form in terms of the canonical parameters of the model. However, the expectation on the right side of (2) is with respect to the distribution of X, and is generally intractable. Thus, (2) does not provide a simple route to an exact expression for λ. Indeed λ is usually an intractable function of the canonical parameters.

Pairwise interaction processes
We shall focus on stationary pairwise interaction point processes [11,18,25]. The conditional intensity at a location u is assumed to be of the form a countable product over all points x i in the point process realisation x, where β > 0 is a parameter and g : Note again that the right hand side is the expectation of a random product over all points of the point process. This can be formally defined as the exponential of the random integral of log g(u) with respect to the point process counting measure.

Strauss process
A special case of pairwise interaction is the Strauss process [44] where where 0 ≤ γ ≤ 1 is the interaction strength parameter and 0 < r < ∞ the interaction distance. Then where is the number of points of x within distance r of the location u. Equation (2) becomes The right side of (7) is an expectation with respect to the Strauss process. The distribution of T = t r (0, X) under the Strauss process is not known, and λ is not known analytically as a function of β and γ.
A special case of particular interest is the hard core process obtained by setting γ = 0 and interpreting 0 0 = 1, so that (5) becomes λ(u; x) = β1{t r (u, X) = 0} (8) and (2) is Again the right hand side is an (intractable) probability with respect to the hard core process.

New approximation
Our approach is to view (1) as analogous to the tilting property of an exponential family, and use it to derive approximations analogous to the saddlepoint approximation. To explain the motivation, consider any model where the conditional intensity is loglinear in the parameter θ, say λ(0; X) = exp(θ V (0, X)) where V (0, X) is a known function. Then the identity (1) may be regarded as an exponential tilting relationship between P !0 and P . By analogy with saddlepoint methods, we should approximate the true distribution P on the right hand side of (2) by a simpler distribution Q with matching first moment. We choose Q to be the distribution of the Poisson point process with the same intensity λ. Thus we replace the identity (2) by the approximate relation where the right hand side is the expectation with respect to a Poisson point process of intensity λ. Solving this equation for λ, if possible, yields a value λ ps which is an approximation to the true value of λ.
Definition 1. Consider any stationary Gibbs process with intensity λ and Papangelou conditional intensity λ(u; X). The Poisson-saddlepoint approximation of λ is the solution λ ps of provided a solution exists.
This approach is less restrictive than approximating the moments of a point process X by those of a Poisson process, based on a Poisson limit theorem for X in the 'sparse' limit (low intensity or weak interaction) [42,20,22]. Our proposed approach is a rationale for applying the same type of approximation to processes that are not well approximated by Poisson processes. The Palm distribution of X is approximated by that of a Poisson process of the same (but unknown) intensity, yielding a self-consistency equation for the intensity of X.
In the special case of a pairwise interaction process X, the conditional intensity (3) is a product over points of X. Under a Poisson process of intensity λ we have [26, eq. (1.12), p. 20] where assuming G > −∞. The quantity G is the second Mayer cluster integral [14, p. 108]. Note that g(x) ≤ 1, with strict inequality on a set of positive measure, will guarantee that G > 0. Then (11) becomes Solving this equation yields the following result.
Theorem 1. Consider a stationary pairwise interaction process X with conditional intensity (3) such that G > 0 where G is the second Mayer cluster integral (13). The Poisson-saddlepoint approximation of the intensity of X is where W is the inverse function of x → xe x .
The increasing function W is (the principal branch of) Lambert's W function [10,9,40]. Efficient software exists to compute values of W by iterative rootfinding. The

Elementary properties and comparison with mean field
Here we give some elementary properties of our approximation, and for comparison, discuss the corresponding properties of the classical mean field approximation.
There is an extensive literature on "mean field-like" approximations to properties of Markov Random Fields on finite graphs, including applications to image processing [47,8,15]. This approach "consists [in] neglecting fluctuations from the mean in the environment of each pixel" [8], effectively replacing the Markov dependence by independence between pixels. In the context of point processes, the corresponding approach is to approximate a Gibbs point process by a Poisson process.
The mean field approximation can be derived by a variational argument which we do not recapitulate here. In brief, the true distribution P of the point process is to be approximated by the distribution Q of a stationary Poisson process with some intensity λ. In the mean field approximation, λ is chosen to minimise the Kullback-Leibler divergence K(Q P ) = E Q log(dQ/dP ). After some calculation this yields the following. Definition 2. Consider any stationary Gibbs process with intensity λ and Papangelou conditional intensity λ(u; X). The mean field approximation of λ is the solution λ mf of λ = exp(E Pois(λ) log λ(0; X)) provided a solution exists.
This yields a result of similar form to (15), For the Strauss process, Γ = −ω d r d log γ. In two dimensions, Γ = −πr 2 log γ giving λ mf as the dashed curve in Figure 1.
For the true value of λ we have by Jensen's inequality by Campbell's formula [11, p. 163]. This expectation is the same for all processes with intensity λ, including the Poisson process, so we have so that λ mf ≤ λ.
Note: We conjecture that λ ≤ λ ps under the assumptions of Theorem 2, which is strongly suggested by the plots in Section 5.
Next consider the behaviour of these approximations in the limiting case that corresponds to a hard core.

Theorem 3. Consider a stationary pairwise interaction process in R d with conditional intensity (3) with pairwise interaction of the form
where θ ≥ 0 is a parameter and V a nonnegative real-valued function. Define Proof. The assumptions imply that G(θ) and Γ(θ) are finite for 0 ≤ θ < ∞.
Since G(θ) and Γ(θ) are increasing functions of θ, λ ps and λ mf are decreasing functions of θ.

Numerical examples
The accuracy of the approximations was assessed in the case of the stationary Strauss process. All computations were performed in the R language [38]. The Lambert function W was evaluated by the function lambert W0 in the R package gsl, an interface to the GNU Scientific Library [16]. Computing one value of λ ps or λ mf took about 10 microseconds on a 3GHz dual core laptop. For comparison, about 2 milliseconds were required to compute the same values using a generic root-finding algorithm such as the R function uniroot, which is based on the Netlib algorithm zeroin [7,6].
The true intensity λ was estimated by Monte Carlo methods. For each parameter value, 10, 000 independent realisations of the model were generated by a coupling-from-the-past (CFTP) algorithm [4] run with the same parameters on the square [−2r, 1 + 2r] 2 and then clipped to the unit square. The number of points in each realisation was averaged to obtain the estimated intensity and its standard error. Source code for the CFTP algorithm was kindly provided by Kasper Klitgaard Berthelsen. It is available as a function rStrauss in the contributed R package spatstat [2]. Figure 2 shows a numerical example of the intensity approximation. The Strauss process with β = 100, r = 0.1 was studied for different values of γ. Solid lines show the analytic approximation λ ps of (15). Circles show Monte Carlo estimates of the exact value λ computed for 19 values of γ. Each Monte Carlo estimate in Figure 2 was the sample mean of 10,000 simulated values of the number of points in the unit square. Figure 2 shows that the approximation can be surprisingly accurate. The maximum relative error is about 15% (and of course this occurs close to γ = 0). Figure 3 shows two more examples. Computation times for the CFTP algorithm depend greatly on the parameter values. In our examples the computing time per realisation (including evaluation of the mean number of points) ranged from about 2 milliseconds per realisation when γ = 1 to about 170 milliseconds per realisation for a Strauss process with β = 100, γ = 0 and r = 0.1. These times must be multiplied by the number of simulated realisations, typically at least 1000. Thus, the Poisson-saddlepoint approximation is 5 to 7 orders of magnitude faster than MCMC estimation using an exact simulation algorithm, in these examples.
The CFTP algorithm is not practicable for all parameter values. Its computation time increases rapidly with β and r: for example, when β = 100, γ = 0 and r = 0.15, the mean computation time is already about 60 seconds per realisation. The tail of the distribution of computation time also becomes heavier. In such cases, one must fall back on non-exact simulation algorithms such as birth-death-shift Metropolis-Hastings [19,31] for which the computation time is controlled, but which exhibit a small bias.
Note that our CFTP Monte Carlo estimates of λ may also be slightly biased, since they were not obtained by simulating the stationary Strauss process, but by simulating the finite Strauss process on [−2r, 1 + 2r] 2 and restricting the realisations to [0, 1] 2 . As a check, we simulated the hard core process on a larger window [−m, 1 + m] 2 where m > 2r, again restricting the realisations to [0, 1] 2 . The resulting estimates of λ, shown in Table 1, suggest that the bias is small.

Discussion and Conclusions
We proposed an approximation to the intensity of a Gibbs point process. We believe this approximation is new. It may already be known in the statistical physics community, but we are unaware of any relevant literature. The idea was described briefly by the first author in [1, p. 360] and independently in unpublished work by Hahn [21]. An analogous approximation for the mean vacancy of a hard core disc model is described by Bondesson and Fahlén [5].
The self-consistency equation (11) is analogous to the self-consistency (or 'closure') equation which leads to the Percus-Yevick [37] approximation to the pair correlation function. In a sense our approximation is the counterpart, for first moments, of the Percus-Yevick approximation for second moments.
The new approximation was only evaluated for pairwise interaction processes of purely inhibitory type, i.e. those for which g(·) ≤ 1. It would be of interest to explore other pairwise interaction models such as the Lennard-Jones [27] interaction. Simulation of these models is more computationally-intensive, and they will be studied elsewhere.
The new approximation was defined for a general Gibbs process, but computed only for pairwise-interaction processes. For other Gibbs processes, the existence and uniqueness of the solution to (11) needs to be verified. At least for models with conditional intensity of the form λ(u, x) = exp(−θV (u, x)) this is relatively straightforward, as it reduces to determining certain properties of the moment generating function of V (0, X) for a Poisson process of intensity λ, as a function of λ. However, algorithms for numerical solution of (11) may in general require Monte Carlo simulation of V (0, X).
Possible uses for the new approximation (15) were mentioned in the Introduction. These will be canvassed in a sequel paper. Most of the interesting examples involve an extension of the approximation to the non-stationary case. The non-stationary analogue of (14) is a straightforward integral equation. However, numerical solution of the integral equation requires additional techniques, which we discuss elsewhere.
In statistical physics the mean field approximation is known to be inadequate for studying phase transitions of molecular gas models, because its asymptotic behaviour is qualitatively different from the true asymptotic behaviour of the models. This is reflected in Figure 1 in which, as the interaction parameter γ tends to zero, corresponding to a hard core process, the mean field approximation to the intensity λ converges to 0, while the true intensity is manifestly non-zero. Our approximation to the intensity shares the same qualitative behaviour as the true intensity, and so could be useful for qualitative study of Gibbs models.
The mean field approximation can be justified by a variational argument, outlined at the start of Section 4. We have not been able to find a corresponding justification for our new approximation. Perhaps one could be found using the specialised variational calculus for Poisson processes [30].