Large-scale modelling and forecasting of ambulance calls in northern Sweden using spatio-temporal log-Gaussian Cox processes

To optimally utilise the resources of a country's prehospital care system, it is crucial to spatiotemporally forecast spatial regions and periods with an increased risk of seeing a call to the emergency number 112. Such forecasts allow the dispatcher to make strategic decisions to direct unoccupied ambulances. In addition, simulations based on forecasts may serve as the starting point for different optimal routing strategies. In this paper, we study a unique set of Swedish spatiotemporal ambulance call data. The spatial study region is the four northernmost regions of Sweden and the study period is January 1, 2014, to December 31, 2018. The non-infectious disease nature of the data motivated us to employ log-Gaussian Cox processes (LGCPs) for the spatiotemporal modelling and forecasting of the calls. To this end, we propose a K-means based bandwidth selection method for the kernel estimation of the spatial component of the separable spatiotemporal intensity function. The temporal component of the intensity function is modelled by Poisson regression and the spatiotemporal random field component of the random intensity of the LGCP is fitted using Metropolis-adjusted Langevin algorithm. A study of the spatiotemporal dynamics of the data shows that a hot-spot can be found in the southeastern part of the study region, where most people in the region live and our fitted model captured this behaviour quite well. The fitted temporal component of the intensity functions reveals that there is a significant association between the expected number of calls and the day of the week as well as the season of the year. In addition, non-parametric second-order spatiotemporal summary statistic estimates indicate that LGCPs seem to be reasonable models for the data. Finally, we find that the fitted forecasts generate simulated future spatial event patterns which quite well resemble the actual future data.


Introduction
To obtain a desired outcome of the prehospital care in a given country, which includes keeping the mortality low, the ambulance response times should be as small as possible (Blackwell and Kaufman, 2002;O'keeffe et al., 2011;Pell et al., 2001). Ideally, any given country would have the financial means to let the number of active ambulances, at any given time of the day and in any given region, be so large that the higher quantiles of the empirical response time distribution would be very small. Unfortunately, this is not/rarely the case and many countries struggle to make the most of their existing resources. Even Sweden, for example, which is a prime example of a well-fare state with large resources at hand, struggles with making its state financed prehospital care meet different desired response time goals; in 2018, the Swedish prehospital had about 660 ambulances that responded to approximately 1.2 million calls per year, and cost more than 4 billion SEK annually. Some of the challenges that the Swedish prehospital care is facing include an ageing population, an increasing population and urbanisation. As an effect, in particular in northern Sweden, there are large remote rural areas with few inhabitants, mainly elderly people, which is also one of the groups of the Swedish society with the highest medical care consumption, who need to have access to prehospital care quite quickly. Since the (specialised/academic) hospitals of Sweden mostly are located in larger towns, which often are far from the rural areas in question, given a set of ambulances within a region, one should optimally place the ambulance dispatch centres such that the response times will not be too large for any inhabitant of the region in question. To be able find solutions to these issues, it is crucial to understand the evolution of the spatio-temporal risk of the occurrence of an ambulance call, at any given time and place. However, given the discussion above, it should be clear that the ambulance/emergency alarm call frequency throughout Sweden is both quite complex and dynamical, with ever changing conditions in both space and time.
The data at hand, which consists of the emergency alarm calls in the four (northernmost) Swedish regions Västerbotten, Norrbotten, Västernorrland and Jämtland-Härjedalen, represents a typical case of a spatio-temporal point pattern (Baddeley et al., 2015;Diggle and Gabriel, 2010;Diggle, 2013;González et al., 2016), i.e. it is of the form {(s i , t i )} n i=1 where s i is the spatial (gps) location of the ith call and t i is the associated call time (here obtained as the date). Although there is a vast literature on prehospital care optimisation (see e.g. Aringhieri et al. (2017) and the references therein), surprisingly very little effort has been spent on actually modelling the calls by means of point processes, which are the models used for point patterns. Some attempts have been made (Zhou et al., 2015(Zhou et al., , 2016 but, based on our understanding, the proposed approaches did not justify their model choices on the basis of non-parametric point pattern analyses so it remains unclear how well the proposed models are generally suited for this type of data. The primary goal of this work is to describe the spatio-temporal dynamics of the calls with the main focus of the study being to identify hotspot-regions within the spatial study region as well as generating a short-term forecast-model that allows us to simulate realistic future emergency alarm call locations, which e.g. may be used to optimally allocate prehospital resources and design prehospital strategies, such as ambulances dispatching strategies. Since the location and time of each emergency alarm call are known, a spatio-temporal point process provides the right framework to model the space-time dynamics in the emergency alarm call data. As will be verified by our non-parametric analysis, due to the data's non-infectious epidemiological nature, inhomogeneous log-Gaussian Cox processes (LGCPs) (Møller et al., 1998) tend to be particularly suited to model the space-time dynamics of the emergency alarm calls -LGCPs take spatial and/or temporal clustering/aggregation into account. Moreover, through the LGCP framework we may do spatial forecasting for future time periods and, consequently, simulate "realistic" future emergency alarm scenarios. Following Diggle (2013), the space-time intensity function of the LGCP considered is expressed as a product of two deterministic baseline functions and one stochastic log-Gaussian random field component. The deterministic components represent purely spatial and purely temporal variations while the stochastic component deals with the unexplained space-time variation; the employed approach is semi-parametric in the sense that the spatial baseline function is taken to be a spatial density estimate whereas the temporal one is taken to be a temporal intensity estimate which is based on temporal covariates. The spatial density estimate, which is a normalised spatial intensity estimate (Cronie and Van Lieshout, 2018;Moradi et al., 2019;van Lieshout, 2012), is obtained through kernel intensity estimation and we here propose a new clustering approach to select the associated bandwidth. In the case of the temporal intensity estimation, we employ a Poisson regression model which is based on temporal covariates such as "day of the week". To fit the stochastic component of the model, minimum contrast estimation (Baddeley et al., 2015;Diggle, 2013;Moller and Waagepetersen, 2003) is the natural way to proceed. Finally, the forecasting is carried out by exploiting the Ornstein-Uhlenbeck approximation of Brix and Diggle (2001).
The structure of the article is as follows. Section 2 briefly presents the ambulance call data and Section 4 provides an overview of the statistical methods used in this work. We present the results of the study in section 5 and discuss the implication of the results as well as provide a precise summary of the work in section 6.

Data
The spatio-temporal point pattern {(s i , t i )} n i=1 under study consists of n = 444283 events, i.e. ambulance/emergency alarm calls, and for each event we have recorded its spatial (gps) location s i ∈ R ⊆ R 2 as well as its date of occurrence, t i ∈ T . The temporal domain, T , of the spatio-temporal data ranges from January 1, 2014 to December 31, 2018.
The left-hand panel in Figure 1 displays the spatial locations of the calls while the right-hand panel presents the call count over time. From the left-hand panel of the figure, we can see that the calls are unevenly distributed over the four northern regions (Norrbotten, Västerbotten, Västernorrland, and Jämtland-Härjedalen) of Sweden. In addition, as expetced, the calls tend to be located in populated areas, which in turn lie along the road network -this will make the semi-parametric statistical inference quite challenging. For data privacy issues, the axis labels of the left-hand panel in the figure have been obtained by dividing the xy-coordinates of the alarm call locations by 1000. Note that since we will use spatio-temporal point processes for the modelling, we will model daily accumulated data as if they were in fact observed on a temporal continuum, i.e. on the interval between midnight and midnight on these two dates, we will in effect treat the calls occurring during a given day as though they were uniformly distributed over that day.

Point process preliminaries
A (simple) spatio-temporal point process (STPP) (Diggle, 2013;Diggle and Gabriel, 2010;González et al., 2016) is essentially a random countable subset of R 2 × R such that, with probability one, i) the number of points of within any bounded (Borel) subset A of R 2 × R, denoted N(A), is finite, ii) the cardinality N({(s, t)}) ∈ {0, 1} for any (s, t) ∈ R 2 × R, and iii) we equip the temporal domain R with the natural complete ordering of the real numbers. Moreover, throughout we will consider the STPP obtained as the restriction of an STPP in R 2 × R, where R ⊆ R 2 is a bounded spatial domain and [0, T ] ⊆ R is a bounded temporal domain. We will interchangeably use the term event for both a point (s i , t i ) of and the actual event that took place in the application to give rise to a space-time measurement (s i , t i ). Hence, in the case of our ambulance call data, an event is (the location and the time of) a call made to the dispatcher/emergency centre, which resulted in the dispatch of an ambulance. The events are indexed in accordance with their time of occurrence.

Intensity functions
Provided that it exists, the intensity function of an STPP governs its univariate properties and is defined as where s ∈ δs ⊆ R 2 and t ∈ δt ⊆ R, and |δs × δt| denotes the size of δs × δt. Heuristically, since our STPPs can have at most one event at any location, we interpret λ(s, t) at the density of the probability that the STPP has an event in an infinitesimal neighbourhood of (s, t). Whenever λ is a non-constant function we say that the STPP is inhomogeneous, and otherwise we say that it is homogeneous. Similarly, we may study joint occurrences of points of the STPP by considering the intensity function λ 2 of the point process formed by distinct pairs ((s i , t i ), (s j , t j )) ⊆ (R 2 × R) 2 , i j, of points of the STPP: which is referred to as the second-order intensity/product density function. Here |δs × δt| and |δs × δt | denote the volumes of neighbourhoods δs × δt and δs × δt of the spatio-temporal locations (s, t) and (s , t ), respectively. Note that λ 2 ((s, t), (s, t)) = 0 for any (s, t) and λ 2 ((s, t), (s , t )) is interpreted as the density of the joint probability that the STPP has events in infinitesimal neighbourhoods of (s, t) and (s , t ).

Poisson processes
A Poisson process, often playing the role of a benchmark which represents the case of no spatiotemporal interaction, is such that the event counts in disjoint subsets of the spatio-temporal domain are independent and the random number of events found in any subset of the spatio-temporal domain is Poisson distributed. If is a Poisson process with intensity function λ, then N(A) follows a Poisson distribution with mean A λ (s, t) dsdt for any subset A of the spatio-temporal domain. Due to the independence property of a Poisson process, we have that its second-order intensity function satisfies λ 2 ((s, t), (s , t )) = λ(s, t)λ(s , t ).

Intensity reweighted second-order moments
One can easily make the mistake of thinking that λ 2 ((s, t), (s , t )) being large means that there is (strong) dependence between points of the underlying STPP located around (s, t) and (s , t ). For instance, in the case of a Poisson process, if λ(s, t) and/or λ(s , t ) are/is large enough then we also obtain that λ 2 ((s, t), (s , t )) is large, despite the fact that there is independence between all points. Hence, to study actual dependence, we have to scale away the individual intensity contributions and this results in the pair correlation function (Diggle, 2013;Gabriel and Diggle, 2009;González et al., 2016), which is one of the most commonly used summary statistics for studying second-order dependence structures. It is defined as and we see that for a Poisson process with intensity function λ we have that g((s, t), (s , t )) = 1. Hence, if the pair correlation function of an arbitrary STPP satisfies g((s, t), (s , t )) > 1 then there is clustering/aggregation between points located around (s, t) and (s , t ). Reversely, g((s, t), (s , t )) < 1 instead indicates inhibition. Note that the so-called covariance density (Diggle, 2013) is given by c((s, t), (s , t )) = λ 2 ((s, t), (s , t )) − λ(s, t)λ(s , t ) = λ(s, t)λ(s , t )(g((s, t), (s , t )) − 1), which is 0 for a Poisson process. It is further common to consider cumulative versions of pair correlation functions, so-called Kfunctions. Originally defined by Ripley for the spatial homogeneous case, it has been extended in a variety of directions. Baddeley et al. (2000) considered the spatial inhomogeneous case and this was extended to the spatio-temporal case by Gabriel and Diggle (2009), which in turn was further studied by Møller and Ghorbani (2012); Gabriel (2014). Versions for marked (spatio-temporal) point processes have been treated by Cronie and van Lieshout (2016); Iftimi et al. (2019). Inhomogeneous K-functions, both spatial and spatio-temporal, have been used in several application areas, see e.g. Diggle et al. (2007); Diggle and Gabriel (2010); Law et al. (2009);Arbia et al. (2012); Gabriel and Diggle (2009). Inhomogeneous K-functions are defined under the assumption of second-order intensity reweighted stationarity (SOIRS), which may defined as i.e. the pair correlation function only depends on the separation vectors, in combination with the intensity function being positive. Note that SOIRS is only well-defined for STPPs on R 2 × R, i.e. not for STPPs on bounded regions, since SOIRS is based on Euclidean shifts and this is the reason why we consider restrictions of the kind (1) rather than defining our STPPs directly on R × [0, T ]. Gabriel and Diggle (2009) further defined SOIRS (with isotropy) as (additionally) having g((s, t), (s , t )) = g( s − s , |t − t |), where · denotes the Euclidean norm, and considered the spatio-temporal K-function where r ≥ 0 and t ≥ 0 are the spatial and temporal lags, respectively. We may think of (3) as reflecting the expected number of further points within spatial distance r and temporal distance t of an arbitrary point, having scaled for the varying intensity (if the intensity is high in a region then we scale down the contribution from that region). For an inhomogeneous spatio-temporal Poisson process with intensity function λ(s, t) we have that K (r, t) = 2πr 2 t and we may use this as a benchmark: if K (b, d) − 2πb 2 d > 0 we have indications of clustering for spatial lags less than r and temporal lags less than t, and K (b, d) − 2πb 2 d < 0 instead indicates inhibition/regularity. We finally point out that there also exist higher order spatiotemporal summary statistics Cronie and Van Lieshout (2015) which may capture interactions beyond pairwise interactions, which is the case for K-functions.

Marginal point processes and separability
We say that an STPP is (first-order) separable if where it is customary to let either λ 0 or λ 1 be a density function. In other words, we assume that the univariate spatio-temporal properties of the STPP are obtained as a spatial scaling of the temporal univariate properties, or vice versa. Although this assumption mainly is suited for data which exhibit a seasonal behaviour (the spatial mass is not redistributed over time), it is often a needed pragmatic working assumption.
According to the definition of a point process, since we consider a restriction in a bounded domain R × [0, T ] (as is always the case in practice), we may also consider the well-defined projected point and it is tempting to believe that they correspond to λ 0 and λ 1 when is separable. However, we have that (Møller and Ghorbani, 2012;González et al., 2016) so only one of the integrals on the right hand sides will vanish if we assume that either λ 0 or λ 1 is a density function. E.g., if λ 0 (s) is a density on R, then In addition, when the event times are discrete and integer encoded, as in the case of our ambulance data, we may also consider the spatial point process time series (t) = {s i : (s i , t i ) ∈ , t i = t} ⊆ S , t = 0, 1, . . . , T . Moreover, denoting the counting function of (t) by N t (B), B ⊆ R, it follows that (t) has intensity functions and, clearly, if is the restriction of a SOIRS STPP then (t) is the restriction of a spatial SOIRS point process in R 2 (in the sense of Baddeley et al. (2000)) and we may consider i.e. its (spatial) pair correlation function and K-function.

Log-Gaussian Cox processes
If we see signs of clustering and the data are such that the spatio-temporal interactions intuitively seem to come from some latent risk, which is the case with our ambulance call data (underlying factors such as weather and time of the day are possible drivers for the variations in the call risk), a reasonable family of models to employ is spatio-temporal log-Gaussian Cox processes (LGCPs) (Diggle, 2013).
Let Λ = {Λ(s, t) ∈ R + | (s, t) ∈ R 2 × R} be a spatio-temporal random field, referred to as the random intensity process, such that (with probability one) A Λ(s, t)dsdt < ∞ for any bounded subset A of R 2 ×R. Its first-and second-order intensity functions are given by λ( . A spatio-temporal Cox process with random intensity Λ is generated by conditioning on Λ = ρ = {ρ(s, t) : (s, t) ∈ R 2 × R} and then, in turn, generating a Poisson with intensity function ρ: A spatio-temporal LGCP is a spatio-temporal Cox process with the property that the logarithm of its stochastic intensity function is a Gaussian process (Møller et al., 1998). Following e.g. Diggle et al. (2005), in the case of an LGCP we specifically have that T ]} is an (unobservable) spatio-temporal Gaussian random field/process with mean and the covariance functions The distribution of Z is completely determined by the functions µ(s, t) and C((s i , t i ), (s i , t i )), and for any ((s 1 , t 1 ), . . . , (s n , t n )) ∈ (R × [0, T ]) n , n ≥ 1, by definition, (Z (s 1 , t 1 ) , . . . , Z (s n , t n )) follows a multivariate Gaussian distribution with mean vector (µ(s 1 , t 1 ), . . . , µ(s n , t n )) and covariance matrix given by the n- ..,n . The spatio-temporal random function Z represents unobservable space-time Gaussian noise that affects the occurrence of events and its inference provides information about spatial clustering of the events. The properties of Z determine the properties of its corresponding LGCP. For instance, if Z is stationary and/or isotropic, then its corresponding LGCP is also stationary and/or isotropic (Møller et al., 1998). Defining σ 2 (s, t) = Var(Z(s, t)) = C((s, t), (s, t)), the first-and second-order intensity functions are given by (Møller et al., 1998) = λ 0 (s)λ 0 (s )λ 1 (t)λ 1 (t ) exp{µ(s, t) + µ(s , t ) + σ 2 (s, t)/2 + σ 2 (s , t )/2 + C((s, t), (s , t ))}.
In this work, we assume that µ(s, t) = −0.5σ 2 and σ 2 (s, t) = σ 2 > 0 so that E[exp{Z(s, t)}] = 1 and we note that the constant variance σ 2 determines the point-wise variability of Z and it scales the log-intensity. This assumption implies that λ(s, t) = E [Λ(s, t)] = λ 0 (s)λ 1 (t), and we obtain that the LGCP is separable (recall Section 3.4). We further have that the deterministic temporal component λ 1 (t) ≥ 0 represents the expected number of events that occur in an infinitesimal neighbourhood of t and when we consider a restriction of the type (1), i.e. when our LGCP is given on R × [0, T ], we let the fixed spatial component λ 0 (s) ≥ 0 be a density, i.e. R λ 0 (s)ds = 1. The above (separable) construction helps to provide separate spatial and temporal meaning to the expectation of the stochastic intensity. We further assume that Z is stationary and isotropic, whereby the covariance function depends only on the separation lags between the evaluation points.
In addition, we assume a separable covariance structure, i.e. the covariance between two points in space-time can be factored into purely spatial and purely temporal components: where · represents the Euclidean norm in R 2 , |·| denotes absolute value, and φ, σ 2 and θ ∈ R + are fixed parameters (to be estimated). The scale parameters φ and θ control the rate of decay of the spatial and temporal correlation functions. We interpret temporal dependence as the covariance between the number of events in the study region at two points in time. Several forms for isotropic spatial correlation functions exist, e.g. the spatial correlation function in exponential and Matèrn covariance functions.
Consulting the first-and second-order intensity functions in (6), we find that the pair correlation function and the K-function of the LGCP with driving covariance function (7) are given by the assumed LGCP is SOIRS, provided that the intensity is positive on R 2 × R.

Statistical methodology
We next turn to the statistical inference based on , which is the restriction of the kind (1) of the LGCPs proposed in Section 3.5.

Spatial kernel intensity estimation
To non-parametrically deal with inhomogeneity of point patterns, there are at least two approaches that can be utilised. The first approach is to use methods of homogeneous point processes (constant intensity function) on subregions of the study region (Allard et al., 2001; while the second approach is to directly deal with the point patterns' inhomogeneity by employing some functional approach which smooths over the whole domain (Cronie and Van Lieshout, 2018;Moradi et al., 2019;van Lieshout, 2012), e.g. kernel estimation.
Here we choose to use Quartic kernel smoothing methods to non-parametrically describe the purely spatial intensity function λ 0 (s), which generates a spatially smooth estimate of the local intensity of events. More precisely, we consider the kernel 2} and the estimator of the purely spatial intensity function is be given bŷ To incorporate edge correction into the spatial intensity estimate, the methods described in Diggle (1985) and Berman and Diggle (1989) may be applied. Note, however, that in the case of our ambulance data (recall Figure 1 and a map of Sweden and its border), edge effects may not actually be very present.
Although it is often claimed that the choice of kernel used is irrelevant (in comparison to the bandwidth choice), this is not entirely true (Cronie and Van Lieshout, 2018) and we note, in particular, that kernels with bounded supports, such as the Quartic kernel, may generate large areas in the study region where the intensity estimate is 0, as opposed to using e.g. the Gaussian kernel which has unbounded support. The upside is that we, as in the case of our ambulance data, may better control where we place the mass of the estimated intensity. The issue that we are facing in the context of SOIRS models, however, is that our estimated intensity will not be positive everywhere, which is one of the conditions of SOIRS.
Here we choose to ignore that latter fact and simply proceed with the Qaurtic kernel.

Bandwidth selection
As previously indicated, selection of an appropriate bandwidth, h, which controls the smoothness of a kernel estimate, is the main challenge in kernel smoothing. In optimal bandwidth selection, we need to optimise the balance between under and over-smoothing of the kernel smoothing method. In the context of point processes, we are also faced with the additional challenge that the points of the underlying point process may be dependent. A range of different methods have been proposed in the literature (see e.g. Baddeley et al. (2015); Davies et al. (2018) for overviews), and most noteworthy are perhaps the recent method of Cronie and Van Lieshout (2018) and the Poisson processes likelihood cross-validation method Loader (1999).
We here take on an approach which is different than the point process derived methods mentioned above, namely to exploit the idea of K-means clustering. K-means clustering is a method commonly employed in machine learning contexts, which is used to identify groups or clusters of data points in a multidimensional space. Let {s 1 , . . . , s n } be a spatial data set consisting of n observations in a 2-dimensional Euclidean domain R. The data set is partitioned into K ≥ 1 clusters and each cluster is composed of data points whose inter-point distances are smaller than their distances to points outside of the cluster. After taking clustering information into account, the average of the variances of the clusters can be taken as an optimal estimate of the bandwidth. We have summarised this approach in the algorithm below.
3: Update the centroids of the clusters (the mean locations of the clusters): 4: Repeat steps 2 and 3 for v = 1, 2, . . ., 5: Denote the obtained centroids of the clusters byˆ k , k = 1, . . . , K. 6: Compute an optimal selection of the bandwidth: A few things need to be highlighted here. The Cronie and Van Lieshout (2018) method and the Poisson processes likelihood cross-validation method are both based on integral expressions, where the integrals run over the whole spatial domain R. Since essentially all of the ambulance calls have spatial locations which are close to the Swedish road network (see Figure 1), we have big holes in R where there are no events. This is exactly the scenario where the aforementioned methods tend to not work well (Cronie and Van Lieshout, 2018). The K-means clustering idea, on the other hand, operates on a more local scale and thus works better for our ambulance data. Note that the K-means clustering idea, similarly to the Poisson processes likelihood cross-validation method, does not fully compensate for dependence and, consequently, densely packed groups of events are interpreted as regions with high intensity function values, when they in fact may be the results of aggregation; e.g., in the case of an LGCP we are essentially trying to reconstruct the random intensity function rather than estimating the actual intensity function. Moreover, we have to choose the the smoothing parameter K by means of visual inspection but this we found to be doable in the case of the ambulance data (see Section 5 for details).

Poisson regression based temporal intensity modelling
Recall from Section 3.5 that, by assumption, R λ 0 (s)ds = 1 so the temporal component λ 1 (t) coincides with λ T (t), which reflects the expected number of events in the spatial study region at time t. Following Diggle et al. (2005), it seems to make sense to use a Poisson regression approach to model the temporal intensity component of the stochastic intensity function Λ(s, t). Day of the week, season of the year, and the periodic nature of the data can be used as covariate information for the modelling of the expected number of events, i.e. the emergency alarm calls, over time. For instance, the effect of seasons on the expected number of calls over the study region can be understood from the regression model. Let N R (t) be the expected number of emergency calls in R at time t; under a Poisson distribution assumption, the mean and variance of N R (t) are equal and here given by λ 1 (t). The obtained Poisson regression model is given by γ j 1{sn (t) = j}+β 1 sin (τt)+β 2 cos (τt)+β 3 sin (2τt)+β 4 cos (2τt)+δt, where {α 0 , α 1 , · · · , α 7 , γ 1 , · · · , γ 4 , β 1 , · · · , β 4 , δ} are the set of parameters, d (t) is the day of the week at time t, sn (t) is the season of the year at time t, δ is the trend parameter, τ = 2π 365 , and 1{d (t) = i} and 1{sn (t) = i} are indicators of the day of the week and the season of the year at time t. The days of a week, from Monday to Sunday, have been numbered ranging from 1 to 7 while the four seasons of a year, Spring, Summer, Fall, and Winter, have been numbered ranging from 1 to 4, respectively. The annual periodicity of the events can be taken into account in the model setting through the parameters β i , i = 1, . . . , 4.

Non-parametric second-order summary statistic estimation
In the current context of LGCPs and their statistical inference, second-order summary statistics and their non-parametric estimators play a crucial role. Not only do they help in the non-parametric estimation of the underlying spatio-temporal dependence structures of the data, but they also play a crucial role in the minimum contrast-based parameter estimation for the latent Gaussian field Z in an LGCP's random intensity function.
There are some issues that may arise in the estimation of spatio-temporal inhomogeneous K-functions or pair correlation functions. First off, they contain the true, unknown intensity function, but this we may replace by an intensity estimate (recall the previous sections). However, such a replacement should be done with caution, as it often adds bias to the estimation (Baddeley et al., 2000). This may for instance be the case when we estimate both a varying intensity function and a second-order characteristics using the same point pattern. We can overcome such problem by modelling the intensity using explanatory variables (Diggle et al., 2007;Mrkvička et al., 2014). Another way of dealing with the problem is to model the intensity parametrically as suggested by Arbia et al. (2012) and Diggle et al. (2007); e.g., Liu et al. (2007) and Moller and Waagepetersen (2003) have used parametric models to model intensity functions.
A further problem is related to edge effects, i.e. the influence of unobserved events outside the study region on the estimation of spatio-temporal interaction structures. In comparison to the purely spatial setting, edge effects are more difficult to overcome here where the dimension is greater than two (Baddeley, 1999). Various edge-correction methods have been proposed in the spatial setting (see e.g. Baddeley (1999); Cressie (1993); Diggle (2003); Illian et al. (2008); Law et al. (2009);Li and Zhang (2007); Ripley (1988); Goreaud and Pélissier (1999); Haase (1995); Pommerening and Stoyan (2006); Yamada and Rogerson (2003)) and in the spatio-temporal context, Gabriel (2014) has extended three classical spatial edge-correction factors to the spatio-temporal setting while Cronie and Särkkä (2011) proposed methods for parametric models.
Following Gabriel and Diggle (2009);Gabriel (2014) and assuming SOIRS with isotropy as in Gabriel and Diggle (2009), we employ the unbiased non-parametric estimator of the K-function in equation (3). Here the notation 1{·} is used for indicator functions, ω R (s i , s j ) is Ripley's edge correction factor in R (Ripley, 1977) and ω T (t i , t j ) is the one-dimensional analogue of Ripley's edge correction factor in [0, T ]. Next, recall the point process time series (t), t = 1, . . . , T , in Section 3.4. Assuming separability and following Davies and Hazelton (2013), we may consider the purely spatial and temporally averaged non-parametric pair correlation function and K-function estimatorŝ where ω R is Ripley's isotropic edge correction with respect to the region R, κ s is a univariate smoothing kernel with bandwidth h s , and |R| is the area of the spatial region R. Note that the above forms are derived from the fact that the expected number of events with (integer) event time t is given by λ 1 (t) R λ 0 (s)ds.
For the bandwidth selection in the pair correlation function estimation in (10), we use Stoyan's rule of thumb (Stoyan and Stoyan, 1994). Note that, in practice, in each of the estimators above we have to plug in an estimate of the intensity function.

Parameter estimation for the Gaussian random field parts
We next turn to the statistical inference for the whole LGCP model in Section 3.5. The key here is to note that we estimate λ 0 = {λ 0 (s) | s ∈ R} and λ 1 = {λ 1 (t) | t ∈ [0, T ]} (recall Section 3.4) separately, non-parametrically and parametrically, respectively, so we are only left with estimating the covariance functions of the assumed LGCP, which are governed by the parameters σ 2 , φ, and θ. In other words, we are here only concerned with estimating the parameters of the latent Gaussian random field Z in the formulation of the random intensity of the LGCP in Section 3.5.
It is common practice to exploit a regular grid discretisation {(s i , t i )} m i=1 ⊆ R × [0, T ] of the study region (Møller et al., 1998;Brix and Diggle, 2001;Diggle et al., 2005), which yields a discretised version z = {z(s i , t i )} m i=1 of the Gaussian process Z over the grid (recall Section 3.5). Conditionally on Z, the number of events located within the grid cell centred at (s i , t i ) may be treated as an observation of a Poisson random variable x(s i , t i ). Assuming that the stochastic intensity function Λ is constant over the cells (in practice this may approximately be achieved by assuming that the cells are small), conditionally on the collection follows a multivariate Poisson distribution. Note that vector-equivalents may be obtained by ordering the cells. The event times of our ambulance call data are recorded daily and we have thereby integer encoded them. Hence, we may consider three-dimensional time slices x t = lexo{{x(s i , t)} m i=1 }, t ≥ 1, which are column vectors obtained by lexicographical ordering (lexo) of the cell-aggregated data at time t. Here the short notation x 1:t represents the cell-counts at time slices 1, . . . , t. The notations z t and z 1:t , which pertain to the discretisation of the random field Z, are interpreted in a similar manner as the notations x t and x 1:t .
The natural starting-point for parameter estimation is maximum likelihood estimation. However, in general, the likelihood of an LGCP is analytically intractable since it is expressed as a high-dimensional integral with respect to the distribution of the unobserved Gaussian process Z. Under the current parametrisation, the likelihood of the LGCP can be expressed as T ] is an observed spatio-temporal point pattern. In principle, importance sampling can be used to carry out maximum likelihood estimation of LGCP (Geyer, 1994;Møller et al., 1998). However, in order to use importance sampling for maximum likelihood estimation of the LGCP, we need to sample from the conditional distribution of the whole of z 1:m , given all the observations x 1:m , and this is unfortunately infeasible in big data settings such as the current one (Brix and Diggle, 2001).
Instead, the common approach is to carry out minimum contrast estimation, based on summary statistics. Here, the essential idea is to minimise a functional distance between the parametric form of a functional summary statistic, which has been derived from the assumed model (here the LGCP-structure in Section 3.5), and a non-parametric estimate of the summary statistic in question. Within the current set-up, the spatial and temporal components are estimated separately.

Minimum contrast estimation of the spatial components
We start by dealing with (σ 2 , φ), i.e. with the variance and the parameter(s) of the spatial part of the separable covariance function (7). More specifically, they are estimated according to (σ 2 ,φ) = arg min where U s constitutes a fine and evenly spaced sequence 0 ≤ r min < r 1 < · · · < r max with ∆r = r i+1 − r i , ϑ(·) is an optional lag-dependent weight, S is the model-based parametric form of the summary statistic used, S is its non-parametrically estimated counterpart and  [·] is a transformation applied to the functional summary statistics. For the summary statistics we may here use the non-parametric estimators in (10) and their theoretical LGCP-based counterparts (recall Section 3.5 and Section 3.4): note that these are constant in t. In the case of our ambulance data, we specifically employ an exponential correlation function r φ (u) = exp {−u/φ}. Moreover, we have here chosen to use the pair correlation for our minimum contrast estimation.

Minimum contrast estimation of the temporal components
The parameter θ describes temporal dependence between different (spatial point process) components of the series (t), t = 1, . . . , T . More specifically, it determines the covariance between the number of events in the spatial domain R at two points in time. Recalling the count function N t (B), B ⊆ R, of (t), t = 1, . . . , T (see Section 3.4), by Diggle et al. (2005); Davies and Hazelton (2013), and an empirical auto-covariance function is a natural non-parametric estimator for the temporal covariance (12) and it is given byĈ We may now obtain a minimum contrast estimate of θ by findinĝ where v is a temporal lag and v max is a maximum temporal lag. We refer to Davies and Hazelton (2013) for a complete overview of a minimum contrast estimation of covariance function parameters in spatial and spatio-temporal LGCPs. Hence, using expressions (11) and (13), the parameters of the covariance function in equation (7) may be estimated. The centroids of a regular grid cells of the discretised study region and the estimated parameters can be exploited to approximate the covariance matrix Σ by computing its elements using equation (7). By definition, the discretisation z of the unobservable Gaussian process Z on the study region follows a multivariate Gaussian distribution. It follows that simulation can be used to do inference for the unobservable random vector z, by sampling from the conditional distribution of the unobservable random vector given its corresponding observable random vector (Brix and Diggle, 2001). Letting [·] denote the probability distribution of a random quantity, the conditional distribution of the unobserved time series of random vectors z 1:t , given its corresponding observed time series of random vectors x 1:t , satisfies Note that here [x 1:t | z 1:t ] essentially describes the distribution of a (discretised) Poisson process, given that its intensity is obtained through a realisation of a (discretised) random field, and [z 1:t ] represents the unconditional distribution of the (discretised) field. Hence, to infer about z, we need to be able to sample z 1:t from the conditional distribution. We may approximate the right hand side of (14) by where t − v is a small integer, since remote past observations tend to have a negligible effect on the current state. The distribution [z v:t ] can be evaluated using the estimates of the parameters σ 2 , φ, and θ obtained through equations (11) and (13). Finally, recall that in the case of the ambulance data we assume that the spatial correlation function is given by the exponential form r φ (u) = exp{−u/φ}. We here also assume that that the temporal correlation function has an exponential form, i.e. r θ (v) = exp{−v/θ}. Note that the scale parameters controls the smoothness of the underlying (marginal) fields and indicate the predictability, i.e. essentially how close to each other two events have to be "to influence each other".
In the Supplementary material we provide details on the Fast Fourier transform for covariance matrix computation and the Metropolis-adjusted Langevin algorithm for simulation.

Forecasting
In this work, the emergency alarm call locations at the last time point have been reserved and the emergency alarm call locations at the last time point have been forecasted to visually evaluate the practicability of the spatio-temporal log-Gaussian Cox process model. Furthermore, we are aimed at forecasting the emergency alarm call locations beyond the last time point T of data observation.
Using the last time point T , the random intensity function in equation (5) can be rewritten as where ∆ is the number of time units (days in the case of the ambulance data) beyond the last time point T . Let q = MN, where M and N are the number of rows and columns of a discretisation of the spatial domain; see expression (.3) in the Supplementary material. On the extended lattice (with dimension q×q), the time series z ext t , t = 1, . . . , T , is assumed to be a stationary Gauss-Markov process. It follows that this is a discrete-time sample of an Ornstein-Uhlenbeck process (Brix and Diggle, 2001), which solves the stochastic differential equation where θ is q × q invertible real matrix, µ is q-dimensional long-run mean, σ is q × q positive semidefinite matrix, and t , t ≥ 0, is a q-dimensional Wiener process (Jacobsen et al., 1993). The diffusion coefficient σ measures the size of the noise, µ is the asymptotic mean level for the process and θ determines how fast the process reacts to perturbations. Solving (16) and using the stationary Gauss-Markov process property of the process, we have that where ϕ (∆) = exp{−θ∆} and I is the q × q identity matrix. Given the observed data x ν:T , the forecast distribution of z T +∆ can be formulated as and based on equation (17), the expected value of the forecast distribution z ext T +∆ | x ν:T can be obtained as follows: Using the expected values in equations (17) and (18), the variance of the forecast distribution z ext T +∆ | x ν:T may be obtained as A simplifying assumption should be used to estimate the forecast distribution. One of the key assumptions is to assume the parameter matrices θ and σ to be Iθ and Iσ. Taylor et al. (2013) have implemented the estimation of the forecasting distribution under these simplified assumptions. Following this assumption, equations (17), (18), and (19) can be simplified and Monte Carlo estimation can be used to estimate the forecast expectation in equation (18). Let z ext T (i) denote the i th sample obtained during the simulation of the Gaussian random field, which can be taken as a draw from [z ext T | x ν:T ] once the chain has reached stationarity. Using equation (17), we can then generate a draw z ext T +∆ (i) from [z ext T +∆ | z ext T ] given the sample z ext T (i). Considering a burn-in time b, the mean of the samples z ext T +∆ (i), i = b, b + 1, . . . , q, is an unbiased estimator of the expectation E[z ext T +∆ | x ν:T ]. Once we estimate the expected value in equation (18) and predict the expected number of emergency alarm calls by the estimated Poisson regression model, equation (15) can be computed to simulate spatio-temporal inhomogeneous Poisson process.

Spatio-temporal ambulance call data analysis
We have finally reached the point where we apply the above LGCP framework to the spatio-temporal ambulance call data in Section 2. Recall that the main task here is to simulate realistic future emergency alarm call locations, given a future time point (here, a day). A large part of the analysis has been carried out using the R package lgcp (Taylor et al., 2013).

Non-parametric estimation of the spatial component of the intensity
The first thing that we do is to obtain a non-parametric estimate of the spatial intensity component λ 0 (s), s ∈ R, which we do in accordance with Section 4.1. More specifically, we employ the Quartic kernel estimator (8), where we select the bandwidth using the K-means method outlined in Section 4.1.1; the outcome is visually illustrated in Figure 2. To select the number of clusters, K, to use, we have turned to visual inspection; we found that estimates with K > 5 tended to accentuate regions which (visually) have a low spatial intensity too much. In addition, K = 5 was also the choice which gave rise to the simulated forecasted point patterns that most resembled the true data (see the part on forecasting below).

Non-parametric spatio-temporal interaction analysis
Before choosing a particular model to fit to a given point pattern, it is paramount to carry out a nonparametric analysis of spatio-temporal interaction, to better understand the data and to see what type of model may be appropriate for the data. Here the idea is to quantify the interaction (clustering/aggregation or inhibition/regularity) present at/within different inter-point distances. As previously indicated, we here employ the inhomogeneous K-function estimatorK(r, t) in (9), which is compared with the theoretical K-function K (r, t) = 2πr 2 t obtained for an inhomogeneous spatio-temporal Poisson process (with any intensity function), to see whether there is excess spatio-temporal interaction in the underlying STPP. Recall that we plug an intensity estimateλ(s, t) into expression (9), which we here let be given by a product of the spatial intensity estimate obtained in Section 5.1 and a temporal kernel estimate with an Epahnechnikov kernel. Given the scale of the data, the effects of the miss-specification that our data are discrete in time while (9) is, in fact, defined for temporally continuous data are negligible. Figure 2 shows a plot ofK(r, t) − 2πr 2 t, where all values have been scaled by the maximum value of the estimator K(r, t) in order to obtain values between 0 and 1. We see that there is light aggregation present which, we argue, advocates for employing an LGCP-model (a special case of an LGCP is a Poisson process). There is a caveat, however, which is that since we assume first-order separability,K(r, t) reflects both spatio-temporal aggregation and non-separable first-order effects. It should further be noted that there may be stronger temporal interactions present, which operate on time lags smaller than one time unit, i.e. one day. Spatial kernel intensity estimation plot (left) and inhomogeneous K-function estimation plot (right) for the ambulance call data. In the right plot, the x-axis label denotes spatial lags in meters while the y-axis label represents temporal lags in days.

Estimation of the temporal intensity component
We next turn to modelling the temporal evolution of the calls, which is done in accordance with Section 4.2. To fit the Poisson regression model for the temporal component λ 1 (t), we use the iteratively reweighted least squares method and the obtained results can be found in Table 1. We see that our model is neither overestimating nor underestimating as the median deviance residual is close to zero, see Table 1. By looking at the residual deviance we see that the inclusion of covariates in the Poisson regression model generates a better fit as compared to how well we would have performed with a model that only includes the intercept (reflected by the null deviance). This suggests that the fitted model is indeed appropriate. The table also reflects that the daily expected number of calls has a strong relationship with the covariate day-of-the-week and season-of-the-year, e.g. summer and fall. This may indicate that summer and fall, as we have defined them here, stand out while the rest of a year has a similar behaviour.
A plot of the fitted Poisson regression model and the observed number of emergency alarm calls are shown in Figure 3. We can here visually verify that the fitted model has captured the pattern in the observed data quite well. The significance of summer and fall can be visually detected by the small offsets observed in the harmonic behaviour in Figure 3.

Covariance fitting, forecasting and simulation of future ambulance call locations
One of the main aims of forecasting is to obtain a dynamical model which stepwisely takes in new data and outputs updated forecast, i.e. on-line spatio-temporal risk-mapping. Hence, we here leave out the data at the last time point in the fitting of the model (training data) and then generate the forecast model for that time point to simulate realistic "future" spatial event locations.
By employing the minimum contrast approach outlined in Section 4.4, which deals with the estimation of the variance parameter σ 2 , the spatial correlation function r φ (u) = exp{−u/φ} and the temporal correlation function r θ (v) = exp{−v/θ}, we obtain the estimates (σ 2 ,φ,θ) = (4. 933, 3494.705, 0.182). Hence, given the current separable setting, we have indications that there are intraday dependencies at play, given the fitted spatial dependence structure (recall from (13) that the temporal covariance component is fitted conditionally on the estimated spatial one), and the variance parameter estimate indicates that we have more structure than in a Poisson process. At the same time, the fairly large spatial scale parameter estimate indicates that there is quite weak dependence in space.
Next, the fitted/trained model has been used to forecast the Gaussian random field to obtain forecast intensities, which have been exploited to simulate Poisson process realisations. Figure 4 illustrates the observed ambulance call locations (spatial test data) together with a simulated realisation, i.e. simulated call locations. Visually, it seems that the forecasted model quite well picks up on many of the different hot-spots found in the actual data. It does, however, not fully capture a few of the different regions where there are just a few calls, which most likely is an effect of the spatial kernel intensity estimate (see Figure  2). Also, the counts in the simulated realisations are slightly smaller than the what we see in the true data, which may be due to the discretisation used for the random field. On the other hand, it seems to catch the different hot-spots quite well. Taking things a step further, Figure 5 shows simulated realisations beyond the last time point, for the days January 01 (a Tuesday) until January 07 (a Monday), 2019. We see that the spatial event distribution tends to change over time.

Discussion
The main focus of this paper is to study whether spatio-temporal log-Gaussian Cox process (LGCP) models can succesfully model ambulance call data and, in addition, whether the associated spatio-temporal forecasts have the ability of generating realistic simulated future ambulance call scenarios. This is relevant e.g. to design optimal ambulance dispatching rules/strategies. Our rather unique spatio-temporal point pattern dataset under consideration consists of 444 283 ambulance/emergency alarm calls, with associated dispatch addresses and event day recordings, in the four northernmost regions of Sweden over the years 2014 to 2018.
In accordance with e.g. Diggle et al. (2005), we consider an LGCP which consists of three different components: a spatial component, λ 0 (s), a temporal component, λ 1 (t), and a separable log-Gaussian random intensity field, exp{Z(s, t)}. The former two are purely deterministic components which are designed to take purely spatial and purely temporal variation of the events into account. The stochastic component exp{Z(s, t)}, on the other hand, models dependence and spatio-temporal variation of the events. Moreover, computational merit can be obtained using separable covariance structure and exponential form of correlation function for the temporal component (Diggle et al., 2005). In addition, assuming an exponential covariance function for the temporal part of the Gaussian random field Z makes it Markovian in time.
The parameters of the spatial and temporal parts of the covariance function of Z have been estimated using minimum contrast estimation (Møller et al., 1998;Davies and Hazelton, 2013), whereas Quartic kernel smoothing has been utilised to estimate λ 0 (s) for the ambulance data. The bandwidth in the Quartic kernel smoothing has been selected using K-means clustering, which requires selecting the number of clusters to be used during the estimation of the bandwidth. The proposed K-means approach better managed to capture the high concentration of events along the relatively highly populated east coast and the main roads of the region, than did many of the existing (adaptive) intensity estimation approaches (Davies et al., 2018;Moradi et al., 2019;Ogata et al., 2003;Ogata, 2004). The number of clusters has been selected by visually evaluating whether the spatial intensity generated by the Quartic kernel smoothing method shows the spatial variation of the ambulance calls. A Poisson regression model, which incorporates different calendar covariates such as day-of-the week and season-of-the-year, has been adopted to model the temporal variation of the calls and the fitting of the parameters has been carried out using the iteratively reweighted least squares method. We find that the fit is good, and a few covariates are significantly governing the behaviour of the model. An estimate of the inhomogeneous spatio-temporal K-function of Gabriel and Diggle (2009) has been used to verify that an LGCP is indeed a sensible choice of model -there seems to be low/moderate spatio-temporal clustering present in the ambulance data but we hypothesise that there likely is within-day-dependence present (which we cannot quantify here since the temporal sampling of our data is per day). Regarding the computational aspects, a two-dimensional discrete Fourier transform and a fast-Fourier transform have been adopted to reduce the computational cost in simulating the Gaussian random field Z. In addition, the Gaussian random field has been simulated using the Metropolis-adjusted Langevin algorithm. The fitted LGCP model can be utilised to obtain a forecast distribution of the Gaussian random field and, consequently, a Poisson intensity from which we can simulate realistic future event locations. To carry out the forecasting, we have excluded the last time point in the data and carried out forecasting of the spatial structure for this time point. We find that the patterns of the observed and the simulated spatial locations are very similar. However, the number of spatial locations in the simulations are slightly smaller, which may be due to the discretisation used for the random field.
The core challenges here stem from the nature of the current set of spatio-temporal (big) data. First of all, the geographic region is vast and, consequently, the changes in the data vary substantially over the spatial region. This may partly be remedied by splitting up the data set into samples over smaller sub-regions; this will, in effect, have the consequence that the final model will have a spatially varying covariance structure (which seems quite reasonable, given the scope of the data). Additionally, since most of the events are located in regions with a higher population density, having access to relevant spatial, e.g. demographic, covariates (something we are currently working on obtaining) may also improve the fit substantially. A further idea, which likely requires a substantial amount of methodological development, is to transition to linear network point processes (Ang et al., 2012;Baddeley et al., 2015Baddeley et al., , 2020Moradi and Mateu, 2019).
In conclusion, this preliminary study has shown that spatio-temporal log-Gaussian Cox processes are practically viable for the spatio-temporal modelling and forecasting of ambulance calls. Simulations from such forecasts may, in turn, be exploited in optimising prehospital care resources, e.g. by designing ambulance dispatching strategies in an optimal way.

Fast Fourier transform for covariance matrix computation
To implement methods for log-Gaussian Cox processes, it is necessary to discretise the spatial study region R. The most common such discretisation strategy is to construct an approximation grid over an encapsulating rectangle W of the study region since R can be an irregular region in R 2 . Here, the cellcentroids are given by C = s x min + (i − 0.5) ∆ x , s y min + ( j − 0.5) ∆ y | i = 1, 2, · · · m; j = 1, 2, · · · , p , ( where ∆ x is the maximum range spanning R along the x-axis divided by the number of disjoint cells m along the x-axis, ∆ y is the maximum range spanning R along the y-axis divided by the number of disjoint cells p on the y-axes, s x min and s y min are the minimum x-coordinate and y-coordinate of W and the index (i, j) represents the cell-centroids of the regular spatial grid over W. For any given time, the spatially continuous Gaussian process Z can be approximated by a collection of random variables at cell-centroids of the regular spatial grid over W covering the observation window R. This collection of random variables can be ordered lexicographically to obtain a long column vector of random variables z = lexo{{z i, j } m,p i=1, j=1 }. In this way, the simulation of the spatially continuous Gaussian process Z can be approximately translated to the simulation of a high-dimensional Gaussian random vector z. Under this assumption, the stationary Gaussian random field z has mean µ and covariance structure Σ with dimensions mp × 1 and mp × mp, respectively. The (i, j) th entry of the covariance matrix Σ may be obtained as follows: where i, j ∈ I = {1, 2, · · · , mp}, d i j = C z(i) −C z( j) , and r φ is the spatial correlation function with parameter φ. The quantities C z(i) and C z( j) ∈ C are the cell-centroids corresponding to the i th and j th element of z in the lexicographic ordering. A massive computation is required to simulate the Gaussian random field directly even for a coarse regular grid approximation of the study area. Given the discretization, the size of the covariance structure grows dramatically and it is necessary to consider an eigendecomposition of the covariance structure to simulate a correlated multivariate Gaussian random field. A method using a two-dimensional discrete Fourier transform and a fast-Fourier transform to reduce the computational cost in simulating a Gaussian random field has been proposed (Møller et al., 1998;Wood and Chan, 1994). A fast-Fourier transform is a quick evaluation algorithm that helps to obtain the eigenvalues of block circulant matrices at a dramatically reduced computational cost. In its current form, the covariance structure Σ is not a block circulant matrix. However, a fast-Fourier transform requires the matrix to be block circulant. Therefore, circulant embedding has to be performed using an extended encapsulating rectangular grid, which must be at least twice the original size along both axes of the study region. The cell-centroids of the extended encapsulating regular spatial grid of the study region can be given by C ext = s x min + (i − 0.5) ∆ x , s y min + ( j − 0.5) ∆ y | i = 1, 2, · · · M; j = 1, 2, · · · , N , ( where M ≥ 2m and N ≥ 2p. The number of equally-spaced disjoint cells of the extended encapsulating rectangle of the study region is MN. The extended cell-centroid based Euclidean distance definition must correspond to wrapping the extended lattice on a torus and then computing the shortest Euclidean distances to satisfy the block circulant requirement of the covariance structure. Let (u, v) = C ext z ext (i) be an index in C ext corresponding to the i th entry in z ext , which is a column vector obtained by a lexicographic ordering of the Gaussian random fields at the cell-centroids in C ext . Denote the shortest distance on the torus between (u 1 , v 1 ) = C ext z ext (i) and (u 2 , v 2 ) = C ext z ext ( j) by