Bayesian estimation of heterogeneous environments from animal movement data

We describe a flexible class of stochastic models that aim to capture key features of realistic patterns of animal movements observed in radio-tracking and global positioning system telemetry studies. In the model, movements are represented as a diffusion-based process evolving differently in heterogeneous regions. In this article, we extend the process of inference for heterogeneous movement models to the case in which boundaries of habitat regions are unknown and need to be estimated. Data augmentation is used in reconstructing the partition of the heterogeneous environment. The augmentation helps to diminish the impact of uncertainty about when and where the animal crosses habitat boundaries, and allows the extraction of additional information from the given observations. The approach to inference is Bayesian, using Markov chain Monte Carlo methods, allowing us to estimate both the parameters of the diffusion processes and the unknown boundaries. The suggested methodology is illustrated on simulated data and applied to real movement data from a radio-tracking experiment on ibex. Some model checking and model choice issues are also discussed.


INTRODUCTION
based on tractable processes (Blackwell, 1997;Dunn & Gipson, 1977; see Section 2.1 for details) or using more flexible stochastic differential equation models based on potential functions (Brillinger et al., 2001;Preisler et al., 2004). Models and methods developed for the analysis of animal movement may also have other environmental applications, as in for example, the monitoring of fishing vessels to assess their movement and activity (Gloaguen et al., 2015). The goal of this article is to develop a continuous-time framework for analysis of individual animal movements combined with estimating the structure of the spatial heterogeneity underlying those movements, using GPS-based or similar location data.
Animal ecology is fundamentally spatiotemporal, with ecological processes occurring on heterogeneous landscapes . Spatial heterogeneity may occur because of external physical differences across a landscape, for example, different habitat types, or because of internal behavioral responses to location that reflect memory, social interactions, or some kind of strategy in the use of space, for example, a long-term home range or territory. Environmental heterogeneity is built into our movement model by assuming that the landscape consists of a finite set of regions and that the parameters of the movement model are specific to each region. A region may represent an actual "patch" of a habitat type, the presence or extent of which is not known directly, or an area that the animal regards as meaningful but which cannot be observed directly, such as its territory or the core area of its home range. We propose a modeling approach in which location data over time are used to estimate not only the movement parameters but also the unobserved boundaries between regions. Inference for all parameters of interest is performed within the Bayesian approach, because the model used has a complicated structure and because some prior information from ecologists is often available.
Our approach estimates unobserved boundaries from movement data, and so is much more general than the typical case where regions are known, for example, from independent mapping of habitat features, and only the movement parameters associated with them need be estimated (see Blackwell et al., 2016;Harris & Blackwell, 2013). We do however assume that this structured heterogeneity is constant over time; in contrast, Scharf et al. (2019) look at the case where the heterogeneity takes the form of attraction to a physical feature that varies over time but can be directly observed, while Wang et al. (2019) consider the interaction of spatial patchiness and time-varying resources due to depletion by the observed animals. Our approach imposes a more specific spatial structure than is found in, for example, the potential-based models mentioned above; however, our aim is that this structure has biological meaning, increasing the interpretability of the models.
In the case where the heterogeneity is an internal behavioral feature, our approach shares its aims with methods for identifying the core area of an animal's home range that are based on independent observations of location. For example, Wilson et al. (2010) describe a method for constructing the "core area" based on observations forming a Poisson process with a higher intensity inside the core than outside. Their boundary construction is based on kernel density estimation, rather than the parametric form that we use, but they limit their scope to locations that are independent over time, rather than modeling autocorrelated movement.
Our article is organized as follows. Section 2 describes a diffusion model of movement incorporating spatial heterogeneity. Given unknown boundaries, it is natural to consider the observed data as incomplete, lacking observations on the times and locations of transitions across boundaries. Section 3 presents a method for reconstructing the animal paths between observed locations using data augmentation. In Section 4 we describe in more detail the implementation of spatial heterogeneity used in the article. Markov chain Monte Carlo methods are used to fit the model to data and Section 5 briefly examines some aspects of our implementation. We illustrate the methodology with two examples, with simulated data in Section 6 and with GPS data in Section 7, using telemetry data on animal movements to classify the landscape into two regions and estimate the movement parameters within each. The article concludes with a discussion of possible extensions of our method in Section 8.

Modeling movements in a heterogeneous environment
A useful and popular conceptual model for animal movements is a diffusion, a process in continuous time with continuous sample paths that is the solution to a stochastic differential equation. Let X(s) = (X 1 (s), X 2 (s)) be a diffusion describing the location of animal at the time s (see, e.g., Patterson et al., 2016). We consider a flexible class of stochastic models constructed from two-dimensional Ornstein-Uhlenbeck (OU) processes. Recall that X(s) follows an OU process with The OU process serves as a model for animal movement paths in its own right, taking into account high dependency between successive observations. It was first proposed by Dunn and Gipson (1977) and is now a well-established tool for conducting statistical analyses of animal movement data at the individual level Patterson et al., 2016). The OU process has proved to be a workable model for describing movement with physically interpretable parameters; see, for example, Worton (1995) and the review paper of . This model describes a movement pattern having a stationary distribution with elliptical contours, with the locations concentrated in the center. The parameter represents this central attraction point which can have various biological interpretations, such as food or conspecific attraction. The parameter Λ is a positive-definite matrix that determines the spread of locations at which the animal is found. The matrix B controls the strength and form of the centralizing tendency. The definition of the process implies that the matrix B must be stable; Blackwell (2003) notes that the case in which B = I is a diagonal matrix for some scalar < 0 is usually sufficient in practice. An important feature of a diffusion model is that it can incorporate observations that are irregular in time. Though many telemetry observations are made at equally spaced time points, the ability of the model to handle arbitrary observation times is useful in many situations, including accounting for missing observations without need for imputation, allowing comparison or synthesis of experiments on different regular timescales, or dealing with studies where sampling intervals vary by design. See Harris and Blackwell (2013), Blackwell et al. (2016), and Eisenhauer and Hanks (2020) for examples and further discussion.
While the limitations of this model are well recognized, in many cases it is a useful approximation under which inferences can be made about the parameters of the home range. Dunn and Gipson's model was extended by Blackwell (1997). In Blackwell's model, for which inference was described in detail in Blackwell (2003), an individual switches between different OU processes, representing different behaviors. A process in which this switching occurs is called a mixed diffusion.
Animals move differently in different regions, and complex patterns of behavior are driven not only by biological mechanisms but also by spatial heterogeneity. Harris (2007) and Harris and Blackwell (2013) considered an extended mixed diffusion model that incorporates spatial heterogeneity, with the processes that describe movement changing according to location as well as behavioral state. We consider this heterogeneous case; we restrict our attention to the single-behavior model, but the extension to the multibehavioral setting is straightforward. We will focus on the study of the spatial heterogeneity, in particular, how it can influence the movement process, and how in turn the movement can help us learn about the heterogeneity, whether internal or external in the sense described in Section 1. Note that this takes us well beyond the class of "separable" models (Harris & Blackwell, 2013) with known spatial structure, for which inference is addressed by Blackwell et al. (2016). We consider the heterogeneous environment  as a partition containing a finite set of nonintersecting regions R i , with  = ⋃ k i=1 R i . A single pattern of movement is associated with each region, so that the overall diffusion evolves as an OU process with parameters Since we are considering only a finite set of regions, and therefore of movement patterns, any model of this kind is directly related to a model that switches randomly (as in Blackwell, 2003) between those same movement patterns, differing only in that the "switches" are explained by the spatial heterogeneity rather than forming a completely random realization of some underlying Markov chain of "behaviors" that is homogeneous in both space and time. The spatially heterogeneous model, where applicable, will therefore have more explanatory and predictive power than the homogeneous one.
The goal of this article is to develop a method for inference from animal movement that allows use of the model described above in the case where the boundaries between the regions are unknown and have to be estimated from the data on movements. In terms of implementation, this extension adds some extra unknown parameters to those that govern movement. If Ψ i is a suitable parametric representation of the boundaries for region R i , then altogether the following set of parameters is to be estimated:

Likelihood
Let x t = (x 1t , x 2t ) be the location in two-dimensional space R 2 of an animal at time index t. Then all observations of the animal's locations are represented by X = (x 0 , x 1 , … , x n ). For most of the paper we assume that observations are made at equally spaced time points so that in expression (1) is constant and observation times are an integer multiple of the constant sampling interval. However, it is important that the model can straightforwardly handle arbitrary observation times.
The inference for parameter Θ is performed within a Bayesian framework, using the Markov chain Monte Carlo (MCMC) method. Implementation of the MCMC algorithm is described in Section 5. Here we discuss some assumptions and approximations involved in calculation of the likelihood (2) Firstly, the likelihood depends on the assumptions made about the initial observation x 0 . There are several practical approaches to evaluating the marginal density function for the first location p(x 0 |Θ).  suggest that, if the movement model p(x t |x t−1 , Θ) is stationary, then p(x 0 |Θ) can be approximated by p ∞ (x 0 , Θ), the temporal limiting distribution of the movement model, or its weighted version w(x 0 )p ∞ (x 0 , Θ) where the weighting function w(x) depends on the selection of resources available for the animal. Since the OU process is stationary, one can use its limiting (equilibrium) distribution N( , Λ) for calculating p(x 0 |Θ). In the mixed case, Blackwell (2003) proposes to extract the information contained in x 0 by generating the part of the trajectory before time t = 0. In this reverse simulation, the path x −s , … , x −1 is reconstructed and the extra term p(x 0 |x −1 , Θ) can be used in the likelihood. Another approach is to condition on x 0 and simply remove the p(x 0 |Θ) term in (2) (Christ, ver Hoef, & Zimmerman, 2008;. If the time series is long, the loss of information in this approach is small but the complexity of the likelihood is greatly reduced. We use the latter approach because the number of observations in modern studies is usually large. The conditional distribution of the animal's position x t given its position x t−1 , when both locations are in the same region R l , is given by the OU process with parameters Ω l = ( l , B l , Λ l ), making the assumption that the process does not exit the region and return to it between observations, in other words, that there are no unobserved sojourns to different regions. While in some cases this may seem like an oversimplification, it is likely to be accurate when observations are close together in time, either because the actual data are collected at high frequency or given the data augmentation described below.
When x t−1 and x t lie in different regions R l 1 and R l 2 , we need to make some assumptions to calculate the transition density for the boundary-crossing move from x t−1 to x t . Since the movement processes at the start and end of the interval are known, it seems natural to approximate the conditional distribution p(x t |x t−1 , Θ) by a mixture of these two processes. That is, if p OU (x t |x t−1 , Ω l i ) denotes the conditional density of the OU process in the region R l i , then the transition density on an interval including a boundary crossing is approximated by where l 1 , l 2 are defined by x t−1 ∈ R l 1 and x t ∈ R l 2 . The adequacy of this approximation depends crucially on the choice of the coefficient w l 1 l 2 for weighted distributions in (3) and on the length of the time interval between observations. The movement process near the boundary may be complicated; the different habitat types in the vicinity of an animal may influence where and how that animal will move. This means that the optimal choice of weight w l 1 l 2 would depend both on the regions adjoining the boundary and also on the animal's precise location. Choosing this coefficient is not straightforward in the case of unknown boundaries between regions. On the other hand, the contributions of both of the component OU processes to the likelihood can be very important when number of transitions across the boundaries is large.
Here we ensure that the limitations of this approximation do not have a large effect, using an approach based on reconstructing intermediate points on the animal's path. When the animal crosses a boundary, the term p(x t |x t−1 , Θ) will contribute to the overall likelihood some amount of uncertainty due to the approximation (3), and our reconstruction process aims to reduce this effect by ensuring that only a short time interval is affected. As a result, the approximation for the transition density is less influenced by the coefficients for the weighted distributions in (3), and the weight w l 1 l 2 can be chosen flexibly. This approach also reduces the effect of our assumption of 'no unobserved sojourns' which means that all visits to regions are represented by locations; we need only assume that this holds for the augmented paths, not for the actual data. The details of the animal path reconstruction are given in the next section.

RECONSTRUCTING TRANSITIONS BETWEEN REGIONS
Information about the times and locations of transitions across boundaries is generally not available. So when the animal's locations at successive time points fall into different regions, calculation of the conditional distribution p(x t |x t−1 , Θ) necessarily involves some approximation. Our approach is to reconstruct the animal paths at instants between the observed points. Assume x(s) and x(s + ) are two successive animal locations measured at times s and s + . Using the proposed movement model and information contained in the observed locations, we generate locations at intermediate time points s < s 1 , … , s p < s + . These time intervals may be equally spaced or they may become more fine in the vicinity of a boundary. These simulated points z(s 1 ), … , z(s p ) are used as a reconstruction of the animal path between the two measurements. If the animal crosses a boundary between points z(s k ) and z(s k+1 ), say, then in the reconstructed trajectory the approximation of the transition on interval (s, s + ) is replaced by the approximation of transition on the much shorter interval (s k , s k + ). As a result, it helps to diminish the impact of uncertainty about boundary crossings on the likelihood and potentially to extract some additional information from the observations. Unknown animal locations and times of boundary transitions can be considered within the context of measurement imprecision. Frair et al. (2010) in their review note that there are different approaches to reduce the effects of location imprecision including differential corrections, interpolating and smoothing animal paths. Our approach is to interpolate the collected locations by reconstructing animal movements between fixes, sampling repeatedly to allow propagation of uncertainty.

Data augmentation
In reconstructing animal paths between the regions of a heterogeneous environment, we use the data augmentation technique. The observed data X are treated as incomplete and are augmented by latent (unobserved) data Z using a method similar to that suggested by Tanner and Wong (1987). This approach is common in movement modeling, either to add the times (and perhaps locations) of behavioral switches in between observations, for exact fitting of continuous-time models (e.g., see Blackwell, 2003;Blackwell et al., 2016), or to add "true" locations at the times of the existing observations when allowing for measurement error (e.g., . In those examples, and in the present work, the augmentation of the data takes place in conjunction with inference about the model parameters, with each informing the other. Thus the sampled parameters directly incorporate all the information from the data. This contrasts with a two-stage approach where refinement of the trajectories is carried out first, using a simpler movement model, and then inference from the model of interest is carried out separately based on those refined paths. A two-stage approach may be appropriate in some circumstances; see McClintock (2017) and Scharf et al. (2017). for details and discussion. In the present case, a two-stage approach is unlikely to perform well because uncertainty in the boundaries has a large effect on reconstructed paths. Our approach of simultaneous augmentation and parameter inference allows us to extract additional information about the movement of the animal between regions; this could be combined with augmentation to address behavior or observation error, but we omit those factors here for simplicity. Mathematically, the augmentation method makes it possible to express the required posterior density in the form To evaluate this integral, we draw posterior simulations of the joint vector of unknowns (Z, Θ) and then focus on the estimands of interest. The process is implemented in the following iterative scheme where the augmented data Z are treated as variables.
Step 2. Using Θ from Step 1 simulate Z ∼ p(z|x, Θ). Accept or reject the simulated Z.
Accept-reject steps are done by a single-component Metropolis-Hastings algorithm for Θ and an independence sampler for Z. Realization of these steps will be described in Section 5. Iterative sampling results in the set of observed data X being expanded by adding augmented data Z and the full data set is now Y aug = X ∪ Z.
For the imputation Step 2 above we need a method for proposing the latent variable Z. We propose a path between two successive locations x t−1 and x t from a Wiener process (Brownian motion) tied to the observations at the endpoints, that is a Brownian bridge. This process appears to be a sufficiently good representation of the animal movement paths for this purpose, over short intervals. In practice we use a time-discretization approach, proposing values of the path at specific intermediate times using this Brownian bridge.
Samples from the Brownian bridge could be generated sequentially or simultaneously, but, for convenience, the augmentation is carried out recursively. For each pair of observations x t and x t+1 we form a Brownian bridge  t (s), t < s < t + 1 with  t (t) = x t ,  t (t + 1) = x t+1 , and generate its value at the mid-point of the time interval, where the variance 2 is derived from the variances of OU processes in the corresponding regions. This bisection of the time intervals is then repeated by constructing a Brownian bridge { t j (s)}, s ∈ (tj ′ , tj ′′ ) on each of the intervals (tj ′ , tj ′′ ) formed, and generating the value for each one at its mid-point, ) .
This refining algorithm fills in the p t required intermediate points between x t and x t+1 ; we generally take each p t to be of the form 2 i − 1 (i = 1, 2, 3, … ). The (time-ordered) points generated in this way form the augmented data z tj , j = 1, … , p t . For computational stability, unobserved trajectories are reconstructed between all animal locations in X. For each observed pair x t and x t+1 the values of latent data z t,1 , z t,2 , … , z t,p t are generated. We then have the extended data set arrayed in the form

Augmented likelihood
The above scheme provides us with augmented data Y aug = (y 1 , … , y T ) which includes both observed X and unobserved Z components as described in Section 3.1. The "complete-data" augmented likelihood in this notation is where the "complete-data" conditional movement density is given by In the above mixture weights w are taken to be equal across all regions. The finer time-discretization, resulting from the augmentation, makes this approximation acceptable. In practice, we use w = 0.5, assuming that the two OU processes make equal contributions to the movement density in the vicinity of a boundary crossing.

IMPLEMENTING THE MODEL
While the described method is applicable to any finite partition of space, for illustrative purposes we shall restrict ourselves to a simple example. We consider a simplistic study area that comprises two regions: an inner region with a circular boundary where the animal spends most of its time and an outer zone of relatively less intense use. Such a circular model can be seen as an approximation for a patch of habitat that forms a convex region and is treated as homogeneous due to the uniformity of the movement pattern when compared with the whole study area. More generally, it may be a simplified representation of the animal's internal idea of its territory or the core of its home range. This illustrative case is supported by considerations of efficiency due to shape, in a number of contexts. Summarizing properties of different types of landscape patches, Barnes (2000) points out that a circular patch minimizes the amount of edge compared, for example, to a thin, rectangular strip, which has only a narrow band of interior habitat. One example of circular habitat usage is provided by red squirrels. These animals are considered to be territorial in coniferous forests. The article by Gurnell (1984) examining home range and territoriality in red squirrels has indicated that all the observed squirrels patrolled and defended areas of circular or oval shape, centered on large caches and a nest site. Similarly, biological studies suggest (see, e.g., Linn et al., 2007) that the home ranges of elephant-shrews are compact and quite symmetrical, with an inner zone of intense use surrounded by a zone of relatively less intense use and a variable outer zone of sparse use. The symmetrical perimeter of the intensively used area can be approximated by a circle. Space use within this perimeter is strongly symmetrical, with the most intense use at the center. These analyses of range use provide motivation for use of the two-region OU model with a circular boundary to model both movement and use of space. A simplistic circular model is also used as an approximation of the home range for some large carnivores, for example mountain lions (cougars), that have large territories, usually oval or circular in shape (Russell et al., 2012). Finally, if the movement process in the outer region is taken to be Brownian motion (available as a limiting case of the OU process), the radius of the circle can represent the (unknown) maximum range of attraction of the central food source, and so on; such a model will be transient rather than stationary.
Modeling examples of an idealized animal's territory or home range represented by a circle are given in Harris and Blackwell (2013). They provide two simulated examples of a circular region: uniform environment with a central point of attraction within the circle and unused patch of habitat within a home range with a central point of repulsion inside the circle.

IMPLEMENTATION OF THE MCMC ALGORITHM
Often the time interval between collected radio-tracking data is a constant, and thus it is more convenient to use an alternative parameterization of the OU process where is fixed. We use the same reparameterization for diffusion parameters as in Blackwell (2003) and Harris (2007) with Λ and B matrices replaced by where exp(⋅) is the matrix exponential and parameter Φ is the covariance matrix of an observation from an OU process conditional on an observation at a time earlier. To incorporate irregular data, can be taken to correspond to the most frequently occurring interval between observations. This parameterization is more convenient for MCMC simulation and gives us the following vector of unknown parameters for region R i : In the remainder of the article we use both parameterizations depending on the context. As outlined in Section 3.1, after appropriate initialization of Z and Θ the MCMC algorithm alternates between the following steps.
Step 1. Sampling Θ|Z, X (random-walk Metropolis-Hastings) For each parameter ∈ Θ: propose ′ from a proposal distribution centered on ; accept or reject ′ using a Hastings ratio based on the augmented likelihood (5).

Step 2. Sampling Z|Θ, X (independence sampler)
For t in 0, … , n − 1: propose x t+1 from the Brownian bridge of Section 3.1, independently of z t,1 , … , z t,p t ; accept or reject z ′ t,1 , … , z ′ t,p t using a Hastings ratio based on appropriate terms of the augmented likelihood (5).
For our examples, we define the parameterization of boundaries Ψ i by fixing a circular boundary between two regions so that the study area consists of a circular "core" R 1 and a region R 2 =  ⧵ R 1 , outside that core. Since our examples are intended to model cases with the most intense use at the center, we assume that two regions have the same attraction point and both OU processes share a common center: 1 = 2 = .
Blackwell (2003) notes that in practice the isotropic case is usually sufficient, with the matrix B given by I where < 0 is some scalar and I is the identity matrix. The isotropic assumption gives us Γ = I with 0 < < 1. Also, since we model the circular home range, it is biologically natural to assume that matrix Λ = I and an attraction point inside the circle implies 1 > 2 . Parameter Φ, in turn, is reduced to a scalar = (1 − 2 ) > 0 with Φ = I. This parameterization gives us the vector of unknown parameters Θ = ( , 1 , 2 , 1 , 2 , ) where is radius of the circular boundary and ∈ R 2 is the position of the attraction point.
In the examples, we use vague proper priors for all parameters, assuming that there is little prior information about parameters. Details of the priors are given in Web Appendix A, with prior and proposal distributions for each parameter used in the MCMC simulation summarized in Web Table 1. The parameters of the proposal distributions are chosen to give Metropolis-Hastings steps with an acceptance rate that is not too far from optimal values.
To illustrate the described approach and its performance, in the next sections we will consider two examples, one with simulated data (Section 6) and the other with radio-tracking data on movements of an ibex (Section 7).

SIMULATION EXPERIMENT
To demonstrate the benefits of using the augmentation for estimation of parameters, we provide the results of a simulation experiment. In the experiment we compared three computational schemes: estimation of the parameter Θ from the original observed points and from augmented data sets with either one or three points imputed between each pair of observations (schemes A 0 , A 1 , and A 3 , respectively).
To illustrate the possible gains from using augmentation, we conducted the simulation experiment for a small sample size for which parameter estimation seems more challenging. Animal paths of n = 100 observations were simulated from mixed OU processes, representing movement patterns in a study area consisting of two regions, a circular core and an outer zone. The animal has an attraction toward a central core area but also makes excursions from this area. The same parameters were used throughout all simulations, and we set Θ T = ( = 2, = (0, 0), 1 = −3, 2 = −0.5, 1 = 1, 2 = 0.8). Structurally, the model is the same as that fitted to the ibex data; the parameter values are broadly comparable, but are set to represent a somewhat different case, where the function of the movement parameters that is most easily estimated directly from the data, i , varies between regions much less than is the case for the ibex, making the estimation more challenging. (In the parameterization used in the MCMC algorithm, the simulations have 1 = 0.050, 2 = 0.61, 1 = 1.00, 2 = 0.506.) While these parameter choices are not meant to mimic a particular species, they plausibly represent observations on a similar time scale to the ibex data, around every 4 h.
The parameter estimates for each path were calculated using MCMC sampling. The number of MCMC iterations in simulations A 0 , A 1 was taken as N = 10, 000; for scheme A 3 we used a slightly greater number of iterations, N = 12, 000. These chains, after the first burn-in steps were discarded, were used for calculating posterior estimatesΘ. Chains obtained with augmented datasets tend to converge more slowly and the number of burn-in iterations was adjusted for each scheme to take into account convergence rate.
For each simulated path we therefore had three sets of parameter estimates, for A 0 , A 1 , and A 3 . Parameter estimates were compared using a root mean square error (RMSE) statistic with the distance d Θ = d(Θ, Θ T ) between the parameter estimateΘ and the true value of parameter Θ T measured as Euclidean distance: In the simulation experiment we were interested to examine to what degree the algorithm with data augmentation allows us to make better inference concerning unknown boundaries and movement parameters. To explore this, we considered separately the distances d Ψ and d Ω for boundary parameters Ψ and diffusion parameters Ω accordingly. Distances at the level of individual parameters are not given, because of the correlations between parameters within each of Ψ and Ω.
The results of the parameter estimate comparisons for 50 simulated animal paths are presented in samples. In brackets we show the relative reduction in RMSE due to augmentation,  Figures 1, 2, and 3, respectively, for each computational algorithm. It should be noted that the outliers in Web Figures 1 and 3 for the A 3 scheme reflect the fact that runs with more augmentation required more iterations to achieve convergence for some of randomly generated 'animal paths', in particular for diffusion parameters. This would have to be taken into account when implementing the method on practice.
The simulation studies show definite changes in estimating both the boundary and OU movement parameters. In terms of RMSE, even a small amount of augmentation gave clear improvements in the parameter estimation. Table 1 shows that for RMSE Θ , there is a 34% decrease with one augmented point and a 42% decrease with three augmented points between each observation, compared with no augmentation scheme. The experiment demonstrated that augmentation is beneficial when an animal path includes a large number of boundary crossings. The effect of using augmentation is apparent even with a small number of added points between each animal location, and is particularly pronounced for boundary parameters Ψ; RMSE statistic RMSE A 1 Ψ represents a 53% improvement in boundary estimates compared with the A 0 scheme, though the improvement does not progress with further increasing number of augmented points which is, probably, related to the difference in the number of potential boundary crossing between A 1 and A 3 .
In the web-based Supplementary Materials for the article, Web Figure 4 provides a visual illustration of the estimation of Θ for the A 0 and A 3 schemes for one arbitrarily chosen simulated path. The example demonstrates that the augmentation gave a visible improvement in the selection of the boundary between regions.

Data and model
The Ibex dataset contains the GPS relocations of an ibex during 15 days in the Belledonne mountain range (French Alps). Data on relocations were collected roughly every 4 h. During the study period 71 observations were recorded. The data can be found in the data repository of the package adehabitat Calenge et al. (2009); the original source of the data is "Office national de la chasse et de la faune sauvage," France. Ibex locations are shown in Figure 1, with the initial observation marked by a triangle. The figure suggests that the animal movements have a centralizing tendency toward a core area at the upper part of the plot, which can be interpreted as a foraging home range, but the animal also made some short excursions from this area. The foraging area has a center of attraction which might be, for example, a concentrated food source. Animal movements around the attraction can also be interpreted as "area-restricted." We fit the model to the data with two OU processes sharing an attraction point. One process will describe the animal relocations from starting point toward the core region and outside it ("exploratory" movements) and the second will represent short animal movements within circular foraging habitat ("resident" movements). We interpolated each pair of observations with three augmented points. Figure 1 shows the estimated boundary between the two regions together with the observations on ibex movements. Table 2 shows the posterior mean estimates of all parameters, together with their standard deviations (SD) and highest posterior density (HPD) intervals. The posterior distribution of each parameter was estimated from a sample of 30,000

TA B L E 2
Posterior parameter estimates with their SD and corresponding HPD intervals for ibex data MCMC iterations, after a burn-in period of 10,000 iterations and meeting convergence criteria. The convergence was inspected using a visual graphical assessment. The diagnostic was performed for each parameter in the model using multiple parallel chains starting with different values. The diagnostic trace plots and posterior density plots were displayed and visually analyzed to check that chains are mixing well and that a consistent estimate of the posterior distribution is reached. Adding augmentation results in placing a wider boundary than the results obtained from nonaugmented data, with the boundary estimates for nonaugmented sample being = 0.66 (0.06), 1 = 0.42 (0.06), 2 = 1.94 (0.06) (presented as mean (SD)). It could be noted that applying the augmentation leads to reduction in the posterior SDs indicating an improvement in the parameter estimation. The wider boundary from augmented data means that more points located near the boundary are included in the foraging area. Visual inspection of sample paths suggests that it makes sense since these "near-boundary" points correspond to rather short animal moves. Thus augmenting transitions across the boundary helps to identify the type of location, in particular, whether it is exploration or resident movement.
We conducted a small additional simulation to explore how augmentation helps to learn about unobserved visits to a region. In the ibex dataset there are 21 observed boundary crossings for the estimated with three augmented points boundary (30% of all registered movements). For a sample of simulated augmentation paths we had 57 (20%) crossings on average (median) that is average of 36 unobserved animal visits to a region.
The means of the estimators in Table 2 show that a fitted OU process outside the central patch tends to be more dispersed with a weaker attraction toward the core area than the process governing animal movements within the foraging region, as would be expected. Note that the parameter 2 , which controls the strength of drift toward the center for exploratory movements, is close to 1, implying that the corresponding diffusion process is close to Brownian motion.

Model checking and comparison
Although the two-region model seems to be natural for the ibex data and to perform well, obviously there are other options in setting up a model that can fit the data. As part of our model checking, we compared the two-region model with a model consisting of one region; these two competing models will be labeled as M 2 and M 1 , respectively. In the one-region model, the study area is not partitioned into subregions and it is assumed that animal movements come from a single underlying OU process with parameters Θ M 1 = ( , B, Λ). Correspondingly, for the two-region model we have To compare the fit of the different models we use two approaches, both based on posterior predictive model comparison.
As noted in Section 2.1, we could also fit, say, a spatially homogeneous two-state switching OU model to these data. Such a model may well do a good job of capturing the difference between the initial observations that are relatively far from the center and those that are more clustered, but is less likely to adequately represent the later data which include shorter excursions beyond the core. More importantly, a nonspatial model will be less able to explain these variations in movement pattern. So, while we acknowledge that a two-state model would probably be selected over a one-region/one-state model, we do not pursue that case formally.

Deviance information criterion
The deviance information criterion (

Posterior predictive probability
Another natural approach is to compare models on the basis of the posterior predictive distributions of some specifically chosen test quantity T(X, Θ). The approach is related to the numerical posterior predictive check method described in Gelman et al. (2004), but here we use it with a focus on comparing discrepancies between models rather than checking model fit; we compare T(X obs , Θ M i ) under two different models M 1 and M 2 . Let 1 , 2 , and 3 be successive data points, ordered by their times, say s 1 , s 2 , and s 3 , respectively. With fixed end points, the middle point 2 follows the OU bridge Q s = Q(s; 1 , 3 ), that is an OU process conditioned to go from 1 at time s 1 to 3 at time s 3 . The conditional density corresponding to this process can be written as The test statistic T(X obs , Θ M j ) for model comparison is based on the conditional probability (7). We consider the log ratio of conditional probabilities for two models: Values of the probability density (7) were calculated under both models for each observed ibex location. Logarithms of these values are plotted on Figure 2. The figure also shows discrepancies between log conditional probability densities for each observation (lower plot). The sum of these discrepancies gives the value of our test statistic, T(X obs , Θ M 1 , Θ M 2 ) = 41.23. For most observations, the conditional probability density under the two-region model is higher than under the model with just one region.
To further illustrate this approach, we compared the predictive distribution of a particular point from the dataset under two models. Figure 3 shows part of the observed ibex movement path which we use for comparing models. Three successive points (x 17 , x 18 and x 19 in the dataset), connected by arrows in time order, represent animal movement in a backward direction, which is unusual with an attraction point model. We compare the two models by examining the question: which model explains this atypical movement path better?
We now take as our test statistic For the one-region model, we calculate p Q ( 2 | 1 , 3 , Θ M 1 ) by substituting the corresponding conditional densities of the OU process with parameters Θ M 1 = ( , B, Λ) into formula (7). For two regions, p( 2 | 1 , Θ M 2 ) = p OU ( 2 | 1 , Ω 2 ), p( 3 | 1 , Θ M 2 ) = p OU ( 3 | 1 , Ω 2 ). Since with fixed 1 and 3 the intermediate point can belong to either region R 1 or R 2 , the remaining term is: F I G U R E 3 Three successive points from ibex dataset chosen for model comparison The probabilities that 2 belongs to R 1 or R 2 in (8) Figure 4 also shows the estimated posterior densities of T(X obs , Θ M i ). The estimated posterior means for one-and two-region models, respectively are p Q (x(s 2 )|x(s 1 ), x(s 3 ), Θ M 1 ) = 1.635 ⋅ 10 −6 and p Q (x(s 2 )|x(s 1 ), x(s 3 ), Θ M 2 ) = 1.682 ⋅ 10 −4 . Thus it follows from the conducted posterior predictive check that the combination of the observed animal locations shown in Figure 3 is more plausible under model M 2 , which suggests that specific patterns in the data are better captured by the two-region model.

DISCUSSION
In this article, we have presented a method for modeling individual animal movement in a heterogeneous environment, while simultaneously learning about that environment. The approach involves applying diffusion models to account for observed movement and using a data augmentation technique to reconstruct animal locations between collected observations. The novelty in the present approach is that boundaries between spatial heterogeneous regions are included into the model as unknown parameters. The example analyses presented herein were concerned with animal relocations between just two regions, one of which was circular. These are relatively simple models but their analysis gives an indication of how we might expect the algorithm to perform in more complex applications. Even with that simplification in the partitioning of the study area, obtaining parameter estimates was computationally demanding. As the number of observed locations and the number of augmented points per pair of observations increase, this estimation procedure quickly becomes very expensive. In particular, runs with more augmentation require more iterations as well as increased cost per iteration, and the total computational cost increase faster than linearly in the number of augmented points. Estimating parameters for large study areas with sophisticated heterogeneous regions may become problematic, although our method could benefit from a parallel computing framework which would keep the algorithm computationally feasible. However, our results suggest F I G U R E 4 Chains (on log scale) and posterior densities of conditional probabilities p Q at observed point x(s 2 ) = x 18 from ibex dataset for two models. Dashed lines on the plot correspond to model M 1 that even low levels of augmentation can give appreciable benefits. Even a single augmented point between observations allows for the range of possible times at which a given boundary crossing may take place, while adding more points allows for unobserved visits to a region to be accommodated, capturing the key omissions from the data as naïvely interpreted.
The general framework presented here can be extended in a number of directions. The first practically important generalization would be to consider models with greater behavioral complexity. In addition to estimating movement and boundary parameters for multiple regions, Bayesian inference could be used to account for different behavioral states in each region. Incorporating movement states into our model would be straightforward; the general approach in simpler settings with spatial homogeneity or with fixed boundaries is described in Blackwell (2003) and Blackwell et al. (2016), respectively.
Another extension is directly concerned with modeling spatial heterogeneity. Although the approach developed in this study can be used to model the movements of animals in environments with various spatial structures, describing complex heterogeneous environments which can influence the movement patterns of organisms will require more sophisticated models for partitioning of the study area. One promising means for developing realistic and flexible models for habitat fragmentation or, say, complex patterns of territoriality is to represent the partition parametrically in the form of a random tessellation, as applied in ecological, environmental and other contexts by Blackwell and Macdonald (2000), Blackwell and Møller (2003), and Pope et al. (2019).
More ambitiously, this approach could be combined with the representation by Niu et al. (2016) and Milner et al. (2021) of collective movement as a point following an OU process in a higher-dimensional space. Different movement in different regions then represents the dependence of behavioral interactions on the geometry of the group at any instant, for example, two animals being within a certain distance of each other is equivalent to the point representing their joint locations being within a particular infinite circular cylinder. Our approach could thus help learn about the range and geometry of such collective movement.
We have focused here on models built from the OU process, because of their tractability and flexibility. Some of their limitations, in terms of realism of movement modeling, could be addressed by instead applying our approach to the integrated OU process, often known as the continuous-time correlated random walk (CTCRW) in an animal movement context, in which it is the velocity of the animal and not its position that follows an OU process.  introduced a spatially and behaviorally homogeneous CTCRW, and Michelot and Blackwell (2019) showed how behavioral switching could be incorporated without time-discretization. Russell et al. (2018) included spatial heterogeneity in the form of a spline-based "motility surface" and applied it to high-frequency laboratory data on ant movement, using Euler-Maruyama numerical approximation. Applying our approach, where data are not sufficiently high frequency that changes between fixes can be ignored, would require augmentation with velocities as well as positions, but since our results show that fairly low levels of augmentation have appreciable benefits, it should be readily achievable. For our particular example, we need the animal's location-not just its velocity-to be stationary, and so the CTCRW is not directly applicable without some additional mechanism being included.
In principle, in the Bayesian framework the number of regions k can be treated as a variable and included into the model as unknown parameter. Technically, Bayesian inference in this case can be done using the reversible jump MCMC method or similar, since the number of parameters in Θ = (Θ 1 , … , Θ k ) can change at each iteration. Pope et al. (2019) shows how this might work in practice, using a random tessellation, as discussed above, with a varying number of regions.
In this study the number of heterogeneous regions k was chosen in Section 7.3 using model comparison tools. It should be noted that the interpretation of DIC in this setting requires caution and further investigation, though the results in our example were sufficiently clear-cut for this not to be an immediate concern. Apart from taking into account the unobserved data Z, the model with unknown boundaries can be interpreted as a missing data model problem in the sense that we do not know which region each observation belongs to. We can formalize identifying the region R j that observed location x i falls in by introducing the indicator variable r = (r 0 , … , r n ) where r i = (r i1 , … , r ik ) so that r ij = 1 if x i ∈ R j and r ij = 0 if x i ∉ R j . Under the model with unknown boundary parameters Ψ i the variable r should be treated as missing data.
We have shown that is feasible to use autocorrelated location data, such as is increasingly available from GPS tagging, to learn about the spatial form of heterogeneity that influences an animal's movement behavior. We have also indicated a number of ways in which our approach could extend naturally to accommodate more complex models and ecological questions. The method as it stands is computationally expensive, but our results show that even low levels of data augmentation can result in appreciable improvements in the precision of the model fitting.