Nonlinear Processes in Geophysics Kernel estimation and display of a five-dimensional conditional intensity function

The aim of this paper is to find a convenient and effective method of displaying some second order properties in a neighbourhood of a selected point of the process. The used techniques are based on very general high-dimensional nonparametric smoothing developed to define a more general version of the conditional intensity function introduced in earlier earthquake studies by Vere-Jones(1978).


Introduction
This paper is concerned with the second order properties of a multidimensional point process in contexts where some features of a given point (e.g.location, depth, magnitude) play a dominant role in determining the local behavior of the process in a neighbourhood of the selected point.The aim of the paper is to describe a convenient and effective method of displaying second order properties of counts in a neighbourhood of a selected point of an observed point process and to examine how those properties are affected by the features of the fixed point.In particular we would like to display second order properties of counts in a neighbourhood of the initial event in an aftershock sequence or swarm in a seismic active area.For instance, the way these properties change with the magnitude of the initial event tells us something about the physical processes governing the numbers and distributions of the aftershocks respond to the size of the initial event.Similar issues arise in the discussion of medical epidemic data, where the size and severity of the epidemic overall may be related to characteristics of the initial recorded infection.
To look at second order properties, the counts need to be averaged over both the choice of a selected point and over the events in its neighbourhood.Ripley's K-function (Ripley, Correspondence to: G. Adelfio (adelfio@unipa.it)1976) is commonly used for such a purpose in discussing the cumulative behavior of interpoint distances about an initial point.It is defined as the expected number of events falling within a given distance δ of the initial event, divided by the overall density (rate in 2-dimensions) of the process, say λ.Since it is defined as an average over many initial points, the K-function cannot be used to distinguish processes with the same (average) second order properties.As an alternative, Getis and Franklin (1987) suggested examining the behavior of the occurrence patterns in the neighbourhood of selected initial points developing a second order neighbor analysis of mapped point patterns.However, this method is not useful for determining whether a given pattern is random, clustered or regular (Doguwa, 1989).Adelfio and Schoenberg (2009) suggested using a weighted version of some second order statistics to provide diagnostic tests.In Adelfio and Chiodi (2009) weighted second order statistics are used to assess the fitting of seismic models to real catalogs.Grillenzoni (2006) focussed on the conditional intensity function of a space-time process, where conditioning is made on the basis of past events only.
Second order statistics, such as the Ripley's K-function (Ripley, 1976), are useful to describe observed point patterns characterized by high correlation structures both in space and time and are also designed to test the randomness hypothesis often based on the Poisson distribution.For this reason second order statistics are crucial to study and comprehend seismic process and its realization, since description of seismic events often requires the definition of more complex models than stationary Poisson process and the relaxation of any assumption about statistical independence of earthquakes.Indeed, a more realistic description of seismicity often needs the study and the interpretation of features like selfsimilarity, long-range dependence and fractal dimension.
Published by Copernicus Publications on behalf of the European Geosciences Union and the American Geophysical Union.
G. Adelfio: Kernel estimation of a five-dimension conditional intensity function In this paper a nonparametric estimation of the second order conditional intensity function (CIF) introduced by Vere-Jones (1978) is provided, by making use of kernel intensity estimators.The nonparametric second order CIF is here introduced to analyze the influence in a neighbourhood of a multidimensional point to some properties of the observed point pattern, by using a procedure that does not require any constraining assumption to characterize the generating process.
In Sect. 2 a brief introduction of spatial-temporal point processes and their second order characteristics is provided.The proposed nonparametric approach is introduced in Sect.3, showing some application in Sect. 4. Section 5 provides some concluding remarks and directions for future study.

Point processes and conditional intensity function
A spatial-temporal point process is a random point pattern defined by time and location of every single event.Point processes are here introduced by a mathematical approach that uses the definition of a counting measure on a set X ⊆ R d ,d ≥ 1, with positive values in Z: for each Borel set B this Z + -valued random measure gives the number of events falling in B.
This section reviews some basic definitions related to point processes, reported to introduce the notation used throughout the paper.For further elaboration and references, please see Daley and Vere-Jones (2003).

Definition 1 Point process
Let ( ,A,P ) be a probability space and a collection of locally finite counting measures on X ⊂ R d .Define X as the Borel σ -algebra of X and let N be the smallest σ -algebra on , generated by sets of the form {φ ∈ : φ(B) = n} for all B ∈ X .A point process N on X is a measurable mapping of ( ,X ) into ( ,N ).A point process defined over ( ,A,P ) induces a probability measure N (Y ) = P (N ∈ Y ),∀Y ∈ N (Cressie, 1991).
Given a point process N defined on the space (X,X ) and a Borel set B, the number of points N(B) in B is a random variable with first moment defined by: that is a measure on (X,X ).The measure µ N is called the mean measure or first moment measure of N (Cressie, 1991).The second moment measure of N is given by: 2) the process is second order.
Let ds and du be small regions located at s and u ∈ X, and let (x) be the Lebesgue measure of x.The first order intensity is defined by: the second order intensity is defined by: .
The second-order counting properties of such a process can be summarized by a covariance density: (1) The covariance measure also has a singular component concentrated along the line s = u, as illustrated by the formula: Let N be a point process on a spatial-temporal domain is the intensity function of the process conditioned to H t , that is the space-time occurrence history of the process up to time t, or in other words, the σ -algebra of events occurring at times up to but not including t; dt,dz are time and space increments respectively, and The conditional intensity function is a function of the point history and it is itself a stochastic process depending on the past up to time t.Assuming such a limit exists for each point (t,z) in the space-time domain and the point process is simple, the conditional intensity process uniquely characterizes the finite-dimensional distributions of N (Daley and Vere-Jones, 2003).According to the used notation, the star in λ * (•) is used to indicate that the intensity is a function of the past history H t .
If the conditional intensity function is independent of the past history and dependent only on the current time and spatial location, Eq. (2) determines λ(t,z) and identifies an inhomogeneous Poisson process.A constant conditional intensity provides a stationary Poisson process.
In a time-stationary but spatially inhomogeneous process, the expression in (2) is: with λ the overall rate occurrence for a given region and f (•) a time-invariant space density.A more general form for (2) is provided assuming a separable form, in which the spatial term is assumed to be univariate in time and temporal density is not constant, where the both terms are allowed to depend on the past history, such that: It simplifies to the product of constants for homogeneous Poisson processes.
In this paper a nonparametric estimation of a second order measure like (3) is provided to describe dependency structures of a multidimensional observed seismic process by using a flexible procedure based on kernel intensity estimators.

Nonparametric estimation
For an adequate description of the seismic activity of a fixed area and to suggest useful ideas on the mechanism of a such complex process, the definition of a valid and effective model is required.When a complete definition of a parametric model is not reliable, nonparametric approach could be useful.Indeed, in seismic modelling contexts, parametric models are not always useful since the definition of a reliable mathematical model from the geophysical theory may not be available.
In general, some disadvantages of the parametric modelling can be avoided by using flexible procedures (nonparametric techniques), based on kernel intensity methods (Silverman, 1986).Given n observed events s 1 ,s 2 ...,s n in a d-dimensional given region, the kernel estimator of an unknown density f in R d is defined as: where K(s 1 ,...,s d ) denotes a multivariate kernel density (usually the standard Normal density function) operating on d arguments centered at (s i1 ,...,s id ) and h=(h s 1 ,...,h s d ) is the vector of the smoothing parameters of the kernel functions.If s i = {t i ,x i ,y i ,z i ,M i }, the space-time-magnitude kernel intensity estimator of (2) is defined by the superposition of the separable kernel densities: where K t , K s and K M are temporal, spatial and magnitude kernel density functions, as in (4), respectively.
Introducing the estimator defined in (5), the estimation of a complex intensity function dependent on the past history of the process as in (2) now reduces to the estimation of the intensity function of an inhomogeneous Poisson process, independent of the past history and identified by a space-time Gaussian kernel intensity (Adelfio and Ogata, 2010); this result provides useful directions for a simpler estimation approach in describing very complex phenomena such as the seismic one.Separability of time and space kernel densities is here assumed for computational convenience, because of the high dimensional issue, although tests to assess this assumption could be used (Schoenberg, 2004).It might be useful to note that this assumption is not directly extended to the intensity function of the process since it is obtained by the superposition of these densities.
In this context the problem of choosing the amount of smoothing is of crucial importance, since smoothing parameter regularizes the trade-off between variance and bias of the estimator, that is between random and systematic error.In Adelfio et al. (2006) the seismicity of the Southern Tyrrhenian Sea is described by Gaussian kernels and the optimum value of h is chosen such as to minimize the mean integrated square error (MISE) of the estimator f (•).In particular the authors used the value h opt that Silverman (1986) obtained minimizing the MISE of f (•) assuming multivariate normality.
In Adelfio (2010) a variable bandwidth procedure is introduced, choosing h j = (h j x ,h j y ,h j t ), i.e. the bandwidth for the j -th event, j = 1,...,n, as the radius of the smallest circle centered at the location of the j -th event (x j ,y j ,t j ) that includes at least a fixed number of further events.
In Adelfio and Ogata (2010) a naive likelihood crossvalidation function is optimized to obtain the bandwidth of the smoothing kernel used to estimate the intensity for earthquake occurrence of northern Japan.
Although the use of variable bandwidth may be preferable to reflect local occurrence rates instead of using fixed bandwidth, in this paper constant smoothing is considered as a convenient approach to deal with the high dimensionality of the analyzed problem.

Conditional intensity estimation
Conditional intensity estimation can be considered as a generalization of regression, focussing on the estimation of the full conditioned distribution and not just on the expectation value.
A discrete estimate of the second order conditional intensity for a point process is given by Vere-Jones (1978).Now we are looking for a smoothed version of the conditional intensity, that is the local intensity of the process at p, given the occurrence of a point of the process at s. Thus: where s and p are points in R d , with d = 5, since we are considering space (3-D), time and magnitude dimensions.This function can be related to the covariance density (1) by the equation Moreover the formula above can be considered as another way of looking at the Ripley's K-function, useful when the emphasis is on the physical interpretation of the dependence.
In this paper nonparametric kernel estimators are used for estimating h(p|s) in high-dimensional domain; in this context the conditioning puts a different complexion on the problem, as the smoothing parameters have to be adjusted to the conditioning event.
Here we use a version of the conditional intensity function introduced in earlier earthquake studies by Vere-Jones (1978), where the second order properties are classified according to the magnitude of the initial event.In the early paper the analysis was based on a crude discretization of the process, but in the present paper we make use of kernel density estimates applied jointly to both the conditioning and the conditioned events.More precisely, to evaluate a smoothed version of the conditional intensity (CIF) in (6), we use the ratio estimate that is, the ratio between the joint intensity of the conditioning and the conditioned event (i.e.p and s), and the marginal intensity of the conditioning event (event s).This function is therefore estimated as the ratio of the kernel intensity estimators, defined in Eq. ( 5), for λ(p,s) and λ(s), respectively.The kernel estimators consider Gaussian density with zero mean and variance selected in a such way that its standard deviation h (the kernel bandwidth) minimizes the mean integrated square error (MISE) of the estimate λ(•).
This approach simplifies the complex estimation issue of the second-order measure in (6), although it implies the use of high dimensional kernel functions.Indeed, the quantity in (7) has been computed considering the ratio of fivedimensional kernel estimators, providing, on one hand, a very computer intensive procedure, but on the other hand, some advantages related to the possibility of describing the main features of the process in multiple domains without constraining data to binding assumptions.3-D space plot, with depth discretized into four groups with equispaced quantiles as breakpoints, so that all groups have roughly the same number of points.Different colors are used for different ranges of magnitude: black for 4.5 ≤ M < 5.4, red for 5.4 ≤ M < 6.3 and green for 6.3 ≤ M < 7.2.

Application to New Zealand data and discussion
Space-time modelling seems one sensible direction, especially if depth is also involved.Some pictures could suggest something about the evolution of spatial clusters at various depth (when they merge or separate).Deep earthquakes generally lack fully evolved sequences which decay according to the Omori's law Utsu (1961).Therefore clustering described from ETAS model (Ogata, 1988) could be valid just for shallower events, that may have a classical aftershocks behavior.
On the other hand, aftershocks for deep events have a different behavior and for this case a different modelling could be useful, mostly to check their features that are still unknown in some sense.
Here a direct approach to analyze second order properties of deep earthquakes is considered, based on the use of the two point correlation function, that is of the conditional intensity function defined in (7).
We selected a subset of the GeoNet catalog of New Zealand earthquakes.Completeness issues of this catalog are discussed in Harte and Vere-Jones (1999).The data consist of n=3097 earthquakes of local magnitude M L =4.5 and larger that are chosen from the wide region −43    concentrated in the area extending from Taranaki and Taupo region to northeast Bay of Plenty, persisting over different time periods and different depth ranges.From Fig. 3 we can observe still deep events for high magnitude, (for instance, an event with M = 7.2 has been recorded at 273 km of depth), although very deep earthquakes are recorded for M<5.9.
Fig.1.3-D space plot, with depth discretized into four groups with equispaced quantiles as breakpoints, so that all groups have roughly the same number of points.Different colors are used for different ranges of magnitude: black for 4.5 ≤ M < 5.4, red for 5.4 ≤ M < 6.3 and green for 6.3 ≤ M < 7.2.

Fig. 2 .
Fig. 2.3-D scatterplot of earthquakes epicenters in terms of latitude, longitude and time.Depth is used as the conditioning variable.Different colors are used for different ranges of magnitude: black for 4.5 ≤ M < 5.4, red for 5.4 ≤ M < 6.3 and green for 6.3 ≤ M < 7.2.

Fig. 3 .
Fig. 3. 3-D scatterplot of earthquakes epicenters in terms of latitude, longitude and depth.Magnitude is used as the conditioning variable.

Fig. 4 .
Fig. 4. Temporal conditional kernel intensity for all magnitude events (on the top left), with M ≥ 5 (on the top right), with M ≥ 5.5 (on the bottom left), with M ≥ 6 (on the bottom right).
of the process are h x = 0.34, h y = 0.27, h t = 1000.86(in days), h z = 0.06 and h M = 17.54.Some features of the observed events are now investigated.Space-time scatterplots of events are provided in Figs. 1, 2 and 3.Both in Figs. 1 and 2 the depth variable is used like a conditioning variable, showing significant deep events Nonlin.Processes Geophys., 17, 237-244, 2010 www.nonlin-processes-geophys.net/17/237/2010/