Testing for a generalized Pareto process

We investigate two models for the following setup: We consider a stochastic process X \in C[0,1] whose distribution belongs to a parametric family indexed by \vartheta \in {\Theta} \subset R. In case \vartheta = 0, X is a generalized Pareto process. Based on n independent copies X(1),...,X(n) of X, we establish local asymptotic normality (LAN) of the point process of exceedances among X(1),...,X(n) above an increasing threshold line in each model. The corresponding central sequences provide asymptotically optimal sequences of tests for testing H0 : \vartheta = 0 against a sequence of alternatives Hn : \vartheta = \varthetan converging to zero as n increases. In one model, with an underlying exponential family, the central sequence is provided by the number of exceedances only, whereas in the other one the exceedances themselves contribute, too. However it turns out that, in both cases, the test statistics also depend on some additional and usually unknown model parameters. We, therefore, consider an omnibus test statistic sequence as well and compute its asymptotic relative efficiency with respect to the optimal test sequence.


Introduction
In the recent three decades, the focus of univariate extreme value theory shifted from the investigation of maxima (minima) in a sample to the investigation of exceedances above a high threshold. This approach towards large observations eased accessing the field of extreme value theory and became a crucial tool for various applied disciplines, such as building dykes.
Since the publications of the articles by Balkema and de Haan (1974) and Pickands (1975) it is known that exceedances above a high threshold can reasonably be modeled only by (univariate) generalized Pareto distributions (GPD), resulting in the peaks-over-threshold approach (POT).
Due to practical necessity, the focus of extreme value theory moved in recent years to multivariate observations as well. Accordingly, the investigation of multivariate exceedances enforced the definition and investigation of multivariate GPD.
This investigation is still lively continuing as even the definition of multivariate GPD is under debate; see, for instance, Falk et al. (2010, Chapter 5).
As already mentioned by de Haan and Ferreira (2006, p. 293): Infinite-dimensional extreme value theory is not just a theoretical extension of multivariate extreme value theory to a more abstract context. It serves to solve concrete problems as well. Such concrete problems are, e.g., observing dykes and tides along their whole width and not only at a finite set of observation points. There is, consequently, the need for a POT approach for functional data and for generalized Pareto processes as well. Again, the data exceeding some kind of a high threshold are modeled by a functional counterpart of a GPD; see Aulbach et al. (2012b). The current paper deals with optimal tests that check for particular models whether those exceedances do, in fact, arise from such a corresponding process.
1.1. Previous work. Following Buishand et al. (2008), a standard generalized Pareto process, i.e., a generalized Pareto process with ultimately uniform tails in the margins, is defined as follows. For convenience, we use bold font such as V for stochastic processes and default font such as f for non stochastic functions. All operations on functions such as f ≤ 0 are meant pointwise.
Definition 1.1. Let U be an on (0, 1) uniformly distributed random variable (rv) and let Z = (Z t ) t∈[0,1] ∈ C[0, 1] be a stochastic process on the interval [0, 1] having continuous sample paths. We require that U and Z are independent and choose an arbitrary constant M < 0. Then . defines a standard generalized Pareto process (GPP) if 0 ≤ Z t ≤ m, E(Z t ) = 1, t ∈ [0, 1], hold for some constant m ≥ 1. A stochastic process Z ∈ C[0, 1] with these two properties will be called a generator.
The constant M is incorporated in the above definition to ensure that V t > −∞ for each t ∈ [0, 1]. Note that the finite-dimensional marginal distributions of V provide multivariate GPD with ultimately uniform tails; see, e.g., Aulbach et al. (2012a).
The process V is characterized by the fact that its functional distribution function (df) is given by for some x 0 > 0. We setĒ − defines a D-norm on E[0, 1] with generator Z; see Aulbach et al. (2012c). This representation of the df of V in terms of a D-norm is in complete analogy with the multivariate case of a GPD. We refer to Falk et al. (2010, Section 5.1 uniformly distributed on (0, 1).
For each standard GPP there is a corresponding standard MSP, i.e., a stochastic On the other hand, the df of each max-stable process η having standard negative exponential margins has a representation as in Equation (2); we refer to Aulbach et al. (2012c) for details.
1.2. Overview of the current paper. We replace the rv U in Equation (1) by a rv W ≥ 0 which is independent of Z, too. However, the distribution of W is different from the uniform one and, thus, the process 1] is no longer a standard GPP.
This gives rise to the following problem: Based on the exceedances in a sample of n independent copies X (1) , . . . , X (n) of X above a high threshold line, how close can the df of W get to that of U with the difference still being detected? As we consider exceedances above a high threshold, only the lower end of the df of W matters. In other words, the problem suggests itself to define parametric models {H ϑ : ϑ ∈ Θ} for the df H ϑ of W , such that we can derive optimal tests detecting the deviation of the distribution of the upper tail of X from that of V , i.e., the deviation of ϑ from zero. This is the content of the present paper, which is organized as follows.
In Section 2 we require that the df H ϑ of W has a density h ϑ near zero, which satisfies for some δ > 0 the expansion (4) h ϑ (u) = 1 + ϑu δ + o(u δ ) as u ↓ 0 with some parameter ϑ ∈ Θ, where zero is an inner point of Θ ⊂ R. The standard exponential df, for instance, satisfies this condition with δ = 1 and ϑ = −1. The null-hypothesis ϑ = 0 is meant to be the uniform distribution on (0, 1).
In Section 3 we assume that the distribution of W belongs to an exponential family given by the probability densities on the interval In both models we establish local asymptotic normality (LAN) of the point process of exceedances among X (1) , . . . , X (n) above an increasing threshold line.
The results, which are stated in Theorem 2.2 and Theorem 3.2, provide in each model the corresponding central sequence and, thus, optimal tests for testing ϑ = 0 against a sequence of alternatives ϑ n converging to zero as the sample size increases.
It turns out that the particular values of the exceedances contribute to the central sequence only in model (4), whereas in the exponential model (5) the number of exceedances alone yields the central sequence. The fact that just the number of realizations in shrinking sets provides the central sequence was characterized for truncated processes in quite a general framework in Falk (1998) and Falk and Liese (1998).
As the central sequences and, thus, the asymptotically optimal tests also depend on further parameters of the generator process Z, which might be unknown in practice, we consider an omnibus test for testing ϑ = 0 as well. We compute its asymptotic relative efficiency (ARE) with respect to the optimal test in each model.
While ARE is positive in model (4), it is zero in model (5).
Moreover, we assume a standard MSP. This is implied by the fact that in this case P inf t∈[0,1] U t > 0 = 1.
Note that (8) and Hölder's inequality also give 2.1. Local asymptotic normality. In order to derive asymptotically optimal tests in this model, we first establish local asymptotic normality (LAN) of the point process of exceedances where X (i) , i ≤ n, are independent copies of X in (3) and c < 0. B denotes the σ-field of Borel sets of R and ε x is the point measure with mass one at x. Note that i.e., the random point measure N n,c actually represents those observations among X (1) , . . . , X (n) which exceed the constant threshold function c.

Denote those observations among sup
By Theorem 1.4.1 in Reiss (1993) we may assume without loss of generality that under parameter ϑ, and that they are independent of the total number τ (n), which is binomial B (n, P ϑ (X ≥ c))-distributed.
In the next lemma we provide the density f ϑ,c of sup t∈[0,1] (X t /c) and, thus, the density of Y under ϑ, which is f ϑ,c /P ϑ (X ≥ c). By P * Z we denote the distribution of a rv Z.
Lemma 2.1. Suppose that the distribution of the rv W in (3) belongs to the family , and it is given by Furthermore there exists ε 0 > 0 such that Proof. Let m be given as in Definition 1.1 and u 0 , ε 0 , δ be given as in Equation (6).
The following result finally proves the desired LAN property of N n,c ; it is a crucial tool for deriving asymptotically optimal tests in the subsequent subsection.
Theorem 2.2. Suppose that c n ↑ 0, n |c n | 1+2δ → ∞ as n → ∞. Then we obtain for ϑ n as in (11) the expansion Proof. First we compile several facts that will be used in the proof. From Lemma 2.1 we obtain and, thus, a suitable version of the central limit theorem implies Moreover, we conclude from Lemma 2.1 for |ϑ| ≤ ε 0 and, thus, where ∼ denotes asymptotic equivalence. The preceding convergence to zero follows from the condition n |c n | 1+2δ → n→∞ ∞ and the equivalence From the law of large numbers we obtain Altogether we have shown so far that Next we show that We have by Lemma 2.1 where r 0 (Y k , ϑ n , c n ) = o (n |c n |) −1/2 uniformly for k and n with E 0 (r 0 (Y 1 , ϑ n , c n )) and V ar 0 (r 0 (Y 1 , ϑ n , c n )) ≤ E 0 r 2 0 (Y 1 , ϑ n , c n ) = o (1/(n |c n |)) .

Using again the Taylor expansion log
by the law of large numbers and the central limit theorem. This completes the proof.
The corresponding uniformly asymptotically optimal test for H 0 against ϑ n (ξ) with ξ < 0 is The asymptotic power functions of these tests are provided by Theorem 2.2 as well. By LeCam's third lemma we obtain that under ϑ n = ϑ n (ξ) The asymptotic power functions of ϕ i are, consequently, given by A disadvantage of the optimal test statistics ϕ i (N n,cn ) is the fact that they require explicit knowledge of the constants A and δ. To overcome this disadvantage, we consider in the following an alternative test.
Recall that the observations Y 1 , Y 2 , . . . are independent and, under ϑ = 0, uniformly on (0, 1) distributed rv if the threshold c is close to zero. Conditional on the assumption that there is at least one exceedance, i.e., conditionally on τ (n) > 0, the test statistic is under H 0 exactly N (0, 1)-distributed. By Φ we denote the standard normal df.
This test statistic is analogous to that in Falk and Michel (2009) for testing for a multivariate generalized Pareto distribution.
The next result provides the asymptotic distribution of T n,cn under the alternative ϑ n = ϑ n (ξ) as n → ∞.
Proposition 2.3. Under the assumptions of Theorem 2.2 we have Proof. First we compute the asymptotic mean and variance of Φ −1 (Y ) under ϑ n and c n for n → ∞. From Lemma 2.1 we obtain that the density of Y under ϑ n is for 0 ≤ u ≤ 1 and c n ≥ c 0 given by From Fubini's theorem and the substitution u → Φ(x) we, therefore, obtain where ϕ(x) = Φ (x) = (2π) −1/2 exp(−x 2 /2), x ∈ R, is the density of the standard normal df Φ.
Now we can compute the asymptotic distribution of T n,cn under ϑ n . We have where the first term is by a suitable version of the central limit theorem asymptotically standard normal distributed, and which completes the proof.
Denote by k n := min k ∈ N : E ϑn(ξ) (ϕ * i (N k,c k )) ≥ E ϑn(ξ) (ϕ i (N n,cn )) the least sample size, for which ϕ * i N kn,c kn is, at ϑ n (ξ), at least as good as ϕ i (N n,cn ). The relative efficiency of ϕ * i N kn,c kn with respect to ϕ i (N n,cn ) is then defined as n/k n . From (12) and (13) we obtain that (14) lim n→∞ n |c n | 1+2δ k n |c kn | 1+2δ = (2δ + 1) see Section 10.2 in Pfanzagl (1994) for the underlying reasoning. This explains the significance of the asymptotic relative efficiency defined above.

Testing in an exponential family model
In this section we assume that the distribution of W belongs to an exponential family given by the probability densities on the interval [0, 1]  (14).
Remark 3.1. From the arguments in the proof of Lemma 2.1 we obtain that the rv sup 0≤t≤1 (X t /c) has for c < 0 close to zero and each ϑ ∈ R on [0, 1] the Lebesgue- In what follows we put with arbitrary ξ ∈ R ϑ n := ϑ n (ξ) := ξ (n |c n |) Theorem 3.2. Suppose that |c n | → 0, n |c n | → ∞ as n → ∞. Then we obtain for the loglikelihood ratio in (10) the expansion Proof. Again we compile several facts first.
This follows from the expansion exp(x) = 1 + x + o(x) as x → 0: This can be seen as follows. From Remark 3.1 and Fact 5 we obtain which is Fact 6.
Fact 6 together with Fact 1 yields Repeating the arguments in the proof of Theorem 2.2 one shows that It remains to show that Repeating the arguments in the proof of Fact 6 we obtain uniformly for u ∈ [0, 1] and n ∈ N. The expansion log(1 + ε) = ε − ε 2 /2 + O ε 2 for ε → 0 together with Fact 7, thus, yields,
We, thus, obtain . From Fact 7 we obtain (16) III n ∼ n |c n | A 2 1 (n |c n |) 1/2 Next we show that I n = o P0 (1). This assertion follows, if we show that By elementary arguments we obtain which is (17).
We, thus, have established (15), which completes the proof of Theorem 3.2.
As T n,cn → D ϑn ,cn N (0, 1), Lemma 3.3 implies that the test statistic T n,cn is not capable to detect the alternative ϑ n = ϑ n (ξ).