Estimating self-similarity through complex variations

: We estimate the self-similarity index of a H -sssi process through complex variations. The advantage of the complex variations is that they do not require existence of moments and can therefore be used for inﬁnite variance processes.

where (d) = stands for equality of the finite dimensional distributions. Process X has stationary increments if, for all s ∈ R, A process that is H-self-similar with stationary increments will be called H-sssi. The aim of this paper is to estimate H from a discrete sample of X over the time interval [0, 1]. More precisely, one observes a H-sssi process X at a discrete sampling k/n, k = 0, . . . , n. Let a = (a 0 , a 1 , . . . , a K ) be a finite sequence with L + 1 vanishing moments K k=0 a k k ℓ = 0 ℓ = 0, . . . , L, K k=0 a k k L+1 = 0, with convention 0 0 = 1. The increments of X with respect to the sequence a are defined by ∆ p,n X = K k=0 a k X p + k n .
A usual statistical tool is the φ-variation V n (φ, X) = 1 n − K + 1 n−K p=0 φ(|∆ p,n X|), (1.1) where φ is a given function. When X is a fractional Brownian motion, generalized quadratic variations (φ(x) = x 2 , L ≥ 1) provides a consistent estimate of H, asymptotically normal with rate √ n [10]. These generalized quadratic variations can be used for Rosenblatt or more generally Hermite processes [4,5,17], that are processes with finite variance, but the √ n-rate of convergence is not always obtained. A similar tool is the wavelet's one. Roughly speaking, the increments of the process X are replaced by wavelet coefficients [1,11]. For H-sssi processes with infinite variance, for instance α-stable processes, the use of (generalized) quadratic variations is hopeless. One could try with functions φ(x) = x p , 0 < p < α. Indeed, the p-variations still lead to estimate for fractional Brow-1394 J. Istas nian motions [12]. But, for stable processes, this requires an a priori accurate estimate of the stability index α. One can try with log-variations φ(x) = log |x| as in [7,9]. Indeed, this estimate requires only the existence of logarithmic moments. Unfortunately, the rate of convergence of the self-similarity index estimate is very slow (logarithmic [9, sec. 6.2]). [14] propose an estimate based on weighted sums of logarithm of wavelet coefficients, but on a time interval [0, T ] with T → +∞.
At present, in the statistical literature, there is no self-similarity index consistent estimate with a reasonable rate of convergence. One only knows that the self-similarity index is identifiable from the observation of one sample path over a bounded interval. Let us come back to the φ-variations, with φ(x) = x p . These power functions are used since, for all x, y > 0, φ(x/y) = φ(x)/φ(y). But, for any p ∈ C, we still have (x/y) p = x p /y p . Among these complex powers, the purely imaginary powers (p = iM, M ∈ R) are particularly interesting, since |x iM | = 1. Indeed, for any positive random variable U , the expectation of U i still exists. In this paper, we therefore work with complex variations These complex variations provide a self-similarity index estimate without assumptions on the existence of moments of X. Under suitable conditions, we prove the consistency and obtain the rate of convergence.
We then consider family of examples. H-sssi second-order processes exist for 0 < H ≤ 1. Stable H-sssi processes exist for 0 < H ≤ max(1, 1/α). Therefore, we consider examples that cover the range of admissible parameters H: Hsssi processes with independent increments, fractional Brownian motions, wellbalanced linear fractional stable motions and Takenaka's processes. In the case of fractional Brownian motion, we provide a central limit theorem and a numerical comparison between quadratic variations, log variations and complex variations. Finally, we prove that the distribution of X(1) is identifiable, even if we do not build a tractable estimate.

Settings and assumptions
For r > 0, M ∈ R, set r iM = e iM log(r) .
We will make some of the following assumptions.
Trivial cases, like X(t) = U t (U ∼ N (0, 1)) and L ≥ 1, are forbidden by the first condition. For the second condition, the function M → E|∆ 0,1 X| iM is continuous and equal to 1 at M = 0. There always exists a neighborhood of 0 such that E|∆ 0,1 X| iM = 0. Moreover, we prove that E|∆ 0,1 X| iM is never vanishing for Gaussian and stable variables. We show in the section "Examples" how to check the third condition on several H-sssi processes. Note that the limit case γ = 0 corresponds to the case when the series k∈Z |cov(|∆ k,1 X| iM , |∆ 0,1 X| iM )| is convergent.
Let us define the estimate of the self-similarity index H H n = 1 M log 2 arg(V n/2 (X)/V n (X)).
Let us make an heuristic explanation of this estimator H n . As proved later, the expectation of V n (X) is equal to n −iHM E|∆ 0,1 X| iM . If V n (X) converges, as n → +∞, to its expectation, the ratio V n/2 /V n will converge to 2 iHM . Since the modulus of V n/2 /V n may be not equal to 1, we need to take the argument of V n/2 /V n to build the estimator H n . We have the following, where O P is defined by (2)). Then, assuming 3)a) Since process X is H-sssi Then It follows and (2.1) is proved. Using (2.3), assumption 3)b) and the dominated convergence theorem then yield

Theoretical results
Fractional Brownian motion is the unique, up to a constant, Gaussian H-sssi process, with H ∈ (0, 1]. The case H = 1/2 has just been treated. Our assumptions exclude the trivial case H = 1. Let us consider the other cases. We prove that Theorem 2.1 holds with L ≥ 1 and any M < 2π/ log(2). Moreover, we prove a central limit theorem.
Let us first prove that Assumption 2) is satisfied for any M . Proof.
We start with the following Lemma.
One then checks that where P ρ (x, y) is a fourth degree polynomial that depends continuously on ρ. By a Taylor expansion of order 2 of the function ρ → B ρ (x, y), there exists ρ ∈ (0, ρ) such that On the compact set |ρ| ≤ 1/2, the polynomials P ρ (x, y) can be bounded by a fourth degree polynomial P (x, y). More precisely, for all x, y ∈ R, |P ρ (x, y)| ≤ |P (|x|, |y|)|. Then This leads to One then checks that The dominated convergence theorem is applied to (3.2) as ρ → 0. Since |cov(|U | iM , |V | iM )| is bounded by 1, (3.1) is proved for any |ρ| ≤ 1/2.
Let us now prove Proposition 3.1. We next compute the covariance between ∆ k,1 X and ∆ 0,1 X A Taylor expansion up to order 3 leads to Lemma 3.2 ensure the convergences of the series p∈Z |cov(|∆ k,1 X| iM , |∆ 0,1 X| iM )| for any H ∈ (0, 1).
W n (X) has the same distribution as Let us recall the Breuer-Major's theorem [2,13]. The Hermite rank d of a function f is the order of the first non-trivial coefficient in the Hermite expansion of f . Let (X k ), k ∈ Z be a centered stationary Gaussian sequence. Let ρ(k) = E(X 0 X k ). Assume d ≥ 1 and k∈Z |ρ(k)| d < ∞. Then 1 √ n n 1 f (X k ) converges in distribution, as n → +∞, to a centered Gaussian variable. Since R |x| iM xe −x 2 /2 dx = 0, the Hermite rank of any linear combination αRe(|x| iM ) + βIm(|x| iM ) is at least 2. The Breuer-Major's theorem can be applied to any real function f (x) = αRe(|x| iM ) + βIm(|x| iM ). This proves the convergence of the couple (Re(|x| iM ), Im(|x| iM )). Therefore, √ n(W n (X) − E|∆ 0,1 X| iM ) converges in distribution, as n → +∞, to a centered Gaussian variable. For the convergence of the couple (W n/2 , W n ), one needs a bivariate Breuer-Major's theorem. The proof of this bivariate Breuer-Major's theorem is similar to the proof of the classical one. We sketch the proof, following [13, ch. 6 & 7]. Let f and g be two real functions which Hermite ranks are at least d ≥ 1. Let (f q ) and (g q ) be the coefficients of the Hermite expansion where (H q ) are the Hermite polynomials. The classical Breuer-Major's theorem ensures that 1 √ n n k=1 H q (X k ) converges in distribution, as n → +∞ to a centered Gaussian variable q! k∈Z ρ(k) q N q , where N q is a centered standard Gaussian variable. Since the Hermite polynomials are orthogonal, condition (6.2.11) of Theorem 6.2.3 of [13] is straightforwardly satisfied. It follows, for any given m ≥ d that converges in distribution, as n → +∞, to a centered Gaussian vector  where the centered standard Gaussian variables (N q ) are independent. We then mimic the proof of Theorem 7.2.4 of [13]. Condition (d) is satisfied, one can get m = +∞ in (3.3). With other words, we have proved that the couple converges in distribution, as n → +∞, to the centered Gaussian vector  The proof of the convergence of the couple (W n/2 , W n ) is similar. One proves the convergence of the vector H q ∆ k,n/2 X var(∆ k,n/2 X) first for a given q, with d ≤ q, using the fourth moment theorem [13, Th.5.2.7], then for any given m, with d ≤ q ≤ m, using the orthogonality of the Hermite polynomials, then for any d ≤ q. Since the estimatorĤ n is a smooth function of the couple (W n/2 , W n ), the central limit theorem of Proposition 3.1 is proved.

Numerical comparisons
Let us now numerically compare the estimate H n of Theorem 2.1 with quadratic and log variations estimates. More precisely, we use φ(x) = x 2 and φ(x) = log(x) in (1.1). We compute the mean estimation H M and the mean square error  for SP = 200 sample paths, for different values of H and n = 2 N . At first sighat, quadratic variations leads to better results, that is consistent with [6]. Logarithmic variations leads to worst results and complex variations are in between.

Well-balanced linear fractional stable motions
Let M be a symmetric α-stable random measure (0 < α < 2) with Lebesgue control measure. We refer to [15] for basic facts on integration with respect to a stable random measure. The well-balanced linear fractional stable motions are defined by [15, p.140] where 0 < H < 1 and H = 1/α. Process X is H-sssi. We prove, in the following serie of Lemmas 3.3, 3.4, 3.5 and 3.6, that, for all L ≥ 0 lim n→+∞ H n = H (P), and that there exists C > 0 such that Therefore, Theorem 2.1 holds with L > 2/α + H − 1 and any M < 2π/ log (2). With other words, to obtain the optimal rate of convergence, one needs to know an a priori bound for the stability index α.
Let us now prove that Assumption 2), for a standard stable variable, is satisfied for any M . Then, for all M ∈ R, E|X| iM = 0.
Proof. From Lemma 3.3, with 0 < R(p) < min(α, 1) Let us perform the change of variable x α = y 2 Let Z be a centered standard Gaussian variable. The same can be written for Z Following Lemma 3.1 (3.6) (3.4), (3.5) and (3.6) lead to The dominated convergence theorem then leads, as p → iM , and Lemma 3.4 is proved.
Lemma 3.5. Let (S, µ) be a measure space, f, g ∈ L α (S, µ) and M be a symmetric α-stable random measure on S with control measure µ. Set There exists a constant C(η) such that Proof. First note that, since function x → |x| iM does not belong to L 1 , [14, Th.2.1] cannot be applied. Let 0 < ε < α/2 and let us apply Lemma 3.3 with p = iM + ε.
and the same for V . Fubini's Theorem then leads to
Lemma 3.6. Set There exists a constant C > 0 such that Proof. Using the change of variable A Taylor expansion proves that there exists C > 0 such that, for |t| ≥ 2K By Cauchy-Schwarz inequality We use again (3.7) for large p. We then show that, for any p
Takenaka process X is defined by Process X is β/α-sssi. Let us estimate cov(|∆ k,1 X| iM , |∆ 0,1 X| iM ). We apply Lemma 3.5. Set Therefore, we have to estimate, as |k| → +∞ When k > 2r+K, then |f k (x, r)f 0 (x, r)| = 0. When k < 2r+K, |f k (x, r)f 0 (x, r)| can only be bounded by a constant. In particular, increasing the number of vanishing moments of the sequence (a i ) has no effect. Indeed, we have proved that there exists C > 0 such that Take M such that 0 < β/α < 2π/(M log (2)). Then

Identification of the distribution of X(1)
We just have seen that the self-similarity index H is identifiable. The aim of this section is to prove that the distribution of X(1) is identifiable. Let us assume in this section the following • The distribution of X(1) is symmetric.

Remarks
The following questions then arise. Are there other ways of estimating selfsimilarity than complex variations? Is one way optimal? There are clearly other estimators. For instance, one can use logarithmic variations (Please note that, in the following equation, logarithmic variations are used in another way that in [7,9]) H log,n = V n/2 (log, X) − V n (log, X) log 2 .
An inspection of the proof indicates that Theorem 2.1 holds assuming the existence of logarithmic moments for X. Let now f be an arbitrary function from R 2 onto R and g from R onto C. Define the following estimate H f,g,n = f (V n/2 (g, X), V n (g, X)).
What is the class F of admissible functions f and g? That is functions f and g leading to a consistent estimate under weak assumptions on X. Among this class F , what are the best functions, say with respect to the mean square error? These questions turn out to be actually open and this paper offers no answer.