Variations and Hurst index estimation for a Rosenblatt process using longer filters

The Rosenblatt process is a self-similar non-Gaussian process which lives in second Wiener chaos, and occurs as the limit of correlated random sequences in so-called \textquotedblleft non-central limit theorems\textquotedblright. It shares the same covariance as fractional Brownian motion. We study the asymptotic distribution of the quadratic variations of the Rosenblatt process based on long filters, including filters based on high-order finite-difference and wavelet-based schemes. We find exact formulas for the limiting distributions, which we then use to devise strongly consistent estimators of the self-similarity parameter $H$. Unlike the case of fractional Brownian motion, no matter now high the filter orders are, the estimators are never asymptotically normal, converging instead in the mean square to the observed value of the Rosenblatt process at time 1.


Introduction
Self-similar stochastic processes are of practical interest in various applications, including econometrics, internet traffic, and hydrology. These are processes X = {X (t) : t ≥ 0} whose dependence on the time parameter t is self-similar, in the sense that there exists a (selfsimilarity) parameter H ∈ (0, 1) such that for any constant c ≥ 0, {X (ct) : t ≥ 0} and c H X (t) : t ≥ 0 have the same distribution. These processes are often endowed with other distinctive properties.
The fractional Brownian motion (fBm) is the usual candidate to model phenomena in which the selfsimilarity property can be observed from the empirical data. This fBm B H is the continuous centered Gaussian process with covariance function The parameter H characterizes all the important properties of the process. In addition to being self-similar with parameter H, which is evident from the covariance function, fBm has correlated increments: in fact, from (1) we get, as n → ∞, when H < 1/2, the increments are negatively correlated and the correlation decays more slowly than quadratically; when H > 1/2, the increments are positively correlated and the correlation decays so slowly that they are not summable, a situation which is commonly known as the long memory property. The covariance structure (1) also implies this property shows that the increments of fBm are stationary and self-similar; its immediate consequence for higher moments can be used, via the so-called Kolmogorov continuity criterion, to imply that B H has paths which are almost-surely (H − ε)-Hölder-continuous for any ε > 0. It turns out that fBm is the only continuous Gaussian process which is selfsimilar with stationary increments. However, there are many more stochastic processes which, except for the Gaussian character, share all the other properties above for H > 1/2 (i.e. (1) which implies (2), the long-memory property, (3), and in many cases the Hölder-continuity). In some models the Gaussian assumption may be implausible and in this case one needs to use a different self-similar process with stationary increments to model the phenomenon. Natural candidates are the Hermite processes: these non-Gaussian stochastic processes appear as limits in the so-called Non-Central Limit Theorem (see [5], [8], [25]) and do indeed have all the properties listed above. While fBm can be expressed as a Wiener integral with respect to the standard Wiener process, i.e. the integral of a deterministic kernel w.r.t. a standard Brownian motion, the Hermite process of order q ≥ 2 is a qth iterated integral of a deterministic function with q variables with respect to a standard Brownian motion. When q = 2, the Hermite process is called the Rosenblatt process. This stochastic process typically appears as a limiting model in various applications such as unit the root testing problem (see [31] ), semiparametric approach to hypothesis test (see [13]), or long-range dependence estimation (see [15]). On the other hand, since it is non-Gaussian and self-similar with stationary increments, the Rosenblatt process can also be an input in models where self-similarity is observed in empirical data which appears to be non-Gaussian. The need of non-Gaussian self-similar processes in practice (for example in hydrology) is mentioned in the paper [26] based on the study of stochastic modeling for river-flow time series in [16]. Recent interest in the Rosenblatt and other Hermite processes, due in part to their non-Gaussian character, and in part for their independent mathematical value, is evidenced by the following references: [4], [6], [10], [18], [19], [20], [27], [28].
The results in these articles, and in the previous references on the non-central limit theorem, have one point in common: of all the Hermite processes, the most important one in terms of limit theorem, apart from fBm, is the Rosenblatt process. As such, it should be the first non-Gaussian self-similar process for which to develop a full statistical estimation theory. This is one motivation for writing this article.
Since the Hurst parameter H, thus called in reference to the hydrologist who discovered its original practical importance (see [14]), characterizes all the important properties of a Hermite process, its proper statistical estimation is of the utmost importance. Several statistics have been introduced to this end in the case of fBm, such as variograms, maximum likelihood estimators, or spectral methods, k-variations and wavelets. Information on these various approaches, apart from wavelets, for fBm and other long-memory processes, can be found in the book of Beran [3]. More details about the wavelet-based approach can be found in [2], [11] and [30].
In this article, we will concentrate on one of the more popular methods to estimate H: the study of power variations; it is particularly well-adapted to the non-Gaussian Hermite processes, because explicit calculations can be performed via Wiener chaos analysis. In its simplest form, the kth power variation statistic of a process {X t : t ∈ [0, 1]}, calculated using N data points, is defined as following quantity (the absolute value of the increment may be used in the definition for non-even powers): There exists a direct connection between the behavior of the variations and the convergence of an estimator for the selfsimilarity order based on these variations (see [7], [28]): if the renormalized variation satisfies a central limit theorem then so does the estimator, a desirable fact for statistical purposes.
The recent paper [28] studies the quadratic variation of the Rosenblatt process Z (the V N above with k = 2), exhibiting the following facts: the normalized sequence N 1−H V N satisfies a non-central limit theorem, it converges in the mean square to the Rosenblatt random variable Z (1) (value of the process Z at time 1); from this, we can construct an estimator for H whose behavior is still non-normal. The same result is also obtained in the case of the estimators based on the wavelet coefficients (see [2]). In the simpler case of fBm, this situation still occurs when H > 3/4 (see for instance [29]). For statistical applications, a situation in which asymptotic normality holds might be preferable. To achieve this for fBm, it has been known for some time that one may use "longer filters" (that means, replacing the increments , or higher order increments for instance; see [7]). To have asymptotic normality in the case of the Rosenblatt process, it was shown in [28] that one may perform a compensation of the non-normal component of the quadratic variation. In fact, this is possible only in the case of the Rosenblatt process; it is not possible for higher-order Hermite processes, and is not possible for fBm with H > 3/4 [recall that the case of fBm with H ≤ 3/4 does not require any compensation]. The compensation technique for the Rosenblatt process yields asymptotic variances which are difficult to calculate and may be very high.
The question then arises to find out whether using longer filters for the Rosenblatt process might yield asymptotically normal estimators, and/or might result in low asymptotic variances. In this article, using recent results on limit theorems for multiple stochastic integrals based on the Malliavin calculus (see [22], [23]), we will see that the answer to the first question is negative, while the answer to the second question is affirmative. We will use quadratic variations (k = 2) for simplicity. A summary of our results is as follows.
Here Ω denotes the underlying probability space, and L 1 (Ω) and L 2 (Ω) are the usual spaces of integrable and square-integrable random variables.
• √ N c 1,H T 4 converges in distribution to a standard normal (Theorem 2), where c 1,H is given in Proposition 4.
• There exists a strongly consistent estimatorĤ N for H based on V N (Theorem 5), and 2 c −1/2 2,H (log N )N 1−Ĥ N Ĥ N − H converges in L 1 (Ω) to a Rosenblatt random variable (Theorem 7). Here c 2,H is again given in (16). Note that while the rate of convergence of the estimator, of order N −1+H log −1 N , depends on H, the convergence result above can be used without knowledge of H since one may plug inĤ N instead of H in the convergence rate.
• The asymptotic variance c 2,H in the above convergence decreases as the length of the filter increases; this decrease is much faster for wavelets-based filters than for finitedifference-based filters: for values of H < 0.95, c 2,H reaches values below 5% for wavelet filters of length less than 6, but for finite-difference filters of length no less than 16. • converges in distribution to a standard normal, where c 2,H is given explicitly in formula (16) and c 3,H in formula (19).
Similarly, for the estimator we have that N converges in distribution to the same standard normal. However, no mater how much we increase the order and/or the length of the filter, we cannot improve the threshold of 2/3 for H.
What prevents the normalization of V N from converging to a Gaussian, no matter how long the filter is, is the distinction between the two terms T 2 and T 4 . In the case of fractional Brownian motion, V N contains only one "T 2 "-type term (second chaos), but this term has a behavior similar to our term T 4 , and does converge to a normal when the filter is long enough; this fact has been noted before (see [7]). In our case, the normalized T 2 always converges (in L 2 (Ω)) to a Rosenblatt random variable; the piece that sometimes has normal asymptotics is T 4 , but since T 2 always dominates it, V N 's behavior is always that of T 2 . This sort of phenomenon was already noted in [6] with the order-one filter for all non-Gaussian Hermite processes, but now we know it occurs also for the simplest Hermite process that is not fBm, for filters of all orders.
The organization of our paper is as follows. Section 2 summarizes the stochastic analytic tools we will use, and gives the definitions of the Rosenblatt process and the filter variations. Therein we also establish a specific representation of the 2-power variation as the sum of two terms, one in the second Wiener chaos, which we call T 2 , and another, T 4 , in the fourth Wiener chaos. Section 3 establishes the correct normalizing factors for the variations, by computing second moments, showing in particular that T 2 is the dominant term. Section 4 proves that the renormalized T 4 is asymptotically normal. Section 5 proves that T 2 converges in L 2 (Ω) to the value Z (1) of the Rosenblatt process at time 1. In Section 6 it is shown that the variation obtained by subtracting this observed limit of T 2 leads to a correction term which is asymptotically normal. Section 7 establishes the strong consistency of the estimatorĤ for H based on the variations, and proves that the renormalized estimator converges to a Rosenblatt random variable in L 1 (Ω). Its asymptotic variance is given explicitly for any filter, thanks to the calculations in Section 3. In Section 8, we compare the numerical values of the asymptotic variances for various choices of filters, including finite-difference filters and wavelet-based filters, concluding that the latter are more efficient.

Basic tools on multiple Wiener-Itô integrals
Let {W t : t ∈[ 0, 1]} be a classical Wiener process on a standard Wiener space (Ω, F, P ). If a symmetric function f ∈ L 2 ([0, 1] n ) is given, the multiple Wiener-Itô integral I n (f ) of f with respect to W is constructed and studied in detail in [21,Chapter 1]. Here we collect the results we will need. For the most part, the results in this subsection will be used in the technical portions of our proofs, which are in the Appendix. One can construct the multiple integral starting from simple functions of the form f := i 1 ,...,in c i 1 ,...in 1 A i 1 ×...×A in where the coefficient .,in is zero if two indices are equal and the sets A i j are disjoint intervals, by setting then the integral is extended to all symmetric functions in L 2 ([0, 1] n ) by a density argument. It is also convenient to note that this construction coincides with the iterated Itô stochastic integral The application I n is extended to non-symmetric functions f via wheref denotes the symmetrization of f defined byf (x 1 , . . . , x x ) = 1 n! σ∈Sn f (x σ(1) , . . . , x σ(n) ). The map (n!) −1/2 I n can then be seen to be an isometry from L 2 ([0, 1] n ) to L 2 (Ω). The nth Wiener chaos is the set of all integrals I n (f ) : f ∈ L 2 ([0, 1] n ) ; the Wiener chaoses form orthogonal sets in L 2 (Ω). Summarizing, we have The product for two multiple integrals can be expanded explicitly (see [21]): if f ∈ L 2 ([0, 1] n ) and g ∈ L 2 ([0, 1] m ) are symmetric, then it holds that where the contraction f ⊗ ℓ g belongs to L 2 ([0, 1] m+n−2ℓ ) for ℓ = 0, 1, . . . , m ∧ n and is given by Note that the contraction (f ⊗ ℓ g) is not necessary symmetric. We will denote by (f⊗ ℓ g) its symmetrization.
Our analysis will be based on the following result, due to Nualart and Peccati (see Theorem 1 in [22]).
Proposition 1 Let n be a fixed integer. Let I n (f N ) be a sequence of symmetric square integrable random variables in the nth Wiener chaos such that lim N →∞ E I n (f N ) 2 = 1. Then the following are equivalent: Gaussian random variable.
(ii) For every τ = 1, . . . , n − 1 2.2 Rosenblatt process and filters: definitions, notation, and chaos representation The Rosenblatt process is the (non-Gaussian) Hermite process of order 2 with Hurst index H ∈ ( 1 2 , 1). It is self-similar with stationary increments, lives in the second Wiener chaos and can be represented as a double Wiener-Itô integral of the form Here {W t , t ∈ [0, 1]} is a standard Brownian motion and L t (y 1 , y 2 ) is the kernel of the Rosenblatt process where and K H is the standard kernel of fBm, defined for s < t and H ∈ ( 1 2 , 1) by where c H = H(2H−1) β(2−2H,H− 1 2 ) 1 2 and β(·, ·) is the beta function. For t > s, we have the following expression for the derivative of K H with respect to its first variable: The term Rosenblatt random variable denotes any random variable which has the same distribution as Z(1). Note that this distribution depends on H.
If we associate such a filter α with the Rosenblatt process we get the filtered process V α according to the following scheme: Some examples are the following: This is a filter of length 1 and order 1.
This is a filter of length 2 and order 2.
3. More generally, longer filters produced by finite-differencing are such that the coefficients of the filter α are the binomial coefficients with alternating signs. Therefore, borrowing the notation ∇ from time series analysis, ∇Z (i/N ) = Z (i/N ) − Z ((i − 1) /N ), we define ∇ j = ∇∇ j−1 and we may write the jth-order finite-difference-filtered process as follows From now on we assume the filter order is strictly greater than 1 (p ≥ 2).
For such a filter α the quadratic variation statistic is defined as Using the definition of the filter, we can compute the covariance of the filtered process V α i N : Therefore, we can rewrite the variation statistic as follows The next lemma is informative, and will be useful in the sequel.
Proof. For H < 1, we may rewrite c (H) by using the representation of the function |q − r| 2H via fBm B H , as its canonical metric given in (3), and its covariance function R H given in (1). Indeed we have where in the second-to-last line we used the filter property which implies ℓ q=0 α q = 0, and the last inequality follows from the fact that ℓ q=0 α q B H (q) is Gaussian and non-constant. When H = 1, the same argument as above holds because the Gaussian process X such that The assertion that c(0) = 0 comes from the filter property.
Observe that we can write the filtered process as an integral belonging to the second Wiener chaos Using the product formula (7) for multiple stochastic integrals now results in the Wiener chaos expansion of V N .
Proposition 2 With C i as in (14), the variation statistic V N is given by where T 4 is a term belonging to the 4th Wiener chaos and T 2 a term living in the 2nd Wiener chaos.
In order to prove that a variation statistic has a normal limit we may use the characterization of N (0, 1) by Nualart and Ortiz-Latorre (Proposition 1). Thus, we need to start by calculating E |V N | 2 so that we can then scale appropriately, in an attempt to apply the said proposition.
3 Scale constants for T 2 and T 4 In order to determine the convergence of V N , using the orthogonality of the integrals belonging in different chaoses, we will study each term separately. This section begins by calculating the second moments of T 2 and T 4 .
In this section we use an alternative expression for the filtered process. More specifically, denoting b q := q r=0 α r , we rewrite C i as follows, for any i = ℓ, . . . , N − 1: Recall that the filter properties imply ℓ q=0 α q = 0 and α ℓ = − ℓ−1 q=0 α q .

Term T 2
By Proposition 2, we can express E(T 2 2 ) as: where This proposition is proved in the Appendix.

Term T 4
In this paragraph we estimate the second moment of T 4 , the fourth chaos term appearing in the decomposition of the variation V N . Here the function N −1 i=ℓ (C i ⊗ C i ) is no longer symmetric and we need to symmetrize this kernel to calculate T 4 's second moment. In other words, by Proposition 2, we have that where C i⊗ C i := C i ⊗ C i . Thus, we can use the following combinatorial formula: If f and g are two symmetric functions in L 2 ([0, 1] 2 ), then It implies The proof of the next proposition, in the Appendix, shows that the two terms T 4,(1) and T 4, (2) have the same order of magnitude, with only the normalizing constant being different. (13). Let

Proposition 4 Recall the constant c (H) defined in
Then we have the following asymptotic variance for √ N T 4 : This proposition is proved in the Appendix. Observe that in the Wiener chaos decomposition of V N the leading term is the term in the second Wiener chaos (i.e. T 2 ) since it is of order N H−1 , while T 4 is of the smaller order N −1/2 . We note that, in contrast to the case of filters of lenght 1 and power 1, the barrier H = 3/4 does not appear anymore in the estimation of the magnitude of T 4 Thus, the asymptotic behavior of V N is determined by the behavior of T 2 . In other words, the previous three propositions imply the following.
where c 2,H is defined in (16).
¿From the practical point of view, one only needs to compute the constant c 2,H to find the first order asymptotics of V N . This constant is easily computed exactly from its formula (16), unlike the constant c 1,H in Proposition (4) which can only be approximated via its unwieldy series-integral representation given therein.

Normality of the term T 4
We study in this section the limit of the renormalized term T 4 which lives in the fourth Wiener chaos and appears in the expression of the variation V N . Of course, due to Theorem 1 above, this term does not affect the first order behavior of V N but it is interesting from the mathematical point of view because its limit is similar to those of the variation based on the fractional Brownian motion ( [29]). In addition, in Section 6, we will show that the asymptotics of T 4 , and indeed the value of c 1,H , are not purely academic. They are needed in order to calculate the asymptotic variance of the adjusted variations, those which have a normal limit when H ∈ (1/2, 2/3).
Define the quantity ¿From the calculations above we proved that lim N →∞ E(G 2 N ) = 1. Using the Nualart-Peccati criterion in Proposition 1, we can now prove that G N is asymptotically standard normal. (18) converges in distribution to the standard normal.
Setup of proof of Theorem 2. To prove this theorem, by Proposition 4 and Proposition 1, it is sufficient to show that for all τ = 1, 2, 3, For τ = 1, 2, 3, this quantity can be written as The Appendix can now be consulted for proof that for each τ = 1, 2, 3 this quantity converges to 0, establishing the theorem.

Anormality of the T 2 term and Asymptotic Distribution of the 2-Variation
For the asymptotic distribution of the variation statistic we have the following proposition.
Setup of proof of Theorem 3. The strategy for proving this theorem is simple. First of all Proposition 4 implies immediately that N 1−H T 4 converges to zero in L 2 (Ω). Thus if we can show the theorem's statement about T 2 , the statement about V N will following immediately from Proposition 2.
Next, to show N 1−H √ c 2,H T 2 converges to the random variable Z (1) in L 2 (Ω), recall that T 2 is a second-chaos random variable of the form I 2 (f N ), where f N (y 1 , y 2 ) is a symmetric function in L 2 ([0, 1] 2 ), and that this double Wiener-Itô integral is with respect to the Brownian motion W used to define Z (1), i.e. that Z (1) = I 2 (L 1 ) where L 1 is the kernel of the Rosenblatt process at time 1, as defined in (9). Therefore, by the isometry property of Wiener-Itô integrals (see (6)), it is necessary and sufficient to show that This is proved in the Appendix.

Normality of the adjusted variations
In the previous section we proved that the distribution of the variation statistic V N is never normal, irrespective of the order of the filter. However, in the decomposition of V N , there is a normal part, T 4 , which implies that if we subtract T 2 from V N the remaining part will converge to a normal law. But T 2 is not observed in practice. Following the idea of the adjusted variations in [28], instead of T 2 we subtract Z(1) which is observed. Z(1) is the value of the Rosenblatt process at time 1. Thus, we study the convergence of the adjusted variation: In Section 4 we showed that √ N c 1,H T 4 converges to a normal law. For the quantity U 2 we prove the following proposition Proposition 5 For H ∈ 1 2 , 2 3 , √ N U 2 converges in distribution to normal with mean zero and variance given by where c 2,H is defined as in (16) and F is defined as follows Proof. The proof follows the proof of [28,Proposition 5] and is omitted here.
Proof. The proof follows the steps of the proof of [28, Theorem 6] and is omitted.
7 Estimators for the self-similarity index We construct estimators for the self-similarity index of a Rosenblatt process Z based on the discrete observations at times 0, 1 N , 2 N , . . . , 1. Their strong consistency and asymptotic distribution will be consequences of the theorems above.

Setup of the estimation problem
Consider the quadratic variation statistic for a filter α of order p based on the observations of our Rosenblatt process Z: We have already established that E [S N ] = − N −2H 2 ℓ q,r=0 α q α r |q − r| 2H (see expression (12) ). By considering that E [S N ] can be estimated by the empirical value S N , we can construct an estimatorĤ N for H by solving the following equation: In this case, unlike the case of a filter of length 1 which was studied in [28], we cannot compute an analytical expression for the estimator. Nonetheless, the estimatorĤ N can be easily computed numerically by solving the following non-linear equation for fixed N , with unknown x ∈ [1/2, 1]: This equation is not entirely trivial, in the sense that one must determine whether it has a solution in [1/2, 1], and whether this solution is unique. As it turns out, the answer to both questions is affirmative for large N , as seen in the next Proposition, proved further below.

Definition 2
We define the estimatorĤ N of H to be the unique solution of (21).
Note that Equation (21) can be rewritten as S N = c(x)N −2x /2 where the function c was defined in (13). The proposition is established via the following lemma. Proof. Firstly, we show that V N converges to zero almost surely as N → ∞. We already know that this is true in L 2 (Ω). Consider the following If we choose β < 1 − H and q large enough so that (1 − H − β)q > 1. This implies that Therefore, the Borel-Cantelli lemma implies |V N | → 0 a.s., with speed of convergence equal to The almost-sure convergence of V N to 0 yields the statement of the lemma.
To guarantee existence of a solution, we use Lemma 2. This lemma implies the existence of a sequence ε N such that 2N 2H S N = c(H) + ε N and lim N →∞ ε N = 0 almost surely. Since in addition c is continuous, then almost surely, we can choose N large enough, so that 2N 2H S N is in the image of [ 1 2 , 1] by the function c. Thus the equation c (x) = 2N 2H S N has at least one solution in [ 1 2 , 1]. Since this equation is equivalent to (21), the proof of the proposition is complete.

Properties of the estimator
Now, it remains to prove that any suchĤ N is consistent and to determine its asymptotic distribution. We may now write Since in addition lim N →∞ log (1 + V N ) = 0 a.s., we get that almost surely, This implies the first statement of the proposition. The second statement, which is more precise, is now obtained as follows. SinceĤ N → H almost surely, and c is continuous, log c(Ĥ N )/c (H) converges to 0. The second statement now follows immediately.
The asymptotic distribution of the estimatorĤ N is stated in the next result. Its proof uses Theorem 3 and Theorem 1, plus the expression (23). While novel and interesting, this proof is more technical than the proofs of the proposition and theorem above, and is therefore relegated to the Appendix.

Theorem 6 For any
As can be seen from Theorem 3 and Theorem 6, the renormalization of the statistic V N , as well as the renormalization of the differenceĤ N − H, depend on H: it is of order of N 1−H . The quantities N 1−H V N and N 1−HĤ N cannot be computed numerically from the empirical data, thereby compromising the use of the asymptotic distributions for statistical purposes such as model validation. Therefore one would like to have other quantities with known asymptotic distribution which can be calculated using only the data. The next theorem addresses this issue by showing that one can replace H byĤ N in the term N 1−H , and still obtain a convergence as in Theorem 6, this time in L 1 (Ω). Its proof is in the Appendix.

Numerical Computation of the Asymptotic Variance
In practice certain issues may occur when we compute the asymptotic variance. The most crucial question is what order of filter we should choose. Indeed, from (16) withĤ N instead of H, it follows that the constants of the variance not only depend on the filter length/order (ℓ, p), but also on the number of observations (N ). We measure the "accuracy" of the estimator H N by its standard error which is the following quantity: There are several types of filters that we can use. In this paper, we choose to work with finite-difference and wavelet-type filters.
• The finite-difference filters are produced by finite-differencing the process. In this case the filter length is the same as the order of the filter. The coefficients of the order-ℓ finite difference filter are given by α k = (−1) k+1 ℓ k , k = 0, . . . , ℓ.
• The wavelet filters we are using are the Daubechies filters with k-vanishing moments.
(By vanishing moments we mean that all moments of the wavelet filter are zero up to a power). The Daubechies wavelets form a family of orthonormal wavelets with compact support and the maximum number of vanishing moments. In this scenario, the number of vanishing moments determines the order of the filter and the filter length is twice the order. For more details, the reader can refer to [17].
We computed the standard error for N = 10, 000 observations, filters of order varying from 2 to 20 and Hurst parameters varying from 0.55 to 0.95. This means that the corresponding lengths of the finite-difference filters were 2 to 20 and for the wavelets 4 to 40. The code we use to simulate the Rosenblatt process is based on a Donsker-type limit theorem and was provided to us by J.M. Bardet [1]. The results are illustrated in the figures 1, 2, and 3, on the next page; these are graphs of the asymptotic standard error various fixed values of H as the order of the filters increase. We observe that the standard error decreases with the order of the filter. Furthermore, we observe that the wavelet filters are more effective than the finite-difference ones, since they have a higher impact on the decrease of the standard error for the same order, as the filter increases. Specifically, the graph in Fig. 1, with the finite difference filters, shows that for fixed H, there is no advantage to using a filter beyond a certain order p, since the standard error tends to a constant as p → ∞. This does not occur for the wavelet filters, where the standard error continues to decrease as p → ∞ in all cases as seen in the graph in Fig. 2. On the other hand, the finite-difference filters have lower errors than the wavelet filters for low filter lengths; only after a certain order p * do the latter become more effective; this comparison is seen in the graph in Fig. 3, where p * is roughly 9.
In addition, since the order of convergence depends on the true value of the Hurst parameter H, we investigated the behavior of the error with respect to H. It seems that the higher H is, the more we lose in terms of accuracy; this is visible in all three graphs.
In general, the choice of a longer filter might lead to a smaller error, but at the same time it increases the computational time needed in order to computeĤ and its standard error. In a future work, we will study extensively this trade-off and other consequences of using longer filters.   9 Appendix: proofs.

Proof of Proposition 3
We start by computing the contraction term C i ⊗ 1 C i : Now, the inner product computes as We make the following change of variables and the second moment of T 2 becomes . We study first the diagonal terms of the above double sum We conclude that Observe that the term (24) can be calculated as follows: We may now use a Riemann sum approximation and the fact that 4H ′ − 4 = 2H − 2 > −1. Since ℓ is fixed and q 1 and q 2 are less than ℓ, we get that the term in (24) is asymptotically equivalent to We conclude that Using the fact that We have The scalar product computes as Using the fact that α(H) 2 d(H) 2 H(2H−1) = 1 2 and (25) the scalar product becomes The last equality is true since ℓ q=0 α q = 0 by the filter definition. Therefore, we have At this point we need the next lemma to estimate the behavior of the above quantity. This lemma is the key point which implies the fact that the longer variation statistics has, in the case when the observed process is the fractional Brownian motion, a Gaussian limit without any restriction on H (see [12]).
• Lemma 3 For all H ∈ (0, 1), we have that Proof. Proof of (i). Let f (x) = ℓ q,r=0 α q α r (1 + (q − r)x) 2H , so the summand can be written as ℓ q,r=0 Using a Taylor expansion at x 0 = 0 for the function f (x) we get that For small x we observe that the function f (x) is asymptotically equivalent to where p is the order of the filter. Therefore, the general term of the series is equivalent to Therefore for all H < p − 1 4 the series converges to a constant depending only on H. Due to our choice for the order of the filter p ≥ 2, we obtain the desired result.
Proof of (ii). Similarly as before, we can write the general term of the series as k ℓ q,r=0 Therefore for all H < p the series converges to a constant depending only on H.
Combining all the above we have Since the leading term is of order N −1 we have that If we define the correlation function of the filtered process as we can express the asymptotic variance lim N →∞ N E T 2 4,(1) in terms of a series involving ρ α H (k).

The term E T 2 4,(2)
In order to handle this term we use the alternate expression (15) of C i . Therefore, following similar calculations as in the T 2 case we get that We study the convergence of the above series as N → ∞ Therefore the general term of the series is asymptotically equivalent to which converges for all H ∈ ( 1 2 , 1). We treat the second series (II) in the same way and we get that it is asymptotically equivalent to cst. k 4H−4−8p . Combining all the above we have The leading term in E T 2 4,(2) is of order N −1 and the constant computes as Therefore, combining the two terms we get the statement of the proposition.
Thus, we have As in the computations for T 4,(2) we can show that the above series converges and thus J 1 = O(N −2 ), which implies that for all H ∈ ( 1 2 , 1) lim N →∞ J 1 = 0.
• Term for τ = 2 The series converges for all H ∈ (1/2, 1), so the whole term is of order O(N −2 ) which means that goes to zero as N → ∞.
With similar computations as in the case of T 4 we conclude that J 3 = O(N −2 ).

Proof of Theorem 3
According to our previous computations we can write Let us show first that we can reduce this function to the interval y 1 ∈ [0, i−q N ] and y 2 ∈ [0, i−r N ]. We will show that if y 1 ∈ I iq , y 2 ∈ [0, i−r N ] (and similarly for the situations y 1 ∈ [0, i−q N ], y 2 ∈ I ir and y 1 ∈ I iq , y 2 ∈ I ir ) the corresponding terms goes to zero as N → ∞. We have, due to the fact that the intervals I iq are disjoint, which tends to zero because 2H > 1. This proves the following asymptotic equivalence in L 2 ([0, 1] 2 ): We will show that the above term, normalize by N 1−H √ c 2,H , converges pointwise for y 1 , y 2 ∈ [0, 1] to the kernel of the Rosenblatt random variable.
On the interval I iq × I ir we may attemp to replace the evaluation of ∂ 1 K H ′ at u and v by setting u = (i − q)/N and v = (i − r)/N . More precisely, we can write and all the above summand above can be treated in the same manner. For the first one, using the definition of the derivative of K H ′ with respect to the first variable, we get for any As a consequence of the above estimates, The quantity 1 This means we have proved the following pointwise asymptotically equivalent for f N (y 1 , y 2 ): Recall that Thus we get Further, we can ignore the terms q/N and r/N in comparison with i/N in the last line above, and thus invoke a Riemann sum approximation, which proves that, for every y 1 , y 2 ∈ (0, 1) 2 To finish the proof it suffices to check that N 1−H f N is a Cauchy sequence in L 2 ([0, 1] 2 ). Up to a constant depending on H we have that for all M , N , The first two terms have already been studied and will converge to the same constant as M, N → ∞. Concerning the inner product, by making the usual change of variable we have For large i, j we can ignore the terms u N , u ′ N , q 1 N , etc., compared to i N and j N . Therefore, the above quantity is a Riemann sum that converges to the same constant as the squared terms, as M, N → ∞. This finishes the proof of the theorem.

Proof of Theorem 6
We wish to show that, as N → ∞, A minor technical difficulty occurs when V N is not small. We deal with this as follows. We decompose the above expectation E according to whether or not |V N | ≤ 1/2: we have E = E 1 + E 2 where Dealing with this term first, Schwarz's and Minkowski's inequalities yields SinceĤ N is bounded, the sum of the two rooted expectation terms above is bounded above by a constant multiple of N 2−2H . Therefore to deal with E 1 , one only needs to show that P [|V N | > 1/2] ≪ N −4+4H . It is well known that any random variable X which can be written as a finite sum of Wiener chaos terms up to order q satisfies, for any integer n, E X 2n ≤ K n,q E X 2 n where K n,q depends only on n and q. This can be proved iteratively by using formula (7), for instance. Therefore, since V N is a sum of terms in the second and 4th chaos (q = 4), by Chebyshev's inequality, and using Theorem 1, with N large enough, P [|V N | > 1/2] ≤ 4 n E |V N | 2n ≤ 4 n c n,4 E |V N | 2 n ≤ 8 n K n,4 c n 2,H N 2Hn−2n .
It is thus sufficient to choose n = 3 to guarantee that E 1 → 0. We now only need to study E 2 . We invoke the mean value theorem to express Ĥ N − H log N more explicitly. For any x, y ∈ [1/2, 1], there exists ξ ∈ (x, y) such that log c (x) c (y) = (x − y) (log c) ′ (ξ) .
Here the function (log c) ′ is bounded on [1/2, 1], because c ′ is bounded and c is bounded below. Therefore, denoting by ξ N ∈ [1/2, 1] the value corresponding to x = H and y =Ĥ N , and using line (23)  Since (log c) ′ (ξ N ) is bounded (by a non-random value), by choosing N large enough, an upper bound for the last fraction above, in absolute value, is 2V N / log N . Therefore (using Minkowski's inequality), By Theorem 1, the term in line (28) is bounded above by 1/ log 2 N , and thus converges to 0. For the term in line (27), because of the indicator 1 |V N |≤1/2 , we use the fact that when |x| ≤ 1/2, we have |x − log (1 + x)| ≤ x 2 . Thus this line is bounded above by The term in line (29) converges to 0 by Theorem 3. Finally, by Theorem 1 again, and the earlier statement about higher powers of random variables with finite chaos expansions, the term in line (30) is of order N 2H−2 , and therefore converges to 0 as well. This proves that E 2 converges to 0, finishing the proof of the theorem.

Proof of Theorem 7
It is sufficient to prove that We decompose the probability space depending on whetherĤ N is far or not from its mean. For a fixed value ε > 0 it is convenient to define the event D = Ĥ N > ε + 2H − 1 .
We have Proof.
Term A : Introduce the notation x = max 1 − H, 1 −Ĥ N and y = min 1 − H, 1 −Ĥ N . Finally, B goes to 0 as N → ∞. This finishes the proof of the theorem.