Abstract
We study the problem of parameters estimation in indirect observability contexts, where \(X_t \in R^r\) is an unobservable stationary process parametrized by a vector of unknown parameters and all observable data are generated by an approximating process \(Y^{\varepsilon }_t\) which is close to \(X_t\) in \(L^4\) norm.We construct consistent parameter estimators which are smooth functions of the sub-sampled empirical mean and empirical lagged covariance matrices computed from the observable data. We derive explicit optimal sub-sampling schemes specifying the best paired choices of sub-sampling time-step and number of observations. We show that these choices ensure that our parameter estimators reach optimized asymptotic \(L^2\)-convergence rates, which are constant multiples of the \(L^4\) norm \(|| Y^{\varepsilon }_t - X_t ||\).
Similar content being viewed by others
References
Aït-Sahalia, Y.: Maximum likelihood estimation of discretely sampled diffusions: a closed-form approximation approach. Econometrica 70, 223–262 (2002)
Aït-Sahalia, Y., Mykland, P.A., Zhang, L.: How often to sample a continuous-time process in the presence of market microstructure noise. Rev. Financ. Stud. 18(2), 351–416 (2005)
Aronson, D.G.: Bounds for the fundamental solution of a parabolic equation. Bull. Am. Math. Soc. 73(6), 890–896 (1967)
Azencott, R., Beri, A., Jain, A., Timofeyev, I.: Sub-sampling and parametric estimation for multiscale dynamics. Commun. Math. Sci. 11(4), 939–970 (2013)
Azencott, R., Beri, A., Ren, P., Timofeyev, I.: Parametric estimation of the volatility equation in the Heston model using indirect observations. In Preparation (2015)
Azencott, R., Beri, A., Timofeyev, I.: Adaptive sub-sampling for parametric estimation of Gaussian diffusions. J. Stat. Phys. 139(6), 1066–1089 (2010)
Azencott, R., Beri, A., Timofeyev, I.: Parametric estimation of stationary stochastic processes under indirect observability. J. Stat. Phys. 144(1), 150–170 (2011)
Azencott, R., Dacunha-Castelle, D.: Series of Irregular Observations: Forecasting and Model Building. Springer, New York (1986)
Azencott, R., Gadhyan, Y.: Accuracy of maximum likelihood parameter estimators for Heston stochastic volatility sde. J. Stat. Phys. 159(2), 393–420 (2015)
Bandi, F.M., Phillips, P.C.B.: Fully nonparametric estimation of scalar diffusion models. Econometrica 71(1), 241–283 (2003)
Barndorff-Nielson, O.E., Hansen, P.R., Lunde, A., Shephard, N.: Designing realised kernels to measure the ex-post variation of equity prices in the presence of noise. Econometrica 76(6), 1481–1536 (2008)
Barndorff-Nielson, O.E., Shephard, N.: Econometric analysis of realized volatility and its use in estimating stochastic volatility models. J. R. Stat. Soc. 64, 253–280 (2002)
Barndorff-Nielson, O.E., Shephard, N.: Power and bipower variation with stochastic volatility and jumps. J. Financ. Economet. 2(1), 1–37 (2004)
Berner, J.: Linking nonlinearity and non-Gaussianity of planetary wave behavior by the Fokker-Planck equation. J. Atmos. Sci. 62, 2098–2117 (2005)
Bollen, B., Inder, B.: Estimating daily volatility in financial markets utilizing intraday data. J. Empir. Financ. 9(5), 551–562 (2002)
Bradley, R.C.: Basic properties of strong mixing conditions. A survey and some open questions. Probab. Surv. 2, 107–144 (2005)
Brandt, M.W., Santa-Clara, P.: Simulated likelihood estimation of diffusions with an application to exchange rate dynamics in incomplete markets. J. Financ. Econ. 63, 161–210 (2002)
Christensen, K., Podolskij, M.: Asymptotic theory of range-based multipower variation. J. Financ. Economet. (2012)
Comte, F., Genon-Catalot, V., Rozenholc, Y.: Penalized nonparametric mean square estimation of the coefficients of diffusion processes. Bernoulli 13(2), 514–543 (2007)
Crommelin, D., Vanden-Eijnden, E.: Reconstruction of diffusions using spectral data from time-series. Commun. Math. Sci. 4, 651–668 (2006)
Crommelin, D., Vanden-Eijnden, E.: Diffusion estimation from multiscale data by operator eigenpairs. SIAM Multiscale Model. Simul. 9(4), 1588–1623 (2011)
Dacunha-Castelle, D., Duflo, M.: Probability and Statistics: Volume II. Springer, New York (1986)
Davies, E.B.: Heat Kernels and Spectral Theory. Cambridge University Press, Cambridge (1992)
Delattre, M., Genon-Catalot, V., Samson, A.: Estimation of population parameters in stochastic differential equations with random effects in the diffusion coefficient. MAP5 2014-07. HAL id: hal-01056917. to appear in ESAIM P&S. 2014
Ditlevsen, S., Samson, A.: Estimation in the partially observed stochastic Morris-Lecar neuronal model with particle filter and stochastic approximation methods. Ann. Appl. Stat. 2, 674–702 (2014)
Dolaptchiev, S., Achatz, U., Timofeyev, I.: Stochastic closure for local averages in the finite-difference discretization of the forced Burgers equation. Theor. Comput. Fluid Dyn. 27(3–4), 297–317 (2013)
Elerian, O., Chib, S., Shephard, N.: Likelihood inference for discretely observed nonlinear diffusions. Econometrica 69(4), 959–993 (2001)
Franzke, C., Majda, A.J.: Low-order stochastic mode reduction for a prototype atmospheric GCM. J. Atmos. Sci. 63, 457–479 (2006)
Franzke, C., Majda, A.J., Vanden-Eijnden, E.: Low-order stochastic mode reduction for a realistic barotropic model climate. J. Atmos. Sci. 62, 1722–1745 (2005)
Friedman, A.: Partial Differential Equations of Parabolic Type. Prentice-Hall, Englewood Cliffs (1964)
Friedrich, R., Siegert, S., Peinke, J., Lück, S., Siefert, M., Lindemann, M., Raethjen, G.D.J., Pfister, G.: Extracting model equations from experimental data. Phys. Lett. A 271, 217–222 (2000)
Gallant, R., Long, J.R.: Estimating stochastic differential equations efficiently by minimum chi-squared. Biometrika 84(1), 125–141 (1997)
Genon-Catalot, V.: Parameter estimation for stochastic differential equations from noisy observations. Maximum likelihood and filtering techniques. Lipari 2009 biomathematics summer school, MAP5 2010-03. HAL id: hal-00448996 (2010)
Genon-Catalot, V., Jeantheau, T., Laredo, C.: Limit theorems for discretely observed stochastic volatility models. Bernoulli 4(3), 283–303 (1998)
Genon-Catalot, V., Jeantheau, T., Laredo, C.: Parameter estimation for discretely observed stochastic volatility models. Bernoulli 5(5), 855–872 (1999)
Hairer, M.: Convergence of Markov Processes. Lecture Notes (2010)
Hansen, L., Scheinkman, J., Touzi, N.: Spectral methods for identifying scalar diffusions. J. Economet. 86, 1–32 (1998)
Heston, S.: A closed-form solution for options with stochastic volatility with applications to bond and currency options. Rev. Financ. Stud. 6(2), 327–343 (1993)
Heyde, C.C.: Quasi-Likelihood and Its Application: A General Approach to Optimal Parameter Estimation. Springer, New York (1997)
Ii’in, A., Kalashnikov, A., Olejnik, O.: Linear equations of the second order of parabolic type. Russ. Math. Surv. 17(3), 1 (1962)
Gradisek, R.F.J., Siegert, S., Grabec, I.: Analysis of time series from stochastic processes. Phys. Rev. E 62, 3146–3155 (2000)
Jain, A., Timofeyev, I., Vanden-Eijnden, E.: Stochastic mode-reduction in models with conservative fast sub-systems. Comm. Math. Sci. 13(2), 297–314 (2015)
Karlin, S., Taylor, H.: A Second Course in Stochastic Processes. Academic Press, (1981)
Kelly, L., Platen, E., Sørensen, M.: Estimation for discretely observed diffusions using transform functions. J. Appl. Probab. 41A, 99–118 (2004)
Kessler, M., Sørensen, M.: Estimating equations based on eigenfunctions for a discretely observed diffusion process. Bernoulli 5(2), 299–314 (1999)
Kutoyants, Y.A.: Statistical Inference for Ergodic Diffusion Processes. Springer, London (2004)
Ladishenskaya, O., Solonnikov, V., Uraltseva, N.: Linear and Quasiiinear Equations of Parabolic Type. Nauka, Moscow (1967)
Majda, A.J., Timofeyev, I., Vanden-Eijnden, E.: A mathematics framework for stochastic climate models. Commun. Pure Appl. Math. 54, 891–974 (2001)
Majda, A.J., Timofeyev, I., Vanden-Eijnden, E.: A priori tests of a stochastic mode reduction strategy. Phys. D 170, 206–252 (2002)
Majda, A.J., Timofeyev, I., Vanden-Eijnden, E.: Stochastic models for selected slow variables in large deterministic systems. Nonlinearity 19(4), 769–794 (2006)
Melbourne, I., Stuart, A.M.: A note on diffusion limits of chaotic skew-product flows. Nonlinearity 24, 1361 (2011)
Norris, J.R., Stroock, D.W.: Estimates on the fundamental solution to heat flows with uniformly elliptic coefficients. Proc. Lond. Math. Soc. 62, 373–402 (1991)
Panzar, L., van Zanten, H.: Nonparametric Bayesian inference for ergodic diffusions. J. Stat. Plan. Inference 139(12), 4193–4199 (2009)
Papaspiliopoulos, O., Pokern, Y., Roberts, G.O., Stuart, A.M.: Nonparametric estimation of diffusions: a differential equations approach. Biometrika 99(3), 511–531 (2012)
Papavasiliou, A., Pavliotis, G.A., Stuart, A.: Maximum likelihood drift estimation for multiscale diffusions. Stoch. Proc. Appl. 119(10), 3173–3210 (2009)
Pavliotis, G.A., Stuart, A.: Parameter estimation for multiscale diffusions. J. Stat. Phys. 127, 741–781 (2007)
Pavliotis, G.A., Stuart, A.M.: Multiscale Methods: Averaging and Homogenization. Springer, New York (2008)
Pedersen, A.R.: A new approach to maximum likelihood estimation for stochastic differential equations based on discrete observations. Scand. J. Stat. 22(1), 55–71 (1995)
Pokern, Y., Stuart, A.M., van Zanten, J.H.: Posterior consistency via precision operators for Bayesian nonparametric drift estimation in SDEs. Stoch. Process. Appl. 123(2), 603–628 (2013)
Porper, F.O., Eidelman, S.D.: Two-sided estimates of fundamental solutions of second-order parabolic equations, and some applications. Russ. Math. Surv. 39(3), 119–178 (1984)
Prakasa Rao, B.L.S.: Statistical Inference for Diffusion Type Processes (Kendall’s Library Statist. 8). Arnold, London (1999)
Ren, P.: Parametric Estimation of the Heston Model Under the Indirect Observability Framework. PhD thesis, University of Houston, Dept. of Mathematics (2014)
Shepp, L.A., Marcus, M.B.: Sample behavior of Gaussian processes. In: Proc of 6th. Berkeley Symp., pp. 423–442 (1972)
Siegert, S., Friedrich, R., Peinke, J.: Analysis of data sets of stochastic systems. Phys. Lett. A 243, 275–280 (1998)
Sørensen, M.: Estimating Functions for Discretely Observed Diffusions: A Review. Lecture Notes–Monograph Series, vol. 32, pp. 305–326. Institute of Mathematical Statistics, Hayward (1997)
Sroock, D.W., Varadhan, S.: Multidimensional Diffusion Processes. Springer, New York (2005)
Sura, P., Barsugli, J.: A note on estimating drift and diffusion parameters from timeseries. Phys. Lett. A 305, 304–311 (2002)
Van Der Meulen, F.H., Van Der Vaart, A.W., Van Zanten, J.H.: Convergence rates of posterior distributions for Brownian semimartingale models. Bernoulli 12(5), 863–888 (2006)
Zanten, H.V.: Rates of convergence and asymptotic normality of kernel estimators for ergodic diffusion processes. J. Nonparametr. Stat. 13(6), 833–850 (2001)
Zhang, L., Mykland, P.A., Aït-Sahalia, Y.: A tale of two time scales: determining integrated volatility with noisy high-frequency data. J. Am. Stat. Assoc. 100(472), 1394–1411 (2005)
Acknowledgments
I.T. and R.A. were supported in part by the NSF Grant DMS-1109582.
Author information
Authors and Affiliations
Corresponding author
Appendix: \(L^2\)-consistency for Unobservable Estimators of Lagged 2nd Order Moments
Appendix: \(L^2\)-consistency for Unobservable Estimators of Lagged 2nd Order Moments
In this appendix we present a detailed Proof of Theorem 2, which addresses the \(L^2\)- consistency results for the unobservable sub-sampled empirical estimators \(\bar{X}^\varepsilon \) and \(\hat{K}_X^\varepsilon (u)\) of means and lagged covariances. The hypotheses and notations are those of Theorem 2. Replacing \(X_t\) by the centered process \(X_t - \mu \) and setting \(\mu =0\) is a trivial change in the proof , so we only need to consider the case where all \(X_t\) are centered and \(\mu = 0\).
Step 1 Sums of decorrelation values For all \(D>0\) and \(j \ge 1\) one has \(D f(jD) \le \int _{(j-1)D}^{j D} f(T) dT \) since the decorrelation rate f(T) is decreasing. This implies
Define the function g(q, D) for all integers \(q \ge 2\) and all \(D>0\) by
Due to (37), the following inequality holds for all \(D>0\) and \(q \ge 2 \)
Step 2 Sub-sampled empirical means converge in \({\varvec{L}^2}\) Fix an integer \(j \in [ 1 \ldots r]\). Denote the j-th coordinates of \(X_{n \Delta }\) and of the empirical mean estimator \(\bar{X}^\varepsilon \) by
With the notation \(s_j ^2 = {\mathbb {E}}(U_n^2)\), this implies
Applying the decorrelation hypothesis (22) and the relations (38), (39), we obtain
The definition of the \(L^q\)-norm also implies \(s_j \le \Vert X_t \Vert _2 \le \Vert X_t \Vert _4 =\nu \). Hence, (40) implies
For any \(( r_1 \times r_2)\) random matrix M, and any \(q \ge 1\) our definition of the norm \(\Vert M\Vert ^q\) implies
The inequality \(\Vert \bar{X}^\varepsilon \Vert _2 \le \sqrt{r} \max _{j} \Vert \bar{X}^\varepsilon (j) \Vert _2\), then yields, due to (41),
Since \(\Delta (\varepsilon ) \rightarrow 0 \) this proves the \(L^2\)-bound in (24) when \(X_t\) is centered and hence in general.
Step 3 Sub-sampled empirical means converge in \({\varvec{L}^4}\) Basic algebra yields the identities
where the sums \(S_1\), \(S_2\), \(S_3\), and \(S_4\) are defined by
Due to the assumption that the \(L^4\) norm of \(X_t\) is bounded uniformly by \(\nu \), one clearly has \(|{\mathbb {E}}( S_1) | \le N \nu ^4\) and \( | {\mathbb {E}}(S_2) | \le 4 N^2 \nu ^4\).
Since we are considering the centered process \(X_t\), \(E(U_n) \equiv 0\), and for \(a < m < n\) the decorrelation hypothesis implies
Similarly, one shows that
These bounds and definition (38) yield (for \(N \ge 3\))
which implies, due to the bound (39),
As above, one also has \( | {\mathbb {E}}(U_a U_b U_m U_n) | \le f( (b-a) \Delta ) \) for \(a < b < m < n \). The expressions of \(S_4\) and \(g(m, \Delta )\) then yield (for \(N \ge 4\))
Therefore, due to (39) we obtain for \(N \ge 4\)
Finally, the bounds on \(| {\mathbb {E}}(S_k) |\), and Eq. (42) entail
for some explicit constant C, since \(N(\varepsilon ) \rightarrow \infty \) and \(\Delta (\varepsilon ) \rightarrow 0 \) with \(N(\varepsilon ) \Delta (\varepsilon ) \rightarrow \infty \). In particular for \(\varepsilon \) small enough, one can clearly take \(C = 7 I(f)\). Therefore, Eqs. (41) and (43) imply
which proves the \(L^4\)-bound in (24).
Step 4 Convergence of empirical lagged covariance matrices estimators Introduce the short-hand notations \(V_n = X_{n \Delta }\) and
From the Definition (12), the covariance matrix estimators \(\hat{K}^\varepsilon _X(u) \) can be rewritten as
First, we evaluate the term \(\bar{V}_N (\tau \bar{V}_N)^* \) in the equation above. Impose \(0 \le u \le A\) for some fixed A. Thus, by construction
and applying the inequality (18) one arrives at the following relation
Since \(\mu = 0\), we also have
as proven in Step 3. This implies, by inequality (18),
which yields, due to Eq. (48),
By the construction of \(\kappa (u,\varepsilon )\), the “discrete” lag \(\kappa \Delta \) is close to continuous lag u and \(| \kappa \Delta - u | \le \Delta \). Since the true lagged covariance matrices K(u) are locally Lipschitz, there is a constant \(\lambda = \lambda (A)\) such that for all \(0 \le u \le A\) and all \(\varepsilon >0\) the following deterministic inequality holds
Next, we compare the term \(W_N\) in the expression of the covariance estimator (47), with the true covariance matrix \(K(\kappa \Delta )\) evaluated at the “discretized ” time lag \(\kappa \Delta \). Since \(X_t\) is stationary, we have \(K(\kappa \Delta ) = {\mathbb {E}}( V_n V_{n+ \kappa }^*) \) for all n, and formula (46) implies that
For any two coordinates \(i, j \in [1 \ldots r]\) denote \(T_n= V_n(i)\) and \(U_n= V_n(j)\) as the i-th and j-th coordinates of \(V_n\), respectively. In addition, we also define
Clearly \({\mathbb {E}}[H_n] = 0\) and the (i, j) coefficient of the matrix \( M = W_N - K(\kappa \Delta )\) is then
and
Next, we partition the summation interval in the expression above into two complementary sets, \(Q^+\) and \(Q^-\), as follows
-
\((m,n) \in Q^+\) whenever \( | n - m | > \kappa \) and \(m, n \in [1 \ldots N] \),
-
\((m,n) \in Q^-\) whenever \( | n - m | \le \kappa \) and \(m, n \in [1 \ldots N]\).
Due to bounded fourth moments of the process \(X_t\) we have \(\left| {\mathbb {E}}[H_m H_n] \right| \le 2 \nu ^4 \). Moreover,
and, therefore,
For \((m,n) \in Q^+\), the decorrelation rate of the 2nd order moments yields
so that
Thus, relation (51) and inequalities (52), (53) imply
Easy algebra transforms the double sum above into
where Eq. (37) were used in the last inequality. Recall that for \(0 \le u \le A\) and due to the construction of \(\kappa \), one also has
Substituting the last two expressions into (54) we obtain
By Eq. (41) we further obtain
Using the expression for \(\hat{K}^\varepsilon _X(u)\) in (47) and the triangle inequality we can write
Combining the three bounds in (49), (50), and (55), we obtain, for all \(\varepsilon > 0\) and \(0 \le u \le A\),
where
Moreover, for \(\varepsilon \) small enough, we can take \(C^2 = \sqrt{7r I(f)}\) as discussed in Step 2, and \({4 (1+A) \nu ^2}{(N \Delta )^{-1/2}}\) will become much smaller than \(\sqrt{r I(f)}\). Therefore, for \(\varepsilon \) small enough, one has (using \(\sqrt{a+b} \le \sqrt{a} + \sqrt{b}\)) a simplified expression for the constant \(\Gamma \)
This concludes the Proof of Theorem 2.
Rights and permissions
About this article
Cite this article
Azencott, R., Ren, P. & Timofeyev, I. Parametric Estimation from Approximate Data: Non-Gaussian Diffusions. J Stat Phys 161, 1276–1298 (2015). https://doi.org/10.1007/s10955-015-1379-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-015-1379-6