Probabilistic models and statistics for electronic financial markets in the digital age

The scope of this manuscript is to review some recent developments in statistics for discretely observed semimartingales which are motivated by applications for financial markets. Our journey through this area stops to take closer looks at a few selected topics discussing recent literature. We moreover highlight and explain the important role played by some classical concepts of probability and statistics. We focus on three main aspects: Testing for jumps; rough fractional stochastic volatility; and limit order microstructure noise. We review jump tests based on extreme value theory and complement the literature proposing new statistical methods. They are based on asymptotic theory of order statistics and the R\'{e}nyi representation. The second stage of our journey visits a recent strand of research showing that volatility is rough. We further investigate this and establish a minimax lower bound exploring frontiers to what extent the regularity of latent volatility can be recovered in a more general framework. Finally, we discuss a stochastic boundary model with one-sided microstructure noise for high-frequency limit order prices and its probabilistic and statistical foundation.


Introduction
The evolutions of stock prices are subject to market risk.Foundations of price models in continuous time typically refer back to Louis Bachelier.He did his PhD supervised by Henri Poincaré in Paris and defended his thesis "Théorie de la spéculation" in 1900.He is considered to be the first researcher who found the Brownian motion.I would see this as a great success, although the moral that price changes cannot be forecasted (Efficient Market Hypothesis) is rather negative for speculators.Naturally, forecasting of future prices is a worthwhile pursuit.The first Bachelor student I have supervised asked me about methods to do that, but I rather pointed him at risk forecasting.Looking at data from the DAX he concluded that "One can see clearly that there is a much higher autocorrelation for squared returns than for returns.This is an indication that it might be easier to forecast variance of returns than the returns themselves".While forecasting price changes (=returns) would be desirable to make money, forecasting risk is more successful and an integral instrument of risk management.This is a main application of statistics for financial markets and the analysis of financial time series.
The Brownian motion, or Wiener process, (W t ) with W 0 = 0 is defined by the properties: • Its increments are stationary, such that (W t+s − W t ) is distributed as W s ; • The expectation is E[W t ] = 0, for all t; • The paths t → W t (=realizations) are continuous.
All random objects throughout the manuscript are defined on some probability space (Ω, F, P), with σ-field F and measure P. We follow the standard notation not to write arguments ω ∈ Ω for random objects.Brownian motion can be motivated as integrated continuous-time white noise, and as well as limit of a discrete-time random walk X T = T t=1 ϵ t , T ∈ N, where (ϵ t ) are independent, identically distributed (i.i.d.) with P(ϵ t = 1) = 1/2 = P(ϵ t = −1).It holds that T −1/2 X ⌊T s⌋ → W s , as T → ∞, with the floor function ⌊ • ⌋.Note that continuous-time white noise does not exist in the sense of a measurable stochastic process related to the fact, that the paths of (W t ) are continuous, but nowhere differentiable.So each realization of a Brownian motion has this fascinating property like the Weierstrass function.The existence was proved by Wiener in 1923 and we refer to the textbook Schilling and Partzsch (2014) for an overview on different constructions and various properties.The Brownian motion is really at the heart of the theory on continuous-time stochastic processes.From my point of view, stochastic processes is mainly the study of classes of processes which share one or some of the fundamental properties of the Brownian motion: • It is a Gaussian process.That means that all finite-dimensional distributions of W t 1 , . . ., W tn are normal for all n ∈ N, and arbitrary times.A Gaussian process is uniquely determined by its expectation and covariance function.A Brownian motion is hence uniquely characterized as a continuous Gaussian process with W 0 = 0, E[W t ] = 0, for all t, and the covariance function Cov(W s , W t ) = min(s, t).
• It is a Lévy process, that is a process (X t ) with independent stationary increments, X 0 = 0, and which satisfies ∀ϵ > 0: P(|X t+h − X t | > ϵ) → 0, as h → 0. Writing X t = n j=1 (X t j − X t j−1 ), 0 = t 0 < t 1 ≤ t 2 ≤ • • • ≤ t n = t, their study is related to studying sums of i.i.d.random variables.The second fundamental example of a Lévy process is a Poisson (jump) process.
• It is a martingale whose conditional expectation satisfies E[W t |W s ] = W s , almost surely for all t ≥ s.
• It is a Markov process, for which (W t+s − W t ) s≥0 is another Brownian motion independent of {W u , 0 ≤ u ≤ t}.
• It is self-similar such that a −1/2 W at is distributed as W t , for all a > 0. Looking at a path in a plot with axes that have no labels, we could hence not say anything about the scaling.
The famous Black-Scholes price model follows Bachelier's principles and describes the stock price S t at time t by the stochastic differential equation dS t = aS t dt + σS t dW t (1) which is solved by a geometric Brownian motion.This can be proved by a simple application of Itô's lemma.Bachelier's mantra, that there is no (short-term) expected profit for traders without insider information, is culminated in the "Fundamental Theorem of Asset Pricing": In an arbitrage-free market, prices follow martingales.
An extension to general no arbitrage conditions by Delbaen and Schachermayer (1994) implies that log-prices should be semimartingales.These are processes that can be expressed on compact time intervals as sums of a martingale and a process of finite variation.Of course, the Brownian motion is a semimartingale.Interestingly, these processes do not only occur in this modern fundamental theorem of asset pricing, but are also the class of "good integrators" for stochastic integrals according to the Bichteler-Dellacherie theorem.We work with a continuous-time log-price modelled as an Itô semimartingale: Throughout this work we focus on real-valued, one-dimensional processes modelling the price evolution of only one asset.Since the multivariate setting has some surprises in store, a visit is as well interesting.Most of my own work is in fact devoted to multivariate phenomena and we do not live in a one-dimensional world.However, for my selection of topics in this manuscript and simplicity it is sufficient to stick to a one-dimensional image space at a fixed time.Some recent aspects of a multidimensional analysis and its applications to portfolios are briefly mentioned in the outlook.In (2), (σ s ) is the volatility process.Volatility is the prevalent concept to describe market risk.It will therefore be our main target of statistical inference.
The first important advancement compared to the Black-Scholes model ( 1) is to include time-varying volatility which may be a stochastic process itself.A second important advancement is to consider price jumps which can describe rapid price adjustments in response to new information provided, e.g., by economic shocks or central bank announcements.
The realm of big data in financial markets can be viewed as a stroke of luck for statisticians and data analysts giving us huge data sets at hand, in particular when looking at intra-daily tick data.In electronic financial markets almost 70% of the volume is nowadays attributed to high-frequency trading.This should be very useful for risk quantification, but similar as in other fields, the picture how to efficiently exploit big data is not yet complete as the data sets are complex and noisy.Nevertheless, or maybe for this very reason, for applications to intra-day financial data in econometrics and also in macroeconomic studies high-frequency statistics for semimartingales became highly relevant.The use of high-frequency financial data was promoted by Engle (2000) and Andersen (2000), among others, around the turn of the millennium.Engle (2000) called the situation when prices from all transactions are recorded "ultra-high frequency data".Such data samples are rather small compared to the ones we now have available from limit order books, see Figure 5 for a snapshot.The advent of ultra-high frequency data motivated to bridge several strands of research between statistics for stochastic processes, time series analysis and econometrics.Since then high-frequency statistics for semimartingales has evolved into a huge field of study.Many brilliant researchers made substantial contributions to this field including Yacine Aït-Sahalia, Ole Barndorff-Nielsen, Jean Jacod, Per Mykland and younger ones as Mark Podolskij and Viktor Todorov, to name just a few.
Having self-similar processes in price models, it might be a bit disappointing to learn that nevertheless different models are used when considering data over different time scales.While for instance, under a very high time resolution over a short interval discreteness of prices in the image space becomes relevant, what is not in line with a Brownian motion, looking at low time resolutions, e.g., daily data over a year, discrete time series might be adequate models.Depending on the time resolution and the concrete application, micro-, meso-or macroscopic models are used as devices to coherently describe price dynamics and to allow at the same time a good calibration of the model.Often the applied focus is on risk forecasting, portfolio allocation or asset pricing.Since the seminal works on AutoRegressive Conditional Heteroskedastic (ARCH) time series and its generalizations (GARCH), see Engle and Bollerslev (1986), 1 in the late 1980s, time series models are the main workhorse to perform volatility forecasting, typically on a daily basis.GARCH models take into account several stylized facts as volatility clustering.In the era of intra-day high-frequency price recordings, one common approach is to infer volatility based on a continuous-time model and to plug the estimates into time series models used over daily time scales, see, e.g., Hansen et al. (2012).Although it might appear odd to a mathematician, that the continuous-time model is then not used over all time scales, many econometricians are satisfied with the good empirical performance.In recent years the research on forecasting shifted more towards fractional time series and continuous-time fractional models.The rough volatility literature and the data example in Section 4 inherit a similar philosophy to combine estimates from different models for different time resolutions.
Stylized facts of ultra high-frequency data contradict a pure semimartingale model due to so-called market microstructure.An early paper reporting these empirical facts and which foreshadowed the very successful semimartingale with additive noise model to describe tick data was Zhou (1996).Various related estimation methods for the volatility in this model have been proposed including two-scale and multi-scale realized volatility by Zhang et al. (2005) and Zhang (2006), pre-averaging by Jacod et al. (2009), the kernel estimator by Barndorff-Nielsen et al. (2008), the Quasi-Maximum-Likelihood approach by Xiu (2010) and many more.A lower bound for the asymptotic variance of integrated volatility estimators at optimal rate for this problem was established by Reiß (2011).In a related vein, the presented model for limit order book quotes in Section 5 preserves the idea of an underlying semimartin-1 Robert F. Engle received the Nobel prize in 2003 with Granger.Scholes and Merton won it in 1997 when Black's key role was pointed out, but the prize is not given posthumously.gale efficient price and describes market microstructure effects by additive noise.The only difference, which is however crucial, is that we proceed from regular noise with expectation zero to irregular, non-negative one-sided noise.
Deciding whether there are jumps in a price process or if it is continuous is beyond volatility estimation one of the most important problems in the literature on high-frequency data.Like "Cat or dog?" nowadays seems to be one of the big questions based on inputs from images in research on AI and machine learning, "Jump or no jump?"poses one of the main testing problems for statistics of financial markets.One reason is that it is crucial to select and work with an adequate price model.Moreover, volatility of continuous price movements and jumps are used to describe different kinds of market risks and it is important to distinguish between the two in economic studies.Based on continuous-time paths it was simply possible to see jumps, while based on discrete recordings with a fix time ∆ between subsequent observations it is impossible to decide about the question.In a highfrequency asymptotic regime with distance ∆ n → 0 between discrete recordings, the question yields an interesting problem.In Section 3 we address this problem based on methods from extreme value theory and the beautiful theory of order statistics placing Rényi's representation at the forefront.Extreme value theory concerns outliers in stationary sequences of random variables or time series, in particular the asymptotic distribution of maxima of i.i.d.random variables.An outlier refers to a realized value which is far away from the mean level of a time series, e.g., a year with a once-in-a-century flood in a sequence of yearly rainfall data.The mean level before and after an outlier remains the same.The picture of a jump is different.For instance, if a stock price evolves around level a before a jump of size b in response to the communication of quarterly results of a company, the price will continue to move at the new level a + b after the event.Jumps and outliers are nevertheless closely related.In particular, as Figure 1 reveals, jumps in a process can be detected as outliers in the sequence of corresponding increments.Applying methods from extreme value theory to increments is hence one promising starting point to construct jump tests.Figure 1 shows a simulated price based on a popular Heston model with constant drift and its specific stochastic volatility (σ t ), to that we add compound Poisson jumps with jump sizes drawn from a Laplace distribution.Even though the five realized jumps are highlighted in the path of the process left-hand side in Figure 1, it might be difficult to spot all of them, while it is easier to find the increments with jumps as outliers in the increments right-hand side.
In view of numerous references on the broader topic2 , we only strive to sketch a picture of three selected recent developments.Concerning deeper, open questions, parts of the more recent literature in the field has become rather technical and sophisticated.It is the goal here to shed light on some key ideas and insights, and to point out their relation to concepts from probability and mathematical statistics.Our journey shall not be a random walk.We begin in Section 2 as a prologue with fundamental concepts and cornerstones of statistics for semimartingales based on high-frequency data.We highlight the impact of Jean Jacod.Section 3 reviews statistical tests for price jumps in high-frequency data based on extreme value theory.We complement the existing methods proposing alternative, original methods.Gumbel is the second researcher whose contribution is emphasized.The newly developed Rényi test in Theorem 1 is built upon a maximal difference between order statistics motivated by the Rényi representation.This is the first classical, elementary yet ingenious, concept from probability which we highlight in a colourbox.A comment on the genius behind this is given in another box.In Section 4 we provide a brief review on rough fractional stochastic volatility with a data example.The review mentions recent results on the identifiability of the Hurst exponent under high-frequency asymptotics.Our Theorem 2 adds a new result which points out that the regularity of stochastic volatility in a more general sense is identifiable only in some cases and can be estimated only with a slower rate of convergence.This reveals that -even based on high-frequency data -there are frontiers in recovering path properties of a latent volatility from price recordings.The third part of the journey in Section 5 is admittedly captured to summarize and reflect some own research.The colourboxes place the spotlight on the taxi problem and the reflection principle for Brownian motion.The taxi problem is a popular example for estimating a boundary parameter.It is even contained in some school-books which develop the main estimation ideas in an intuitive manner.A nice feature of this example is that it is nevertheless deeper understood with concepts as complete and sufficient statistics.While the taxi problem is used to motivate the estimation based on local minima of best ask quotes, the reflection principle paves the way to determine distributions of functionals of Brownian motion which is an important ingredient of the asymptotic analysis of our boundary model.

Elements of high-frequency statistics for semimartingales
For some first simple but useful insights consider the parametric model with log-price with a standard Brownian motion (W t ) and µ ∈ R, σ > 0 unknown parameters.An obvious problem for statistics is to estimate the two parameters.Assume that we have discrete observations X 0 , X t 1 , . . ., X tn on an equidistant grid t j = j∆ n , 0 ≤ j ≤ n, available.In statistics for stochastic processes there are different asymptotic regimes, either T = n∆ n is fix and ∆ n → 0 (high-frequency), or ∆ n = ∆ is fix and T = n∆ → ∞ (low-frequency), or even both is true and we have high-frequency data over an asymptotically large time interval.We have in our setting many observations of only one single path of the process.The inference and testing problems, e.g., whether there are jumps or not, are in this field formulated and addressed pathwise, i.e., we want to know if the realized path has jumps or not.Statistical inference is based on the increments (3) In the parametric, equidistant setting it holds that ∆ . standard normal, denoted N (0, 1), random variables, such that the increments have expectation µ∆ n and variance σ 2 ∆ n .In this standard model, the maximum likelihood estimator has the smallest possible variance.The quadratic risk of the estimated drift parameter does not depend on ∆ n .It tends to zero only if T → ∞, and not under highfrequency asymptotics.Modelling intra-daily financial data, we have usually discrete observations available at very high frequencies, e.g.once per second.Naturally, we work in a high-frequency asymptotic regime.We learn from (5) that we cannot consistently estimate the drift in this situation.Looking at σ2 M L , we see that the term with μML tends to zero.The standard estimator for σ 2 is therefore the realized volatility It is an elementary exercise using moments of a normal distribution to compute its squared risk which tends to zero under high-frequency asymptotics.We should not use it in a low-frequency regime in that its variance tends to zero, but the bias does not.Throughout the remainder of this manuscript we set T = 1, without loss of generality.It is not surprising that a central limit theorem (clt) holds true.We write d −→ for convergence in distribution (weak convergence).Note, however, that we have a triangular array of random variables here and not a sequence, that is, going from n to n + 1 is not just adding one observation, but all observation times depend on n.For this reason, clts for triangular arrays need to be used in the high-frequency framework.The main benefit of a clt is to facilitate asymptotic confidence statements.These are feasible when we standardize the lefthand side in (8) with a consistent estimator of the asymptotic standard deviation.A more elegant method -which might give a slightly better approximation for finite samples -is a variance stabilization applying the ∆-method with the strictly increasing logarithm: Since log is strictly increasing, confidence intervals for log(σ 2 ) readily translate into confidence intervals for σ 2 .A current discussion comparing four approaches to perform asymptotic confidence based on a clt, which all work in this example, is given in Politis (2024).Beyond parameter estimation, the analogy to the standard statistical model allows to transfer more methods, e.g., likelihood ratio tests.
In the more general model with high-frequency observations of a continuous semimartingale (C t ) from (2) with time-varying drift (µ s ) and volatility (σ s ), the first goal is estimation of the integrated volatility 1 0 σ 2 s ds, e.g., integrated over trading days as a daily measure of risk.Since Itô's isometry yields that the realized volatility ( 6) is still a suitable (and in fact optimal) estimator.
A very strong asymptotic result under mild regularity conditions is the functional stable central limit theorem for realized volatility by Jacod (1997): with (B s ) a Brownian motion independent of (W s ) defined on an extension of the original probability space.This implies the marginal clt For stochastic volatility, the variance of the limit distribution is random, the limit is then called mixed normal.For this reason it is important that the convergence is stable.This is a stronger mode of weak convergence equivalent to joint weak convergence with every measurable bounded random variable on the same space.
Since it allows for a ∆-method and weak convergence after standardization, known as Slutsky's lemma for weak convergence, it is a crucial ingredient to construct asymptotic confidence intervals.Beyond inference on the integrated volatility, the functional clt allows for various other statistical applications, for instance, a volatility change-point test of cusum-type as explained in Section 2 of Bibinger et al. (2017).
It is not only relevant to infer the integrated volatility, the nonparametric estimation of the spot volatility process (σ 2 s ) is another central problem in high-frequency statistics.At this point, it is beneficial to introduce some rigorous assumption on the characteristics of the continuous semimartingale log-price process (C t ).
Jean Jacod is certainly a spiritus rector of high-frequency statistics for semimartingales.He was head of the probability group at Paris VI (Pierre et Marie Curie) from 1987 until 2000.He was well known for his textbooks and research on limit theorems, jump processes and Malliavin calculus when he established the main groundwork in this field.He pointed out that stable convergence is the right concept for asymptotic statements on realized volatility and related functionals.Proofs of stable clts for functionals of semimartingale increments usually rely on his results.He moreover provided techniques to separate jumps from continuous movements which are exploited by many authors.The textbooks Jacod and Protter (2012) and Aït-Sahalia and Jacod (2014) summarize main aspects of high-frequency statistics.
Assumption 1.The drift (µ t ) t≥0 is locally bounded and the volatility is strictly positive, inf t∈[0,1] σ t > 0, almost surely.For all 0 ≤ t + s ≤ 1, t ≥ 0, s ≥ 0, with some constants C σ > 0, and α > 0, it holds that Condition ( 12) imposes a certain regularity α of the volatility process.Due to the expectation, it is not Hölder continuity and ( 12) does not rule out volatility jumps.The increments of some compound Poisson jump process for instance, over a time interval of length s, equal a constant times (s 2 + s), if the jump size distribution has a second moment.Therefore, it satisfies ( 12) with α = 1/2.This is true for much more general jump processes.A continuous semimartingale and in particular Brownian motion satisfy ( 12) with α = 1/2.The Hölder condition (12) in quadratic mean is a convenient concept to describe the variability of a stochastic process.It is also used in other fields of probability, for instance, for functional data in Golovkine et al. (2022).In particular, the rate of convergence, with that (σ 2 s ) at some time s ∈ (0, 1) can be estimated, hinges on the regularity parameter α.Using a local average of rescaled squared increments as estimator yields with k n = n 2α/(2α+1) , for which the order of the squared bias (k n ∆ n ) 2α is the same as that of the variance k −1 n , the minimal root mean squared error of order n −α/(2α+1) .Given that the regularity parameter determines optimal spot volatility estimation, inference on an unknown α is certainly of interest.This, however, is an intricate problem not yet solved in general which we visit in Section 4. In the standard case α = 1/2, spot volatility can be estimated with rate n −1/4 .In the best (non-constant) case α = 1, the rate is n −1/3 .
If there are jumps, the realized volatility converges in probability, denoted P −→, to the entire quadratic variation: It is common notation to write jumps with sums over uncountable index sets, since the processes will always have only countably many random jump times.Most relevant is first to estimate the integrated volatility in presence of the nuisance jumps.To get rid of jumps, we need to discard in particular large jumps as in Figure 1 which are contained in the large absolute increments.A natural approach is hence to truncate increments whose absolute values are above a certain threshold.For this purpose, define a sequence A is true and 0, else.The idea is now to work out under which restrictions on the jumps the truncated realized volatility satisfies the same clt as for the realized volatility without jumps in (11).Sufficient is The truncation method was pioneered by Mancini (2009), is summarized in Chapter 13 of Jacod and Protter (2012), and the community is still working on refinements, see, for instance, Figueroa-López and Mancini ( 2019) and Amorino and Gloter (2020).We can decompose the difference with some κ ∈ (0, 1), e.g., κ = 1/2.For the term I n , knowing that max , what is contained in the next section, and that it is even almost surely smaller than u n by the law of the iterated logarithm is sufficient.The argument by Jacod is more elementary and in the following way: for any N ∈ N, and using moments of (∆ n j C) and choosing N sufficiently large yields that √ n The two other terms require a closer look at the jump component (J t ) of the semimartingale (X t ), X t = C t + J t .Most literature in the area imposes the general structure of Itô semimartingales in the sense of Section 2.1.4 of Jacod and Protter (2012) which admit a "Grigelionis representation".The jumps are independent of (C t ) and separated in a martingale of compensated small jumps and larger jumps with a Poisson random measure µ compensated by ν(ds, dz) = λ(dz) ⊗ ds, with a σ-finite measure λ.The function δ, for which the third argument ω is consequently also not written, is a predictable function, for which we assume that a non-negative deterministic function γ exists, such that is locally bounded.The benefit of such a meticulous definition of (J t ) is to preserve generality.For instance, the large class of Lévy jump processes is contained as a special case with δ(s, x) = x.The main assumption on the jumps is captured in a jump activity index r ∈ [0, 2], for which Basically, this means summability of s≤1 |∆J s | r .For r = 0 this is a strong restriction with at most finitely many jumps on [0, 1] (finite-activity), and for r = 1 we assume the jump process to be of finite variation.The larger r, the less restrictive is the condition.Now we have the toolbox to handle terms II n and III n .In the sequel, let K be a generic constant.With Markov's inequality and Cauchy-Schwarz, we obtain for II n that The term II n can be thought of as an error by truncating also the continuous components whenever the threshold is exceeded.Therefore, the order gets smaller for smaller τ , moving u n farer away from √ ∆ n , when only very large absolute increments are truncated.If rτ < 1/2, we conclude that √ nII n P −→ 0. Most difficult is term III n due to small jumps in non-truncated increments.We exploit the martingale structure of the small jumps to apply Burkholder's inequality and to deduce This term gets smaller for larger τ , moving u n closer to √ ∆ n .Both terms decrease for smaller r.To ensure that √ nIII n P −→ 0, we need that r < 1, and τ (2 − r) > 1/2.This is the main result about truncated realized volatility: what can be ensured by selecting τ close to 1/2, it satisfies ( 14).While we emphasize some key steps of the proof, the bounds for II n and III n admittedly lack some details.Most of them are elementary, as carefully using the triangle inequality, but a few are deeper.A less pedagogic but rigorous proof can be found in Chapter 13 of Jacod and Protter (2012).In particular, in the bound for II n we work under the event with at most one larger jump contained in one increment, for which the Poisson nature of the jumps yields precise estimates, see Step 5 in the proof of Thm.13.1.1 in Jacod and Protter (2012).The restrictions on r for spot volatility estimation with truncation are less strict, r < 4/3, see Section 13.4.1 of Jacod and Protter (2012), mainly since the rate is slower with that such a difference needs to tend to 0.

Jump detection in high-frequency data based on extreme value theory
There are several different constructions of tests for jumps in high-frequency data.
Let me focus here only on the most prominent one in financial economics by Lee and Mykland (2008).It is based on the maximal (absolute) normalized increment and exploits its asymptotic Gumbel distribution under the null hypothesis of no jumps.The test is sometimes called the Lee-Mykland test, or the Gumbel test in the literature.The asymptotic Gumbel distribution is traced back to the one of the maximum of i.i.d.N (0, 1) random variables and thus classical extreme value theory.
In the sequel, we consider real-valued random variables on some probability space with measure P. For random variables (X j ) 1≤j≤n , we denote the order statistics In particular, X (1) = min 1≤j≤n X j refers to the minimum and X (n) = max 1≤j≤n X j to the maximum.These are unique with probability 1 for random variables whose distributions are absolutely continuous with respect to the Lebesgue measure.It is a standard example in extreme value theory, see, e.g., Example 1.1.7 in de Haan and Ferreira (2006), that for (X j ) 1≤j≤n i.i.d.N (0, 1), the maximum satisfies with the Gumbel limit distribution Λ, i.e., it holds for all x ∈ R that  Lee and Mykland (2008) proved in their Thm. 2 that with a suitable estimator of the volatility (σ j/n ), e.g., from (13).The proof is carried out under some assumptions on (µ t ) and (σ t ), which can be generalized to rather weak regularity conditions.The similarity to ( 16) is striking.Indeed, the proof traces back the convergence to ( 16) showing that the normalized increments can be approximated by i.i.d.N (0, 1) observations.The factor 2 in the logarithm is due to the absolute value in the statistic and exploits the symmetry of N (0, 1).The normalizing sequences given in Lee and Mykland (2008) are in fact slightly different, but asymptotically equivalent.Rejecting the null hypothesis when the statistic lefthand side in (17) exceeds − log(− log(1 − α)), α ∈ (0, 1), hence yields a test with asymptotic level α, i.e., the probability of a false rejection converges to α.Under the alternative hypothesis H 1 : sup τ ∈(0,1) |X τ − X τ − | > 0, the test rejects correctly with asymptotic probability 1.There is moreover a rate of convergence.We can state equivalently that the test rejects correctly with asymptotic probability 1 under local alternatives This means that we can not detect arbitrarily small jumps based on a fix number of (n + 1) discrete high-frequency recordings, but jumps which are larger than of order n −1/2 .The test has several appealing properties.Critical values based on quantiles of the Gumbel distribution can be determined to test at a chosen level α.Moreover, the associated argmaximum consistently estimates the time of the largest jump under H 1 .Based on sequential testing and the largest absolute increments thus jump times and jump sizes can be inferred.
The next paragraph advances research on high-frequency jump tests contributing alternative methods based on extreme value theory which have some advantages compared to the Gumbel test.For the construction, we make an excursion to the nice, classical theory of order statistics.The exponential distribution, Exp(λ), with Rényi's representation is a key result about order statistics.
Lemma 1.Let (E j ) 1≤j≤n be i.i.d.Exp(1).The equality in distribution This shows that differences of subsequent order statistics of i.i.d.Exp(1) random variables are independent and While the proof is typically based on an elementary change of variables, the result is deeply rooted in the characteristic memorylessness property of the exponential distribution: Conditional on an event {E 1 > t}, t > 0, the tail probability shows that the conditional distribution is again Exp(1).The exponential distribution is moreover min-stable, i.e., E (1) ∼ Exp(n), since Due to the memorylessness the difference (E (2) − E (1) ) is Exp(n − 1)-distributed as the minimum of (n−1) independent Exp(1) random variables and Lemma 1 follows by induction.Working with order statistics of general i.i.d.random variables, we typically apply transformations to the exponential distribution to exploit Rényi's representation.
We build up our methods on the joint asymptotic distribution of the extreme order statistics.
This is a special case of Thm.2.1.1 from de Haan and Ferreira (2006) for distributions in the MDA of a Gumbel distribution.If X j ∼ N (0, 1), the sequences (a n ) I expect that Alfréd Rényi is well known to many readers.A list of his various contributions to number theory, probability, analysis and to many more mathematical fields in memoriam of him is given in Revesz and Vincze (1972).Let me highlight the eminent importance of Rényi's representation from Rényi (1953).It is often exploited, e.g., for the asymptotic analysis of estimators of the extreme value index, various statistical methods and for the presented test in the summarized recent area of research.and (b n ) coincide with the ones from ( 16).In particular, − log(E 1 ) has a Gumbel distribution.The main ingredient of the proof of Proposition 3.1 is the convergence in distribution for ( Ẽj ) i.i.d.Exp(1), which is directly implied by Rényi's representation.With a standard analytical condition for extreme value convergence, a change of variables and an (extended) continuous mapping theorem this yields the result.
Based on Proposition 3.1, we derive that This motivates an interesting alternative to the Gumbel test to construct a test based on differences of ordered normalized increments related to the distribution of (X (n) − X (n−r) ), e.g., for r = 1.Under jumps from a distribution with a Lebesgue density such a test will attain analogous asymptotic properties.The asymptotic distribution under the null hypothesis is, however, simpler, as it does not require the sequence (b n ) any more.In view of an arduous discussion about the finitesample fit of different, asymptotically equivalent variants of (b n ), and that incorrect normalizations of the Gumbel test led to some problems in the applied literature, cf.Nunes and Ruas (2024), the advantage of getting rid of (b n ) in determining critical values of a jump test should not be underestimated.More reasons to explore this path are in the beauty of the joint limit distribution of differences of extreme order statistics, again related to Rényi's representation, and an improved, simplified detection of jumps under the alternative hypothesis.The latter implies a practical improvement compared to a sequential application of the Gumbel test which I expect to be of relevance for the current analysis of high-frequency data.We establish the main result along three auxiliary lemmas on the limit distributions in Proposition 3.1 and (18).They are suitable as exercises in courses on probability and analysis.
Our first auxiliary result shows that interesting transformations of exponential random variables yield again exponential distributions.
for any r, 1 ≤ r ≤ N − 1.In particular, Proof.For some non-negative, independent random variables X and Y , with Lebesgue densities f X and f Y , the change of variables yields the Lebesgue density of the ratio X/Y .With the density of the Gamma(r,1) distribution of r j=1 E j , i.e., the rth convolution of Exp(1), and independence, we obtain for the density g of E r+1 / r j=1 E j : The last identity is implied by the known moments of an exponential distribution with λ = (z + 1).Since z → log(1 + z) has inverse u → exp(u) − 1, with derivative exp(u), a change of variables yields that Hence, U ∼ Exp(r).
I find it even more interesting that, although the same random variables enter the transformation (19) for different r, we have an independence based on the next two auxiliary lemmas.
Proof.We show that the joint density equals the product of the marginal densities.Elementary computations yield the Jacobian of the inverse map and its determinant vw 2 .Based on a (multivariate) change of variables and with the product exponential density of (E 1 , E 2 , E 3 ), we obtain the joint density Since this equals the product of the marginal densities f W (w) = w 2 e −w /2, w > 0, of the Gamma(3,1) distribution of ( , we conclude the independence. Transformations of independent random variables remain independent.For the general conclusion, we only need to extend Lemma 3 what can be done by induction.Lemma 4. For E 1 , . . ., E r+1 , r ∈ N, i.i.d.Exp(1), the random variables E 1 /(E 1 + E 2 ), . .., ( r j=1 E j )/( r+1 j=1 E j ), and ( r+1 j=1 E j ) are independent.
Proof.From the inverse map , we infer by induction the Jacobian .
We write A ij for the entry in the ith row and jth column of some matrix A. Based on a Laplace expansion with respect to the last line, we obtain With a telescoping sum in the exponent, similar as in the proof of Lemma 3, we obtain the joint density g of U 1 , . . ., U r+1 : which equals the product of the marginal densities f U r+1 (u r+1 ) = e −u r+1 u r r+1 /r!, and f U j (u j ) = j • u j−1 j , j ∈ {1, . . ., r}.
Since the right and left tail behavior of the normal distribution are symmetric and since the differences between subsequent extreme order statistics dominate the ones of intermediate order statistics, the auxiliary lemmas and Proposition 3.1 suffice to conclude the main result.
Theorem 1.For (X j ) 1≤j≤n i.i.d.N (0, 1), it holds that Figure 2: Histograms of the statistics for n = 3,600 from 1,000,000 Monte Carlo iterations and the densities of their asymptotic Gumbel, exponential and Deheuvels distributions.
The result combines ( 18) with the non-obvious, asymptotic independence of the differences.Note that the cdf of independent random variables equals the product of their cdfs.Since the differences are not identically distributed, the limit distribution does not belong to the class of standard extreme value distributions for maxima of i.i.d.sequences.Nevertheless, the limit cdf is remarkably simple and intuitive what I was not aware of before exploring this path.After (re-)discovering this result, I expected that it has been discussed in the literature and it was not difficult to find it as Thm. 1 in Deheuvels (1985).With the main focus on a related law of the iterated logarithm, Deheuvels (1985) provides a rigorous, more technical and less intuitive proof of the convergence (21).I will therefore call the limit Deheuvels distribution.The square on the right-hand side of ( 21) is due to the symmetry of the tails.Looking only at one of the tails, we obtain the limit cdf without the square.This is useful when testing for positive and negative jumps separately.In order to compute quantiles based on (21), one can approximate the infinite product by a finite one up to some cut-off, or, even simpler, approximate it by 1 − exp(−x) − exp(−2x), for x not too small.This approximation exploits a telescoping sum and is very precise for all relevant quantiles.
Figure 2 compares for (X j ) 1≤j≤n i.i.d.N (0, 1) histograms of the statistics for finite sample size n = 3,600, corresponding to one price observation per second over one hour, based on a Monte Carlo simulation with 1,000,000 iterations, to the densities of the limit standard Gumbel, standard exponential and Deheuvels distributions.The derivative of the infinite product not having a nice closed form, I use a numerical approximation with Richardson's extrapolation to evaluate the density.
Crucial for the test is the precision of the fit in the high quantiles.We illustrate it based on our Monte Carlo simulation in Figure 3 plotting empirical (90 + j)% Figure 3: Q-q plots with (90 + j)% percentiles, 0 ≤ j ≤ 9, of the statistics for n = 3,600 from 1,000,000 Monte Carlo iterations compared to the asymptotic Gumbel, exponential and Deheuvels distributions.percentiles, 0 ≤ j ≤ 9, against their theoretical asymptotic counterparts.As common in quantile-quantile (q-q) plots, we draw a diagonal line and the closer the points are to the diagonal, the better the fit by the limit distribution.We see that all three limit distributions fit the empirical, finite-sample distributions reasonably well.In fact, the fit for the differences of order statistics are better than that of the Gumbel distribution.I did, however, not try different variants of (b n ) here which could further improve the Gumbel approximation, cf.Nunes and Ruas (2024).
We finish this section with our new Rényi test for jumps.Based on where D is the Deheuvels distribution and ∆ n X = n 1/2 (∆ n 1 X/σ 1/n , . . ., ∆ n n X/σ 1 ) the vector of normalized increments, we reject the null if the statistic left-hand side exceeds the (1 − α) quantile of the Deheuvels distribution.The test has asymptotic level α and achieves the same rate of convergence as the Gumbel test.
In order to detect several jumps, the Gumbel test can be performed sequentially.In case of rejection, the time of the largest jump is estimated with the argmaximum.After discarding the largest absolute increment, the test is applied again.In case of another rejection, the next jump time is estimated.This is iterated until the test does not reject any more.For the Rényi test, there is a similar sequential application.In case of rejection, however, we can readily ascribe all increments above or below the maximal difference of the order statistics to jumps.Since the maximum can be taken between several increments which contain jumps, we nevertheless apply another test which may be based on (18).

Is volatility rough?
A fractional Brownian motion (fBm), (B H t ) t≥0 , with Hurst exponent H ∈ (0, 1), is a Gaussian process with continuous paths uniquely determined by E[B H t ] = 0 for all t, and (B H t ) has stationary Gaussian increments ( ), which are positively correlated for H > 1/2, and negatively correlated for H < 1/2.Except the case of a standard Brownian motion when H = 1/2, increments are thus not independent and (B H t ) is not a Markov process and also not a semi-martingale.The fBm is self-similar with index H given by the Hurst exponent, such that a −H B H at is distributed as B H t for all a > 0. Interested readers find a nice survey about fBm in Nourdin (2012).Harold Edwin Hurst was in fact not a mathematician, but a British hydrologist who empirically found long-range dependence in a time series of his measurements of the water level in the Nile river.Long-range dependence refers to a high degree of persistence in the data and after fBm was introduced by Mandelbrot and Van Ness (1968), it can be modelled by a fBm with large Hurst exponents.Such long memory was attributed in finance to volatility processes and Comte and Renault (1998) suggested a fractional Ornstein-Uhlenbeck process, with H > 1/2, as a model for the log-volatility.The Hurst exponent determines at the same time the regularity of the process in Assumption 1 and by the Kolmogorov-Chentsov continuity theorem the paths are Hölder continuous for any index strictly smaller than H.A recent strand of literature considers a rough fractional stochastic volatility model built on the same kind of processes but with small Hurst exponents H < 1/2.This development was initiated by Gatheral et al. (2018) and is mainly motivated by empirical evidence.It is important to point out that related literature is looking at volatility processes over longer time periods and not on an intra-daily basis over, e.g., just one single day.The strategy of Gatheral et al. (2018) is to consider a time series of realized volatilities based on high-frequency, intra-daily data over some longer period.Modelling integrated volatilities, or realized volatilities directly, by a fractional process, the latent volatility becomes observable, either directly or with negligible noise from the estimation.Based on σ j∆ , 0 ≤ j ≤ n, they study the statistics The idea is to perform linear regressions what we motivate here differently than in Gatheral et al. (2018).Based on the defining properties of fBm above, we see for some time step ∆ and l, k with Z ∼ N (0, 1).This already resembles the model equation of a linear model, i.e., a linear function of log(l∆) with slope H. Having observations σ j∆ , 0 ≤ j ≤ n, we compute ( 22) over different coarser grids, or equivalently with log(σ k∆ ) − log(σ (k−l)∆ ), 1 ≤ l ≤ L, up to some L ∈ N, and regress m(q, l∆) on log(l∆) to estimate intercept and slope with a simple linear regression.If (log(σ t )) was a fBM, or as well if it was a more general fractional process, we expect to find q • H as the slope in these regressions.This and also more refined estimators of the Hurst exponent yield in several empirical studies of financial data similar results with Hurst exponents smaller than 0.2.The data sets from the Oxford-Man Institute used for illustrations in Gatheral et al. (2018) are unfortunately not available any more.We replicate the same behavior of statistics m(q, ∆) as in Figures 5-7 of Gatheral et al. (2018) for a time series of 7021 quasi maximum likelihood daily volatility estimates from 1996 to 2023, based on the method by Xiu (2010), inferred from intra-day high-frequency trade prices of the S&P 500 market ETF.The data is constructed from the Risk Lab on Dacheng Xiu's website4 and the S&P 500 is certainly a very relevant financial index.It is not important if we insert σ 2 j∆ , or a square root σ j∆ , in ( 22).The definition without square is taken from Gatheral et al. (2018), but we insert the estimates of squared volatility.The left plot in Figure 4 illustrates the linear regressions for q = j/2, 1 ≤ j ≤ 4. The points give the computed statistics.The statistics m(q, l∆), 1 ≤ l ≤ 100, look as a function of l indeed almost perfectly logarithmic.This is confirmed by the good fit of the linear functions in the plot.The right-hand side of Figure 4 compares the estimated slope, called ζ q in Gatheral et al. (2018), along different values of q.From this illustration, we see the estimate Ĥ ≈ 0.16 for this data.Again, we find empirical evidence for a small Hurst exponent fitting a fBm to the log-volatilities.Moreover, our data shows pronounced negative empirical autocorrelations which further indicates small Hurst exponents and would contradict large ones.
This new rough volatility paradigm already stimulated a considerable body of research, beyond the high-frequency literature, for instance, on financial implications, in Bayer et al. (2019) and Horvath et al. (2020).The main motivation from econometrics to use this model is that it facilitates improved volatility forecasting, see Wang et al. (2024), among others.Having a Gaussian process, optimal prediction is feasible and given by conditional expectation.While the puzzle of rough volatility vs. volatility persistence is now to a large extent -but not yet fully -understood, forecasting mainly exploits a correlation structure.From this point of view, large Hurst exponents and very small ones could both favour a similar good performance of prediction, while the opposite is the case for values close to 1/2.The application of rough volatility for forecasting uses the continuous-time model rather as a substitute of time series models over longer periods, where the latency of volatility is less crucial than within the framework of intra-daily high-frequency observations.The question if we can infer the Hurst exponent, or more general the regularity α from Assumption 1, based on observations of the log-price (X j∆n ) is nevertheless of great theoretical interest.Given its crucial role in spot volatility estimation in Section 2, it is moreover practically relevant.
The important question for rough volatility, if this is true also in case that α < 1/2, is confirmed in the recent work Chong et al. (2024b).Estimation methods and asymptotic confidence are furthermore established in the companion work Chong et al. (2024a).This is shown for models in that the log-volatility follows a fractional process of similar nature as fBm.The Hurst exponent α is in this case not only the regularity from Assumption 1, but determines also the inter-dependence structure (persistence) and more.In joint work with Moritz Jirak, we are interested in the question, if the regularity α can be identified from high-frequency log-prices (X j∆n ) in the more general case.Since for direct observations, the rates are the same, and most estimators for the Hurst exponent in this framework are in fact constructed to assess the regularity, this could be expected.However, we obtain a rather negative result with the following lower bound.We impose regularity α in ( 23) and that the process exploits this regularity in the sense of a lower and an upper bound.It is clear that only the upper bound from Assumption 1 is not a suitable condition when we aim to estimate α, since, e.g., constant functions satisfy this for any α.
Theorem 2. Suppose that positive constants c σ and C σ exist, such that for s, t ≥ 0: The minimax lower bound for estimation of α is determined by That is, for any sequence of estimators αn of the true parameter α 0 , r n gives a lower bound on the rate with that the minimax risk decreases in n.
The proof is provided in Section 7. Lower bounds for minimax rates typically rely on statistical groundwork by Tsybakov (2008).We exploit techniques and results from Tsybakov (2008) for the proof of Theorem 2 and our construction mimics one used in Bibinger et al. (2017) for a related, different lower bound pertaining change-points of α.In our model, different from the one of Chong et al. (2024b), α determines only the regularity.The proof of the lower bound utilizes a sub-model which does not have the dependence structure of fBm.Since lower bounds extend to supersets but not to subsets, it does not apply to the more specific model with a fBm and is hence not in conflict with the result from Chong et al. (2024b).In particular, we obtain n −1/2+2α , for α < 1/4, as a lower bound.This shows that a consistent estimator only exists for α 0 < 1/4!For α 0 close to zero we get close to the standard parametric rate n −1/2 .Both rates from Chong et al. (2024b) and Theorem 2 have in common that the rate hinges on the parameter and is better for smaller values.The comparison reveals that estimation of a latent volatility's regularity, or the Hurst exponent imposing a model with a fractional process, are in general different problems.We conclude that estimating the regularity is statistically more difficult.
Let me finish the section with a positive result.We use the stochastic Landau symbols.Assume that α is consistent .Not knowing α, we replace it by α, and call the resulting estimator σ2,ad s .The elementary identities then yield with the results from Section 3 that for two random variables Z 1 and Z 2 : We conclude that σ2,ad s attains the same optimal rate of convergence as the estimator which exploits known α.

Limit order microstructure noise
While we modelled high-frequency log-prices so far as discretizations of continuoustime stochastic processes, when having available data from a limit order book, there is not only one price at some given time.Figure 5 gives a snapshot of price dynamics of the Apple asset traded at Nasdaq over a 10 minutes time interval.We use Nasdaq data from Lobster.5 A blue line shows the evolution of the best ask price, that is, the lowest price at which someone is willing to sell the asset.A red line shows the best bid price, that is, the highest price someone is offering to buy the asset.In between there is a bid-ask spread.The many points above the best ask illustrate many other active ask-limit orders and below the best bid active bid-limit orders.Trading usually takes places when market orders arrive with that someone buys or sells the asset at the best available price.These are executed against the available limit orders.For this reason trade prices bounce between the best ask and best bid what makes the illustration of all three in the same plot a bit overfraught.Trade prices are plotted in Figure 5 as black dots.A prevalent concept for market microstructure in financial econometrics is to assume some underlying efficient, semimartingale log-price process in an arbitrage-free market modelling longer-term price dynamics, while high-frequency observations are diluted by an additive market microstructure noise.Therefore, the observation model to account for market microstructure is with an Itô semimartingale (X t ) and noise (ϵ i ).Such a model was proposed in Andersen et al. (2000), among others, for trade prices with regular noise and there If we model the prices of (best) ask quotes directly, a natural assumption is that they all lie above the efficient, semimartingale log-price (X t ).Reasons are that ask orders will typically be submitted at prices above the level that is seen as current fair price to make money and they also lie above the trade prices.This leads us to a stochastic boundary model with observations in the epigraph of a semimartingale boundary process.We hence use model ( 25) with Limit Order Microstructure Noise (LOMN) which satisfies that is, Lower-bounded, One-sided Microstructure Noise.The model was introduced in Bibinger et al. (2016).We assume that (ϵ i ) 0≤i≤n is exogenous with a cdf Bid prices are analogously modelled with noise that is upper bounded and both combined in practice.Although Figure 5 shows prices in a discrete image space under a very high time resolution, it is standard to work with the real-valued process (X t ), to perform estimation of the volatility, or other daily quantities.It is then natural to consider continuous noise distributions also.Since our methods use differences between local minima or maxima of the data only, it is not crucial that the boundary of the noise is exactly zero.It can be some unknown constant instead, or even a regular function over time, what is meaningful to include compensation of market processing costs.This possible generalization is one reason why we model ask and bid prices separately in boundary models, e.g., instead of considering noise on a bounded interval.Moreover, a model with noise on an interval would not simplify the statistical problem but rather complicate the situation.Condition ( 27) does not impose a parametric form of the noise.The assumed standard behaviour of the cdf close to the boundary is satisfied by many common distributions, as a uniform distribution on some interval [0, A], A > 0, an exponential distribution as we know from Section 3, and a heavy-tailed (shifted) Pareto distribution.Nevertheless, we currently work on generalizations of the model to allow for some general tail index which is 1 in ( 27).The irregular, non-negative noise leads to statistical inference based on local minima instead of local averages which are used under regular noise in the literature.This is motivated by the problem of estimating boundary parameters in parametric statistics.We explain the key idea looking at the prominent example of the taxi problem.An important advantage of LOMN and using order statistics compared to MMN is that no conditions on the right tail of the noise distribution or on the existence of moments of the noise are required.
In our stochastic boundary model we do of course not have a constant boundary to estimate as in the taxi problem, but want to recover a latent semimartingale boundary process.This situation is intricate, but -although the approach appears venturous -we approximate the boundary process locally constant over small time blocks.From the analogy to the taxi problem, it is then natural to estimate the efficient log-price locally by local block-wise minima Let us assume for simplicity equidistant observations again, t n i = i/n, and h −1 n ∈ N being the sequence of number of blocks, and nh n ∈ N the number of observations per block.In our asymptotic high-frequency regime, h n → 0, and nh n → ∞, as n → ∞.There is a balanced regime, h n ∝ n −2/3 , in which the stochastic order of the minimal error over a block and the movement of the boundary process over a block are the same, as min Based on local minima in this balanced regime, a rate-optimal estimator of the integrated volatility has been established in Bibinger et al. (2016).

Taxi problem
Imagine you go for a walk in New York City and notice a lot of the famous yellow cab taxis.You're wondering how many there are in total.Fortunately, the yellow cabs are labeled with consecutive integers on their engine covers.So you can note the numbers you see during your walk and then estimate the unknown maximal number based on your sample. 6onsider the similar problem of estimating the upper boundary θ from i.i.d.uniformly U ([0, θ])-distributed random variables X 1 , . . ., X n on the interval [0, θ].A very successful (though not in this example) and convenient construction for a point estimator is the method of moments: Since X 1 has expectation θ/2, and the sample average xn is a good estimator for an expectation, set θMM = 2x n .This estimator is unbiased, converges almost surely to θ, and satisfies a central limit theorem based on which asymptotic confidence intervals can be obtained.Looking at the likelihood L(θ; x 1 , . . ., x n ) = θ −n 1{θ ∈ [x (n) , ∞)}, i.e., the product density as a function of θ, however, tells statisticians that the maximum X (n) is a sufficient statistic.That means X (n) preserves all information about θ we have from X 1 , . . ., X n .Therefore, during your walk you do not need to take notes with all observed numbers, but only remember the largest number that you observed.Based on the likelihood, we obtain the maximum likelihood estimator: θML = X (n) .We Once more in this article we get an exponential limit distribution!The rate is much faster than for θMM , but θML < θ is obviously biased.Since X (n) is not only sufficient, but moreover complete, statisticians know that the associated Rao-Blackwell improvement of some unbiased L 2 -estimator yields the unbiased estimator with uniformly smallest variance (umvu) by Lehmann-Scheffé.Since θMM is unbiased, let us determine its Rao-Blackwell improvement: With probability 1/n we have ).We hence obtain the umvu estimator Like the estimators based on the maximum in the taxi problem converge faster than with the standard rate, Bibinger et al. (2016) proved that their estimator attains an optimal rate n −1/3 , with that the root mean squared error tends to zero, which improves upon the well-known standard rate n −1/4 for regular noise.
Lower bounds for the rate and the asymptotic variance under regular noise in the parametric case were established by Gloter and Jacod (2001).The distribution of local minima in the balanced regime is, however, involved which yet limited available results.In particular, in Bibinger et al. (2016) we could not provide asymptotic confidence for the integrated volatility.The article Bibinger (2024) contributes a step forward in this direction and extends the probabilistic theory required to work with the boundary model.For the tail function of local minima, we conclude with conditioning, ( 25), ( 27), a Taylor expansion and dominated convergence that for all x < 0, with a standard Brownian motion (B t ).To work with the integrated negative part of a Brownian motion in the last expression, we exploit and extend results about local time of Brownian motion.One main ingredient of the asymptotic analysis in Bibinger ( 2024) is an expansion of this tail function based on a generalized arcsine law.Here, we focus on a simpler idea which is nevertheless the most important step to approximate the distribution of the local minima.Selecting blocks slightly larger than in the balanced regime, we have nh 3/2 n → ∞ in the exponent, such that the probability tends to zero unless the integral yields zero.This is the case if and only if the event {min 0≤t≤1 B t ≥ x} occurs.In this regime, we hence obtain that The distribution of the minimum of a Brownian motion over the interval [0, 1] is remarkably simple.This is due to the reflection principle connected with the strong Markov property of (B t ).We derive with the reflection principle from the above approximation that for x < 0, since P min 0≤t≤1 B The distribution of |Z|, for Z ∼ N (0, 1), is called half-normal distribution.Since our limit is distributed as the product σ khn |Z| then, we call it mixed half-normal.
For volatility estimation, with (B t ) and ( Bt ) two independent standard Brownian motions, define Not having any lower bound for the integrated negative part in the above expression, the remainder decays however slowly in n, and we require an asymptotic expansion and a numerical approximation of Ψ n for an estimator with desirable properties.Nevertheless, with K n → ∞, we first consider the simple estimator in case without price jumps and a truncated version which is robust to nuisance jumps.A main result of Bibinger ( 2024) is that under Assumption 1 for , with some constants C K and δ, 0 < δ < 2α/(1 + 2α), the estimator satisfies the stable clt In fact, we use the (approximated) function Ψ n for a bias correction to obtain a clt at optimal rate.The asymptotic variance is derived with the expansion of the tail function based on the joint distribution of minimum and terminal value of a Brownian motion over [0, 1] concluded from the reflection principle.To this end we use one of the most important examples for applications of Fubini-Tonelli in probability that relates moments and the tail function: For some non-negative random variable Z, with distribution P Z , and k ∈ N, it holds true that Integration with respect to the σ-finite probability and Lebesgue measures is exchanged here.Extensions to covariances and real-valued random variables are available and allow us to use the form of the tail function from above.In a recent preprint Bibinger et al. (2024), we develop jump detection methods under LOMN including a Gumbel test for jumps.It is based on Based on extreme value theory, we show that under the null hypothesis and for (σ t ) ∈ C α , i.e., Hölder continuous with regularity α, it holds with with Λ the standard Gumbel distribution.Under local alternatives The subscript of the measure is to indicate that we are under H 1 , and the path has at least one jump.Considering local alternatives, the question about which probability space(s) to work on is justified, but not particularly important here, since we can simply consider the distributions of the statistics directly to avoid an arduous construction.
The main insight of this result is that under LOMN smaller jumps can be identified compared to MMN.While we can detect jumps of size larger than n −1/3 , only jumps of size larger than n −1/4 can be found under MMN.Moreover, working with order statistics to infer jumps has some nice advantages compared to local averages under MMN, where averaging over jump times is creating huge problems described as "pulverisation of jumps by pre-averages" by Mykland and Zhang (2016).This is illustrated in Section 2 of Bibinger et al. (2024).One ingredient to show (32) is uniform consistency of the spot volatility estimation, for which we require the continuity of (σ t ) under H 0 .Furthermore, the precise Gumbel convergence for differences between half-normal random variables is determined, since we cannot trace this one back to a standard example of extreme value theory.Our sequence is furthermore not i.i.d., but it is known that Gumbel convergences of maxima hold analogously more generally under weak dependence conditions.

Outlook
In the multi-dimensional framework with a portfolio of d stocks, the volatility process becomes (d×d) matrix-valued.The key role for risk diversification is rather taken by the covariances than the idiosyncratic volatilities.A co-jump pattern is of interest to separate idiosyncratic and systemic effects, see e.g., Caporin et al. (2017).Since the estimation uncertainty of a (d × d) matrix increases proportional to d 4 in the dimension, we have our very own curse of dimensionality.
Multivariate ultra high-frequency data are not only subject to market microstructure, but discrete observations moreover arrive at non-synchronous times.Volatility matrix estimation under these peculiarities motivated another strand of research.In Bibinger et al. (2014) we contributed two main insights: 1. Different than for non-noisy observations, non-synchronicity effects are at first order asymptotically negligible.In a combination with noise, the noise prevails.
2. A lower bound for the asymptotic variance-covariance structure of volatility matrix estimation reveals that the multivariate model allows improved estimates, also of idiosyncratic volatilities.
The first result is based on an asymptotic equivalence between a continuous-time and the discrete-time observation model.Asymptotically equivalent experiments provide the same amount of information about unknown quantities, which hence can be estimated with the same precision in both situations.If one model is simpler than another one or already well explored, this is very useful, also since statistical methods can be transferred.The effect of efficiency gains from a multivariate model for the estimation of a single volatility arises when assets are correlated and observed with uncorrelated noise.The picture on boundary estimation sketched by the taxi problem is yet incomplete, since the rate of convergence heavily depends on the behaviour of the cdf close to the boundary.For instance, for a triangle distribution which is the convolution square of the uniform distribution, the rate is only √ n instead of n.Extending the model with a general tail index and its estimation allow to better calibrate boundary models to limit order quotes.Our first empirical trials indicate that different assets might show different tail behaviours, what is particularly interesting in view of their strong correlations and since a small tail index results in a higher accuracy of volatility estimation.This is a strong motivation to develop a multivariate observation model with limit order microstructure noise.When the estimation uncertainty varies across different stocks, a risk analysis for one stock, e.g., Apple, could be improved using data also from another stock, e.g., Google.Compared to multivariate regular noise, efficiency gains become even more relevant affecting the rates of convergence and not only minimal asymptotic variances at optimal rate.Forecasts of financial risk can improve considerably when going from a model for a single stock price to a multivariate model, e.g., this was the case for the multivariate GARCH model proposed in Bollerslev et al. (1988).Consequently, if rough volatility provides accurate forecasts it should be further extended to a multivariate model.This should include possibly different Hurst exponents, what is mathematically challenging.Therefore, it is also of theoretical interest for mathematicians.
Currently, the analysis of high-dimensional high-frequency data is a vibrant research area.This refers to an asymptotic regime in that not only n → ∞, but moreover d → ∞ is considered for an asymptotic expansion.In this area, high-frequency statistics is combined with methods from high-dimensional statistics, e.g., LASSO, penalization in general, shrinkage estimation, thresholding eigenvalues, principal component analysis and sparsity, see e.g., Aït-Sahalia and Xiu (2019), Pelger (2019), Chen et al. (2020), Ledoit and Wolf (2020) and Christensen et al. (2023).In view of strong correlations between most financial assets, factor models appear to be very attractive.These are of the form dX t = B q t dF t + dZ t , [F, Z] ≡ 0, Σ t = B q t Σ S t (B q t ) ⊤ + Σ I t , where the q factors F t affect all stocks, with B q t ∈ R d×q , Σ S t ∈ R q×q .The dimension q is kept fix as d → ∞.The estimation of all components of the model is challenging.Moreover, the rank q has a crucial role and we are interested in testing constant rank and detecting changes of q over time.The precision matrix, the inverse of the (integrated) volatility matrix, is the most important object for optimal portfolio allocation.Cai et al. (2020) focusses on its estimation from high-dimensional high-frequency data.Assuming equidistant observations with regular microstructure noise, however, there is something left to improve upon in future research.

Proof of Theorem 2
It suffices to prove that r n is a lower bound for a specific sub-model contained in our general model, since the lower bound then extends to the general model.A simplified sub-model that preserves the main structure of the estimation problem is the estimation of α ∈ (0, 1] from observations (34) where (U j ) are i.i.d.real-valued random variables with a symmetric centered law, E[U 2k−1 1 ] = 0, k ∈ N, independent of (Z j ) j≥1 , for which all moments exist.Assuming that the law of U 1 has a Lebesgue density g U , we obtain by conditioning the following density of P α with respect to the Lebesgue measure λ: where addends for odd k vanish by the symmetry of the law of U 1 , and the coefficients of the power series are degree 2k polynomials in x, with x 8 24 − 11 12 x 6 + 41 8 x 4 − 7x 2 + 1 .
Naturally, the first addend of dP α /dλ(x) yields the standard normal density.We see that it holds that which is one common measure of the distance between the probability measures P α and P α. χ 2 dP α ∥dP α) tends to zero when ∆ → 0. In a high-frequency asymptotic regime, ∆ → 0, the last equation yields that .

Figure 1 :
Figure 1: Log-price (left) in Heston model with 5 jumps and its increments (right).

Figure 5 :
Figure 5: Bid and ask quotes and trade prices (black dots) for the Apple asset over a 10 minutes time interval.isa vast area of research on this model.Classical regular Market Microstructure Noise (MMN) (ϵ i ) 0≤i≤n is i.i.d. with E[ϵ i ] = 0.If a full limit order book is available, it is recently applied to mid quotes, i.e., averages of best bid and best ask quotes.If we model the prices of (best) ask quotes directly, a natural assumption is that they all lie above the efficient, semimartingale log-price (X t ).Reasons are that ask orders will typically be submitted at prices above the level that is seen as current fair price to make money and they also lie above the trade prices.This leads us to a stochastic boundary model with observations in the epigraph of a semimartingale boundary process.We hence use model (25) with Limit Order Microstructure Noise (LOMN) which satisfies

CC
2k (x) dx = 0 .Note that R C 2k (x)e − x 22 dx = 0, for the coefficients k = 2, 4, can be seen inserting the moments of the standard normal distribution.For two parameters α, α ∈ (0, 1], consider the χ 2 -divergenceχ 2 dP α ∥dP α) (x) (∆ α ) 2k −(∆ α) 2k 1+ k≥1 C 2k (x)(∆ α2k (x)(∆ α) 2k dx , Vogt (2021) Gumbel is an interesting personality for at least two reasons, his work as a researcher on extreme value theory, and his political commitment.Despite his academic contributions, his pacifism, statistical research which exposed the leniency towards political murders committed by right-wing extremists, and his political activity in general led to his dismissal from the University of Heidelberg in 1932.This troubling chapter in the university's history is recently critically reflected. 3spects of his legacy arouse interest in the recent literature on the history of science, see, e.g,Vogt (2021)andRendtel et al. (2021) (in German).Limit distributions of maxima of i.i.d.random variables can only be of Gumbel, Weibull or Frechét type.If sequences (a n ) and (b n ) exist, such that for a cumulative distribution function (cdf) F the maxima of i.i.d.random variables with cdf F converge to Λ, write F ∈ MDA(Λ) (maximum domain of attraction).On the null hypothesis H 0