1 Introduction

In this article we dedicate our studies to answer the question of whether a data set of univariate real numbers belongs to the famous family of Cauchy distributions. The Cauchy distribution is undoubtedly the standard example for a distribution without existing mean value and was studied in the mathematical world for more than 300 years, having wide applicability in diverse fields ranging from modeling resonances in physics (then often called Lorentz distribution) to cryptocurrencies in finance, see Szczygielski et al. (2020). It is also known as the Breit–Wigner distribution, for an extensive historical overview, see Stigler (1974). To be precise, we write shorthand C\((\alpha ,\beta )\), \(\alpha \in \mathbb {R}\), \(\beta > 0\), for the Cauchy distribution with location parameter \(\alpha \) and scale parameter \(\beta \), having density

$$\begin{aligned} f(x,\alpha ,\beta ) = \frac{1}{\pi }\frac{\beta }{\beta ^2+(x-\alpha )^2},\quad x\in \mathbb {R}. \end{aligned}$$

For a detailed discussion on this family, see Johnson et al. (1994), Chapter 16. The Cauchy distribution is a so called heavy tailed distribution and is a member of the stable distributions, see Nolan (2020). Note that \(X \sim \) C\((\alpha , \beta )\) if, and only if, \((X-\alpha )/\beta \sim \) C(0, 1) and hence the Cauchy distribution belongs to the location-scale family of distributions. In the following we denote the family of Cauchy distributions by \(\mathcal {C}:=\{\text{ C }(\alpha ,\beta ):\,\alpha \in \mathbb {R},\beta >0\}\), a family of distributions which is closed under translation and rescaling. We test the composite hypothesis

$$\begin{aligned} H_0:\;\mathbb {P}^X\in \mathcal {C} \end{aligned}$$
(1)

against general alternatives on the basis of independent identical copies \(X_1,\ldots ,X_n\) of X defined on an underlying probability space \((\Omega ,\mathcal {A},\mathbb {P})\). This testing problem has been considered in the literature: Gürtler and Henze (2000) propose a test procedure based on the empirical characteristic function and Matsui and Takemura (2005) extend this test by considering alternative estimation methods. More recently, Mahdizadeh and Zamanzade (2017) propose to use the likelihood ratio as in Zhang (2002) as well as the Kullback–Leibler (KL) distance, an idea that is extended in Mahdizadeh and Zamanzade (2019). A quantile based method is proposed in Rublik (2001) and compared to the classical omnibus procedures. An empirical power study of goodness-of-fit tests for the Cauchy model based on the empirical distribution function as the Kolmogorov–Smirnov test, the Cramér–von Mises test, the Kuiper test, the Anderson Darling and the Watson test is found in Onen et al. (2001). In Litvinova (2005), two tests for the standard Cauchy distribution based on characterizations are given; however, they are not designed for the composite hypothesis (1).

The novel procedure is based on the following new characterization of the standard Cauchy distribution.

Theorem 1.1

Let X be a random variable with absolutely continuous density p and \(\mathbb {E}\left[ \frac{|X|}{1+X^2}\right] < \infty \). Then X has a Cauchy distribution C(0, 1) if and only if

$$\begin{aligned} \mathbb {E}\left[ \left( it-\frac{2X}{1+X^2}\right) \exp (itX)\right] = 0 \end{aligned}$$
(2)

holds for all \(t\in \mathbb {R}\), where i denotes the imaginary unit.

Proof

For \(X\sim \text{ C }(0,1)\) a direct calculation shows the assertion. Let X be a random variable with absolutely continuous density function p(x) such that

$$\begin{aligned} \mathbb {E}\left[ \left( it-\frac{2X}{1+X^2}\right) \exp (itX)\right] = 0 \end{aligned}$$

holds for all \(t\in \mathbb {R}\). Note that since \(-it\mathbb {E}[\exp (itX)]\) is the Fourier-Stieltjes transform of the derivative \(p'(x)\) of p(x) we have

$$\begin{aligned} 0=\mathbb {E}\left[ \left( it-\frac{2X}{1+X^2}\right) \exp (itX)\right] =\int _{-\infty }^\infty \left( -p'(x)-\frac{2x}{1+x^2}p(x)\right) \exp (itx)\,\text{ d }x \end{aligned}$$

for all \(t\in \mathbb {R}\). By properties of the Fourier-Stieltjes transform, we hence note that p(x) must satisfy the ordinary differential equation

$$\begin{aligned} p'(x)+\frac{2x}{1+x^2}p(x)=0 \end{aligned}$$

for almost all \(x\in \mathbb {R}\). The only solution satisfying \(\int _{-\infty }^\infty p(x)\text{ d }x=1\) is \(p(x)=f(x,0,1)\), \(x\in \mathbb {R}\), and \(X\sim \text{ C }(0,1)\) follows. \(\square \)

Remark 1.2

The characterization in Theorem 1.1 is related to the spectral representation of the Stein operator of the so called density approach pioneered in Stein et al. (2004). Note that the quotient in (2) is \(f'(x,0,1)/f(x,0,1)=-2x/(1+x^2)\), \(x\in \mathbb {R}\), and the set of so called test functions is given by \(\{\exp (itx):\, t\in \mathbb {R}\}\). For details on this approach see Anastasiou et al. (2022), Sect. 5.4.2.

This paper is organized as follows. In Sect. 2, we introduce a family of test statistics, denoted by \(T_{n,a}\), which is based on the characterization in Theorem 1.1. In Sect. 3, the limit distribution of \(T_{n,a}\) is derived in a Hilbert space framework under the null hypothesis. Furthermore, we derive the limit distribution of the statistic \({\widetilde{T}}_{n,0}=\lim _{a\rightarrow 0} aT_{n,a}\). Consistency of the new tests against a large class of alternatives is shown in Sect. 4. An extensive Monte Carlo simulation study in Sect. 5 shows that the test is a good competitor for the state of the art procedures. In Sect. 6, the tests are applied to log-returns of cryptocurrencies. Finally, conclusions are given in Sect. 7.

2 A new class of goodness of fit tests for the Cauchy distribution

The testing problem under discussion is invariant with respect to transformations of the kind \(x\rightarrow ax+b, \, x\in \mathbb {R}\), where \(a\in \mathbb {R}\) and \(b>0\). Consequently, a decision in favor or against \(H_0\) should be the same for \(X_1,\ldots ,X_n\) and \(aX_1+b,\ldots ,aX_n+b\). This goal is achieved if the test statistic \(T_n\), say, is based on the standardized data \(Y_{n,1},\ldots ,Y_{n,n}\), given by

$$\begin{aligned} Y_{n,j}=\frac{X_j-\widehat{\alpha }_n}{\widehat{\beta }_n}, \quad j=1,\ldots ,n. \end{aligned}$$
(3)

Here, \(\widehat{\alpha }_n=\widehat{\alpha }_n(X_1,\ldots ,X_n)\) and \(\widehat{\beta }_n=\widehat{\beta }_n(X_1,\ldots ,X_n)\) denote consistent estimators of \(\alpha \in \mathbb {R}\) and \(\beta >0\) such that

$$\begin{aligned} \widehat{\alpha }_n(aX_1+b,\ldots ,aX_n+b)= & {} a\widehat{\alpha }_n(X_1,\ldots ,X_n)+b, \end{aligned}$$
(4)
$$\begin{aligned} \widehat{\beta }_n(aX_1+b,\ldots ,aX_n+b)= & {} a\widehat{\beta }_n(X_1,\ldots ,X_n), \end{aligned}$$
(5)

holds for each \(a>0\) and \(b\in \mathbb {R}\). By (4) and (5) it is easy to see that \(Y_{n,j}\), \(j=1,\ldots ,n,\) do not depend on the location nor on the scale parameter. Hence, \(T_n\) has the property \(T_n(aX_1+b,\ldots ,aX_n+b)=T_n(X_1,\ldots ,X_n)\), and we may and do assume \(\alpha =0\) and \(\beta =1\) in the following. Motivated by Theorem 1.1, we choose the test statistic

$$\begin{aligned} T_{n}=n\int _{-\infty }^{\infty }\Big \vert \frac{1}{n}\sum _{j=1}^n \Big (it-\frac{2Y_{n,j}}{1+Y_{n,j}^2}\Big )e^{itY_{n,j}}\Big \vert ^2\omega (t)\;\text{ d }t, \end{aligned}$$
(6)

which is the weighted \(L^2\)-distance from (2) to 0. Here, \(|\cdot |\) is the complex absolute value and \(\omega :\mathbb {R}\rightarrow (0,\infty )\) denotes a positive weight function that is given by \(\omega (t)=\omega _a(t)=\exp (-a|t|)\), \(t\in \mathbb {R}\). Note that \(\omega \) is symmetric around the origin, i.e. \(\omega (t)=\omega (-t)\) holds for all \(t\in \mathbb {R}\), and that

$$\begin{aligned} \int _{-\infty }^{\infty }t^6\omega (t)\;\text{ d }t<\infty \end{aligned}$$
(7)

holds. We have the integration-free, numerical stable formula

$$\begin{aligned} T_n=T_{n,a}&=\frac{1}{n}\sum _{j,k=1}^n\bigg (\frac{8aY_{n,j}Y_{n,k}}{(1+Y_{n,j}^2) (1+Y_{n,k}^2)((Y_{n,j}-Y_{n,k})^2+a^2)}\nonumber \\&\quad -\frac{16aY_{n,j}(Y_{n,j}-Y_{n,k})}{(1+Y_{n,j}^2)((Y_{n,j}-Y_{n,k})^2+a^2)^2} +\frac{4a^3-12a(Y_{n,j}-Y_{n,k})^2}{((Y_{n,j}-Y_{n,k})^2+a^2)^3}\bigg ), \end{aligned}$$
(8)

and hence a whole family of tests depending on the so called tuning parameter \(a>0\). The next result deals with the limit behavior of \(T_{n,a}\) for \(a\rightarrow 0\) and \(a\rightarrow \infty \).

Theorem 2.1

For fixed n, we have

$$\begin{aligned} \lim _{a\rightarrow 0} a \bigg ( T_{n,a}-\frac{4}{a^3} \bigg )&= \frac{8}{n} \sum _{j=1}^n \frac{Y_{n,j}^2}{(1+Y_{n,j}^2)^2} ={\widetilde{T}}_{n,0}, \end{aligned}$$
(9)

and

$$\begin{aligned} \lim _{a\rightarrow \infty } aT_{n,a}&= \frac{8}{n} \Bigg ( \sum _{j=1}^n \frac{Y_{n,j}}{1+Y_{n,j}^2} \Bigg )^2. \end{aligned}$$
(10)

Proof

Splitting the sum in (8) in a diagonal and a non-diagonal part results in

$$\begin{aligned} T_{n,a}= \frac{1}{n}\sum _{j,k=1}^n R_{j,k,a} =\frac{1}{n}\sum _{j=1}^n R_{j,j,a} +\frac{1}{n}\sum _{j\ne k} R_{j,k,a} =T_{n,a}^{d}+T_{n,a}^{nd}, \end{aligned}$$

say. Since

$$\begin{aligned} T_{n,a}^{d} = \frac{8}{na} \sum _{j=1}^n Y_{n,j}^2/(1+Y_{n,j}^2)^2 + \frac{4}{a^3} \end{aligned}$$

and \(\lim _{a\rightarrow 0}T_{n,a}^{nd}=0\), we obtain

$$\begin{aligned} \lim _{a\rightarrow 0} a \bigg ( T_{n,a}-\frac{4}{a^3} \bigg )&= \lim _{a\rightarrow 0} a \bigg ( T_{n,a}^{d} - \frac{4}{a^3} \bigg ) = \frac{8}{n} \sum _{j=1}^n \frac{Y_{n,j}^2}{(1+Y_{n,j}^2)^2}. \end{aligned}$$

By the symmetry of the weight function \(\omega _a(\cdot )\), straightforward calculations show

$$\begin{aligned} T_{n,a}&=\int _{-\infty }^\infty Z_n^2(t)\,\omega _a(t)\text{ d }t, \end{aligned}$$

where

$$\begin{aligned} Z_n(t)=\frac{1}{\sqrt{n}}\sum _{j=1}^n\Big (\frac{2Y_{n,j}}{1+Y_{n,j}^2} +t\Big )\cos (tY_{n,j})+\Big (t-\frac{2Y_{n,j}}{1+Y_{n,j}^2}\Big )\sin (tY_{n,j}),\ \ t\in \mathbb {R}.\nonumber \\ \end{aligned}$$
(11)

Then, \(aT_{n,a}=\int _{-\infty }^\infty Z_n^2(s/a)\,\exp (-|s|)\text{ d }s\), where \(Z_n\) satisfies \(Z_n^2(s/a)\le 4n(1+|s|)^2\), for \(a>1\), and

$$\begin{aligned} \lim _{t\rightarrow 0} Z_n(t) = \frac{1}{\sqrt{n}} \sum _{j=1}^n \frac{2Y_{n,j}}{1+Y_{n,j}^2}. \end{aligned}$$

Now, the second assertion follows from an application of the dominated convergence theorem. \(\square \)

3 Limit distribution under the null hypothesis

The asymptotic theory is derived in the Hilbert space \(\mathbb {H}\) of measurable, square integrable functions \(\mathbb {H}=L^2(\mathbb {R},\mathcal {B}, \omega (t)\text{ d }t)\), where \(\mathcal {B}\) is the Borel-\(\sigma \)-field of \(\mathbb {R}\) and \(\omega (t)\) is the weight function defined in the introduction. Notice that \(T_n=\int _{-\infty }^\infty Z_n^2(t)\,\omega (t)\text{ d }t,\) where \(Z_n\) is defined in (11), is a real-valued \((\mathcal {A} \otimes \mathcal {B}, \mathcal {B})\)-measurable random element of \(\mathbb {H}\). We denote by

$$\begin{aligned} \Vert f\Vert _{\mathbb {H}} = \left( \int _{-\infty }^\infty \big |f(t)\big |^2 \, \omega (t) \, \textrm{d}t \right) ^{1/2}, \qquad \langle f, g \rangle _{\mathbb {H}}=\int _{-\infty }^\infty f(t)g(t) \, \omega (t) \, \textrm{d}t \end{aligned}$$

the usual norm and inner product in \(\mathbb {H}\). In the following, we assume that the estimators \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\) allow linear representations

$$\begin{aligned} \sqrt{n} \widehat{\alpha }_n&= \frac{1}{\sqrt{n}} \sum _{j = 1}^n \psi _1 (X_{j}) + o_{\mathbb {P}}(1), \end{aligned}$$
(12)
$$\begin{aligned} \sqrt{n} (\widehat{\beta }_n - 1)&= \frac{1}{\sqrt{n}} \sum _{j = 1}^n \psi _2 (X_{j}) + o_{\mathbb {P}}(1), \end{aligned}$$
(13)

where \(X_1,\ldots ,X_n \sim C(0,1)\) are independent random variables, \(o_\mathbb {P}(1)\) denotes a term that converges to 0 in probability, and \(\psi _1\) und \(\psi _2\) are measurable functions with

$$\begin{aligned} \mathbb {E}[\psi _1 (X_{1})]= & {} \mathbb {E}[\psi _2 (X_{1})] = 0, \text{ and } \mathbb {E}[\left( \psi _1(X_{1}),\psi _2(X_{1})\right) ^\top \left( \psi _1(X_{1}),\psi _2(X_{1})\right) ]\\= & {} C \cdot I_2; \end{aligned}$$

here, C is a positive constant, \(^\top \) stands for the transpose of a vector and \(I_2\) is the \(2\times 2\)-identity matrix. See Remark 3.2 for examples of estimation procedures satisfying these assumptions.

Theorem 3.1

Let \(X_1,\ldots ,X_n\) be i.i.d. \(\text{ C }(\alpha ,\beta )\) distributed random variables. Then there exists a centred Gaussian random process \(\mathcal {Z}\) in \(\mathbb {H}\) with covariance kernel

$$\begin{aligned}&K(s,t) =\frac{1}{2}\big (s^2+t^2+\vert s-t\vert +1\big )e^{-\vert s-t\vert }\\&\quad -\frac{1}{2}(t^2+\vert t\vert +1)e^{-\vert t\vert } \mathbb {E}\bigg [\bigg (\bigg (\frac{2X_1}{1+X_1^2}+s\bigg )\cos (sX_1)+\bigg (s-\frac{2X_1}{1+X_1^2}\bigg )\sin (sX_1)\bigg )\psi _1(X_1)\bigg ] \\&\quad +\frac{1}{2}t(\vert t\vert +1)e^{-\vert t\vert }\mathbb {E}\bigg [\bigg (\bigg (\frac{2X_1}{1+X_1^2}+s\bigg )\cos (sX_1)+\bigg (s-\frac{2X_1}{1+X_1^2}\bigg )\sin (sX_1)\bigg )\psi _2(X_1)\bigg ]\\&\quad -\frac{1}{2}(s^2+\vert s\vert +1)e^{-\vert s\vert }\mathbb {E}\bigg [\bigg (\bigg (\frac{2X_1}{1+X_1^2}+t\bigg )\cos (tX_1)+\bigg (t-\frac{2X_1}{1+X_1^2}\bigg )\sin (tX_1)\bigg )\psi _1(X_1)\bigg ]\\&\quad +\frac{1}{2}s(\vert s\vert +1)e^{-\vert s\vert }\mathbb {E}\bigg [\bigg (\bigg (\frac{2X_1}{1+X_1^2}+t\bigg )\cos (tX_1)+\bigg (t-\frac{2X_1}{1+X_1^2}\bigg )\sin (tX_1)\bigg )\psi _2(X_1)\bigg ]\\&\quad +\frac{1}{4}(s^2+\vert s\vert +1)(t^2+\vert t\vert +1)e^{-\vert s\vert -\vert t\vert }\mathbb {E}[\psi _1^2(X_1)] +\frac{1}{4}s(\vert s\vert +1)t(\vert t\vert +1)e^{-\vert s\vert -\vert t\vert }\mathbb {E}[\psi _2^2(X_1)] \end{aligned}$$

for \(s,t\in \mathbb {R},\) such that \(T_n\overset{D}{\longrightarrow }\Vert \mathcal {Z}\Vert _\mathbb {H}^2\) as \(n\rightarrow \infty \).

A proof of Theorem 3.1 is found in Appendix  A.1. It is well known that the distribution of \(\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2\) is that of \(\sum _{j=1}^\infty \lambda _jN_j^2\), where \(N_1,N_2,\ldots \) are i.i.d. standard normal random variables and \((\lambda _j)\) is a decreasing sequence of positive eigenvalues of the integral operator

$$\begin{aligned} \mathcal {K}g(s)=\int _{-\infty }^\infty K(s,t)g(t)\omega (t)\,\text{ d }t. \end{aligned}$$

Due to the complexitiy of the covariance kernel K as given in Theorem 3.1 it seems hopeless to solve the integral equation \(\mathcal {K}g(s)=\lambda g(s)\) and find explicit values of \(\lambda _j\), \(j\ge 1\). For a numerical approximation method we refer to Subsect. 3.3 in Matsui and Takemura (2005). Note that

$$\begin{aligned} \mathbb {E}\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2=\int _{-\infty }^\infty K(t,t)\omega (t)\,\text{ d }t \end{aligned}$$
(14)

and

$$\begin{aligned} \text{ Var }\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2=2\int _{-\infty }^\infty \int _{-\infty }^\infty K^2(s,t)\omega (t)\omega (s)\,\text{ d }t\text{ d }s \end{aligned}$$
(15)

can be derived for specific estimation procedures. Such results in the theory of goodness-of-fit tests for the Cauchy family are sparse, for some explicit formulae for mean values, see Gürtler and Henze (2000) and Matsui and Takemura (2005).

Remark 3.2

The generality of Theorem 3.1 in view of the linear representations of the estimators and hence dependence on the functions \(\psi _1\) and \(\psi _2\) leads to explicit covariance kernels for different parameter estimation procedures. To estimate \(\alpha \) and \(\beta \) we choose the following location and scale estimators \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\), which all satisfy (4) and (5) respectively. For a compact notation, we write \(\psi (x)=(\psi _1(x),\psi _2(x))^\top \). Some derivations were partially provided by the computer algebra system Maple, see Maplesoft (2019).

  1. 1.

    Median and interquartile-distance estimators: Let \(\xi _p\), \(p\in (0,1)\), denote the p-quantile of the underlying distribution F, \(\widehat{\xi }_{p,n}\) the sample p-quantile, and \(X_{(1)}\le \cdots \le X_{(n)}\) the order statistics of \(X_1,\ldots ,X_n\). With \(\lfloor \cdot \rfloor \) denoting the floor function, let

    $$\begin{aligned} \widehat{\alpha }_n={\left\{ \begin{array}{ll} \frac{1}{2}(X_{(\frac{n}{2})}+X_{(\frac{n}{2}+1)}), &{} \text {if}\,n\,\text{ even, }\\ X_{(\lfloor \frac{n}{2}\rfloor +1)}, &{} \text {otherwise,} \end{array}\right. } \end{aligned}$$
    (16)

    be the unbiased empirical median and

    $$\begin{aligned} \widehat{\beta }_n=\frac{1}{2}(\widehat{\xi }_{\frac{3}{4},n}-\widehat{\xi }_{\frac{1}{4},n}) \end{aligned}$$
    (17)

    the half-interquartile range (iqr) of the sample. Under mild regularity conditions \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\) are consistent estimators of \(\alpha \) and \(\beta \). Display (3.3) of Gürtler and Henze (2000) then gives the so-called Bahadur representations (see Theorem 2.5.1 in Serfling (1980)) with

    $$\begin{aligned} \psi _1(x)=\pi \left( \frac{1}{2}-{\textbf{1}}\lbrace x\le 0\rbrace \right) ,\quad \text{ and }\quad \psi _2(x)=\pi \left( \frac{1}{2}-{\textbf{1}}\lbrace -1\le x\le 1\rbrace \right) ,\ x\in \mathbb {R}. \end{aligned}$$

    It is easy to see that \(\mathbb {E}[\psi _1(X_1)]=\mathbb {E}[\psi _2(X_1)]=0\) and \(\mathbb {E}[\psi (X_1)\psi (X_1)^\top ]=\frac{\pi ^2}{4}I_{2}\) holds. With these representations, we get the covariance kernel of \(\mathcal {Z}\) in Theorem 3.1

    $$\begin{aligned} K_{MIQ}(s,t)&=\frac{1}{2}\big (s^2+t^2+\vert s-t\vert +1\big )e^{-\vert s-t\vert }\\&\quad +\Big [t\big (\vert t\vert +1\big )\big (2J_1(s)-sJ_2(s)\big )- \big (t^2+\vert t\vert +1\big )\big (\frac{s}{2}J_3(s)+J_4(s)\big )\Big ]e^{-\vert t\vert }\\&\quad +\Big [s\big (\vert s\vert +1\big )\big (2J_1(t)-tJ_2(t)\big )-\big (s^2+\vert s\vert +1\big ) \big (\frac{t}{2}J_3(t)+J_4(t)\big )\Big ]e^{-\vert s\vert }\\&\quad +\frac{\pi ^2}{16}\Big [\big (s^2+\vert s\vert +1\big )\big (t^2+\vert t\vert +1\big )+st\big (\vert s\vert +1\big )\big (\vert t\vert +1\big )\Big ]e^{-\vert s\vert -\vert t\vert }, \end{aligned}$$

    for \(s,t\in \mathbb {R},\) where

    $$\begin{aligned} J_1(t)&=\int _0^1\frac{x\sin (tx)}{(1+x^2)^2}dx, J_2(t)=\int _0^1\frac{\cos (tx)}{1+x^2}dx, \\ J_3(t)&=\int _0^\infty \frac{\sin (tx)}{1+x^2}dx, J_4(t)=\int _0^\infty \frac{x\cos (tx)}{(1+x^2)^2}dx. \end{aligned}$$

    Direct calculations of integrals lead to

    $$\begin{aligned} \mathbb {E}\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2= & {} \int _{-\infty }^\infty K_{MIQ}(t,t)\omega _a(t)\,\text{ d }t \\= & {} (8 \left( a+2 \right) ^{5} \left( 1+a \right) ^{3}{a}^{3} \left( {a}^{2}+2\,a+2 \right) ^{3})^{-1}\left[ \left( {\pi }^{2}-8 \right) {a}^{16} + \left( 19\,{\pi }^{2}-152 \right) {a}^{15} \right. \\{} & {} + \left( 173\,{\pi }^{2}-1368 \right) {a}^{14}+ \left( 1003\,{\pi }^{2}-7720 \right) {a}^{13} + \left( 4126\,{\pi }^{2}-30192 \right) {a}^{12}\\{} & {} + \left( 12594\,{\pi }^{2}-84304 \right) {a}^{11}+ \left( 29128\,{\pi }^{2}-163520 \right) {a}^{10} \\{} & {} + \left( 51460\,{\pi }^{2}-188832 \right) {a}^{9}+ \left( 69320\,{\pi }^{2}-8256 \right) {a}^{8}\\{} & {} + \left( 70296\,{\pi }^{2}+457664 \right) {a}^{7}+ \left( 52176\,{\pi }^{2}+1025920 \right) {a}^{6}\\{} & {} + \left( 26848\,{\pi }^{2}+1323264 \right) {a}^{5}+ \left( 8576\,{\pi }^{2}+1151488 \right) {a}^{4} \\{} & {} \left. + \left( 1280\,{\pi }^{2}+693248 \right) {a}^{3}+280576\,{a}^{2}+69632\,a+8192\right] . \end{aligned}$$
  2. 2.

    Maximum likelihood estimators: Matsui and Takemura (2005) show in Lemma A.1 that for the maximum-likelihood estimator \(\widehat{\alpha }_n\) and \(\widehat{\beta }_n\) in the Cauchy family the linear representations are given by

    $$\begin{aligned} \psi _1(x)=\frac{4x}{1+x^2},\quad \text{ and }\quad \psi _2(x)=\frac{2(x^2-1)}{1+x^2},\;x\in \mathbb {R}. \end{aligned}$$

    Again, straightforward calculations show \(\mathbb {E}[\psi _1(X_1)]=\mathbb {E}[\psi _2(X_1)]=0\) as well as \(\mathbb {E}[\psi (X_1)\psi (X_1)^\top ]=2 I_{2}\). Note that there are no closed form expressions for the estimators, such that the log-likelihood equations have to be solved numerically. This leads to the covariance kernel of \(\mathcal {Z}\) in Theorem 3.1 given by

    $$\begin{aligned} K_{ML}(s,t)&= \frac{1}{2}\big (s^2+t^2+\vert s-t\vert +1\big )e^{-\vert s-t\vert } \\&\quad -\frac{1}{2}(t^2+\vert t\vert +1)(s^2+\vert s\vert +1)e^{-\vert t\vert -\vert s\vert } \\&\quad -\frac{1}{2}t(\vert t\vert +1)s(\vert s\vert +1)e^{-\vert t\vert -\vert s\vert },\quad s,t\in \mathbb {R}. \end{aligned}$$

    With this explicit formula for the covariance kernel, we compute

    $$\begin{aligned} \mathbb {E}\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2=\int _{-\infty }^\infty K_{ML}(t,t)\omega _a(t)\,\text{ d }t={\frac{8\,{a}^{4}+80\,{a}^{3}+352\,{a}^{2}+320\,a+128}{{a}^{3} \left( a+2 \right) ^{5}}} \end{aligned}$$

    and

    $$\begin{aligned} \text{ Var }\Vert \mathcal {Z}\Vert _{\mathbb {H}}^2= & {} 2\int _{-\infty }^\infty \int _{-\infty }^ \infty K_{ML}^2(s,t)\omega _a(s)\omega _a(t)\,\text{ d }t\text{ d }s\\{} & {} = \left( \frac{1}{2}{a}^{5} \left( a+1 \right) ^{7} \left( a+2 \right) ^{10}\right) ^{-1} \left[ 10\,{a}^{14}+270\,{a}^{13}+3521\,{a}^{12}\right. \\{} & {} +27987\,{a}^{11}+146819\,{a}^{10}+510582\,{a}^{9}+1194078\,{a}^{8}+1914216\,{a}^{7}\\{} & {} +2134432\,{a}^{6}+1671456\,{a}^{5}+928192\,{a}^{4}+369792\,{a}^{3}+104192\,{a}^{2}\\{} & {} \left. +18432\,a+1536\right] . \end{aligned}$$
  3. 3.

    Equivariant integrated squared error estimator: In Matsui and Takemura (2005) the authors propose an equivariant version of the minimal integrated squared error estimator introduced by Besbeas and Morgan (2001) for the Cauchy distribution. The estimators are derived by minimization of the weighted \(L^2\)-distance

    $$\begin{aligned} I(\alpha ,\beta )=\int _{-\infty }^{\infty }\left| \varphi _n(t;\alpha ,\beta )-e^{-\vert t\vert }\right| ^2\omega (t)dt, \end{aligned}$$

    where \(\varphi _n(t;\alpha ,\beta )=\frac{1}{n}\sum _{j=1}^n \exp \left( it(X_j-\alpha )/\beta \right) \), \(t\in \mathbb {R}\), is the empirical characteristic function of a distribution from the location scale family in dependence of the parameters. As weight function the authors chose the same weight function \(\omega _\nu (t)=\exp (-\nu |t|)\), \(t\in \mathbb {R}\), \(\nu >0\), and hence get families of estimators \(\widehat{\alpha }_{n,\nu }\) and \(\widehat{\beta }_{n,\nu }\) in dependence of \(\nu \). Note that the optimization problem has to be solved numerically and no closed-form formula is known for the estimators. We denote this class of estimators hereinafter EISE. Lemma A.1 in Matsui and Takemura (2005) provides the linear asymptotic representations

    $$\begin{aligned} \psi _1(x,\nu )&= (\nu +1)(\nu +2)^3\frac{x}{((\nu +1)^2+x^2)^2}, \\ \psi _2(x,\nu )&= \frac{1}{2}(\nu +2)-\frac{1}{2}(\nu +2)^3\frac{(\nu +1)^2-x^2}{((\nu +1)^2+x^2)^2}, \end{aligned}$$

    for \(x\in \mathbb {R}\), which lead for every \(\nu >0\) to an explicit expression of the covariance kernel of \(\mathcal {Z}\) in Theorem 3.1, which can be found in Appendix A.3.

As demonstrated in Remark 3.2, the estimation procedure has some influence on the limit null distribution of \(T_n\). We refer to Chen (2011); Fegyverneki (2013) for alternative estimators of the parameters of the Cauchy distribution and to Cohen Freue (2007) for a Pitman estimator of the location parameter \(\alpha \) when \(\beta \) is known.

3.1 Asymptotic normality of the limit statistic under the null hypothesis

The next result gives the limiting distribution of the scaled limiting statistic \(aT_{n,a}\) for \(a\rightarrow 0\) given in (9).

Theorem 3.3

  1. (a)

    Let \({{\,\textrm{arcsine}\,}}(a,b)\) denote the arcsine distribution on [ab] with distribution function

    $$\begin{aligned} F(x)=\frac{2}{\pi }\arcsin \sqrt{\frac{x-a}{b-a}},a\le x\le b. \end{aligned}$$

    If \(X\sim C(0,1)\), then

    $$\begin{aligned} \frac{4X^2}{(1+X^2)^2}&\sim {{\,\textrm{arcsine}\,}}(0,1). \end{aligned}$$
  2. (b)

    Let \(X_1,\ldots ,X_n\) be i.i.d. \(\text{ C }(0,1)\) distributed random variables, and let \(Y_{n,j}\) be the corresponding scaled residuals, using any estimator \(({\widehat{\alpha }},{\widehat{\beta }})\) with linear representation. Then,

    $$\begin{aligned} {\widetilde{T}}_{n,0}&= \sqrt{2n} \left( \frac{8}{n} \sum _{j=1}^n \frac{Y_{n,j}^2}{(1+Y_{n,j}^2)^2} - 1\right) \ {\mathop {\longrightarrow }\limits ^{\mathcal {D}}}\ N(0,1), \end{aligned}$$
    (18)

    where N(0, 1) denotes the standard normal distribution.

Proof

  1. (a)

    Let U have a uniform distribution on \((-\pi /2,\pi /2)\), and put \(X=\tan U\). Then, \(X\sim C(0,1)\). Further, \(\sin U, \, X^2/(1+X^2)=\sin ^2 U, \, 1/(1+X^2)=\cos ^2 U\) as well as \(2\sin U \cos U\) have an \({{\,\textrm{arcsine}\,}}(-1,1)\) distribution (see, e.g., Norton (1983)). If \(Z \sim {{\,\textrm{arcsine}\,}}(-1,1)\), then \(Z^2 \sim {{\,\textrm{arcsine}\,}}(0,1)\). Hence,

    $$\begin{aligned} \frac{4X^2}{(1+X^2)^2}&\sim 4 \sin ^2 U\cos ^2 U \sim {{\,\textrm{arcsine}\,}}(0,1). \end{aligned}$$
  2. (b)

    A Taylor expansion around \((\alpha _0,\beta _0)=(0,1)\) yields

    $$\begin{aligned}&\frac{1}{\sqrt{n}} \sum _{j=1}^n \frac{(X_j-\alpha )^2/\beta ^2}{(1+(X_j-\alpha )^2/\beta ^2)^2} \\&\quad = \frac{1}{\sqrt{n}} \sum _{j=1}^n \left\{ \frac{X_j^2}{(1+X_j^2)^2} + \frac{2X_j(X_j^2-1)\alpha }{(1+X_j^2)^3} + \frac{2X_j^2(X_j^2-1)(\beta -1)}{(1+X_j^2)^3} \right. \\&\quad \left. + f_1(X_j,{\tilde{\alpha }},{\tilde{\beta }}) \alpha ^2 + f_2(X_j,{\tilde{\alpha }},{\tilde{\beta }}) \alpha (\beta -1) + f_3(X_j,{\tilde{\alpha }},{\tilde{\beta }})(\beta -1)^2 \right\} , \end{aligned}$$

    where \(({\tilde{\alpha }},{\tilde{\beta }})\) lies between \((\alpha ,\beta )\) and (0, 1), and

    $$\begin{aligned} f_1(x,\alpha ,\beta )&= \frac{2}{\beta ^2} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-2} - \frac{20(x-\alpha )^2}{\beta ^4} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-3} \\&\quad + \frac{24(x-\alpha )^4}{\beta ^6} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-4}, \\ f_2(x,\alpha ,\beta )&= \frac{4(x-\alpha )}{\beta ^3} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-2} - \frac{24(x-\alpha )^3}{\beta ^5} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-3} \\&\quad + \frac{24(x-\alpha )^5}{\beta ^7} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-4}, \\ f_3(x,\alpha ,\beta )&= \frac{6(x-\alpha )^2}{\beta ^4} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-2} - \frac{28(x-\alpha )^4}{\beta ^6} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-3} \\&\quad + \frac{24(x-\alpha )^6}{\beta ^8} \left( 1 + \frac{(x-\alpha )^2}{\beta ^2} \right) ^{-4}. \end{aligned}$$

    If we replace \((\alpha ,\beta )\) by any estimator \(({\widehat{\alpha }}_n,{\widehat{\beta }}_n)\) with linear representation, the second and third term converge to zero in probability, since, by the results given in the proof of part a),

    $$\begin{aligned}&\frac{1}{n} \sum _{j=1}^n \frac{X_j(X_j^2-1)}{(1+X_j^2)^3} \overset{a.s.}{\longrightarrow }\mathbb {E}\frac{X_1(X_1^2-1)}{(1+X_1^2)^3} \ = \ 0, \end{aligned}$$
    (19)
    $$\begin{aligned}&\frac{1}{n} \sum _{j=1}^n \frac{X_j^2 (X_j^2-1)}{(1+X_j^2)^3} \overset{a.s.}{\longrightarrow }\mathbb {E}\frac{X_1^2(X_1^2-1)}{(1+X_1^2)^3} \ = \ 0, \end{aligned}$$
    (20)

    and \(\sqrt{n}{\widehat{\alpha }}_n\) and \(\sqrt{n}({\widehat{\beta }}_n-1)\) have limiting normal distributions. Putting \(z=(x-\alpha )/\beta \), the functions \(f_j, j=1,2,3,\) involve terms of the form \(z^k (1+z^2)^{-l}/\beta ^2\), where \(k<2l\). Hence,

    $$\begin{aligned} \frac{1}{n} \sum _{j=1}^n |f_j(X_j,{\tilde{\alpha }}_n,{\tilde{\beta }}_n)|&\le \frac{c_j}{{\tilde{\beta }}_n^2}, \quad j=1,2,3, \end{aligned}$$

    where \(c_j>0\), and \(({\tilde{\alpha }}_n,{\tilde{\beta }}_n)\) lies between \(({\widehat{\alpha }}_n,{\widehat{\beta }}_n)\) and (0, 1). Since \(\sqrt{n}{\widehat{\alpha }}_n^2\), \(\sqrt{n}{\widehat{\alpha }}_n({\widehat{\beta }}_n-1)\) and \(\sqrt{n}({\widehat{\beta }}_n-1)^2\) converge to zero in probability, and \({\tilde{\beta }}_n {\mathop {\longrightarrow }\limits ^{\mathcal {\mathbb {P}}}}1\), we obtain

    $$\begin{aligned} \frac{8}{\sqrt{n}} \sum _{j=1}^n \frac{Y_{n,j}^2}{(1+Y_{n,j}^2)^2} = \frac{8}{\sqrt{n}} \sum _{j=1}^n \frac{X_j^2}{(1+X_j^2)^2} + R_n, \end{aligned}$$

    where \(R_n{\mathop {\longrightarrow }\limits ^{\mathcal {\mathbb {P}}}}0\) for \(n\rightarrow \infty .\) By a), \(8X_1^2/(1+X_1^2)^2\) has expected value 1 and variance 1/2, and the assertion follows by the central limit theorem. \(\square \)

Remark 3.4

  1. (i)

    Let \(X\sim F\), and assume that \(8X^2/(1+X^2)^2\) has expected value 1 and finite variance. Then, the test based on \({\widetilde{T}}_{n,0}\) is not consistent against F, i.e. the test is not an omnibus test.

  2. (ii)

    Typically, the estimation method enters the asymptotic distribution via the functions \(\psi _1\) and \(\psi _2\) in (12) and (13) and the first order terms in the Taylor expansion. Here, this is not the case, because the corresponding expected values in (19) and (20) vanish.

Remark 3.5

Note that the sum appearing on the right hand side of (10) involves essentially the first component of the score function of the Cauchy distribution (see the second part of Remark 3.2). Hence, when using the maximum-likelihood estimator for \((\alpha ,\beta )\), \(\lim _{a\rightarrow \infty } aT_{n,a}\equiv 0\). For other estimators, the limit does not vanish, but a goodness-of-fit test based on it has very low power. Hence, we don’t give further results for this statistic.

4 Consistency

In this section we show that the new tests are consistent against all alternatives satisfying a weak moment condition. Let \(X,X_1,X_2,\ldots \) be i.i.d. random variables with cumulative distribution function F having a unique median as well as unique upper and lower quartiles and \(\mathbb {E}|X|^4/(1+X^2)^2<\infty \). Since the tests \(T_n\) are affine invariant, we assume w.l.o.g. that the true median \(\alpha =0\) and the half interquartile range \(\beta =1\). Under these assumptions we clearly have that \((\widehat{\alpha }_n,\widehat{\beta }_n)\overset{\mathbb {P}}{\longrightarrow }(0,1)\) as \(n\rightarrow \infty \) for the MIQ estimation method in Remark 3.2, for details see Gürtler and Henze (2000), and assume the convergence in all other cases.

Theorem 4.1

Under the standing assumptions, we have

$$\begin{aligned} \frac{T_n}{n}\overset{\mathbb {P}}{\longrightarrow }\int _{-\infty }^{\infty }\bigg \vert \mathbb {E}\bigg [\bigg (it-\frac{2X}{1+X^2}\bigg )e^{itX}\bigg ]\bigg \vert ^2\omega (t)dt=\Delta _F,\quad \text{ as }\;n\longrightarrow \infty . \end{aligned}$$

A proof of Theorem 4.1 is deferred to Appendix A.2. In view of the characterization in Theorem 1.1\(\Delta _F=0\) if and only if \(F=\text{ C }(0,1)\). Hence, we conclude that the tests \(T_n\) are able to detect all alternatives satisfying the assumptions of this section.

Remark 4.2

In this remark we fix the weight function \(\omega _1(t)=\exp (-|t|)\), \(t\in \mathbb {R}\). By numerical integration we calculate that if \(X\sim \mathcal {U}(-\sqrt{3},\sqrt{3})\), then \(\Delta _{\mathcal {U}}\approx 2.332019\), if \(X\sim \text{ N }(0,1)\) then \(\Delta _{N}\approx 0.021839\) and if \(X\sim \text{ Log }(0,1)\), then \(\Delta _{L}\approx 0.041495\).

5 Simulation results

In this section, we describe an extensive simulation study to compare the power of the new tests with established ones. For all computations in this and the following section, we used the statistical software R (R Core Team, 2021). All simulations have been done at nominal level \(\alpha =0.05\) and for sample sizes \(n=20\) and \(n=50\). The unknown parameters are estimated either by median and half-IQR estimators or by the method of maximum likelihood, see Remark 3.2. The convergence of the critical values of the new test statistics to its asymptotic values is generally quite slow. Table 1 shows a few selected cases. For \(T_{n,a}\) in (8) with weight \(a=1\), asymptotic values could be used for sample sizes larger than 100; however, for \(a=5\) and the median and half-IQR estimators, sample sizes of 500 or more are necessary. Hence, in a first step, critical values are determined for each test statistic by a Monte Carlo simulation with \(10^5\) replications. In a second step, empirical power of the different tests is calculated based on \(10^4\) replications.

Table 1 Empirical 0.95-quantiles, based on 10 000 MC samples, for \(T_{n,1}\) and \(T_{n,5}\) using median/half-IQR and ML estimator

Besides the new tests based on \(T_{n,a}\) with weights \(a=1,2,3,4,5,6\), we use the limiting test \({\widetilde{T}}_{n,0}\) in (18). Power decreased for larger weights for all distributions used in the study; hence, we omit the results for weights larger than 6 as well as for the test based on the limiting statistic \(T_{n,\infty }\) in (10), see also Remark 3.5. Moreover, we use the test based on the KL distance described in Mahdizadeh and Zamanzade (2017). For the nonparametric estimation of the entropy, the authors recommend the window size \(m=4\) for \(n=20\) and \(m=20\) for \(n=50\) (Mahdizadeh and Zamanzade 2017, p. 1109), and we followed this suggestion. For computing the Vasicek estimate of differential Shannon entropy, we used the function entropy.estimate in the R package vsgoftest (Lequesne and Regnault 2020). From the classical tests that utilize the empirical distribution function (so-called edf tests), we choose the following: Kolmogorov-Smirnov test (KS), Cramér-von Mises test (CM), Anderson-Darling test (AD) and Watson test (W). Finally, we apply the tests based on the statistics \(D_{n,\lambda }\) considered in Gürtler and Henze (2000) and Matsui and Takemura (2005), with weights \(\lambda =1,2,3,4,5,6\).

Of course, there are further goodness of fit tests for the Cauchy distribution. For example, one can use the procedure proposed by Chen and Balakrishnan (2012) which consists in transforming the observations to approximately normally distributed random variables, and then applying any edf test for normality with estimated parameters. For the latter, there exist tables and software, so there is no need for, say, a parametric bootstrap procedure. This is a nice feature for distributions with shape parameters or in more complicated situations (Klar and Meintanis 2012; Goldmann et al. 2015). However, for location-scale families and parameter-free test statistics, where the null distribution can be obtained by Monte Carlo simulation, this is only a minor advantage. On the other hand, these tests don’t maintain exactly the theoretical level. Further, besides the results derived in Goldmann et al. (2015), not much is known about asymptotic properties of the tests.

Similarly as in Chen and Balakrishnan (2012), Villaseñor and González-Estrada (2021) suggest to transform the observations to approximately exponentially distributed random variables. Then, the authors apply the Anderson-Darling test for exponentiality with estimated parameters. They use this procedure to test for the Cauchy distribution, but recommend to obtain critical values by Monte Carlo simulation. Their results show that neither the classical Anderson-Darling test nor the Anderson-Darling tests based on the two transformations outperforms the other two in all cases. Clearly, one could construct a multitude of further tests by (approximate) transformations to an arbitrary continuous distribution, and using any of the well-known edf test statistics. For the above reasons, and since we don’t aim at a complete overview over gof tests for the Cauchy distribution, we decided not to include such transformation based tests in our power study.

Besides the standard Cauchy and normal distribution (C(0, 1) and N(0, 1), for short), we use Cauchy-normal mixtures CN\((p)=(1-p)\)C\((0,1)+p\)N(0, 1) with \(p=0.5\) and \(p=0.8\). Further, we employ Student’s t-distribution (Student(k)) with \(k=2,3,5,10\) degrees of freedom, the (symmetric) Tukey g-h distribution with \(g=0\) and \(h=0.2, 0.1, 0.05\), denoted by Tukey(h), and the Tukey lambda distribution (Tukey-L\((\lambda )\)) with \(\lambda =-3,-2,-0.5,0.5\). Note that the Tukey lambda distribution approximates the Cauchy distribution for \(\lambda =-1\), it coincides with the logistic distribution for \(\lambda =0\), it is close to the normal for \(\lambda =0.14\), and coincides with the uniform distribution for \(\lambda =1\). In the family of stable distributions S\((\alpha ,\beta )\), we choose symmetric distributions S\((\alpha ,0)\) with \(\alpha =0.4,0.7,1.2,1.5,1.8\), and skewed distributions S\((\alpha ,1)\) with \(\alpha =0.5,1,1.5,2\). Moreover, we use the uniform, logistic, Laplace, Gumbel and exponential distribution, and the Mittag-Leffler distribution ML\((\alpha )\) with \(\alpha =0.25,0.5,0.75\). The ML\((\alpha )\) distribution has Laplace transform \((1+t^\alpha )^{-1}\), and coincides with the exponential distribution for \(\alpha =1\).

Table 2 Percentage of 10,000 MC samples declared significant by various tests for the Cauchy distribution using median and half-IQR estimator (\(\alpha =0.05, n=20\))

Power estimates of the tests under discussion are given in Tables 2, 3, 4 and 5. All entries are the percentage of rejection of \(H_0\), rounded to the nearest integer. In Tables 2 and 3, the results using the median and interquartile-distance estimators are given, with sample size \(n=20\) and \(n=50\), respectively. Tables 4 and 5 show the corresponding results for the maximum likelihood estimator.

Table 3 Percentage of 10,000 MC samples declared significant by various tests for the Cauchy distribution using median and half-IQR estimator (\(\alpha =0.05, n=50\))
Table 4 Percentage of 10,000 MC samples declared significant by various tests for the Cauchy distribution using ML estimation (\(\alpha =0.05, n=20\))
Table 5 Percentage of 10,000 MC samples declared significant by various tests for the Cauchy distribution using ML estimation (\(\alpha =0.05, n=50\))

The main conclusions that can be drawn from the simulation results are the following:

  • As always in similar situations, there exists no uniformly most powerful test, which is in accordance to the results in Janssen (2000).

  • As expected, the power of nearly all test statistics increases for increasing sample size for all alternative distributions. An exception is the KL distance based test. Its power decreases for certain alternatives, in particular for the Mittag-Leffler distribution. This behavior has been reported previously, see Tables 3 to 6 in Mahdizadeh and Zamanzade (2019), and may be due to the choice of the window size.

  • The new tests \(T_{n,a}, a=1,\ldots ,6\), perform better using the maximum likelihood estimator than with median and half-IQR. Hence, the latter estimators should not be used. For all other test statistics, including \({\widetilde{T}}_{n,0}\), performance is comparable, or even better when using median and half-IQR. In any case, the choice of the estimation method can have a pronounced influence on the performance of the tests.

  • The power of \({\widetilde{T}}_{n,0}\) is comparable under both estimation methods.

  • For alternatives with finite first and second moments, the KL distance based test has the highest power among all competitors. On the other hand, its power breaks down completely for some alternatives without existing first moment, and it is low for some alternatives with infinite second moment. Hence, the test can not really be seen as an omnibus test.

  • Among the new tests, values of the tuning parameter around \(a=3\) result in a quite homogeneous power against all alternatives. For the tests based on \(D_{n,\lambda }\), \(\lambda =5\) seems to be a good choice. Both classes of tests perform similarly for the ML estimator; for the median and half-IQR estimator, the latter is preferable.

  • Among the group of edf tests, W outperform the other tests in most cases.

  • For symmetric alternatives without existing first moment as Stable(0.4,0), Stable(0.7,0), Tukey-L(-3) and Tukey-L(-2), the tests based on \({\widetilde{T}}_{n,0}, AD\) and \(D_{n,6}\) perform best.

6 Real data example: log-returns of cryptocurrencies

In this section, we apply the tests for the Cauchy distribution to log-returns of various cryptocurrencies, namely Bitcoin (BTC), Ethereum (ETH), Ripple (XRP), Litecoin (LTC), BitcoinCash (BCH), EOS (EOS), BinanceCoin (BNB) and Tron(TRX). The Cauchy distribution is found to be a comparably good model for such data sets in Szczygielski et al. (2020). There, 58 hypothetical distributions have been fitted to 15 major cryptocurrencies, and the Cauchy model turned out to be the best fitting distribution for 10 cryptocurrencies (including all currencies considered here). In Szczygielski et al. (2020), the number of observations of the various data sets varied widely from 638 to 2255. Further, with very large data sets, each model will be rejected in the end. Hence, we decided to consider shorter time series: a series with daily observations from January 01, 2020 to June 10, 2021, with sample size 527 (526 for ETH), and an even shorter series from January 01, 2021 to June 10, 2021, with sample size 161 (160 for ETH). All prices are closing values in U.S. dollars, freely accessible from CryptoDataDownload via the link www.cryptodatadownload.com/data. Returns are estimated by taking logarithmic differences. Days with zero trading volume are omitted. Figure 1 shows histogramms of the datasets with larger time span, together with the densities of fitted Cauchy distributions. Visually, the Cauchy model seems to be a reasonable approximation.

Fig. 1
figure 1

Histograms of cryptocurrency log-returns from January 01, 2020 to June 10, 2021, together with fitted Cauchy densities

As in the above cited literature, we assume in the following the independence of daily log return data.This is justified in view of the very weak serial correlations of all datasets. Figure 2 shows the autocorrelations of the first four datasets with larger time span, which are all quite small and not significant in most cases. The findings are similar for the other four cryptocurrencies and the shorter time span.

Fig. 2
figure 2

Autocorrelations of cryptocurrency log-returns from January 01, 2020 to June 10, 2021

Tables 6 and 7 report the results of the different tests for the Cauchy model for the two time series. The p-values are based on Monte Carlo simulation under the Cauchy model with \(10^4\) replications. For the test based on the KL distance, we choose the window length \(m=100\) for the longer time series, and \(m=50\) for the shorter one. The results show that even if the Cauchy distribution fits better than many other distributional models, it is still not an acceptable fit in any of the cases, possibly with the exception of EOS data. The tests based on \(T_{n,5}, KL\) and \(D_{n,5}\) result in p-values of 0.000 for all currencies for the longer time series, and p-values smaller than 0.01 for all currencies for the shorter one. The edf based tests have generally larger p-values, with the Watson test having smallest, and Cramér–von Mises test having largest p-values.

Overall, the EOS data seem to be most compatible with the hypothesis of a Cauchy distribution. For the shorter series, the p-values of all edf based tests are larger than 0.1, and the tests based on \(T_{n,1}, {\tilde{T}}_{n,0}, KS\) and CM don’t reject \(H_0\) on the 5%-level even for the larger data set. These tests also show relatively large values for Ripple (XRP) and BitcoinCash (BCH). On the whole, the results confirm the findings of the simulation study concerning the comparative power of the test.

Table 6 p-values of cryptocurrency log-returns (Jan. 01, 2020–June 10, 2021) for various tests for the Cauchy distribution
Table 7 p-values of cryptocurrency log-returns (Jan. 01, 2021-June 10, 2021), for various tests for the Cauchy distribution

7 Conclusion

We have proposed a family of tests for the Cauchy family of distributions with desirable theoretical and computational features as consistency and competitiveness to existing procedures. The authors suggest in view of the results of the simulation study to use the tuning parameter \(a=4\) in applications for good power against most alternatives.

We conclude the paper by pointing out some open problems for further research. The explicit formula of the covariance kernels in Remark 3.2 open ground to numerical approximation of the eigenvalues of the integral operator \(\mathcal {K}\). Especially an approximation of the largest eigenvalue would offer more theoretical insights, since, as is well known, it is useful for efficiency statements in the sense of Bahadur, see Bahadur (1960) and Nikitin (1995). A profound knowledge of the eigenvalues leads to explicit values of cumulants of the null distribution, such that a fit of the null distribution to the Pearson distribution system is possible. This usually gives a very accurate approximation of the quantiles of the distribution and hence of the critical values of the test, see i.e. in the context of normality testing (Ebner 2020; Henze 1990). In view of the results of Baringhaus et al. (2017), we conjecture that for a large class of fixed alternatives \(\sqrt{n}(T_n/n-\Delta _F)\) converges weakly to a centred normal distribution, where the variance is positive and may be consistently estimated. Since the derivation involves heavy algebra, the authors postpone results of such kind to future work. Another desirable extension for the family of tests is a data dependent choice of the tuning parameter a, see Tenreiro (2019) for a procedure which might be applicable. Finally, it is clear that the choice of the weight function \(\omega _a(t)\) has an impact on the power of the tests, so maybe other choices lead to better performances.