A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Zaigraev, Alexander

doi:10.1007/s00184-023-00910-8

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Open access
Published: 24 May 2023

Volume 87, pages 201–209, (2024)
Cite this article

Download PDF

You have full access to this open access article

Metrika Aims and scope Submit manuscript

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Download PDF

Alexander Zaigraev ORCID: orcid.org/0000-0002-6261-031X¹

795 Accesses
1 Altmetric
Explore all metrics

Abstract

The problem of optimal estimation of location and scale parameters of absolutely continuous distributions, by means of two-dimensional confidence regions based on L-statistics, is considered. The case, when the sample size is random and tends to infinity, is studied. The paper can be considered as a supplement to Zaigraev and Alama-Bućko (Metrika 81:283–305, 2018) in case of samples of random size.

Interval Estimation of Probability P(X < Y) in Case of One-Parameter Exponential Families Using Unbiased Estimates*

Article 13 January 2015

Optimal choice of order statistics under confidence region estimation in case of large samples

Article 01 February 2018

Some parametric tests based on sample spacings

Article 16 September 2022

1 Introduction

The problem of optimal confidence interval/region estimation is one of the classical problems of Mathematical Statistics. Although, it has mainly been investigated for a single unknown parameter, the multivariate case has also quite a rich history, starting with confidence regions of rectangular shape (see, e.g., S̆idák 1967) and ending with those of arbitrary shape (see, e.g., Czarnowska and Nagaev 2001; Alama-Bućko et al. 2006; Zaigraev and Alama-Bućko 2013, 2017, 2018). But in all the cases only samples of non-random size have been taken into account.

Here, the samples of random size are studied and the case of two unknown parameters is considered, where the first parameter $\vartheta _1\in \mathbb {R}$ is location and the second parameter $\vartheta _2>0$ is scale. Therefore, two-dimensional confidence regions as estimators of $\vartheta =(\vartheta _1, \vartheta _2)$ are considered. A solution to the problem of construction of the optimal confidence region for large samples was obtained, e.g., in Zaigraev and Alama-Bućko (2013, 2018). Thus, this note can be considered as a supplement to Zaigraev and Alama-Bućko (2018) in case of samples of random size.

Traditionally, the size of the available sample is assumed to be deterministic, but in practice the size of the data available for statistical analysis can be often treated as random. Indeed, quite often the number of observations available is unknown until the end of the recording process and also can be treated as an (random) observation. This is the case, for example, in insurance statistics, where in different accounting periods the numbers of insurance claims are different. Or in medical statistics, where the number of patients with a particular disease varies from month to month due, e.g., to seasonal factors or from year to year due, e.g., to some epidemic. In these cases, the number of observations available, as well as the observations themselves, are previously unknown and should be treated as random.

In this note, we discuss the problem described above for samples of random size and compare the quality of estimators constructed from samples of random and non-random size.

We start with notation, assumptions and a review of the construction of the optimal confidence region given in Zaigraev and Alama-Bućko (2018) for samples of non-random size (Sect. 2), while our main result is given in Sect. 3. In that section we essentially use the results from Korolev (2000) and Bening and Korolev (2005).

2 Notation and recollections

Let $\textbf{x}=(x_1,x_2,\ldots ,x_n)$ be a sample from a distribution $\mathbb {P}_{\vartheta },$ that is $\{x_i\}^n_{i=1}$ are assumed to be independent real-valued random variables having the distribution $\mathbb {P}_{\vartheta }.$ Let $F=F_{(0,1)}$ be the continuous distribution function corresponding to $\mathbb {P}_{(0,1)}$ and let

$$\begin{aligned} F^{-1}(p)=\inf \{v\in \mathbb {R}: F(v) \geqslant p \},\quad 0<p<1 \end{aligned}$$

be the quantile function. The distribution $\mathbb {P}_{(0,1)}$ is assumed to be absolutely continuous with the density function f. Let $x_{m_1:n}, x_{m_2:n},\ldots ,x_{m_k:n}$ be the order statistics corresponding to the sample $\textbf{x},\ 1\leqslant m_1<m_2<\ldots <m_k\leqslant n,$ and let $0<p_1<p_2<\ldots<p_k<1$ be such numbers that $m_i/n-p_i=o(n^{-1/2}),$ as $n\rightarrow \infty ,$ and $f(F^{-1}(p_i))>0,\ i=1,\ldots ,k.$ In what follows we confine ourselves with the case of central order statistics (the treatment of the more general case for samples of deterministic size is detailed in Zaigraev and Alama-Bućko (2018)).

We propose to base the construction of optimal confidence region for $\vartheta $ on L-statistics $t_1(\cdot )$ and $t_2(\cdot )$ being the asymptotically best linear unbiased estimators of $\vartheta _1$ and $\vartheta _2,$ respectively. As it is well-known (see, e.g., Masoom Ali and Umbach (1998) or Zaigraev and Alama-Bućko (2018)), these estimators look as follows:

$$\begin{aligned} t_1(\textbf{x})=\sum _{j=1}^ka_jx_{m_j:n}, \quad \quad t_2(\textbf{x})=\sum _{j=1}^kb_jx_{m_j:n}, \end{aligned}$$

where

$$\begin{aligned}{} & {} \textbf{a}=(a_1,\ldots ,a_k)=\frac{\textbf{u}V^{-1}{} \textbf{u}^T\textbf{1}_kV^{-1}-\textbf{1}_kV^{-1}{} \textbf{u}^T\textbf{u}V^{-1}}{\textbf{u}V^{-1}{} \textbf{u}^T\textbf{1}_kV^{-1}{} \textbf{1}^T_k-(\textbf{1}_kV^{-1}{} \textbf{u}^T)^2},\\{} & {} \textbf{b}=(b_1,\ldots ,b_k)=\frac{\textbf{1}_kV^{-1}{} \textbf{1}^T_k\textbf{u}V^{-1}-\textbf{1}_kV^{-1}{} \textbf{u}^T\textbf{1}_kV^{-1}}{\textbf{u}V^{-1}{} \textbf{u}^T\textbf{1}_kV^{-1}{} \textbf{1}^T_k-(\textbf{1}_kV^{-1}{} \textbf{u}^T)^2}, \end{aligned}$$

$\textbf{1}_k=(1,\ldots ,1)\in \mathbb {R}^k,\ \textbf{u}=(F^{-1}(p_1),\ldots ,F^{-1}(p_k)),\ V=(V_{ij})^k_{i,j=1}$ with

$$\begin{aligned} V_{ij}=\frac{p_i(1-p_j)}{f(F^{-1}(p_i))f(F^{-1}(p_j)) }, \quad \quad 1\leqslant i\leqslant j\leqslant k. \end{aligned}$$

Note that the well-known result on the form of the asymptotic distribution of selected central order statistics, established by Mosteller (see Mosteller (1946) or David and Nagaraja (2003), Subsection 10.3), is used here.

Let $\alpha \in (0,1)$ be a given small number, while $\textbf{y}=(y_1,y_2,\ldots ,y_n)$ be a sample from the distribution $\mathbb {P}_{(0,1)}.$ Denote $T_n(\textbf{y})=(-t_1(\textbf{y})/t_2(\textbf{y}),1/t_2(\textbf{y})-~\!1).$ Taking a set $A_n\in \mathcal {B}^2$ such that $\mathbb {P}_{(0,1)}\left( \sqrt{n}T_n(\textbf{y})\in A_n\right) =1-\alpha ,$ one can obtain

$$\begin{aligned} \mathbb {P}_{(\theta _1,\theta _2)}\bigg (\sqrt{n}\bigg (\frac{\theta _1-t_1(\textbf{x})}{t_2(\textbf{x})},\frac{\theta _2-t_2(\textbf{x})}{t_2(\textbf{x})}\bigg ) \in A_n\bigg )=1-\alpha \end{aligned}$$

(here $\mathcal {B}^2$ denotes the $\sigma $-algebra of Borel subsets of $\mathbb {R}^2$).

Thus, the set

$$\begin{aligned} B_{A_n}(\textbf{x})=(t_1(\textbf{x}),t_2(\textbf{x}))+(t_2(\textbf{x})/\sqrt{n})A_n \end{aligned}$$

(1)

is a confidence region of level $1-\alpha $ for $\vartheta ;$ its quality can be characterized by the risk function defined as

$$\begin{aligned} R(\vartheta ,B_{A_n})=\textrm{E}_{\vartheta }\lambda _2(B_{A_n}(\textbf{x}))=\lambda _2(A_n)\textrm{E}_{\vartheta }t_2^2(\textbf{x})/n, \end{aligned}$$

(2)

where $\lambda _2$ is the Lebesgue measure on $\mathcal {B}^2.$ Under the assumption that the density function $g_n$ of $T_n(\textbf{y})$ is continuous and such that

$$\begin{aligned} \lambda _2(\{\textbf{v}\in \mathbb {R}^2: g_n(\textbf{v})=z\})=0 \ \ \ \ \forall z>0, \end{aligned}$$

the confidence region $B_{A^*_n}$ with

$$\begin{aligned} A^*_n=\{\textbf{v}\in \mathbb {R}^2: \bar{g}_n(\textbf{v})\geqslant z_{\alpha }\} \end{aligned}$$

is optimal among all the confidence regions of the form (1), that is it has the smallest value of (2) (see Einmahl and Mason (1992)), where $z_{\alpha }$ is defined by the equation

$$\begin{aligned} \int _{A^*_n}\bar{g}_n(\textbf{v})\textrm{d}{} \textbf{v}=1-\alpha , \end{aligned}$$

and $\bar{g}_n(\textbf{v})=(1/n)g_n(\textbf{v}/\sqrt{n})$ is the density function of $\sqrt{n}T_n(\textbf{y}).$ Moreover, Zaigraev and Alama-Bućko (2017; 2018), have established that $\sqrt{n}T_n(\textbf{y})$ has asymptotically, as $n\rightarrow \infty ,$ 2-dimensional normal distribution $\mathcal{N}_2(0,W)$ and the set $A^*_n,$ as $n\rightarrow \infty ,$ tends to the ellipse $A_0=\{\textbf{v}\in \mathbb {R}^2\!: \textbf{v}W^{-1}\textbf{v}^T\leqslant 2\ln {\alpha ^{-1}}\},$ where

$$\begin{aligned} W=\left[ \begin{array}{c} \textbf{a}\\ \textbf{b}\end{array}\right] V\left[ \begin{array}{c} \textbf{a}\\ \textbf{b}\end{array}\right] ^T=\left[ \begin{array}{cc} \textbf{a}V\textbf{a}^T &{} \textbf{a}V\textbf{b}^T\\ \textbf{a}V\textbf{b}^T &{} \textbf{b}V\textbf{b}^T\end{array}\right] . \end{aligned}$$

Therefore,

$$\begin{aligned} \lim _{n\rightarrow \infty }\lambda _2(A^*_n)=\lambda _2(A_0)=2\pi \sqrt{\det W}\ln {\alpha ^{-1}}. \end{aligned}$$

Since $\lim _{n\rightarrow \infty }\mathbb {E}_{\vartheta }t_2^2(\textbf{x})=\theta _2^2\lim _{n\rightarrow \infty }\mathbb {E}_{(0,1)}t_2^2(\textbf{y})=\theta _2^2,$ the order of the risk function $\mathbb {R}(\vartheta ,B_{A^*_n})$ for positive and finite limit value of $\det W$ is 1/n, as $n\rightarrow ~\!\infty .$

3 Main result

From now on the notation ${\mathop {\Longrightarrow }\limits ^{D}}$ means the convergence of random variables or random vectors in distribution. Let $(N_n)_{n\geqslant 1}$ be a sequence of integer-valued non-negative random variables such that $N_n$ and $x_1, x_2,\ldots , x_n$ are independent for any n. We assume that

(A) $\ N_n\rightarrow \infty $ in probability, $N_n/n{\mathop {\Longrightarrow }\limits ^{D}}Y,$ as $n\rightarrow \infty ,$ and $\mathbb {E}N_n=n\ \forall n\in {\mathbb N},$ where Y is a non-degenerate random variable having an absolutely continuous distribution with a distribution function G and a density function g.

The next example contains a sequence $(N_n)_{n\geqslant 1}$, for which condition (A) is fulfilled.

Example

Assume that a random variable X has a negative binomial distribution with parameters $r>0$ and $p\in (0,1)$ (denote as $\mathcal{N}\mathcal{B}(r,p)$ distribution), i.e.

$$\begin{aligned} \mathbb {P}(X=k)={r+k-1\atopwithdelims ()k}p^r(1-p)^k,\quad k=0, 1, 2, \ldots \end{aligned}$$

with $\mathbb {E}X=r(1-p)/p.$ If r is non-integer, then ${r+k-1\atopwithdelims ()k}$ is interpreted as

$$\begin{aligned} {r+k-1\atopwithdelims ()k}=\frac{\Gamma (r+k)}{k!\Gamma (r)}. \end{aligned}$$

Note that $\mathcal{N}\mathcal{B}(1,p)$ is the geometric distribution (denoted further as Geo(p)).

Bening and Korolev (2005) show that if the random variable $N_n$ has $\mathcal{N}\mathcal{B}(m/2, m/(m+2n))$ distribution given $m>0, n\in {\mathbb N},$ then $N_n/n{\mathop {\Longrightarrow }\limits ^{D}}U_{m/2},$ as $n\rightarrow \infty ,$ where $U_{m/2}$ is a random variable having the gamma distribution G(m/2, m/2) (it is the scaled $\chi ^2_m$ distribution) with the density

$$\begin{aligned} d(t)=\frac{(m/2)^{m/2}t^{m/2-1}e^{-mt/2}}{\Gamma (m/2)},\quad t>0. \end{aligned}$$

(3)

In particular, if the random variable $N_n$ has ${ Geo}(1/(n+1))$ distribution, $n\in {\mathbb N},$ then $N_n/n{\mathop {\Longrightarrow }\limits ^{D}}U_1$ ($U_1$ has the standard exponential distribution), as $n\rightarrow \infty .$

The next result is due to Korolev (2000) and describes the asymptotic distribution of selected central order statistics in case of samples of random size. It can be considered as a generalization of Mosteller’s result.

Theorem 1

Assume that assumptions made in Sect. 2 and assumption (A) hold. As $n\rightarrow \infty ,$

$$\begin{aligned} \sqrt{n}(y_{m_1:N_n}-F^{-1}(p_1),\ldots ,y_{m_k:N_n}-F^{-1}(p_k)){\mathop {\Longrightarrow }\limits ^{D}}Z_{k,V}/\sqrt{Y}, \end{aligned}$$

where $Z_{k,V}$ stands for the random vector having the k-variate normal distribution $\mathcal{N}_k(0,V).$ The distribution function and the density of the limit distribution can be written as

$$\begin{aligned}{} & {} H(\textbf{v})=\int ^{\infty }_0\Phi _V(u^{1/2}{} \textbf{v})dG(u),\quad \textbf{v}\in \mathbb {R}^k,\\{} & {} h(\textbf{v})=\int ^{\infty }_0u^{k/2}\varphi _V(u^{1/2}{} \textbf{v})g(u)du,\quad \textbf{v}\in \mathbb {R}^k, \end{aligned}$$

respectively, where $\Phi _{k,V}$ and $\phi _{k,V}$ denotes the distribution function and the density corresponding to $\mathcal{N}_k(0,V),$ respectively.

It is worth noting that the limit distribution, established in Theorem 1, belongs to the class of elliptical distributions.

The adaptation of Theorem 1 to the random variables $(N_n)_{n\geqslant 1},$ having negative binomial distributions, gives the following result.

Corollary

If the random variable $N_n$ has $\mathcal{N}\mathcal{B}(m/2, m/(m+2n))$ distribution, $n\in {\mathbb N},$ then under the conditions of Theorem 1, as $n\rightarrow \infty ,$

$$\begin{aligned} \sqrt{n}(y_{m_1:N_n}-F^{-1}(p_1),\ldots ,y_{m_k:N_n}-F^{-1}(p_k)){\mathop {\Longrightarrow }\limits ^{D}} T_k(m,V), \end{aligned}$$

where $T_k(m,V)$ is a random vector having k-dimensional Student distribution with the density

$$\begin{aligned} h(\textbf{v})=\frac{\Gamma ((m+k)/2)}{\Gamma (m/2)m^{k/2}\pi ^{k/2}(\textrm{det} V)^{1/2}}\left( 1+\frac{1}{m}\textbf{v}V^{-1}{} \textbf{v}^T\right) ^ {-(m+k)/2}\!\!,\ \textbf{v}\in \mathbb {R}^k. \end{aligned}$$

(4)

If the random variable $N_n$ has ${ Geo}(1/(n+1))$ distribution, $n\in {\mathbb N},$ then the above limit distribution is $T_k(2,V).$

In what follows, let $\mathcal{G}(s), s>0,$ be the Laplace transform of the function g, while $\mathcal{G}^{-1}$ be the inverse function to $\mathcal{G},$ that is

$$\begin{aligned} s=\mathcal{G}^{-1}(\alpha )\ \Longleftrightarrow \ \mathcal{G}(s)=\int ^{\infty }_0e^{-su}g(u)du=\alpha . \end{aligned}$$

The inverse function exists since the derivative

$$\begin{aligned} \mathcal{G}'(s)=-\int ^{\infty }_0e^{-us}ug(u)du \end{aligned}$$

is always negative.

The next result determines the asymptotic of the optimal confidence region in case of samples of random size.

Theorem 2

Under the conditions of Theorem 1,

1.
$\sqrt{n}T_{N_n}(\textbf{y})\ {\mathop {\Longrightarrow }\limits ^{D}} Z_{2,W}/\sqrt{Y},$ as $n\rightarrow \infty ,$ and the density of the limit distribution has the form
$$\begin{aligned} h(\textbf{v})=\int ^{\infty }_0u\varphi _{2,W}(\sqrt{u}\textbf{v})g(u)du,\quad \textbf{v}\in \mathbb {R}^2. \end{aligned}$$
(5)
In particular, if the random variable $N_n$ has $\mathcal{N}\mathcal{B}(m/2,m/(m+2n))$ distribution, $n\in {\mathbb N},$ then the random vector $Z_{2,W}/\sqrt{Y}$ has $T_2(m,W)$ distribution; if the random variable $N_n$ has ${ Geo}(1/(n+1))$ distribution, $n\in {\mathbb N},$ then the random vector $Z_{2,W}/\sqrt{Y}$ has $T_2(2,W)$ distribution.
2.
The set $A^*_n,$ as $n\rightarrow \infty ,$ tends to the ellipse $A'_0=\{\textbf{v}\in \mathbb {R}^2: \textbf{v}W^{-1}\textbf{v}^T\leqslant 2\mathcal{G}^{-1}(\alpha )\},$ and
$$\begin{aligned} \lim _{n\rightarrow \infty } \lambda _2(A^*_n)=\lambda _2(A'_0)=2\pi \sqrt{\det W}\mathcal{G}^{-1}(\alpha ). \end{aligned}$$

Proof

The proof of the first part is based on Theorem 1 and the limit distribution of $\sqrt{n}T_{N_n}(\textbf{y}),$ that established similarly as the limit distribution of the corresponding statistic for the samples of non-random size (see Corollary 1 of Zaigraev and Alama-Bućko (2018)). As to the proof of the second part, note that the function h, defined by (5), can be written as

$$\begin{aligned} h(\textbf{v})= & {} \frac{1}{2\pi \sqrt{\textrm{det} W}}\!\int ^{\infty }_0\! \exp \left( -\frac{u}{2}{} \textbf{v}W^{-1}{} \textbf{v}^T\right) ug(u)du\\= & {} -\frac{1}{2\pi \sqrt{\textrm{det} W}}\mathcal{G}'\left( \frac{\textbf{v}W^{-1}{} \textbf{v}^T}{2}\right) , \textbf{v}\in \mathbb {R}^2. \end{aligned}$$

The set $A^*_n,$ as $n\rightarrow \infty ,$ approximates the set $A'_0=\{\textbf{v}\in \mathbb {R}^2: \ \ h(\textbf{v})\geqslant z'_{\alpha }\},$ where $z'_{\alpha }$ is defined by the equality

$$\begin{aligned} \int _{A'_0}h(\textbf{v})\textrm{d}{} \textbf{v}=1-\alpha . \end{aligned}$$

(6)

Since

$$\begin{aligned} \mathcal{G}''(s)=\int ^{\infty }_0e^{-us}u^2g(u)du>0\quad \forall s>0, \end{aligned}$$

the function $-\mathcal{G}'$ is monotonically decreasing, that is why

$$\begin{aligned} h(\textbf{v})\geqslant z'_{\alpha }\ \Longleftrightarrow \ \frac{\textbf{v}W^{-1}{} \textbf{v}^T}{2}\leqslant a \end{aligned}$$

(7)

for some $a>0$ and, moreover, $\lambda _2(A'_0)=2a\pi \sqrt{\textrm{det} W}.$

Changing the variables in (6): $v_1=r\cos \beta , v_2=r\sin \beta , r\geqslant 0,\beta \in [0,2\pi ),$ we obtain

$$\begin{aligned} 1-\alpha{} & {} =\int _{\{\textbf{v}: h(\textbf{v})\geqslant z'_{\alpha }\}}h(\textbf{v})d\textbf{v}\\{} & {} =\frac{1}{2\pi \sqrt{\det W}}\int ^{2\pi }_0d\beta \int ^{\frac{\sqrt{2a}}{\sigma (\beta )}}_0rdr\int ^{\infty }_0\exp \left( -\frac{u}{2}r^2\sigma ^2(\beta )\right) ug(u)du\\{} & {} =\frac{2}{2\pi \sqrt{\det W}}\int ^{2\pi }_0\frac{d\beta }{\sigma ^2(\beta )}\int ^{\sqrt{a}}_0zdz\int ^{\infty }_0e^{-uz^2}ug(u)du, \end{aligned}$$

where $\sigma ^2(\beta )=[\cos \beta \ \sin \beta ]W^{-1}[\cos \beta \ \sin \beta ]^T.$ Since

$$\begin{aligned} \int ^{\sqrt{a}}_02uze^{-uz^2}dz=1-e^{-ua}, \end{aligned}$$

we obtain

$$\begin{aligned} 1-\alpha =\frac{1}{2\pi \sqrt{\det W}}\int ^{2\pi }_0\frac{d\beta }{\sigma ^2(\beta )}\int ^{\infty }_0(1-e^{-ua})g(u)du. \end{aligned}$$

(8)

From (7) it follows that taking $z'_{\alpha }=0$ we get $a=+\infty ,$ and the same reasoning as above leads us to the formula

$$\begin{aligned} \frac{1}{2\pi \sqrt{\det W}}\int ^{2\pi }_0\frac{d\beta }{\sigma ^2(\beta )}=1. \end{aligned}$$

(9)

Substituting (9) in (8), we have

$$\begin{aligned} 1-\alpha =1-\int ^{\infty }_0e^{-ua}g(u)du=1-\mathcal{G}(a)\ \Longleftrightarrow \ a=\mathcal{G}^{-1}(\alpha ) \end{aligned}$$

and $\lambda _2(A'_0)=2\pi \sqrt{\det W}\mathcal{G}^{-1}(\alpha ).$ $\square $

Remark 1

The value $\lambda _2(A'_0)=2\pi \sqrt{\det W}\mathcal{G}^{-1}(\alpha )$ from Theorem 2 is larger than $\lambda _2(A_0)=2\pi \sqrt{\det W}\ln \alpha ^{-1}$ from Sect. 2. Indeed, since $\mathbb {E}Y=1,$ from Jensen’s inequality it follows that $\mathbb {E}\alpha ^Y>\alpha ^{\mathbb {E}Y}=\alpha ,$ i.e.

$$\begin{aligned} \int ^{\infty }_0\alpha ^ug(u)du=\mathcal{G}(\ln \alpha ^{-1})>\alpha \ \Longleftrightarrow \ \ln \alpha ^{-1}<\mathcal{G}^{-1}(\alpha ). \end{aligned}$$

Example

(continuation) If the random variable $N_n$ has $\mathcal{N}\mathcal{B}(m/2, m/(m+2n))$ distribution, $n\in {\mathbb N},$ then the function h has the form (see (4) for $k=2$):

$$\begin{aligned} h(\textbf{v})=\frac{\Gamma (m/2+1)}{\Gamma (m/2)m\pi \sqrt{\textrm{det} W}}\left( 1+\frac{1}{m}\textbf{v}W^{-1}{} \textbf{v}^T\right) ^ {-m/2-1}\!\!,\ \textbf{v}\in \mathbb {R}^2. \end{aligned}$$

In this case the function g is defined by (3) and

$$\begin{aligned} \mathcal{G}(s)=\int ^{\infty }_0e^{-su}g(u)du=\Bigl (1+\frac{2s}{m}\Bigr )^{-m/2},\quad s>0, \end{aligned}$$

while

$$\begin{aligned} \mathcal{G}^{-1}(\alpha )=\frac{m}{2}\left( \alpha ^{-2/m}-1\right) ,\quad \alpha \in (0,1). \end{aligned}$$

Therefore, $A'_0=\{\textbf{v}\in \mathbb {R}^2: \textbf{v}W^{-1}{} \textbf{v}^T\leqslant m\left( \alpha ^{-2/m}-1\right) \},$ and

$$\begin{aligned} \lim _{n\rightarrow \infty } \lambda _2(A^*_n)=\lambda _2(A'_0)=\pi m\left( \alpha ^{-2/m}-1\right) \sqrt{\det W}. \end{aligned}$$

Evidently, $\lambda _2(A'_0)>\lambda _2(A_0)\quad \Longleftrightarrow $

$$\begin{aligned} m(\alpha ^{-2/m}-1)>2\ln {\alpha }^{-1}\ \ \forall m>0\ \ \forall \alpha \in (0,1). \end{aligned}$$

The last inequality is a consequence of the known inequality: $x-1>\ln {x}$ $\forall x>1.$ Moreover, note that $m(\alpha ^{-2/m}-1)\rightarrow 2\ln {\alpha }^{-1}$, as $m\rightarrow \infty .$

Remark 2

If the distribution of Y is degenerate and $N_n/n\rightarrow 1$ in probability, as $n\rightarrow \infty ,$ then the results for samples of random size do not differ from those, obtained when the sample size is non-random (Sect. 2).

References

Alama-Bućko M, Nagaev AV, Zaigraev A (2006) Asymptotic analysis of minimum volume confidence regions for location-scale families. Appl Math (Warszawa) 33:1–20
Article MathSciNet Google Scholar
Bening VE, Korolev VYu (2005) On an application of the Student distribution in the theory of probability and mathematical statistics. Theory Probab Appl 49:377–391
Article MathSciNet Google Scholar
Czarnowska A, Nagaev AV (2001) Confidence regions of minimal area for the scale-location parameter and their applications. Appl Math (Warszawa) 28:125–142
Article MathSciNet Google Scholar
David HA, Nagaraja HN (2003) Order statistics. Wiley, New York, p 458
Book Google Scholar
Einmahl JHJ, Mason DM (1992) Generalized quantile processes. Ann Stat 20:1062–1078
Article MathSciNet Google Scholar
Korolev VYu (2000) Asymptotic properties of sample quantiles constructed from samples with random sizes. Theory Probab Appl 44:394–399
Article MathSciNet Google Scholar
Masoom Ali M, Umbach D (1998) Optimal linear inference using selected order statistics in location-scale models. In: Handbook of statistics, vol 17, North-Holland, Amsterdam, pp 183–213
Mosteller F (1946) On some useful inefficient statistics. Ann Math Stat 17:377–408
Article MathSciNet Google Scholar
S̆idák Z (1967) Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc 62(318):626–633
MathSciNet Google Scholar
Zaigraev A, Alama-Bućko M (2013) On optimal choice of order statistics in large samples for the construction of confidence regions for the location and scale. Metrika 76:577–593
Article MathSciNet Google Scholar
Zaigraev A, Alama-Bućko M (2017) Asymptotics of the optimal confidence region for shift and scale, based on two order statistics. In: Statistical methods of estimation and testing hypotheses, in Russian, Perm University, Perm, pp 49-65 (2006); translated in: J Math Sci 220(6):763-776
Zaigraev A, Alama-Bućko M (2018) Optimal choice of order statistics under confidence region estimation in case of large samples. Metrika 81:283–305
Article MathSciNet Google Scholar

Download references

Acknowledgements

The author is grateful to the referee for useful corrections improving the paper.

Author information

Authors and Affiliations

Faculty of Mathematics and Computer Science, Nicolaus Copernicus University, Chopin Street, 12/18, 87-100, Toruń, Poland
Alexander Zaigraev

Authors

Alexander Zaigraev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alexander Zaigraev.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zaigraev, A. A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size. Metrika 87, 201–209 (2024). https://doi.org/10.1007/s00184-023-00910-8

Download citation

Received: 10 October 2022
Accepted: 10 May 2023
Published: 24 May 2023
Issue Date: February 2024
DOI: https://doi.org/10.1007/s00184-023-00910-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Abstract

Similar content being viewed by others

Interval Estimation of Probability P(X < Y) in Case of One-Parameter Exponential Families Using Unbiased Estimates*

Optimal choice of order statistics under confidence region estimation in case of large samples

Some parametric tests based on sample spacings

1 Introduction

2 Notation and recollections

3 Main result

Example

Theorem 1

Corollary

Theorem 2

Proof

Remark 1

Example

Remark 2

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A note on asymptotics of the risk function under confidence region estimation in case of large samples of random size

Abstract

Similar content being viewed by others

Interval Estimation of Probability P(X < Y) in Case of One-Parameter Exponential Families Using Unbiased Estimates*

Optimal choice of order statistics under confidence region estimation in case of large samples

Some parametric tests based on sample spacings

1 Introduction

2 Notation and recollections

3 Main result

Example

Theorem 1

Corollary

Theorem 2

Proof

Remark 1

Example

Remark 2

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation