Quotient of information matrices in comparison of linear experiments for quadratic estimation

Abstract The ordering of normal linear experiments with respect to quadratic estimation, introduced by Stępniak in [Ann. Inst. Statist. Math. A 49 (1997), 569-584], is extended here to the experiments involving the nuisance parameters. Typical experiments of this kind are induced by allocations of treatments in the blocks. Our main tool, called quotient of information matrices, may be interesting itself. It is known that any orthogonal allocation of treatments in blocks is optimal with respect to linear estimation of all treatment contrasts. We show that such allocation is, however, not optimal for quadratic estimation.


Introduction
Any statistical experiment may be perceived as an information channel transforming a deterministic quantity (parameter) into a random one (observation) according to a design indicated by experimenter. The primary aim of statistician is to recover the information about the parameter from the observation. However the e ciency of this process depends not only on the statistical rule but also on the experimental design. Such design, which may be identi ed with the experiment, is represented by a probabilistic structure.
When observations have normal distribution the entire statistical analysis is based on their linear and quadratic forms. Thus the properties of such forms should be taken into account in any reasonable choice of statistical experiment.
Comparison of linear experiments by linear forms has been intensively studied in statistical literature. It is well known (for instance [1][2][3][4][5][6]) that almost all criteria used for comparison of two linear experiments with respect to linear estimation reduce to the Loewner order between their information matrices, say M and M . However, the comparison of normal linear experiments with respect to quadratic estimation is still at the initial stage and we are looking for respective tools.
It was revealed in Stępniak [7] that the relation "to be at least as good with respect to quadratic estimation" needs some knowledge about the matrix M + M , where symbol + means the Moore-Penrose generalized inversion. We shall refer to this matrix as quotient of M by M . Properties of such quotient may be interesting themselves. It appears that the Loewner order may be expressed in terms of the quotient, but not vice versa.
In this note we use the quotient of positive semide nite matrices as the main tool in the ordering of normal linear experiments with respect to quadratic estimation. The orderings of linear experiments with respect to linear and with respect to quadratic estimation are extended here to the experiments involving nuisance parameters. Typical experiments of this kind are induced by allocations of treatments in blocks.
It is well known (see [8]) that any orthogonal allocation of treatments in blocks is optimal by means of linear estimation of all treatment contrasts. We show that this allocation is, however, not optimal for quadratic estimation.

De nitions and known results
In this paper the standard vector-matrix notation is used. All vectors and matrices considered here have real entries. The space of all n × vectors is denoted by R n . For any matrix M the symbols M T , R(M), N(M) and r(M) denote, respectively, its transpose, range (column space), kernel (null space) and rank. The symbol P M stands for the orthogonal projector onto R(M), i.e. the square matrix P satisfying the conditions Px = x for x ∈ R(M) and zero for x ∈ N(M T ). Moreover, if M is square then tr(M) denotes its trace and the symbol M ≥ means that M is symmetric and positive semide nite (psd, for short).
Let x be a random vector with the expectation E(x) = Aα and the variance-covariance matrix σV, where A and V are known matrices while α = (α , ...α p ) T and σ > are unknown parameters. In this situation we shall say that x is subject to the linear experiment L(Aα, σV). If V = I, then we say that the experiment is standard. If x is normally distributed then instead of L(Aα, σV) we shall use the symbol N (Aα, σV). Now let us consider two experiments L = L(A α, σV) and L = L(A α, σW) with the same parameters and with observation vectors x ∈ R m and y ∈ R n , respectively.
De nition 2.1 ([9]). Experiment L is said to be at least as good as L with respect to linear estimation [notation: L ⊵ L ] if for any parametric function ψ = c T α and for any estimator b T y there exists an estimator a T x with uniformly not greater squared risk. If L ⊵ L and L ⊵ L then we say that the experiments are equivalent for linear estimation.
The relation ⊵ may be expressed in terms of linear forms (see [8,9]). Namely L ⊵ L , if and only if, for any b ∈ R n there exists a ∈ R m such that for all α and σ. It is worth to note that the relation L(A α, σV) ⊵ L(A α, σW) does not depend on whether σ is known or not. Thus L ⊵ L if and only if L(A α, V) ⊵ L(A α, W).
Moreover, under the normality assumption, the condition (1) may be expressed in the form: For any parametric function ψ and for any b ∈ R n there exists a ∈ R m such that a T x − ψ is stochastically not greater than b T y − ψ for all α and σ (Sinha [10] and Stępniak [11]). Now consider two normal linear experiments N = N (Aα, σV) and N = N (Bα, σW) with observation vectors x ∈ R m and y ∈ R n . It is well known (cf. [12,13]) that such experiments are not comparable with respect to all possible statistical problems. Therefore we shall restrict our attention to quadratic estimation only.

De nition 2.2 ([7]
). Experiment N is said to be at least as good as N with respect to quadratic estimation [notation: N ⪰ N ] if for any quadratic form y T Gy there exists a quadratic form x T Hx such that

E(x T Hx) = E(y T Gy) and var(x T Hx) ≤ var(y T Gy)
for all α and σ. If N ⪰ N and N ⪰ N then we say that the experiments are equivalent for quadratic estimation.
In the last de nition the quadratic forms x T Hx and y T Gy play the role of potential unbiased estimators for parametric functions of type ϕ(α,σ) = cσ + α T Cα. It is known that any mean squared error of a linearly estimable parametric function ψ = ψ(α) in the experiment N (or in N ) has such a form (Stępniak [14]). The orderings ⊵ and ⪰ possess invariance property with respect to nonsingular linear transformation both the parameter α and the observation vectors x and y as well ( [7], Lemmas 2.1 and 2.2).
The main tool in comparison of the standard linear experiments is the information matrix M de ned as the Fisher information matrix A T A corresponding to the experiment N (Aα, I).
The relation ⊵ may be characterized by the following theorem.

Quotient of matrices in comparison of experiments
For given psd matrices T and U of the same order we shall refer to the expressions Q = TU + , Q = U + T, Q = (U + ) T(U + ) and Q = T U + T as versions of the quotient of T by U. We note that only Q and Q are always symmetric.
We shall start from basic properties of the quotients.  with unknown parameters α ∈ R p , β ∈ R k and σ > such that α (or α and σ) is of interest, while β is treated as the nuisance one. Such experiment will be denoted by L(Aα + Bβ, σI) or (under the normality assumption) by N (Aα + Bβ, σI). We shall say that a statistic t =t(x) is invariant (with respect to β) if its rst two moments exist and they do not depend on β. It is evident that a linear form a T x is invariant in the experiment L(Aα + Bβ, σI) if and only if it depends on x only through (I − P B )x. The same condition for invariance of quadratic form x T Hx follows by the well known formula for variance of quadratic forms in normal variables (cf. [16,17]).

De nition 3.4. We shall say that L is at least as good as L w.r.t. invariant linear estimation if for any invariant statistic b T y there exists an invariant a T x such that E(a T x) = E(b T y) and var(a T x) ≤ var(b T y) for all α and σ. Similarly, we shall say that N is at least as good as N w.r.t. invariant quadratic estimation if for any invariant statistic y T Gy there exists an invariant x T Hx such that E(x T Hx) = E(y T Gy) and var(x T Hx) ≤ var(y T Gy) for all α and σ.
First we shall reduce the comparison of linear experiments with a nuisance parameter β to the same problem for the usual linear experiments. To this aim we need the invariance condition in a a more explicit form. Let x be observation vector in a linear experiment L(Aα + Bβ, σI) or N (Aα + Bβ, σI) and let b , ..., b n−r be orthonormal basis in N(B T ). Then I − P B may be presented in the form For convenience the matricesÃ T iÃi , i = , , will be called the reduced information matrices and will be denoted by M i . We note that As a direct consequence of Theorems 2.3 and 2.4 we get the following lemmas.  (4), called also C-matrix (see [18][19][20]), may be presented in the form One can verify that In particular, for the orthogonal design, Ndiag(b − , ..., b − k )N T = n tt T . Denote by D = D(t; b) the class of all possible allocations of v treatments with replications t , ..., t v in k blocks of sizes b , ..., b k for v, k ≥ . Such class contains or does not contain an orthogonal design. If it does then by Stępniak [8] this design is optimal in D w.r.t. invariant linear estimation, i.e. it is at least as good as any other design in the class.
It is natural to ask whether the orthogonal design is also optimal w.r.t. invariant quadratic estimation. In the light of the results presented in Section 3 we are strongly convinced that the answer is negative, but for formal reasons we are ready to provide a rigorous proof of this fact. By Corollary 3.8 we only need to show that for any incidence matrix N = (n ij ) corresponding to the orthogonal design there exists an incidence matrix M = (m ij ) such that This leads to the following By the way we have demonstrated that, with reference to the orthogonal block design, the meaning of the optimality w.r.t. linear estimation may be strengthened in the sense that the words "at least as good" may be replaced by "better".