On the Nile Problem by Sir Ronald Fisher

The Nile problem by Ronald Fisher may be interpreted as the problem of making statistical inference for a special curved exponential family when the minimal sufficient statistic is incomplete. The problem itself and its versions for general curved exponential families pose a mathematical-statistical challenge: studying the subalgebras of ancillary statistics within the $\sigma$-algebra of the (incomplete) minimal sufficient statistics and closely related questions of the structure of UMVUEs. In this paper a new method is developed that, in particular, proves that in the classical Nile problem no statistic subject to mild natural conditions is a UMVUE. The method almost solves an old problem of the existence of UMVUEs. The method is purely statistical (vs. analytical) and works for any family possessing an ancillary statistic. It complements an analytical method that uses only the first order ancillarity (and thus works when the existence of ancillary subalgebras is an open problem) and works for curved exponential families with polynomial constraints on the canonical parameters of which the Nile problem is a special case.


Introduction
The so called Nile problem formulated by Fisher gave rise to interesting mathematical -statistical problems. The original statement of the problem in Fisher's unique style is in Fisher [11] (it is cited verbatim in Fisher [12], pp. 122): The agricultural land of a pre-dynastic Egyptian village is of unequal fertility. Given the height to which the Nile will rise, the fertility of every portion of it is known with exactitude, but the height of the flood affects different parts of the territory unequally. It is required to divide the area, between the several households of the village, so that the yields of the lots assigned to each shall be in pre-determined proportion, whatever may be the height to which proportion the river rises.
Fisher himself ( [12], pp. 169) specified the problem as making statistical inference for a population with density f (x, y; θ) = e −(xθ+y/θ) , x > 0, y > 0, (1) with θ > 0 as a parameter. If ((X 1 , Y 1 ) , . . . , (X n , Y n )) is a sample from population (1), the pair X, Y of the sample means is an incomplete (minimal) sufficient statistic for θ. Due to incompleteness, there might exist (and in this setting actually exists) an ancillary statistic, (i.e., a statistic whose distribution does not depend on the parameter), in the σ-algebra σ X, Y generated by X, Y .
Due to incompleteness of the minimal sufficient statistic, the existence and construction of UMVUEs do not follow from the Rao-Blackwell and Lehmann-Scheffé theorems and become a nontrivial problem which requires a new approach. Nayak and Sinha [35] mentioned the existence of UMVUEs in the models (1)

and (3) (see below) as an open problem.
Another interpretation of the Nile problem due to Flatto and Shepp [13] is statistical inference on the correlation coefficient ρ of a bivariate Gaussian vector with density ϕ (x, y; ρ) = 1 2π 1 − ρ 2 e − x 2 +y 2 −2ρxy The minimal sufficient statistic for ρ is X 2 + Y 2 , XY . Note that the pair X 2 + Y 2 , XY is in one-to-one correspondence with a quadrable ((X, Y ), (Y, X), (−X, −Y ), (−Y, −X)). If (X 1 , Y 1 ) , . . . , (X n , Y n ) is a sample from (2), the minimal sufficient statistic for ρ is The minimal sufficient statistic is again incomplete (even in case of n = 1), and the problem of the existence of an ancillary statistic in the σ-algebra σ The following observation by Flatto and Shepp [13] solves a related, but different problem. Let A be a set in R 2 with finite Lebesgue measure, λ(A) < ∞. If A is ancillary, i.e., A ϕ (x, y; ρ) dxdy = c, a constant, then λ(A) = 0 (and thus c = 0). The condition λ(A) < ∞ which is essential in the proof, seems artificial from the statistical point of view. However, a first order ancillary statistic H(X, Y ) (i.e., such that E ρ H(X, Y ) = const) measurable with respect to σ X 2 + Y 2 , XY exists. Indeed, set and E ρ H (X, Y ) = const since the marginal distributions of X and Y do not involve ρ. Though H(x, y)dxdy = ∞, the finiteness of this integral is not required in the definition of the first order ancillarity. From general results on curved exponential family in Kagan and Palamodov [20,21] (see also supplement in Linnik [34], and for another proof see Unni [39]) it follows that the only UMVUEs from a sample ((X 1 , Y 1 ) , . . . , (X n , Y n )) from (2) are constants.
Note that from the analytical point of view, inference problems for a sample (X 1 , . . . , X n ) from a population with density with θ > 0 as a parameter, c > 0 known, are very close to the Nile problem.
The assumption that the standard deviation is proportional to the mean seems reasonable in the setup of direct measurements. These problems were studied in a number of papers (see, e.g., Khan [30], Gleser and Healy [15], Hinkley [17]). The minimal sufficient statistic for θ is a pair X, S of the sample mean and standard deviation. The sufficient statistic is incomplete and there exists a convenient ancillary statistic in σ X, S . Combining this with results from Rao [38] on the structure of UMVUEs and Kagan [19], Barra [3] and Bondesson [6] on sufficiency, we prove by purely statistical tools that the only UMVUEs are constants. The statistical method works for any family P = {P θ , θ ∈ Θ} of distributions on (X, A) possessing an ancillary subalgebra B, i.e., such that P θ (B) = const in θ for all B ∈ B. An analytical proof using general results for curved exponential families can be found in Kagan and Palamodov [20,21,22] and the dissertation of Unni [39]. In Section 6 we discuss some examples that cannot be treated by the analytical method but yield to the statistical method.
The following observation is due to I. Pinelis (private communication). Let X be distributed according to (3). Then the statistic ½ {X>0} is ancillary, so that X is incomplete. He conjectured that if the parameter space is R \ {0}, then X is complete. If so, it is an interesting phenomenon. Extrapolating from the above three different interpretations of the Nile problem, the following problem seems to be of a general interest. Let P = {P θ , θ ∈ Θ} be a family of probability distributions on a measurable space (X, A), and let T : (X, A) → (T, C) be an incomplete sufficient statistic for θ. Set A = T −1 C. Describe, if they exist, A-measurable (i.e., function of T ) ancillary statistics.

Sufficiency, Ancillarity and UMVUEs
Let P = {P θ , θ ∈ Θ} be a family of probability distributions parameterized by a general parameter θ of a random element X taking values in a measurable space (X, A). A subalgebra B ⊂ A is called ancillary if P θ (B) = const in θ for all B ∈ B. A statistic T (X) is called ancillary if the the subalgebra it generates is ancillary. A statistic T (X) taking values in R with E θ |T (X)| < ∞ is called a first order ancillary if E θ T (X) = const in θ. A well known theorem due to Basu [4,5] says that if P is a linked family, i.e., for any pair θ ′ , θ ′′ ∈ Θ there exist a sequence θ 0 , θ 1 , . . . , θ n , θ n+1 with θ 0 = θ ′ , θ n+1 = θ ′′ such that P θj , P θj+1 are not mutually singular (i.e., P θj (A) = 1 ⇒ P θj+1 (A) > 0), then any subalgebra B which is P-independent of a sufficient subalgebra A (i.e., P ( A∩B) = P ( A)P (B) for any A ∈ A, B ∈ B and θ ∈ Θ ) is ancillary. A straightforward conversion of the Basu result is plainly false: there exist ancillary subalgebras within the algebra of sufficient statistics (see examples 1, 2 below).
Note in passing that if C is a complete sufficient algebra, then any ancillary algebra is P-independent of C (this result does not require P to be a linked family). In this case there is no C-measurable ancillary statistic.
The minimal sufficient statistic is (X, Y ), and one can easily see that X Y is an ancillary statistic. Example 2. Let (X 1 , . . . , X n ) be a sample from The minimal sufficient statistic is (X, S), and one can check that the statistic X/S is ancillary. However, if a subalgebra C ⊂ A which is P-independent of an ancillary subalgebra B is large enough, then C is sufficient for P. This observation is due to Kagan [19] and independently Barra [3] and Bondesson [6]. Lemma 1. Suppose that a subalgebra C is P-independent of an ancillary algebra B and together with B generates A, i. e., A is the smallest σ-algebra that contains both C and B, σ(C, B) = A. Then C is sufficient for P.
For the sake of completeness, a short proof of Lemma 1 is given in Appendix (the result was proved in Kagan [19], Barra [3] and Bondesson [6]).
Based on a paragraph in Fisher [12], pp. 168, it is likely that Fisher's definition of an ancillary algebra B required existence of a P-independent complement C, i.e., that σ(C, B) = A. If so, he knew that C is sufficient for P.
Basu's [4,5] theorem was useful in characterization of distributions by independence of statistics (see, e.g., Ferguson [9,10], Klebanov [29], Kagan [24] and an expository paper by Gather [14]). Here we want to demonstrate that combining Lemma 1 with Rao's result [38] on the structure of UMVUEs proves triviality of UMVUEs in the models (1) and (3). The proof which is purely statistical seems new and is of interest in its own. An analytical method covering curved exponential families with polynomial constraints on the natural parameters was developed in Kagan and Palamodov [20,21] and simplified in Unni [39]. It is based on a result by Wijsman [40] on the existence of the first order ancillary statistics for samples from curved exponential families with polynomial constraints on the natural parameters.
To state Rao's result, recall that if an observation X ∼ P θ with θ ∈ Θ as a parameter, a statistic g( estimators of zero are also called zeromean statistics). Rao [38] observed that if a statistic T = g(X) is a UMVUE, then, provided that E θ |g(X)U (X)| 2 < ∞, T 2 is also a UMVUE. Proceeding in the same way under the assumption E θ |g(X)| k < ∞, k = 1, 2, . . . we observe that any polynomial of T is UMVUE. Assuming moreover that the polynomials of T are complete in L 2 θ (T (X)), the Hilbert space of functions h(T ) with E θ |h(T )| 2 < ∞, one gets that any statistic h(T ) with E θ |h(T )| 2 < ∞ is a UMVUE (actually, the UMVUE of E θ h(T )).
In particular, if S is an ancillary statistic, the σ-algebras σ(T ) and σ(S) are independent for all θ ∈ Θ. If the pair (T, S) determines the sample point X or, equivalently, σ(T, S) = σ(X) = A, then T is sufficient for θ by virtue of Lemma 1.
In the known examples (Lehmann and Scheffé [31], reproduced in Lehmann and Casella [32], p. 84, Bondesson [7], Kagan and Konikov [26]), the UMVUEs form a subalgebra E ⊂ A (actually, a subalgebra of the minimal sufficient subalgebra A) called the σ-algebra of UMVUEs, i.e., all E-measurable statistics with finite second moments and only they are UMVUEs.
As these examples demonstrate, the problem of describing the algebra of UMVUEs for a general family of distributions seems rather difficult. In these examples, the minimal sufficient statistic is trivial (i.e., coincides with A), while E is not.

Statistical Method and the Original Nile Problem
We shall start with a general result on UMVUEs. Let P = {P θ , θ ∈ Θ} be a family of distributions on (X, A), C ⊂ A an (incomplete) sufficient subalgebra, and B ⊂ C an ancillary subalgebra. In terms of statistics, X represents the data, T = T (X) is an incomplete (minimal) sufficient statistic generating C, and W = W (T ) is an ancillary statistic.
We present a new method for obtaining a strong necessary condition for a statistic to be UMVUE.
Here are the conditions imposed on a statistic g = g(T ). Condition 1.
Conditions 3a and 3b refer to the σ-algebras, one generated by g (T ) and the other by ancillary statistic W . Roughly speaking, Condition 3a means that g(T ) is not a one-to-one function, while the pair (g (T ) , W (T )) is. In real setups, an incomplete T is multidimensional, while g(T ) is a scalar valued statistic.
Let Q θ (u) = P θ {g (T ) ≤ u} be the distribution of g (T ). If for all θ ∈ Θ, Q θ (u) is continuous in u and the moment problem for Q θ is determinate, i.e., Q θ is the only distribution with the moments α m (θ) = u m dQ θ (u) = E θ {g m (T )} , m = 0, 1, 2, . . . then the polynomials in u are dense in L 2 (Q θ ) or, equivalently, polynomials in g(T ) are dense in L 2 θ (g). A sufficient condition for this is given by the Carleman's classical criterion: Proof. Suppose that g = g (T ) is a UMVUE. Then for any zero-mean statistic In particular, since W is an ancillary statistic, (8) implies Turn now to g 2 . For any bounded non-zero statistic U, the statistic U 1 = gU is, by virtue of (8) and Condition 1 also zero-mean statistic with finite second moment, E θ (U 1 ) = 0, E θ (|U 1 | 2 ) < ∞, θ ∈ Θ. Thus, from g = g (T ) being a UMVUE, follows that E θ (gU 1 ) = E θ g 2 U = 0, θ ∈ Θ, implying Proceeding in the same way, one can prove that Notice again that the above arguments are essentially due to Rao [38]. Let now Such sequence exists due to Condition 2. Since . Therefore, any h ∈ L 2 θ (g) has a constant conditional expectation on σ(W ), i. e., σ (g (T )) and σ(W ) are independent for any θ ∈ Θ.
By virtue of Lemma 1 and Condition 3b, the statistic g (T ) (or, equivalently, σ-algebra σ (g (T ))) is sufficient for θ. Actually, σ (g (T )) is complete sufficient for θ. Indeed, if E θ {U (g(T ))} = 0 identically in θ, then U is an unbiased estimator of zero. As a function of g(T ), U is a UMVUE. Plainly, the UMVUE of zero is zero, so that P θ (U = 0) = 1 for all θ ∈ Θ. But due to Condition 3a, σ (g (T )) is a proper subalgebra of the minimal sufficient σ-algebra σ (T ), which is a contradiction. Notice in conclusion that the trivial UMVUE's, g (T ) = const, do not satisfy Condition 3b.
Let now (X 1 , Y 1 ) , . . . , (X n , Y n ) be a sample from f (x, y; θ) = e −(xθ+y/θ) , x > 0, y > 0, with θ > 0 as a parameter. The inference from the above sample is what is usually referred to as the Nile problem by Ronald Fisher. Plainly, the vector T = (X, Y ) is the minimal sufficient statistic for θ and W = X Y is an ancillary statistic. The minimal sufficient statistics is incomplete and this makes the problem of existence and description of UMVUEs nontrivial.
Applied to the Nile problem, Theorem 1 proves that a statistic satisfying rather general "regularity type" conditions (Conditions 1, 2, 3a, b) is not a UMVUE (the existence of a nonconstant UMVUE is an open problem, according to Nayak and Sinha [35]).
Turn now to a natural class of estimators of θ. A statistic θ X, Y is called an equivariant estimator of θ if θ X/λ, Y λ = λ θ X, Y for any λ > 0.
The equivariant estimators in the Nile problem were studied in Kariya [28]. Plainly, an equivariant estimator can be written as for some h. Here Y here may be replaced with any statistic of degree of homogeneity one in sense of (12), e. g., Y /X (the latter is the maximum likelihood estimator (MLE) of θ, as noticed by Fisher himself), or 1/X . If (13) is a UMVUE, then (11) results in where E 1 is the expectation taken when θ = 1 and If θ is a UMVUE, so is θ 2 and thus E θ 2 | W = E θ 2 , but

Problems Closely Related to the Nile Problem
The following setup of direct measurements, being of an interest in its own, has the same basic features as the Nile problem: an incomplete minimal sufficient statistic and an ancillary statistic which is a function of the sufficient one. Let (X 1 , X 2 , . . . , X n ) be a sample from a normal population N (θ, c 2 θ 2 ) with θ, θ > 0 as a parameter, and c > 0 is known. In other words, where ε i , . . . , ε n are independent random variables distributed as N (0, c 2 θ 2 ). In the standard setups of direct measurements, the distribution function F (x) of ε i (not necessarily normal) is assumed independent of θ so that (X 1 , X 2 , . . . , X n ) is a sample from F (x − θ) with a location parameter θ. Estimation of a location parameter in small samples was originated in Pitman [36]. Since then it has been thoroughly studied, especially for the quadratic loss function; see, e. g., monographs Lehmann and Casella [32], Casella and Berger [8], Kagan [23], Zacks [41] and recent papers Kagan and Rao [25], Kagan et al [27] and references therein. A special role of normal distribution Φ(x) in estimation of a location parameter is due to the fact that in the class of the distributions F with a given variance σ 2 , the Fisher information on θ contained in an observation A closely related result is that under the quadratic loss function and n ≥ 3, X is an admissible estimator of θ if and only if F = Φ. The if part is due to Hodges and Lehmann [16] and only if part due to Kagan et al. [18].
The setup of small and large samples from N (θ, c 2 θ 2 ) was studied in a number of papers. Khan [30] found the best unbiased estimator of θ in the class of estimators linear in X, S and showed that it is asymptotically efficient. Gleser and Healy [15] proved admissibility of the best (scale)-equivariant estimator of θ. Since it is different from the (also equivariant) MLE, the latter is inadmissible. See also Hinkley [17] and Kariya [28] for related results. According to Nayak and Sinha [35], the problem of existence of UMVUEs is open.
Since in samples from a normal population X and S are independent, the setup of sampling from N (θ, c 2 θ 2 ) is very similar to the Nile problem: T = (X, S 2 ) is an incomplete sufficient statistic and W = X S is an ancillary statistic.
To show the latter, write and notice that the distributions of X − θ θ and S θ do not depend on θ. A direct application of the method used in proving Theorem 1 proves the following result.
Theorem 2. Let g (T ) be a statistic satisfying Conditions 1, 2, 3a, 3b with T = (X, S) is an incomplete sufficient statistic and W = X S is an ancillary statistic.
Then g (T ) is not a UMVUE.

Analytical Method
We shall show now that Theorem 2 holds true without (unnecessary) Conditions 1, 2, 3a, 3b due to the fact that the family of normal distributions N (θ, c 2 θ 2 ) with θ as a parameter is a curved exponential family with polynomial constraints on the natural parameter. It is straightforward corollary of Lemma 2 below whose proof is purely analytical. The idea of the proof goes back to Wijsman [40]. As one can easily see, the probability density function of the minimal sufficient statistic (based on the sample of size n) where the explicit form of h(u, v) does not matter, but what matters for our purpose is a constraint η 2 1 + 2 c 2 η 2 = 0 on the natural parameters η 1 = 1 c 2 θ , η 2 = − 1 2c 2 θ 2 . The structure of UMVUEs for samples from natural exponential families (NEFs) with polynomial constraints on the parameters was studied in Kagan and Palamodov [20,21] and Unni [39] where the following result was proved. Proof. See Kagan and Palamodov [20,21] and Unni [39].  [7] noticed that if P = {P θ, η } is a family of distributions of a random element X ∈ (X, A) parameterized by a "bivariate" parameter (θ, η), and a statistic T = T (X), T : (X, A) → (T, B) is complete sufficient for θ for any fixed value of η, then σ(T ) is an algebra of UMVUEs.
Lemma 2 provides an analytical proof of non-existence of nontrivial UMVUEs from the samples from populations (1), (2) and (3). All three densities are from exponential families with polynomial constraints on the natural parameters, (3). One can see that reparametrization (17) does not exist in all these cases.

Some Applications of the Statistical Method
The powerful analytical method originated in Wijsman [40] and developed in Kagan and Palamodov [20,21] is applicable to exponential families with polynomial constraints on the canonical parameters (sometimes referred to as curved exponential families). In this section examples of non-exponential families are presented where nonexistence of UMVUEs is proved by the statistical method. Example 3. Let (X 1 , . . . , X n ) be a sample from a uniform distribution U (θ − 1, θ + 1) (plainly non-exponential) with θ ∈ R as a parameter. The pair X (1) , X (n) of the minimum and maximum of the sample elements is an (incomplete) minimal sufficient statistic for the parameter θ, while the range W = X (n) −X (1) is an ancillary statistic. Were g X (1) , X (n) a UMVUE and σ (g, W ) = σ X (1) , X (n) , g would have been a complete sufficient statistic for θ contradicting the minimality of the (bivariate) sufficient statistic T = X (1) , X (n) . In particular, the Pitman estimator of θ under the quadratic loss is X (1) + X (n) /2 and though the best in the class of equivariant estimators, it is not UMVUE.
In the similar way one can prove the non-existence of UMVUEs from a sample from U (θ, λθ) with λ > 1 known, θ > 0 as a parameter. Example 4. Let (X 1 , . . . , X n ) be a sample from an arbitrary (not necessarily exponential) location parameter family F (x − θ) with a known (arbitrary) F (x). Let t n = t n (X 1 , . . . , X n ) be the Pitman estimator of θ for the quadratic loss. The statistic t n and the residuals X 1 − X n , . . . , X n−1 − X n together determine the sample point. Thus, if t n is a UMVUE, it is a function of complete sufficient statistic. Hence, for samples from location parameter families we have a complete description of the UMVUEs: they are statistics depending on the data through the complete sufficient statistic.
An alternative " all or nothing" holds for the location parameter families: either all estimative function of parameter possess UMVUEs or none (except constants). Example 5. Let (X 1 , . . . , X n ) be a sample from an arbitrary location-scale parameters family F ((x − θ)/σ) with θ ∈ R and σ ∈ R + as parameters. Suppose that a vector (U 1 , U 2 ) = (U 1 (X 1 , . . . , X n ), U 2 (X 1 , . . . , X n )) is a UMVUE meaning that the covariance matrix V θ,σ (U 1 , U 2 ) of (U 1 , U 2 ) and the covariance matrix V θ,σ ( U 1 , U 2 ) of any unbiased estimator ( U 1 , U 2 ) of (E θ,σ U 1 , E θ,σ U 2 ) satisfy the inequality V θ,σ (U 1 , U 2 ) ≤ V θ,σ ( U 1 , U 2 ), θ ∈ R, σ ∈ R + (18) in the standard sense (A ≤ B ⇔ A − B is a positive semi-definite matrix.) Note that (18) is stronger that U 1 and U 2 are separately UMVUEs. Plainly (18) implies that a linear combination c 1 U 1 + c 2 U 2 with any constants c 1 , c 2 is a UMVUE and this is independent of the vector W = X 1 − X S , . . . , X n − X S of the standardized residuals. Independency of any linear combination c 1 U 1 + c 2 U 2 of W is equivalent to independence of (U 1 , U 2 ) and W (this is stronger than independence of U 1 and W , U 2 and W ).

Appendix A: Proof
Proof of Lemma 1 Proof. For any C ∈ C, B ∈ B P θ (C ∩ B|C) = E θ (½ C∩B |C) = ½ C P (B), due to ancillarity of B and P-independent of C and B. Similarly, for A = ∪ n i=1 (C i ∩ B i ) for pairwise disjoint C 1 ∩ B 1 , . . . , C n ∩ B n with C i ∈ C, B i ∈ B, i = 1, . . . n P θ (A|C) = n i=1 ½ Ci P (B i ) = P (A|C) does not depend on θ. Thus for any A from the algebra A 0 generated by Since, P θ (A ∩ C) for A ∈ A 0 determines P θ (A ∩ C) via monotone convergence for A ∈ A, one has for any C ∈ C, proving sufficiency of C for P.