Inconsistency of χ 2 test for sparse categorical data under multinomial sampling

. Simple conditions for the inconsistency of Pearson’s χ 2 test in case of very sparse categorical data are given. The conditions illustrate the phenomenon of “reversed consistency”: the greater deviation from the null hypothesis the less power of the test.


Introduction
Statistical inference problems caused by sparsity of contingency tables are widely discussed in the literature. According to a rule of thumb, expected (under the null hypothesis) frequencies in a contingency table is required to exceed 5 in the majority of its cells [6]. If this condition is violated, the χ 2 approximation of Pearson's χ 2 test statistic may be inaccurate and the contingency table is said to be sparse.
There is a vast literature dealing with approximation problems resulting from the sparsity, see, e.g., [1,7,8,2,3] and references therein). In this paper, it is shown that, for very sparse categorical data, the χ 2 test can become completely uninformative (inconsistent) and hence there is no sense to approximate or adjust its distribution. For the likelihood ratio test, analogous results are presented in [4] and [5].
In the next section we introduce notation, present some background and specify a sparsity condition. The inconsistency of Pearson's χ 2 test is proved in Section 3. A simple example and simulation results provided in the last section illustrate the inconsistency and "reversed consistency" phenomena for a finite sample.
We consider very sparse categorical data (contingency tables). Here it means that n = n(N ), N = o(n), P = P (N ), We shall also use additional (technical) conditions related to the sparseness, see Proposition 1.
Perason's χ 2 statistic (1) Using moment generation function one can find the means and the variances of the χ 2 statistic. Here and in the sequel E, D, and P (E 0 , D 0 , and P 0 ) denote, respectively, the expectation, the variance, and the probability for Y ∼ Multinomial n (N, P ) (respectively, Y ∼ Multinomial n (N, P 0 ).

Inconsistency
In this section the inconsistency of the χ 2 statistic is derived under additional conditions related to and quite natural for (very) sparse categorical data.
Otherwise, the test is called inconsistent.

Proposition 1. Suppose that
and the asymptotic relation is valid with D N := D 0 χ 2 + Dχ 2 . Then the χ 2 test is inconsistent.
Proposition 1 shows that (5) is the key condition which determines the inconsistency of χ 2 test. When P 0 is the uniform distribution, ∆ 0 for any P and hence, for any P , condition (5) is not satisfied. In the next section we present a simple example when conditions (5) and (6) are fulfilled.

Remark 1. By definition (5)
(10) Since the second term in this expression is nonnegative the requirement ∆ < 0 implies that the absolute value of the first term in (10) should dominate second one.
In the simulations, the number of observations N = 200, the number of cells n = 2m = 600. Two cases are considered: (a) q 0 = 0.2, q = 0 and (b) q 0 = 0.2, q = 0.1. The number of repetitions is set to 100. The histograms of the χ 2 statistic for the null hypothesis H 0 and the alternative H 1 are represented in Fig. 1. The figure clearly demonstrates the inconsistency of the χ 2 statistic. Actually, in the first case (case (a)), the phenomenon of the "reversed consistency" is observed: although the values of the χ 2 statistic under the null hypothesis H 0 are significantly greater than its values under the alternative H 1 (the data under the alternative "fits" the null hypothesis better than the data under the null hypothesis itself) the latter is evidently separable from the former. Thus Pearson's χ 2 test is completely uninformative in this case.