Generalized Mantel-Haenszel procedures for 2 x J tables.

Generalization of Mantel-Haenszel procedure for 2 x J (J > 2) tables is reviewed. Included are generalized Mantel-Haenszel tests, estimators for a common odds ratio, and generalized Breslow-Day test for the homogeneity of odds ratios across the strata.-Environ Health Perspect 102(Suppl 8): 57-60 (1994)


Introduction
The Mantel-Haenszel procedure (1) is a set of the statistical methods most frequently employed in the analysis of epidemiologic data. It consists of estimation and testing of the odds ratio, presuming the homogeneity of the odds ratios across strata. A method for testing the homogeneity of the odds ratios has been developed by Breslow and Day (2) and added to the procedure [for example, SAS version 6.1 (3)]. The procedure is solely for 2 x 2 tables.
Logistic regression analysis, widely used in the analysis of epidemiologic data, employs the unconditional method of maximum likelihood. Occasionally the conditional method of maximum likelihood also is employed in the analysis, in particular when the sample size is small. Compared with these methods, the Mantel-Haenszel procedure has the following characteristics: while the results by logistic regression analysis are sensitive to the mathematic model employed, the Mantel-Haenszel procedure is free from the models; more precisely, the test statistic is nonparametric and the estimator is a moment estimate (4). Furthermore, the Mantel-Haenszel estimator is dually consistent, i.e., consistent when the number of tables is fixed and the sample size in each stratum is large (5); it is also consistent when the sample size in each table is fixed and the number of tables is large (6). While the unconditional maximum likelihood estimator (MLE) is consistent in the former situation, it is not consistent in the latter (6)(7)(8). Thus, the Mantel-Haenszel estimator is more stable for sparse tables than the unconditional MLE. It has been shown that when subject responses are correlated within tables, the Mantel-Haenszel estimator is consistent, while the conditional MLE is not consistent (9).

Notation
We consider K2 xJ tables for which the cell frequencies and cell probabilities are shown in Table 1. Our primary interest is the relationship between the two classifications represented by the second (disease) and the third (exposure) indices; the first index denotes nuisance variables on which we are stratifying. The j'th odds ratio taking the jth column as a base in the kth stratum will be denoted by: v(j) = Elj k2i-Vtkj = 'klj' k2j (j=2,...,J;k=1,2,...,K). The Mantel-Haenszel type testing and estimation presume the homogeneity of the odds ratios; more specifically (t,) = y = ... = (4 (= V7j), say), [1] and focus on the common odds ratios, W It is easy to see that the odds ratios satisfy the following property:

Testing Exposure and Disease Association
We note that no exposure and disease association is expressed by yVt $) = 1 for any j, j'=1,2,..J, which is equivalent to fj=1, j=2,3,.. .t which is the generalized Mantel-Haenszel estimator of Greenland (16) and Yanagawa and Fujii (14).  (14) show that this statistic does not follow chi-square distribution under Ho asymptotically and that it affords anticonservative results. A method of modification of this statistic is given by Yanagawa and Fujii (14) so that it follows chi-square distribution with (K-1)(J-1) d.f. asymptotically under Ho, together with the method of modification of the generalized Mantel-Haenszel estimator so that it tends to be asymptotically efficient. The algorithm for the computation is represented as follows: Step Using SAS/IML (3) we made a computer program for the Mantel-Haenszel procedure for 2 xJ tables described in this article. We wish to publish it elsewhere.

An Example
As an illustration of the generalized Mantel-Haenszel procedure, we use part of a data set from the case-control study of esophageal cancer given in Breslow and Day (2). Suppose that our primary concern is the association of alcohol consumption and esophageal cancer among persons less than 55 years of age. Epidemiologists might prefer stratifying on the confounding variables (in this case age and tobacco consumption) and then, if the odds ratios are homogeneous throughout the strata, summarizing the information in each table by estimating and testing the common odds ratios. The method is intuitive and easy to comprehend, but from the point of view of analysis we have to deal with an abundance of small entries and empty cells. The data are summarized in eight 2 x 4 tables, shown in Table 2. The response categories are alcohol consumption of 0 to 39, 40 to 79, 80 to 1 19, and 120 + g/day. The results of analysis by the generalized Mantel-Haenszel procedure and those by a log-linear model are presented in Table 3. Alcohol consumption of 0 to 9 g/day has been taken as a base in the table. The tendency of the unconditional MLE from the log-linear model towards inflated values with sparse data is evident for the 120+ g/day category; in this case the generalized Mantel-Haenszel estimate is 121.80, and it is improved to be 194.23. The generalized Mantel-Haenszel estimator has little influence on sparse data (16). The improvement is essential for the homogeneity test to follow a chi-square distribution. The value of the corrected chi-square homogeneity test is 24.66; whereas G2=26.19 and X2=27.73. Although the difference is not as great as in the case of estimation, we can still see the impact of the sparse data on G2, the unconditional likelihood ratio and on X2, the Pearson chi-square.