Subsymmetry and asymmetry models for multiway square contingency tables with ordered categories

Abstract This paper suggests several models that describe the symmetry and asymmetry structure of each subdimension for the multiway square contingency table with ordered categories. A classical three-way categorical example is examined to illustrate the model results. These models analyze the subsymmetric and asymetric structure of the table.


Introduction
Square contingency tables with the same categories occur frequently in applied sciences. Such tables arise from tabulating the repeated measurements of a categorical response variable. Some examples for these kind of tables are: for instance, when the subjects are measured at two di erent points in time (e.g., responses before and after experiments); the decisions of two experts are measured on the same set of subjects (e.g., the grading of the same cancer tumors by two specialists); two similar units in a sample are measured (e.g., the grades of vision of the left and the right eyes); matched pair experiments (e.g., social status of the fathers and sons) [1]. For square contingency tables, several models have been proposed (see, for example [2][3][4][5][6][7][8] but the models of symmetry (S), quasi-symmetry (QS), marginal homogeneity (MH) are classical and well known models [9,10] and the applicability of the these models is straightforward. The QS is less restrictive model than the S model [11][12][13].
Consider an RxR square contingency table with the same row and column classi cations. Let p ij denote the probability that an observation will fall in the ith row and jth column of the table. Bowker [14] considered the symmetry (S) model for RxR tables de ned by The S model implies that the probability that an observation will fall in cell (i, j) of the table is equal to the probability that it falls in cell (j, i).
Multiway contingency table is obtained when a sample of n observations is cross classi ed with respect to T categorical variables having the same number of categories. Such tables are very popular in panel studies or matched pair examples. The symmetry model is den ed in multidimensional way.
Denote the kth categorical variable by X k (k = 1, . . ., T) and consider an R T contingency table (T 3). Let p i ...iT denote the probability that an observation will fall in the (i , . . ., i T )th cell of the table.
For example, when T = 3, let X, Y and Z denote the row, column and layer variables, the S model can be expressed as The simplest possible model of interest is the model of complete independence, where the joint distribution of the three variables is the product of the marginals. The corresponding hypothesis is Symmetry model for multiway tables is given in general as follows: The common schemes for representing contingency tables are based on the row column and layer variables that are independent. In three way contingency tables, the choice of predictor and control variable is of interest to many researches. The purpose of this paper is to give some models which represent the subsymmetry and asymmetry for multiway contingency tables. We will concentrate on only three dimensional tables which are a cross-classi cation of observations by the levels of three categorical variables. The models are de ned in the sub symmetry and asymmetry context taking the rst variable as a control variable. The models below are often used to analyze three dimensional tables.

Subsymmetry and asymmetry models
We collect the triplet (X,Y,Z) for each unit in a sample of n units, then the data can be summarized as a threedimensional table. Let p ijk be the probability of units having In what follows, we de ne some models that represent the subsymmetry and asymmetry. Model 1: Subsymmetry matrices are de ned by each dimension as: V matrix corresponds to the cells on the main diagonal for XxYxZ.
The conditional factor variables are de ned for the asymmetric associations as follows: Conditional symmetry matrix:  Using these factors we analyze the models by GLM appoach.

Numerical example
The data in Table 1 are taken directly from Yamamoto et al. [15] and give results of the treatment group only in randomized clinical trials conducted by a pharmaceutical company in anemic patients with cancer receiving chemotherapy. The response is the patient's hemoglobin (HB) concentration at baseline (before treatment) and following 4 and 8 weeks of treatment. Hb response is classi ed as 10 g/dl, 8-10 g/dl and < 8 g/dl. The reference ranges for hemoglobin concentration in adults are as: for men: 14.0-17.5 g/dL, for women: 12.3-15.3 g/dL. weeks Baseline weeks g/dl -g/dl < g/dl g/dl g/dl -g/dl g/dl < g/dl g/dl g/dl -g/dl -g/dl -g/dl < g/dl -g/dl g/dl < g/dl -g/dl < g/dl < g/dl < g/dl The Models (1-10) proposed here attampt to analyze what is the relationship between X, Y and Z taking "Baseline" as the control.
The example of the design matrix is given for Model (8) in Table 2.
Design matrices are generated for each model. Likelihood ratio chi-square values with associated degrees of freedom, AIC and BIC are given in Table 3. Model comparisons, here in addition to the goodness of t tests, tend to give better information on what model represents the data better.
The results show that all models t the data well. The smallest value for both AIC and BIC is obtained for Model (8). Note that Model (8) and Model (9) are the conditional models that collapsed the baseline variable. Recall that Model (8) is Correspondingly, denote m ijk expected frequencies, the Model (8) is represented as In this model representation, "Baseline" is the control variable therefore it is not included in the parameters.
Model (8) tests the p ijk = β j γ k ψ ψ ψ θ θ θ .η.ξ hypothesis and takes the table YxZ frequencies. The probability that a subject at baseline has hemoglobin level 10 g/dl is 13.10 more likely being 10 g/dl at 4 and 8 consequtive weeks instead of 8-10 g/dl.
The HB concentration tends to decrease from baseline throughout 8 weeks, since the maximum likelihood estimates are less than 1. Therefore, under the model (9), the conditional probability that when a patient's Hb concentration at 4 week is 10 g/dl, the probability that a patient's HB the probability that a patient's level 10 g/dl at baseline instead of 8 weeks and 4 weeks is 14.76 times higher than a patient's Hemoglobin level 10 g/dl instead of 8-10 g/dl at 8 weeks.
The odds ratios greater than one under model (8) and model (9) indicate that the HB concentration at level 10 g/dl is more likely to occur at baseline instead of after 4 and 8 weeks.

Conclusions
We considered subsymmetry models for multiway square contingency tables in which the main diagonal is not of interest. The models are established to analyze square multidimensional contingency tables with ordered categories. We see from the results that the models described here can be applied to a multiway table.
We applied models to the patient's hemoglobin concentration data set to illustrate the proposed models. The responsewas the patient's hemoglobin (Hb) concentration at baseline (before treatment) and following 4 weeks and 8 weeks of treatment. The primary goal was to compare the baselines levels to 4th and 8th weeks taking the baseline as a layer variable. We were interested in considering the changing status of patient's Hb concentration from baseline through time. But one wished to see whether there was an asymmetric transition of those concentrations or not, when the value of those concentration at baseline was given. The advantages of the models proposed here are that they are capable of analyzing the conditional odds ratios as well as the parameter estimates. Extensions to k-way tables are straightforward.