Collapsed Double Symmetry Model and Its Decomposition for Square Contingency Tables

For a square contingency table with ordinal categories, there may be a case that one wants to analyze several collapsed tables obtained by combining some adjacent categories of the original table. This paper proposes some new models which indicate double symmetry, quasi double symmetry and marginal double symmetry for the collapsed square tables. It also gives a decomposition of the double symmetry model for collapsed tables. Two kinds of occupational mobility data are analyzed using new models.


Introduction
Consider square contingency tables having same ordinal row and column classifications, for instance, as Tables 1a and  2a.For these data, a lot of observations tend to fall in (or near) the main diagonal cells.Thus for such data, in many cases, the independence between the row and column variables does not hold.So, we are interested in symmetry instead of independence.Bowker (1948) proposed the symmetry (S) model, which indicates that the probabilities are symmetric with respect to the main diagonal of the table.Wall and Lienert (1976) proposed the point-symmetry (PS) model, which indicates that the probabilities are symmetric with respect to the center point in the table.Tomizawa (1985) proposed the double symmetry (DS) model such that both S and PS models hold.For other models of symmetry, see, for example, Caussinus (1965), Bishop, Fienberg and Holland (1975, Chap.8),Agresti (2013, Chap.11), and Tahata and Tomizawa (2014).
Table 1.Occupational status for Japanese father-daughter pairs; from Hashimoto (2003, p.144  (1,4)  (1) Professional, technical, kindred white-collar managers, officials, (2) Clerical and sales white collar workers, (3) Craftsmen (blue collar workers), (4) Operatives, service workers, laborers excepting farm workers (blue collar workers) (b) Collapsed table (T (1,3)  For analyzing the data in Tables 1a and 2a, in some cases, we would like to divide the occupational status into a simple categories, for example, "High", "Middle" and "Low".Table 1b is the collapsed table with "High" category which is "(1) capitalist" in Table 1a, "Middle" category obtained by combining "(2) new middle", "(3) working" and "(4) selfemployed" categories in Table 1a, and "Low" category being "(5) farming" in Table 1a.In other words, Table 1b is obtained by collapsing Table 1a into the 3 × 3 table by using cut points after the first and fourth rows and after the first and fourth columns.Yamamoto, Tahata and Tomizawa (2012) proposed some S models for collapsed square contingency tables, for example, the collapsed quasi-symmetry model.Yamamoto, Murakami and Tomizawa (2013) proposed some PS models for collapsed square contingency tables, for example, the collapsed quasi point-symmetry model.Tomizawa (1985) gave the theorem that the DS model holds if and only if both the quasi double symmetry (QDS) and marginal double symmetry (MDS) models hold.See Appendix for the DS, QDS and MDS models.
In the present paper, Section 2 proposes some new models for the collapsed tables and gives the decomposition for the new model.These proposed models are totally different from existing models.Section 3 describes the goodness-of-fit test, and analyzes the data in Tables 1a and 2a using new models.Section 4 provides remarks.

New Models and Decomposition
Consider the r × r contingency table having ordered categories.Let p i j denote the probability that an observation will fall in the (i, j)th cell of the r × r table (i = 1, . . ., r; j = 1, . . ., r).
We would like to refer to such a collapsed 3 × 3 table as the We propose the collapsed double symmetry (CoDS), collapsed quasi double symmetry (CoQDS) and collapsed marginal double symmetry (CoMDS) models.Consider collapsing the r categories of an original table with ordered categories into 3 categories (say, groups A, B and C), by using cut points h and h ′ .The CoDS model describes that for each collapsed T (h,h ′ ) table (h = 1, ..., [(r − 1)/2]), (i) the probability that both of the row and column values of an observation are in group A equals to the probability that both of the row and column values of it are in group C, (ii) the probability that the row and column values are in groups A and B, respectively, equals to the probability that the row and column values are in groups B and A, respectively, and it also equals to the probability that the row and column values are in groups C and B, respectively, and moreover it equals to the probability that the row and column values are in groups B and C, respectively, and (iii) the probability that the row and column values are in groups A and C, respectively, equals to the probability that the row and column values are in groups C and A, respectively.

First, we propose the CoDS model defined by
Next, we propose the CoQDS model defined by for k = 1, 2, 3; l = 1, 2, 3, and h = 1, . . ., [(r − 1)/2], where Note that a special case of the CoQDS model obtained by putting {α is the CoDS model.Also noting that the CoQDS model is identical to the collapsed quasi point-symmetry model in Yamamoto et al. (2013) because the QDS model is identical to the quasi point-symmetry model only when the square contingency table is the 3 × 3 table.For the quasi point-symmetry model, see Yamamoto et al. (2013).
Finally, we propose the CoMDS model defined by where This model states that there is a structure of MDS in each collapsed T (h,h ′ )  In a similar manner to the explanation of the CoDS model, the CoMDS model indicates that for each collapsed T (h,h ′ ) table (h = 1, ..., [(r − 1)/2]), (i) the probability that the row value is in group A equals to the probability that the row value is in group C, and it is also equal to the probability that the column value is in group A, and moreover it is equal to the probability that the column value is in group C, and (ii) the probability that the row value is in group B equals to the probability that the column value is in group B; namely, for the original table, the probability that the row variable is h or below is equal to the probability that it is h ′ or above (h = 1, ..., [(r − 1)/2]), and it is also equal to the probability that the column variable is h or below, and moreover it is equal to the probability that the column variable is h ′ or above (h = 1, ..., [(r − 1)/2]).
Applying the decomposition theorem of the DS model in Tomizawa (1985) (described in Section 1) for collapsed 3 × 3 table, we can obtain the following theorem: Theorem 1 The CoDS model holds if and only if both the CoQDS and CoMDS models hold.

Goodness-of-fit Test
Assume that a random sample of fixed size n is cross-classified according to the categorical variables.The distribution of the cell counts {n i j } is then the multinomial distribution specified by the sample size n and the population cell probabilities {p i j }.We can obtain the maximum likelihood estimates (MLEs) of expected frequencies under the models by using the Newton-Raphson method in the log-likelihood equation.The likelihood-ratio approach to testing models leads to the test statistic where m i j is the MLE of expected frequency m i j under the model.The number of degrees of freedom for the CoDS model is 5(r − 2)/2 when r is even and 5(r − 1)/2 when r is odd, that for the CoQDS model is r − 2 when r is even and r − 1 when r is odd, and that for the CoMDS model is 3(r − 2)/2 when r is even and 3(r − 1)/2 when r is odd.Note that the number of degrees of freedom for the CoDS model is equal to the sum of that for the CoQDS model and that of the CoMDS model.

Analysis of Data in Table 1
Consider the data in Table 1a, taken from Hashimoto (2003, p.144), which describes the cross-classification for father's and his daughter's occupational status categories in Japan. ).

Table 2 .
Katz (1978)status of occupations for the husband's father and the wife's father; fromKatz (1978).(The upper and lower parenthesized values are the MLEs of expected frequencies under the CoDS and CoMDS models, respectively.)

table )
We point out that the CoDS model is totally different from the DS model, because the CoDS model indicates that there are DS structures in all collapsed tables.We note that the CoDS model holds if the DS model holds, but the converse does not necessarily hold.
table (h = 1, ..., [(r − 1)/2]).So, the CoMDS model is different from the MDS model.The MDS indicates that the row and column marginal distribution are symmetric and point-symmetric with respect to the midpoint of the row and column categories.Note that the CoMDS model holds if the MDS model holds, but the converse holds only when r is odd.
Table 1b is obtained by collapsing Table 1a into the 3 × 3 table by using cut points after the first and fourth rows and columns.Table 1c is obtained by collapsing Table 1a into the 3 × 3