A nonparametric method for estimating interaction effect of age and period on mortality.

In this paper we introduce a new model and develop an estimation strategy to analyze mortality data. The model we dealt with has the specific structure E[log qij] = mu + alpha i + beta j + rho ij subject to the linear restrictions sigma i alpha i/sigma 2i = sigma j beta j = sigma i pij/sigma 2i = sigma j rho ij = 0 for any i and j; here, qij denotes the mortality for the jth period category and ith age category, mu denotes the overall mean, alpha i denotes the ith age effect in antichronological order, beta j the jth period effect, rho ij the general interaction effect; and sigma 2i is the common error variance of log qij for the ith age group. We propose a combined technique of ANOVA and nonparametric smoothing for estimating these parameters. The methods described are illustrated by mortality data on rectum cancer in Japanese males and females between 1950-1986.


Introduction
An age-period-cohort (APC) model or some modification of the model has often been used for analyzing mortality data. If the model is fitted correctly, then it yields useful summaries of the data in terms of parameters in the model. However, there is a well-known difficulty in estimating parameters, because of the APC model lacking identifiability (1)(2)(3)(4). Setting arbitrary constraint on the parameters of the model is required to determine a unique estimate of the parameters.
To date various constraints have been proposed by many researchers (5). Holford (2) proposed that analysts concentrate their discussions only on the estimable functions such as the curvature component of each effect. Tango (4) analyzed Japanese mortality and detected interesting curvatures in the cohort effect, and Hirotsu (6) introduced a class of estimable components and discussed how to detect a systematic change in cohort effects without the class suffering from a shortterm fluctuation in the context of the one-way analysis of variance. This paper introduces a new age-period model, which is free from an identifiability problem and proposes a method of model fitting to APC data through the nonparametric smoothing technique.

Statistical Model
We derive a new model by replacing the cohort-effect term in the ordinary APC model by a term of general age x period interaction. Thus, our model is expressed as Model I: log qij = p. + ai + j + Pij + Eij, subject to the restrictions I t ,/uf = E j = E P,,/U2 = E P> =°i i (1) for j = 1 ..., p, i = 1, . . ., a; here, qij denotes the observed mortality rate for the jth period category and the ith age category, p. is the overall mean, ai is the fixed effect of the ith age category, (3j is the fixed effect of the jth period category, and pij is the fixed interaction effect associated with the ith age category and the jth period category. The only random component is Eij, which is assumed to be independently distributed with mean E[Eij] = 0 and variance i. When there exist parameters 0 and e such that pij = Oitj for all i and j, Model I becomes the APC model attributable to James and Segal (7) with no cohort-effect term. Furthermore, in case of no age x period interaction, Model I is reduced to a popular two-factor ageperiod model (5). In this paper we call the reduced model age-period-main-effect mode, which is expressed as Model II: log qij = p. + oti + Ij + Eij.
The advantage of using Model I is that various types of age x period interaction, including the so-called co-    will be referred to as Poisson variability in this paper.) Therefore, to obtain highly efficient estimates, we use a weighted least-squares technique rather than the ordinary one in estimating unknown parameters in Model I. If the weights that are proportional to the reciprocal numbers of crj are adopted, the least-square solution is chosen to minimize the weighted residual sum of squares, where Yi-= i i a Ec E Yij, andc* = {o1 E Io2}-1. Note that the solution i j (,u*, a* and 13*) given by Eq.
(2) will be also derived from minimizing the weighted residual sum of squares from Model II.
i j under the restriction of Eq. (1). When oi is unknown, as in the ordinary situation, substituting a suitable estimate s2i for the unknown parameter a' in the weights c*, we deduce the approximate least square estimates of ,u, a, and ,B as follows i= Yi. -Y.., where Y.j E ciyij, y.. = -E ci lYi%, and ci= a (s/ XS2) In these plots, fully blackened represents Aij = antilog(&j) -1.

Illustration of Data Analysis
We now illustrate our method using real mortality data. The data concerns rectum cancer, more precisely, malignant neoplasm of the rectum, rectosigmoid junction, and anus. Mortality rates in Japanese males and females were obtained from the Japanese Vital Statistics List published from 1950 to 1986 (12). . Age-specific plots of square root of mean square error and predicted values from Poisson variability on logarithmic mortality rates fitted to malignant neoplasms of rectum, rectosigmoid junction, and anus in Japanese females between 1950 and 1986. and 2 give the age-specific plots of the logarithmic mortality rates per 100,000 for the male and female data set, respectively. These rates are based on 5-year age intervals and single-year period intervals. Figure 3 shows the sex-age-specific plots of the square root of sV calculated from Eq. (4). Briefly, a common pattern is observed in the plots for both sexes; the values of se for the younger age groups are larger than those for middle or advanced age groups. This trend is surely the result of Poisson variability in the observed number of deaths. Figures 4 and 5 give the plots of the estimated age effects aj and period effects ,j computed from Eq. (3).
set, respectively. These rates are based on 5-year age intervals and single-year period intervals. Figure 3 shows the sex-age-specific plots of the square root of sf calculated from Eq. (4). Briefly, a common pattern is observed in the plots for both sexes; the values of se for the younger age groups are larger than those for middle or advanced age groups. This trend is surely the result of Poisson variability in the observed number of deaths. Figures 4 and 5 give the plots of the estimated age effects &i and period effects Ij computed from Eq. (3).
According to the age-effect plots for both sexes, we can see that the effect increases with age and there is no noticeable difference between the effects for males and females. On the other hand, we can see from the periodeffect plots that there is a clear peak around 1970s for females, while no such peak exists for males. A twodimensional plot is given to visualize the residual from the age-period-main-effect model in Figure 6, which uses a contrast to represent the value of rij computed from Eq. (5) for each (i,j) cell; The rule of making a contrast for the (i,j) cell in this plot is as follows.
According to each computed value of A4 = antilog(rij), we produced a cell-pattern: "fully blackened" if 1.2 -A4, "double shaded" if 1.1 £ Ai < 1.2, "single shaded" if 1.05 -A1ij < 1.1, "dotted" if 0.95 < A4 < 1.05, "whitened" ifA4 j 0.96, for i = 1,..., a;j= l.,p_ A brief outline on the structural age x period interaction pattern can be seen in this plot. To remove the noisy fluctuation in residuals we apply a nonparametric smoothing to sequence of {rijj = 1,..., p} for each i = 1,..., a. Figure 7 illustrates the result of nonparametric smoothing for the female data of two groups: 40 to 44 years of age and 75 to 79 years of age. After the smoothing step, we obtained a clearer interaction pattern, as shown in Figure 8, where the same plotting rule seen in Figure 6 was adopted in making the contrast, replacing rij by the smoothed value, Pij. In Figure   8 we can detect two different kinds of age x period interaction-one seems to be related to cohort effect and the other does not. The former is a noticeable cluster of high interaction around the births cohort who were born in 1960s. It is interesting that a strong resemblance exists between the estimated age x period interaction pattern for males and females, although the corresponding period-effect patterns are remarkably different, as shown in Figure 5. Figure 9 shows a plot of residuals E = -, where the same plotting rule is adopted as in Figure 6, replacing rij by Eij. No lack of fit of Model I is suggested from this plot. To make a further inspection of goodness of fit from this plot we computed the mean square error from Model I for each age group model. Figures 10 and 11 show the age-specific plots of square root of these (approximate) mean square errors and the predicted values from Poisson variability for the data on males and females, respectively. We can see from these plots that there is good agreement between observed value and the predicted one from Poisson variability, suggesting that no additional variation beyond Poisson variability exists in these data.

Additional Remarks
We adopt Model I as the main effect of combining age and period rather than that of age and cohort. One of the reasons why we choose the age-period-main-effect model is that the model is easier to handle than the agecohort-main-effect model in parameter estimation. For example, since a limited amount of mortality data are available for extremely old or new cohorts, some treatment of missing values is inevitable when the age-cohort-main-effect model is adopted; such treatment is not required when the age-period-main-effect model is used. A more essential reason in our choice lies in our belief that recent improvements in medical or public health care, such as advances in developing effective medicines and introducing screening programs, are equally effective over all age groups. For such situations, we suppose that the age-period-main-effect model is more suitable than the age-cohort-main-effect model in describing mortality rates.