A Computational Study Assessing Maximum Likelihood and Noniterative Methods for Estimating the Linear-by-Linear Parameter for Ordinal Log-Linear Models

For ordinal log-linear models, the estimation of the parameter reflecting the linear-by-linear measure of association has long been a topic for the analysis of dependence for contingency tables. Typically, iterative procedures (including Newton’s method) are used to determine the maximum likelihood estimate of the parameter. Recently Beh and Farver (2009, ANZJS, 51, 335–352) show by way of example three reliable and accurate noniterative techniques that can be used to estimate the parameter and extended this study by examining their reliability computationally. This paper further investigates the reliability of the non-iterative procedures when compared with Newton’s method for estimating this association parameter and considers the impact of the sample size on the estimate.


The Ordinal Log-Linear Model
Consider an I × J two-way contingency table, N, where the (i, j)th cell entry is given by n i j for i = 1, . . ., I and j = 1, . . ., J. Let the grand total of N be n and the matrix of joint relative frequencies be P so that the (i, j)th cell entry is p i j = n i j /n and I i=1 J j=1 p i j = 1.Define the ith row marginal proportion by p i• = J j=1 p i j , and define the jth column marginal proportion as p • j = I i=1 p i j .When the row and column variables both consist of ordered categories, the (i, j)th cell may be modeled by considering the following log-linear model: ( The values u and v are the weighted average of the row and column scores assigned a priori to the categories of the row and column variables, respectively.The parameters α i and β j are constrained so that I i=1 α i = J j=1 β j = 1 and are treated as nuisance parameters.Thompson [1, pages 164-197] provides a description how one may use S-PLUS, and R, to obtain the ML estimates the parameters from this model.
An alternative model that may be considered is so that the (i, j)th cell frequency can be estimated by The parameter of interest in (3) is φ and is a measure of linear-by-linear association between the variables of the table.Thus, in this paper, μ, α i and β j (and hence α i and β j ) are treated as nuisance parameters, and can be estimated by ln p i• , respectively; see, for example, Beh and Davy [2].The importance of the interpretation of the parameter φ can be highlighted by considering that ln m i j m i+1, j+1 m i, j+1 m i+1, j = φ(u i+1 − u i ) v j+1 − v j . (5) Therefore, if the row scores (and column scores) are chosen to be consecutive integers, so that u i+1 −u i = 1 and v j+1 −v j = 1, then φ = ln m i j m i+1, j+1 m i, j+1 m i+1, j (6) and is the constant log-odds ratio of the contingency table.
An alternative approach to estimating φ involves a more direct estimation procedure and was proposed by Beh and Davy [2].They considered a moderately reliable and relatively mathematically simple means of estimating φ noniteratively.Beh and Farver [9] greatly improved upon this approach.By applying these estimation techniques to fourteen commonly analysed two-way contingency tables considered in the statistics literature, Beh and Farver [9] showed that noniterative procedures provide estimates of φ that are comparable, and in most cases statistically indistinguishable, to ML estimates using Newton's unidimensional method.Here we further elaborate on this empirical study by examining the relative stability of estimating φ noniteratively and by using Newton's unidimensional method.Also discussed is the impact the sample size has when using iterative and noniterative estimation procedures.

MLE (Newton's Unidimensional Method).
There are a number of iterative algorithms that may be used for obtaining the maximum likelihood estimate (MLE) of φ from model (3).Some procedures include iterative proportional fitting, as described by Agresti [5, pages 64-66], and Nelder and Wedderburn's [6] quasilikelihood approach.Another popular iterative approach, and one adopted in this paper, is Newton's unidimensional method.Refer to Goodman [3], Haberman [4,10], and Agresti [5] for further details on the method.Here a brief description of Newton's method will be provided although a more detailed account on the algorithm used in this paper is provided in Beh and Farver [9].
Suppose that n i j are independent Poisson random variables.At the (t + 1)th iteration of the algorithm, the MLE of φ from the tth iteration is updated such that i j (7) until | φ (t+1) NR − φ (t) NR | < ε, for some reasonable value of ε, say ε = 0.0001.Here, m (t)  i j is just (3) with φ (t) N in place of φ.When this stopping criterion is satisfied, the MLE of φ using Newton's method, φ N , is the final iterative estimate φ (t+1) N .In the study undertaken by Beh and Farver [9], an initial value of φ (0) NR = 0 was chosen and, as (2) indicates, coincides with independence between the two variables.The standard error of the MLE of φ is Beh and Farver [11] showed that the maximum standard error of the estimate of φ for a contingency table coincides with the hypothesis of independence (φ = 0) and may be calculated by SE(φ There are various adaptations of this algorithm that can be considered to accelerate the estimation of φ or to ensure the algorithm converges.Due to the substantial level of technical expertise required to implement these algorithms, we will consider the traditional unidimensional version of Newton's method.[2] considered various issues concerned with the noniterative estimation of φ for model (2).As part of their study, they derived a simple means of estimating the parameter by considering

Noniterative Methods. Originally, Beh and Davy
where σ 2 Beh and Farver [9] continued to examine these issues and proposed the following two noniterative estimates: where (10) is referred to as the Log NI estimate.Here, (11) involves using a first-order approximation of the logarithm function in (10) and is termed the Log NI1 estimate.By considering a second-order approximation of the logarithm in ( 10), Beh and Farver [11] proposed another estimate to improve upon that of ( 11) which is referred to as the Log NI2 estimate.The comprehensive computation study of Beh and Farver [11] demonstrated the reliability and stability of the Log NI estimate (10) in estimating the parameter φ directly.They also showed that for small φ values, or for small contingency tables, the Log NI estimate (10), Log NI1 estimate (11), and Log NI2 estimate (12) provide very good estimates when compared with the MLE calculated using Newton's method.The computational study given in the following section calculates the standard error of BDNI estimate (9), Log NI estimate (10), Log NI1 estimate (11), and Log NI2 estimate ( 12) using (8).

A Computational Study
Two key issues concerning the estimation of φ in (2) are considered in this section: (1) the stability of using the Log NI estimate (10) and Newton's method to estimate φ, (2) the impact of the sample size in estimating the parameter using these two methods.
The study of these two issues was made by adopting the simple procedure for randomly generating a contingency table of given dimension, sample size, n, and φ value described in Beh and Farver [11].The results reported in the following sections are based on randomly generating 1000 contingency tables using the procedure outlined in the appendix, with a fixed and predefined dimension and φ.For the first issue, samples sizes of n ≈ 1000 were considered, and varied from 10 to 10000 for investigations of the second issue.[11] demonstrated computationally the issues concerning the accuracy of Newton's method and four noniterative methods (the BDNI estimate (9), the Log NI estimate (10), the Log NI1 estimate (11), and the Log NI2 estimate ( 12)) for estimating φ.Understanding how stable and reliable these methods are in calculating φ is a very important aspect of such a study.In the present study, three different types of contingency tables were considered to represent the large variation of dimension and φ commonly encountered (see, e.g., Beh and Farver [9] who considered the estimation of φ for fourteen commonly analysed contingency tables found in the statistical literature).One thousand (1000) randomly generated 4 × 5 contingency tables with φ = 0.5, 5×5 tables with φ = 0.3 and 3 × 3 tables with φ = 0.8 were considered.In practice, φ may often be smaller (say φ < 0.3), and, as shown in Beh and Farver [11], the noniterative estimation methods compare very well with Newton's method of calculating the MLE of φ.Therefore, the stability of the noniterative methods for larger φ values was studied.To avoid any problems involved in using the logarithm function for zero-cell values, cells from the randomly generated contingency tables with a zero frequency are replaced with 0.05, a value which is negligibly small in comparison with the sample size of approximately 1000.Agresti [12, page 397] considers how one may treat zero-cell counts in a contingency table .3.1.1.1000 4 × 5 Tables-φ = 0.5.One thousand (1000) 4 × 5 contingency tables were generated with φ = 0.5.Figure 1 shows how stable and accurate each of the five methods were for estimating φ from (2).

Stability of the Estimates. Beh and Farver
The five plots on the first line of Figure 1 summarise the estimate of φ calculated (to four decimal places) using Newton's method, the BDNI estimate (9), the Log NI estimate (10), the Log NI1 estimate (11), and the Log NI2 estimate (12), respectively.These graphs clearly show that Newton's method produces extremely reliable and consistent estimates of the parameter.The iterative technique produced a mean estimate of 0.500182 with a mean standard error (calculated using expression (8)) of 0.01671.The method took, on average, 44 iterations to converge to four decimal places ranging from 20 to 82 iterations (median is 43).Of the four noniterative methods, method (10) provided accurate and reliable estimates of φ.This method produced stable estimates with mean of 0.50298 (mean SE of 0.0122) which is favorably comparable with the MLE.However, while the BDNI estimate (9), Log NI1 estimate (11), and Log NI2 estimate (12) produced fairly consistent estimates, they underestimated the value of φ.This is not surprising since these three estimation procedures require a first or second order approximation of the logarithm function present in the Log NI estimate (10).However, not surprisingly, the second-order logarithm approximation (whose graphical summary appears as the fifth plot of the first line of Figure 1) produced better estimates than either of the two firstorder logarithm approximation methods.The noniterative methods of the BDNI estimate (9), the Log NI1 estimate (11), and the Log NI2 estimate (12) produced mean estimates (and standard errors) of 0.35573 (0.01424), 0.40950 (0.01352), and 0.46108 (0.01279), respectively.Of the noniterative estimates, the mean standard error of the estimates using method (9) were lower than those estimates using an approximation of the logarithm function.One may also note that, as part of the analysis, the mean maximum standard error of the tables, calculated using SE(φ) = [σ I σ J √ n] −1 , was 0.01729.
The second row of Figure 1 summarises the (Wald test) P-value obtained by comparing each of the estimated values of φ with 0.5.They clearly show that the P values for the MLE of φ calculated using Newton's method were all very close to 1, indicating that there is very strong evidence that the MLE is statistically consistent with the true parameter value.Of the noniterative techniques, the estimates calculated using the Log NI estimate (10) had P values consistently above 0.5, with many of them being well above 0.8 indicating that, of all the 1000 tables generated, all the estimates were statistically indistinguishable with φ = 0.5.Of the three remaining noniterative methods, the original estimation procedure of Beh and Davy [2] (the BDNI estimate ( 9)) performed the worst and calculated P values less than 0.05.
Again, Newton's method of calculating the MLE of φ was more stable than any of the noniterative techniques.The mean MLE from the 1000 tables randomly generated was an impressive 0.30001 (0.01455) with the algorithm requiring between 14 and 45 iterations to converge to four decimal places (the median number of iterations observed was 23).However, the Log NI estimate (10) also proved to be an extremely reliable and consistent noniterative estimation procedure and achieved estimates whose P values were consistently well above 0.8.Here, the mean estimate using the Log NI estimate (10) is 0.30010 (0.01198) compared with 0.29083 (0.01214) for method (12), followed by a mean of estimate of 0.26797 (0.01249) for the Log NI1 estimate (11) and a relatively poor mean estimate of 0.24209 (0.01286) for the BDNI estimate (9).For the tables that were randomly generated, the mean maximum standard error of φ was 0.01498.
The P values of the noniterative methods, as summarised in Figure 2, attest to the accurate and consistent estimation of φ using the Log NI estimate (10), while the Log NI2 estimate (12) produced estimates that were nearly as reliable, although consistently underestimating (due to the approximation of the logarithm function used in the Log NI estimate (10)) φ.The BDNI estimate (9) and Log NI1 estimate (11) proved to be the least reliable due to both involving first order approximations of the natural logarithm function in the Log NI estimate (10), however their Wald P-values were all well above the nominal 0.05 level of significance, with all being at least 0.3.
Figure 3 shows that for the 1000 randomly generated contingency tables method (10) estimated φ with great accuracy and reliability, just as Newton's method did.The general level of accuracy of the Log NI estimate (10) is evident by noting that the mean of the estimates was 0.80164 (0.04765) with a mean maximum standard error of 0.05531, while the mean MLE was 0.80057 (0.05421).To calculate the MLE with this level of precision required a median of 34 iterations to converge to four decimal places, with a range of 20 to 155 iterations.The Log NI2 estimate (12) again produced relatively stable and accurate results, with a mean estimate of 0.78785 (0.04789) while the BDNI estimate (9) provided the worst estimates with a mean estimate of 0.68781 and a standard error of 0.04930.

Impact of Sample Size on the Estimate of φ.
The results in Section 3.1 clearly show that using method (10) provides as accurate and reliable means of estimating φ as those obtained using the more commonly adopted MLE procedures (Newton's unidimensional method here).However, for all of the randomly generated contingency tables, the sample size was approximately 1000.In this section, we examine the accuracy and reliability of the estimates obtained using Newton's method compared with the Log NI approach, method (10), for 1000 randomly generated contingency tables where the sample size varied from 10 to 100000.When calculating the MLE, the number of iterations needed for convergence stabilised.For example, for a 3 × 3 contingency table where φ = 0.3, one would expect Newton's method to estimate the MLE of the parameter (to an accuracy of four decimal places) at approximately 34 iterations, while for a 4 × 5 table 43 iterations would be expected.As expected the number of iterations required to ensure convergence depends on the magnitude of φ and the size of the contingency table being analysed.
As part of this study, three different sized contingency tables were considered: 3 × 3 contingency tables where φ = 0.8, 4 × 5 contingency tables where φ = 0.5, and 5 × 5 contingency tables where φ = 0.3.Table 1 summarises the mean estimated value of φ using Newton's method and the Log NI estimate (10) from 1000 randomly generated contingency tables.Table 1 also includes, when using Newton's method to find the MLE of φ, the median number and the range (minimum and maximum) of iterations needed for convergence to occur to four decimal places.It shows that, for the three different types of contingency tables considered, as the sample size increased, so too did the accuracy of both estimates.
Suppose that we now consider the impact of the sample size when adopting Newton's method and the Log NI estimate (10) for estimating φ in the ordinal log-linear model of (2).To monitor the relative accuracy of the estimates using these methods compared to the specified value of φ used to randomly generate the contingency tables, the quantity %Δ in Table 1 describes the percentage difference in the estimate using each of the procedures compared with the specified value of φ such that %Δ = 100 • |1 − φ/φ|.It can be seen that, for small sample sizes, both estimation procedures produce estimates that are equally poor; although, for the 1000 random 3 × 3 tables with φ = 0.8, Newton's method did substantially better at estimating φ than the Log NI estimate (10).However, as the sample size increases, both estimation procedures become comparable, getting to within 1% of the specified value of φ for sample sizes exceeding 500.
Table 1 shows that the Log NI estimate produces very consistent estimates of the underlying parameter and, in some cases involving larger sample sizes, performs better than its iterative counterpart.In such cases, this provides some compelling evidence for considering the computationally simple estimation procedure of noniterative estimation, especially the Log NI estimate (10).

Discussion
This paper has examined, through simulation studies, the accuracy and reliability of estimation techniques for φ of the ordinal log-linear model (2).Traditionally, the MLE of this parameter is determined iteratively, either through variations of iterative proportional fitting, Newton's method, or by some other way.However Beh and Davy [2] and Beh and Farver [9,11] developed, between them, four noniterative estimation techniques, and this paper has computationally studied their precision and consistency.While Beh and Farver [9,11] established that the Log NI estimate (10) provides a very accurate estimate of the parameter when compared with the estimate obtained using Newton's method, the simulation study performed in Section 3 has demonstrated that the Log NI estimate (10) also produces an estimate that is consistent with the true parameter value.We have also shown that, while they may not produce estimates that are as accurate as the Log NI estimate, the BDNI estimate (9), Log NI1 estimate (11), and the Log NI2 estimate (12) are very reliable.A computational study of the impact of sample size on the estimates was also examined, and it was shown that the Log NI estimate provided very consistent results when compared with the estimate using Newton's unidimensional method.
With the advent of faster computing power and processing, performing an algorithm that cycles through hundreds or even thousands of iterations is no longer a major issue.So the question that one may ask is why should noniterative estimation be considered?Certainly for those well versed in undertaking computationally intensive studies, or familiar with the breadth of computing facilities that are now available to perform statistical tasks, estimating the parameters from models like (2) using iterative algorithms is a relatively simple task.However, not all analysts are as well versed in with such tools.Therefore, simple, and direct estimation procedures such as the four noniterative estimates studied in this paper provide one with a means of estimation that is computationally simple, accurate, and stable.There are other factors that can make noniterative approaches appealing, and these stem from problems inherent with using computationally intensive algorithms.While it may occur infrequently for models like (2) that problems with convergence will fail to produce an estimate, or that poor starting values will lead to poor estimates, they can still occur.Direct estimation procedures, therefore, provide the analyst with an alternative strategy for parameter estimation.
The aim of this paper is not to demonstrate the superiority of noniterative estimation methods over the more traditional MLE approaches that adopt algorithms to perform estimation.Instead the aim of this paper, and its predecessors, is to highlight that there are alternative strategies that can be considered.There are, however, still many issues yet to be resolved for the noniterative estimation procedures studied here and elsewhere.For example, this paper has not proposed a strategy for calculating noniterative estimates of multiple parameters in an ordinal log-linear model that have a nonzero covariance structure.This issue, and others like it, will be left for future consideration.

Figure 1 :
Figure 1: Plots showing the stability of Newton's method for obtaining the MLE of φ and the four noniterative procedures in 1000 simulations of a 4 × 5 contingency table where φ = 0.5.

Figure 2 :Figure 3 :
Figure 2: Plots showing the stability of Newton's method for obtaining the MLE of φ and the four noniterative procedures in 1000 simulations of a 5 × 5 contingency table where φ = 0.3.

Table 1 :
Mean ML and Log NI estimates of φ for three different sized contingency tables with varying various sample sizes [iterations] give the median iterations taken for Newton's method to converge to four decimal places and the range of iterations observed.