Linear mixed models with marginally symmetric nonparametric random effects

https://doi.org/10.1016/j.csda.2016.05.005Get rights and content

Abstract

Linear mixed models (LMMs) are used as an important tool in the data analysis of repeated measures and longitudinal studies. The most common form of LMMs utilizes a normal distribution to model the random effects. Such assumptions can often lead to misspecification errors when the random effects are not normal. One approach to remedy the misspecification errors is to utilize a point-mass distribution to model the random effects; this is known as the nonparametric maximum likelihood-fitted (NPML) model. The NPML model is flexible but requires a large number of parameters to characterize the random-effects distribution. It is often natural to assume that the random-effects distribution be at least marginally symmetric. The marginally symmetric NPML (MSNPML) random-effects model is introduced, which assumes a marginally symmetric point-mass distribution for the random effects. Under the symmetry assumption, the MSNPML model utilizes half the number of parameters to characterize the same number of point masses as the NPML model; thus the model confers an advantage in economy and parsimony. An EM-type algorithm is presented for the maximum likelihood (ML) estimation of LMMs with MSNPML random effects; the algorithm is shown to monotonically increase the log-likelihood and is proven to be convergent to a stationary point of the log-likelihood function in the case of convergence. Furthermore, it is shown that the ML estimator is consistent and asymptotically normal under certain conditions, and the estimation of quantities such as the random-effects covariance matrix and individual a posteriori expectations is demonstrated. A simulation study is used to illustrate the gains in efficiency of the MSNPML model over the NPML model under the assumption of symmetry. A pair of real data applications are then used to demonstrate the manner in which the MSNPML model can be used to draw useful statistical inference.

Introduction

Linear mixed models (LMMs) are used as a fundamental tool for the statistical analysis of longitudinal data and data with repeated measurements; see McCulloch and Searle (2001, Ch. 6), Pinheiro and Bates (2000, Ch. 1), and Verbeke and Molenberghs (2000) for introductions on the topic. In the style of Laird and Ware (1982), an LMM can be characterized as follows.

Let Yj=(Yj1,,Yjnj)T be a vector of nj responses belonging to individual j, for j=1,,n, where n is the total number of individuals. Further, let each measurement Yjk, for k=1,,nj, be dependent upon covariate vectors xjkRp and zjkRq, and for each j, let BjRq be a vector of individual random effects arising from the a priori probability distribution FB(b) with density fB(b). We say that the data arises from an LMM if for each j and k, Yjk|(Bj=bj)=xjkTβ+zjkTbj+Ejk, where βRp is a vector of fixed effects and Ejk is a random error with probability density fE(e). Here, a superscript T indicates matrix transposition.

The main difficulty that arises in the use of LMMs is the evaluation and manipulation of the marginal density of yj, which has the general form fYj(yj)=Rq[k=1nkfE(yjkxjkTβzjkTbj)]fB(bj)dbj, and can often be quite complex due to the integration form.

The traditional approach in dealing with the complexities of (2) is to set the error density as fE(e)=ϕ(e;0,σ2), where ϕ(e;μ,σ2)=12πσ2exp[(eμ)22σ2] is the normal density function with mean μ and variance σ2, and to let fB(b) be a multivariate normal density. This approach results in fYj(yj) having the form of a multivariate normal density function, and allows for simple inference by maximum likelihood (ML) estimation via an expectation–maximization (EM) algorithm; see Laird and Ware (1982, Sec. 4) and McLachlan and Krishnan (2008, Sec. 5.9) for details.

It is well known that the estimation of the fixed effects β is robust to the specification of the random-effects distribution. However, this robustness does not extend to the characterization of the random effects in the case of misspecification. The robustness as well as the effects of misspecification are explored in Agresti et al. (2004), Butler and Louis (1992), and McCulloch and Neuhaus (2011). For instance, in all three articles, the authors note that the estimates for the fixed effects tended not to be influenced greatly by the choice of the random-effects model. However, Agresti et al. (2004) note that the usual normal model can be highly inefficient when the true random-effects model is polarizing, such as in the case of binary or mixture random-effects distributions. It is further remarked in McCulloch and Neuhaus (2011) that the estimation of random intercept coefficients can be biased by making incorrect assumptions regarding the shape of the underlying random-effects distribution.

Multiple strategies have been considered for remedying random-effects misspecification. For example, Pinheiro et al. (2001) and Song et al. (2007) considered t distributed random-effects and noise models, whereas Arellano-Valle et al. (2005), Lachos et al. (2010), and Ho and Lin (2010) considered the use of skew normal and t distributed random and noise models. Although a rich class, the use of such distributions often do not allow for simplification of the marginal density (2) and do not allow for enough flexibility to model random-effects distributions with multiple modes or deviations from bell-shaped curves.

Based on the nonparametric maximum likelihood (NPML) principle of Laird (1978), Aitkin (1999) and Butler and Louis (1992) suggested using the point-mass density fB(b)=i=1gπiδ(bλi) for the random effects, where δ(x) is the Dirac delta function, g1 is the number of point masses, and λiRq and πk>0 are the point-mass locations and weights, for i=1,..,g, respectively. To ensure that the total probability of the point masses adds up to unity and that the mean of fB(b) is zero, we also require the restrictions that i=1gπi=1 (which implies that πg=1i=1g1πi) and i=1gπiλi=0, where 0 is a zero vector of appropriate size. We shall refer to densities of the form (4) as NPML-fitted (NPML) densities.

In Agresti et al. (2004) it was shown that NPML densities offered improvements in efficiency for estimating the characteristics of the random-effects density such as the covariance and individual a posteriori estimates of the random effects (i.e. E(Bj|Yj=yj)) when compared to the use of parametric random-effects densities in situations, where the true random-effect densities deviated from the assumed parametric form. However, this improvement comes at a cost of modeling complexity, since for any given g and q, the number of parameters required for the specification and estimation of the NPML density is (g1)(q+1). This number can grow quickly if g or q are large.

We note that although unimodality or bell shape cannot be assumed in general, it may still be acceptable to assume symmetry in the random-effects distribution. Thus, we introduce the marginally symmetric NPML (MSMPML) random-effects density fB(b)=i=1gπi2[δ(bλi)+δ(b+λi)], where g and πi are as specified for (4) and no restrictions are made on λi as density (5) has zero mean by definition. For any g, density (5) comprises 2g point masses at λi and λi, each with weights πi/2. Therefore, in situations where marginal symmetry can be assumed, the MSNPML model doubles the number of point masses when compared to the MPML density, for the same number of parameters.

Aside from the aforementioned articles, our work shares commonalities with the mixture random-effects densities of Verbeke and Lesaffre (1996) and, by extension, the mixture of mixed effects models of Celeux et al. (2005) and Ng et al. (2006); in the aforementioned articles, mixtures of normal densities are used to model the random effects, rather than point-mass densities. When nj=N for all j, the MSNPML model corresponds to a symmetry-restricted and homoscedastic version of the repeated-measures linear mixture regression model of Grun and Leisch (2008). Furthermore, the use of parameter constraints to enforce marginal symmetry is similar to those used in Benaglia et al. (2009) and Chauveau and Hunter (2013). A kernel density approach to mixture modeling with symmetric components is presented in Chee and Wang (2013).

In this article, we shall consider the estimation of LMMs with the MSNPML random-effect density under the assumption of normal random error, using ML estimation via a multicycle expectation–conditional maximization (ECM) algorithm (Meng and Rubin, 1993). This algorithm is known to monotonically increase the log-likelihood and to lead to a stationary point of the said function in the case of convergence. Furthermore, we demonstrate how the covariance matrix and a posteriori expectations of the random effects can be estimated under model (5), and also consider various statistical matters such as consistency and the selection of the number of point masses. We then use numerical simulations to study the efficiency gains due to MSMPML over MPML, in the style of Agresti et al. (2004). A pair of example applications chosen from Pinheiro and Bates (2000) are subsequently used to demonstrate the inferential process of using MSMPML random effects.

The article shall proceed as follows. In Section  2, we shall discuss matters regarding ML estimation. In Section  3, statistical matters and computation of statistics are discussed. Results from simulations are then presented in Section  4, followed by example applications in Section  5. Conclusions are then drawn in Section  6.

Section snippets

Maximum likelihood estimation

Let Yj be characterized by (1) and let fE(e) and fB(b) have forms (3), (5), respectively. Under this model, the individual marginal density (2) is given by fYj(yj)=i=1gπi2[k=1nϕ(yjk;xjkTβ+zjkTλi,σ2)+k=1nϕ(yjk;xjkTβzjkTλi,σ2)] and subsequently, the log-likelihood function has the form ln(θ;y1,,yn)=j=1nlogfYj(yj), where θ=(βT,λT,πT,σ2)T is the vector of all model parameters. Here λ=(λ1T,,λgT)T, and π=(π1,,πg1)T. For brevity, we shall suppress the data dependence in ln(θ;y1,,yn) and

Consistency and asymptotic normality

Establishing consistency and asymptotic normality are fundamental to the process of drawing accurate inferences with the parameter estimate. In the case where all nj=N for some constant N2 (i.e. the balanced experiment setting, in the language of LMMs) we can observe that the density function (6) is simply that of a 2g component N-dimensional mixture of multivariate normal distributions. Thus, in this setting, consistency and asymptotic normality can be established via conventional means for

Numerical simulations

We now perform a set of numerical simulation studies to assess the performance of MSNPML versus the NPML random-effects model, when the random-effects density is marginally symmetric. Our simulations follows in the style of Agresti et al. (2004) and are conducted as follows.

All computation and random data generation in Sections  4 Numerical simulations, 5 Example applications are conducted within the R computing environment (R Core Team, 2013). Data generation of exponential, normal, and

Example applications

In this section, we demonstrate the use of the MSNPML LMM on a pair of data sets from the nlme R package (Pinheiro et al., 2014). The data sets that we use as examples are the Female subset of the Orthodont data set (Pinheiro and Bates, 2000, App. A.17) and the Oxboy data set (Pinheiro and Bates, 2000, App. A.19). The analyses are as follows.

Conclusions

In this article, we have introduced the MSNPML LMM for modeling random effects under the assumption of marginal symmetry. Under such conditions, the MSNPML model utilizes half the number of parameters to model the same number of point masses as the NPML model.

An ECM algorithm is presented for the ML estimation of the parameters of the MSNPML LMM. The algorithm is shown to exhibit monotonicity in the sequence of iterate likelihoods, as well as convergent to a stationary point of the

Acknowledgments

We thank the Associate Editor and the two Referees for their many helpful comments that have made this a much more thorough manuscript.

References (52)

  • D.D. Boos et al.

    Essential Statistical Inference: Theory and Methods

    (2013)
  • S.M. Butler et al.

    Random effects models with non-parametric priors

    Stat. Med.

    (1992)
  • Carnell, R., 2013. Triangle: provides the standard distribution functions for the triangle. URL:...
  • G. Celeux et al.

    Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments

    Stat. Model.

    (2005)
  • D. Chauveau et al.

    ECM and MM algorithms for normal mixtures with constrained parameters. Tech. rep

    (2013)
  • C.-S. Chee et al.

    Estimation of finite mixtures with symmetric components

    Stat. Comput.

    (2013)
  • N. Depraetere et al.

    Order selection in finite mixtures of linear regressions: literature review and a simulation study

    Statist. Papers

    (2014)
  • D. Eddelbuettel

    Seamless R and C++ Integration with Rcpp

    (2013)
  • B. Efron et al.

    An Introduction to the Bootstrap

    (1993)
  • Gilbert, P., Varadhan, R., 2012. numDeriv: Accurate numerical derivatives. URL:...
  • B. Grun et al.

    Flexmix version 2: finite mixtures with concomitant variables and varying and constant parameters

    J. Stat. Softw.

    (2008)
  • H.J. Ho et al.

    Robust linear mixed models using the skew t distribution with application to schitzophrenia data

    Biom. J.

    (2010)
  • C. Keribin

    Consistent estimation of the order of mixture models

    Sankhyā Ser. A

    (2000)
  • S. Kotz et al.

    Beyond Beta: Other Continuous Families of Distributions with Bounded Support and Applications

    (2004)
  • V.H. Lachos et al.

    Likelihood based inference for skew-normal independent linear mixed models

    Statist. Sinica

    (2010)
  • N. Laird

    Nonparametric maximum likelihood estimation of a mixing distribution

    J. Amer. Statist. Assoc.

    (1978)
  • Cited by (2)

    • SURE: A method for decision-making under uncertainty

      2019, Expert Systems with Applications
      Citation Excerpt :

      There are different simulation methods based on triangular distributions which have been compared on factors such as speed, algorithmic code length, applicability and simplicity (Stein & Keblis, 2009b). Josselin and Maux (2017) and Nguyen and McLachlan (2016) both suggest the use of the rtriangle function in the triangle package for R (Carnell, 2017) for the generation of triangular random variates. Initially, this function was used to generate 500,000 decision tables using the data in Table 2.

    View full text