Linear mixed models with marginally symmetric nonparametric random effects

doi:10.1016/j.csda.2016.05.005

Computational Statistics & Data Analysis

Volume 103, November 2016, Pages 151-169

https://doi.org/10.1016/j.csda.2016.05.005 Get rights and content

Abstract

Linear mixed models (LMMs) are used as an important tool in the data analysis of repeated measures and longitudinal studies. The most common form of LMMs utilizes a normal distribution to model the random effects. Such assumptions can often lead to misspecification errors when the random effects are not normal. One approach to remedy the misspecification errors is to utilize a point-mass distribution to model the random effects; this is known as the nonparametric maximum likelihood-fitted (NPML) model. The NPML model is flexible but requires a large number of parameters to characterize the random-effects distribution. It is often natural to assume that the random-effects distribution be at least marginally symmetric. The marginally symmetric NPML (MSNPML) random-effects model is introduced, which assumes a marginally symmetric point-mass distribution for the random effects. Under the symmetry assumption, the MSNPML model utilizes half the number of parameters to characterize the same number of point masses as the NPML model; thus the model confers an advantage in economy and parsimony. An EM-type algorithm is presented for the maximum likelihood (ML) estimation of LMMs with MSNPML random effects; the algorithm is shown to monotonically increase the log-likelihood and is proven to be convergent to a stationary point of the log-likelihood function in the case of convergence. Furthermore, it is shown that the ML estimator is consistent and asymptotically normal under certain conditions, and the estimation of quantities such as the random-effects covariance matrix and individual a posteriori expectations is demonstrated. A simulation study is used to illustrate the gains in efficiency of the MSNPML model over the NPML model under the assumption of symmetry. A pair of real data applications are then used to demonstrate the manner in which the MSNPML model can be used to draw useful statistical inference.

Introduction

Linear mixed models (LMMs) are used as a fundamental tool for the statistical analysis of longitudinal data and data with repeated measurements; see McCulloch and Searle (2001, Ch. 6), Pinheiro and Bates (2000, Ch. 1), and Verbeke and Molenberghs (2000) for introductions on the topic. In the style of Laird and Ware (1982), an LMM can be characterized as follows.

Let $Y_{j} = {(Y_{j 1}, \dots, Y_{j n_{j}})}^{T}$ be a vector of $n_{j}$ responses belonging to individual $j$ , for $j = 1, \dots, n$ , where $n$ is the total number of individuals. Further, let each measurement $Y_{j k}$ , for $k = 1, \dots, n_{j}$ , be dependent upon covariate vectors $x_{j k} \in R^{p}$ and $z_{j k} \in R^{q}$ , and for each $j$ , let $B_{j} \in R^{q}$ be a vector of individual random effects arising from the a priori probability distribution $F_{B} (b)$ with density $f_{B} (b)$ . We say that the data arises from an LMM if for each $j$ and $k$ , $Y_{j k} | (B_{j} = b_{j}) = x_{j k}^{T} β + z_{j k}^{T} b_{j} + E_{j k},$ where $β \in R^{p}$ is a vector of fixed effects and $E_{j k}$ is a random error with probability density $f_{E} (e)$ . Here, a superscript $T$ indicates matrix transposition.

The main difficulty that arises in the use of LMMs is the evaluation and manipulation of the marginal density of $y_{j}$ , which has the general form $f_{Y_{j}} (y_{j}) = \int_{R^{q}} [\prod_{k = 1}^{n_{k}} f_{E} (y_{j k} - x_{j k}^{T} β - z_{j k}^{T} b_{j})] f_{B} (b_{j}) d b_{j},$ and can often be quite complex due to the integration form.

The traditional approach in dealing with the complexities of (2) is to set the error density as $f_{E} (e) = ϕ (e; 0, σ^{2}),$ where $ϕ (e; μ, σ^{2}) = \frac{1}{\sqrt{2 π σ^{2}}} exp [- \frac{{(e - μ)}^{2}}{2 σ^{2}}]$ is the normal density function with mean $μ$ and variance $σ^{2}$ , and to let $f_{B} (b)$ be a multivariate normal density. This approach results in $f_{Y_{j}} (y_{j})$ having the form of a multivariate normal density function, and allows for simple inference by maximum likelihood (ML) estimation via an expectation–maximization (EM) algorithm; see Laird and Ware (1982, Sec. 4) and McLachlan and Krishnan (2008, Sec. 5.9) for details.

It is well known that the estimation of the fixed effects $β$ is robust to the specification of the random-effects distribution. However, this robustness does not extend to the characterization of the random effects in the case of misspecification. The robustness as well as the effects of misspecification are explored in Agresti et al. (2004), Butler and Louis (1992), and McCulloch and Neuhaus (2011). For instance, in all three articles, the authors note that the estimates for the fixed effects tended not to be influenced greatly by the choice of the random-effects model. However, Agresti et al. (2004) note that the usual normal model can be highly inefficient when the true random-effects model is polarizing, such as in the case of binary or mixture random-effects distributions. It is further remarked in McCulloch and Neuhaus (2011) that the estimation of random intercept coefficients can be biased by making incorrect assumptions regarding the shape of the underlying random-effects distribution.

Multiple strategies have been considered for remedying random-effects misspecification. For example, Pinheiro et al. (2001) and Song et al. (2007) considered $t$ distributed random-effects and noise models, whereas Arellano-Valle et al. (2005), Lachos et al. (2010), and Ho and Lin (2010) considered the use of skew normal and $t$ distributed random and noise models. Although a rich class, the use of such distributions often do not allow for simplification of the marginal density (2) and do not allow for enough flexibility to model random-effects distributions with multiple modes or deviations from bell-shaped curves.

Based on the nonparametric maximum likelihood (NPML) principle of Laird (1978), Aitkin (1999) and Butler and Louis (1992) suggested using the point-mass density $f_{B} (b) = \sum_{i = 1}^{g} π_{i} δ (b - λ_{i})$ for the random effects, where $δ (x)$ is the Dirac delta function, $g \geq 1$ is the number of point masses, and $λ_{i} \in R^{q}$ and $π_{k} > 0$ are the point-mass locations and weights, for $i = 1, . ., g$ , respectively. To ensure that the total probability of the point masses adds up to unity and that the mean of $f_{B} (b)$ is zero, we also require the restrictions that $\sum_{i = 1}^{g} π_{i} = 1$ (which implies that $π_{g} = 1 - \sum_{i = 1}^{g - 1} π_{i}$ ) and $\sum_{i = 1}^{g} π_{i} λ_{i} = 0$ , where $0$ is a zero vector of appropriate size. We shall refer to densities of the form (4) as NPML-fitted (NPML) densities.

In Agresti et al. (2004) it was shown that NPML densities offered improvements in efficiency for estimating the characteristics of the random-effects density such as the covariance and individual a posteriori estimates of the random effects (i.e. $E (B_{j} | Y_{j} = y_{j})$ ) when compared to the use of parametric random-effects densities in situations, where the true random-effect densities deviated from the assumed parametric form. However, this improvement comes at a cost of modeling complexity, since for any given $g$ and $q$ , the number of parameters required for the specification and estimation of the NPML density is $(g - 1) (q + 1)$ . This number can grow quickly if $g$ or $q$ are large.

We note that although unimodality or bell shape cannot be assumed in general, it may still be acceptable to assume symmetry in the random-effects distribution. Thus, we introduce the marginally symmetric NPML (MSMPML) random-effects density $f_{B} (b) = \sum_{i = 1}^{g} \frac{π_{i}}{2} [δ (b - λ_{i}) + δ (b + λ_{i})],$ where $g$ and $π_{i}$ are as specified for (4) and no restrictions are made on $λ_{i}$ as density (5) has zero mean by definition. For any $g$ , density (5) comprises $2 g$ point masses at $λ_{i}$ and $- λ_{i}$ , each with weights $π_{i} / 2$ . Therefore, in situations where marginal symmetry can be assumed, the MSNPML model doubles the number of point masses when compared to the MPML density, for the same number of parameters.

Aside from the aforementioned articles, our work shares commonalities with the mixture random-effects densities of Verbeke and Lesaffre (1996) and, by extension, the mixture of mixed effects models of Celeux et al. (2005) and Ng et al. (2006); in the aforementioned articles, mixtures of normal densities are used to model the random effects, rather than point-mass densities. When $n_{j} = N$ for all $j$ , the MSNPML model corresponds to a symmetry-restricted and homoscedastic version of the repeated-measures linear mixture regression model of Grun and Leisch (2008). Furthermore, the use of parameter constraints to enforce marginal symmetry is similar to those used in Benaglia et al. (2009) and Chauveau and Hunter (2013). A kernel density approach to mixture modeling with symmetric components is presented in Chee and Wang (2013).

In this article, we shall consider the estimation of LMMs with the MSNPML random-effect density under the assumption of normal random error, using ML estimation via a multicycle expectation–conditional maximization (ECM) algorithm (Meng and Rubin, 1993). This algorithm is known to monotonically increase the log-likelihood and to lead to a stationary point of the said function in the case of convergence. Furthermore, we demonstrate how the covariance matrix and a posteriori expectations of the random effects can be estimated under model (5), and also consider various statistical matters such as consistency and the selection of the number of point masses. We then use numerical simulations to study the efficiency gains due to MSMPML over MPML, in the style of Agresti et al. (2004). A pair of example applications chosen from Pinheiro and Bates (2000) are subsequently used to demonstrate the inferential process of using MSMPML random effects.

The article shall proceed as follows. In Section 2, we shall discuss matters regarding ML estimation. In Section 3, statistical matters and computation of statistics are discussed. Results from simulations are then presented in Section 4, followed by example applications in Section 5. Conclusions are then drawn in Section 6.

Section snippets

Maximum likelihood estimation

Let $Y_{j}$ be characterized by (1) and let $f_{E} (e)$ and $f_{B} (b)$ have forms (3), (5), respectively. Under this model, the individual marginal density (2) is given by $f_{Y_{j}} (y_{j}) = \sum_{i = 1}^{g} \frac{π_{i}}{2} [\prod_{k = 1}^{n} ϕ (y_{j k}; x_{j k}^{T} β + z_{j k}^{T} λ_{i}, σ^{2}) + \prod_{k = 1}^{n} ϕ (y_{j k}; x_{j k}^{T} β - z_{j k}^{T} λ_{i}, σ^{2})]$ and subsequently, the log-likelihood function has the form $l_{n} (θ; y_{1}, \dots, y_{n}) = \sum_{j = 1}^{n} log f_{Y_{j}} (y_{j}),$ where $θ = {(β^{T}, λ^{T}, π^{T}, σ^{2})}^{T}$ is the vector of all model parameters. Here $λ = {(λ_{1}^{T}, \dots, λ_{g}^{T})}^{T}$ , and $π = {(π_{1}, \dots, π_{g - 1})}^{T}$ . For brevity, we shall suppress the data dependence in $l_{n} (θ; y_{1}, \dots, y_{n})$ and

Consistency and asymptotic normality

Establishing consistency and asymptotic normality are fundamental to the process of drawing accurate inferences with the parameter estimate. In the case where all $n_{j} = N$ for some constant $N \geq 2$ (i.e. the balanced experiment setting, in the language of LMMs) we can observe that the density function (6) is simply that of a $2 g$ component $N$ -dimensional mixture of multivariate normal distributions. Thus, in this setting, consistency and asymptotic normality can be established via conventional means for

Numerical simulations

We now perform a set of numerical simulation studies to assess the performance of MSNPML versus the NPML random-effects model, when the random-effects density is marginally symmetric. Our simulations follows in the style of Agresti et al. (2004) and are conducted as follows.

All computation and random data generation in Sections 4 Numerical simulations, 5 Example applications are conducted within the $R$ computing environment (R Core Team, 2013). Data generation of exponential, normal, and

Example applications

In this section, we demonstrate the use of the MSNPML LMM on a pair of data sets from the nlme R package (Pinheiro et al., 2014). The data sets that we use as examples are the Female subset of the Orthodont data set (Pinheiro and Bates, 2000, App. A.17) and the Oxboy data set (Pinheiro and Bates, 2000, App. A.19). The analyses are as follows.

Conclusions

In this article, we have introduced the MSNPML LMM for modeling random effects under the assumption of marginal symmetry. Under such conditions, the MSNPML model utilizes half the number of parameters to model the same number of point masses as the NPML model.

An ECM algorithm is presented for the ML estimation of the parameters of the MSNPML LMM. The algorithm is shown to exhibit monotonicity in the sequence of iterate likelihoods, as well as convergent to a stationary point of the

Acknowledgments

We thank the Associate Editor and the two Referees for their many helpful comments that have made this a much more thorough manuscript.

References (52)

A. Agresti et al.
Examples in which misspecification of a random effects distribution reduces efficiency, and possible remedies
Comput. Statist. Data Anal.
(2004)
N. Atienza et al.
On the consistency of MLE in finite mixture models of exponential families
J. Statist. Plann. Inference
(2007)
G. Galimberti et al.
A multivariate linear regression analysis using finite mixtures of t distributions
Comput. Statist. Data Anal.
(2014)
B. Grun et al.
Fitting finite mixtures of generalized linear regressions in R
Comput. Statist. Data Anal.
(2007)
H.D. Nguyen et al.
Laplace mixture of linear experts
Comput. Statist. Data Anal.
(2016)
M. Aitkin
A general maximum likelihood analysis of variance components in generalized linear models
Biometrics
(1999)
H. Akaike
A new look at the statistical model identification
IEEE Trans. Automat. Control
(1974)
T. Amemiya
Advanced Econometrics
(1985)
R.B. Arellano-Valle et al.
Skew-normal linear mixed models
J. Data Sci.
(2005)
T. Benaglia et al.
mixtools: An R package for analyzing finite mixture models
J. Stat. Softw.
(2009)

D.D. Boos et al.

Essential Statistical Inference: Theory and Methods

(2013)

S.M. Butler et al.

Random effects models with non-parametric priors

Stat. Med.

(1992)

Carnell, R., 2013. Triangle: provides the standard distribution functions for the triangle. URL:...

G. Celeux et al.

Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments

Stat. Model.

(2005)

D. Chauveau et al.

ECM and MM algorithms for normal mixtures with constrained parameters. Tech. rep

(2013)

C.-S. Chee et al.

Estimation of finite mixtures with symmetric components

Stat. Comput.

(2013)

N. Depraetere et al.

Order selection in finite mixtures of linear regressions: literature review and a simulation study

Statist. Papers

(2014)

D. Eddelbuettel

Seamless R and C++ Integration with Rcpp

(2013)

B. Efron et al.

An Introduction to the Bootstrap

(1993)

Gilbert, P., Varadhan, R., 2012. numDeriv: Accurate numerical derivatives. URL:...

B. Grun et al.

Flexmix version 2: finite mixtures with concomitant variables and varying and constant parameters

J. Stat. Softw.

(2008)

H.J. Ho et al.

Robust linear mixed models using the skew t distribution with application to schitzophrenia data

Biom. J.

(2010)

C. Keribin

Consistent estimation of the order of mixture models

Sankhyā Ser. A

(2000)

S. Kotz et al.

Beyond Beta: Other Continuous Families of Distributions with Bounded Support and Applications

(2004)

V.H. Lachos et al.

Likelihood based inference for skew-normal independent linear mixed models

Statist. Sinica

(2010)

N. Laird

Nonparametric maximum likelihood estimation of a mixing distribution

J. Amer. Statist. Assoc.

(1978)

Cited by (2)

SURE: A method for decision-making under uncertainty
2019, Expert Systems with Applications
Citation Excerpt :
There are different simulation methods based on triangular distributions which have been compared on factors such as speed, algorithmic code length, applicability and simplicity (Stein & Keblis, 2009b). Josselin and Maux (2017) and Nguyen and McLachlan (2016) both suggest the use of the rtriangle function in the triangle package for R (Carnell, 2017) for the generation of triangular random variates. Initially, this function was used to generate 500,000 decision tables using the data in Table 2.
Managerial decision-making often involves the consideration of multiple criteria with high levels of uncertainty. Multi-attribute utility theory, a primary method proposed for decision-making under uncertainty, has been repeatedly shown to be difficult to use in practice. This paper presents a novel approach termed Simulated Uncertainty Range Evaluations (SURE) to aid decision makers in the presence of high levels of uncertainty. SURE has evolved from an existing method that has been applied extensively in the pharmaceutical and speciality chemical sectors involving uncertain decisions in whole process design. The new method utilises simulations based upon triangular distributions to create a plot which visualises the preferences and overlapping uncertainties of decision alternatives. It facilitates decision-makers to visualise the not-so-obvious uncertainties of decision alternatives. In a real-world case study for a large pharmaceutical company, SURE was compared to other widely-used methods for decision-making and was the only method that correctly identified the alternative eventually chosen by the company. The case study demonstrates that SURE can perform better than other existing methods for decision-making involving multiple criteria and uncertainty.
THE INFLUENCE OF MANAGEMENT CREATIVITY ON THE OPTIMALITY OF MANAGEMENT DECISIONS OVER TIME: AN INNOVATIVE ASPECT
2023, Journal of Eastern European and Central Asian Research

View full text

Linear mixed models with marginally symmetric nonparametric random effects

Abstract

Introduction

Section snippets

Maximum likelihood estimation

Consistency and asymptotic normality

Numerical simulations

Example applications

Conclusions

Acknowledgments

Comput. Statist. Data Anal.

J. Statist. Plann. Inference

Comput. Statist. Data Anal.

Comput. Statist. Data Anal.

Comput. Statist. Data Anal.

A general maximum likelihood analysis of variance components in generalized linear models

Biometrics

A new look at the statistical model identification

IEEE Trans. Automat. Control

Advanced Econometrics

Skew-normal linear mixed models

J. Data Sci.

mixtools: An R package for analyzing finite mixture models

J. Stat. Softw.

Essential Statistical Inference: Theory and Methods

Random effects models with non-parametric priors

Stat. Med.

Mixture of linear mixed models for clustering gene expression profiles from repeated microarray experiments

Stat. Model.

ECM and MM algorithms for normal mixtures with constrained parameters. Tech. rep

Estimation of finite mixtures with symmetric components

Stat. Comput.

Order selection in finite mixtures of linear regressions: literature review and a simulation study

Statist. Papers

Seamless R and C++ Integration with Rcpp

An Introduction to the Bootstrap

Flexmix version 2: finite mixtures with concomitant variables and varying and constant parameters

J. Stat. Softw.

Robust linear mixed models using the skew t distribution with application to schitzophrenia data

Biom. J.

Consistent estimation of the order of mixture models

Sankhyā Ser. A

Beyond Beta: Other Continuous Families of Distributions with Bounded Support and Applications

Likelihood based inference for skew-normal independent linear mixed models

Statist. Sinica

Nonparametric maximum likelihood estimation of a mixing distribution

J. Amer. Statist. Assoc.