A Comparison of Confidence Interval Methods of Fixed Effect in Nested Error Regression Model

Linear mixed-effects models are very popular and powerful tools in many scientific fields such as zoology, biology, and education.  Estimators of fixed effects do not only depend on the variances of error terms but they also depend on random terms in mixed-effect models. When the distributions of random effects are unknown or enough sample size cannot be obtained, standard methods may fail. This study aims to determine a promising confidence interval method among existing methods in terms of coverage probability of true value of parameter. Standard and parametric bootstrap-based confidence interval methods for nested error regression model were compared in the simulation study under small samples. It is observed that parametric bootstrap-based method provides better coverage rates for small intra-correlation and profile likelihood method usually provides better results for moderate and strong correlation.


Introduction
Linear mixed-effects model is called with many names in statistics literature.As Demidenko [1] said in his book, "big ideas have many names and applications".These names are such as mixed-effects model, linear mixed model, mixed linear model, hierarchical model, linear mixed-effects model, multilevel model.
Linear mixed-effects model (LMM) is derived from the fact that these models are linear in terms of parameters in the models and they are mixed because these models include independent terms that can be random and fixed.Mixed model as a terminological term was firstly introduced by Eisenhart [2], random and fixed effect terms are formally distinguished in his study.LMMs do not require the assumption of independent and identically distributed sample as required in the typical statistics models such as linear regression.More complex dataset structures such as nested, hierarchical, multilevel can be handled with LMM.Between and within units or clusters variations are included in LMM.In LMM, observations within each unit or cluster are dependent whereas observations between units or clusters are independent.This flexibility attribute gives LMMs appropriateness for using repeated measures, longitudinal, panel and spatial datasets.With the existence of random effects in the model, theoretical approach to estimation and inference of fixed effects become more complicated than standard linear models.
In order to make inference for fixed effect parameters of LMM, several confidence interval methods are suggested by researchers in statistics literature.Before constructing confidence interval, estimator of fixed effect and variance estimation of random effects must be found.For variance estimation, best known methods are ANOVA, maximum likelihood (ML), and restricted (or residual) maximum likelihood (REML) methods.For LMM, REML estimation method was first introduced by Patterson and Thompson [3].Theoretical explanation of these methods are broadly shown in Searle et al [4].
For a good performance of confidence interval, a researcher needs to have a good estimator of fixed effect that has a small mean square error (MSE) .After a careful determination of these estimators, pivot and its distribution must be found to obtain percentiles to construct a confidence interval.Kackar and Harville [5] proposed MSE estimators of fixed and random effects in LMM.They indicated that if true values of variances of fixed and random effects are replaced by estimated values, the MSEs of estimators of fixed and random effects increase in size.Kenward and Roger [6] also proposed MSE estimator by combining Kackar and Harville approximations and their method to avoid bias in small samples for restricted maximum likelihood estimator of fixed effect in LMMs.
Harville and Fenech [7] investigated confidence interval methods for a variance ratio in unbalanced mixed linear models.Savin [8] constructed confidence interval for common mean in one-way classification model.Hall and Maiti [9] studied parametric bootstrap methods to construct confidence interval in mixed-effects model.Staggs [10] proposed parametric bootstrap approach to construct confidence intervals for fixed effect parameters of several mixed-effects models.Burch [11] constructed confidence intervals for variance components in unbalanced one-way random effects model using non-normal distributions.Liu [12] compared some confidence interval methods for function of parameters in repeated measures degradation model.Even though, there are several comparison studies for confidence interval methods, our study attempts to cover all existing methods for fixed effect of LMM.
In this paper, main focus is to be nested error regression model which is a special form of LMMs, given by A canonical form of nested error regression model is shown as follows  is defined as . General form of mixed-effects model for a particular experimental unit i is shown in Eq.(3).
, u ( ) i R with the dimension m x m and sub D with the dimension q x q are covariance matrices of respectively error terms and random effects.p and q are respectively number of fixed effect and random effect terms in LMMs.i X is the design matrix for fixed effects and has the dimension m x p for experimental unit i .It specifies values of fixed effects corresponding to each parameter for each observation.When fixed effect is categorical effect, the values of zero and one are used to denote the absence and presence of effect categories.For covariate effects, the variable values themselves are used in i X .Similarly, i Z is the design matrix for random effects with the dimension of q m x for experimental unit i .i u is 1 q x vector of random effects for experimental unit i with zero mean and covariance matrix sub D .i  is 1 m x vector of error terms for experimental unit i with zero mean and covariance matrix i R .
In addition, we assume that cov( , ) 0 i i u   .The assumptions of LMM above imply that, marginally ~( , ), In small area models, nested error regression model is used to involve unit-level predictors.The main purpose of this paper is to compare performances of several confidence interval methods for fixed effect parameter of nested error regression model under the situations of small samples.

Confidence Interval Methods of Fixed Effect in Nested Error Regression Model
In many applications, researchers usually take fixed effect into focus in mixed-effects model.Many researchers suggest constructing confidence interval (CI) for making inference of fixed effect parameters in LMMs.In order to perform statistical inference in this paper, we used three common maximum likelihoodbased asymptotic statistics; they are Wald-type statistics and likelihood ratio statistics.Neyman and Pearson introduced likelihood ratio statistics [13].
Wald statistic was firstly proposed by Wald [14].All these statistics are known as the first-order asymptotic statistics.They are derived from functions of maximum likelihood estimators, Fisher information matrix (FIM), or a consistent estimator of FIM.These statistics and their application to confidence intervals are described in the next chapters.Most common method among them is the Wald-type methods of confidence interval.

Likelihood ratio test-based confidence interval method
Likelihood Ratio Test-based confidence interval is also known as profile likelihood confidence interval (PLCI) in statistics literature.Since profile likelihood confidence interval method does not require asymptotic normality of the estimator, they usually perform better than Wald-type confidence intervals for small to moderate sample sizes.Even though they do not require normality assumption of the estimators, they still depend on an asymptotic approximation of likelihood ratio test statistic that is believed to have chi-square distribution.Likelihood function of linear mixed model in Eq. ( 3) is written as below where C is a constant that does not depend on model parameters.This likelihood function expression of linear mixed model is given in detail by Pinheiro and Bates [15].The model of interest is nested error regression model in this study.Parameter vector of this model is  is an additional parameter vector for the model.

( , )
L   is the likelihood function and then profile likelihood function for 1 L  is the maximum likelihood function over the remaining parameters for each fixed value of 1 In order to obtain PLCI for parameter of interest, the likelihood ratio test of a two-sided hypothesis should be inverted.For a two-sided test of null hypothesis : , the likelihood ratio test statistic is the difference between log-likelihood of full model and reduced model: where 1  and  are the MLEs for the full model and 0  is the MLE of  for the reduced model with where the test is non-significant at the  level as shown below in the inequalities: where with one degree of freedom.

Wald-type confidence interval methods
In appropriate settings where the assumptions of mixed model are valid, Wald-type confidence interval is preferred for fixed effect parameter.This test is called Wald Test based on the large sample theory and maximum likelihood theory [14].MLEs have useful properties such as consistency and asymptotic normality.These properties are valid if certain conditions defined by Lehmann are hold [16].If these conditions are hold, Wald-type statistics are reliable to construct confidence interval.For sufficiently large n under these conditions, equation ( 4) indicates that †  is consistent estimator of †  and †  is "approximately distributed as normal" for sufficiently large n .Another crucial result is that By using this result, covariance matrix of MLEs is calculated.Usually, observed information matrix is used because obtaining expected information matrix may not be possible.Observed information matrix is shown as below, Wald statistic is from Eq.( 4) for sufficiently large n , with p degrees of freedom, Based on asymptotic normality of the maximum likelihood estimator, the distribution of the pivot has asymptotically standard normal distribution in large sample sizes and its percentiles can be obtained from standard normal distribution percentiles.
In linear mixed model context, FIM is written as negative expectation of the Hessian matrix.It is used for obtaining standard error of the MLE †  of †  .Hessian matrix is given below: where l is the log likelihood ,   Wald confidence interval written as percentile of standard normal distribution.

t-Naive confidence interval method
t-Naive confidence interval is a Wald-type method used for small sample sizes under the situation that asymptotic distribution of maximum likelihood estimator is not converged.In practice, the variance of maximum likelihood estimator of fixed effects from FIM is usually obtained.Hence, when the variance of the estimator is unknown, it is ideal to use Student-t distribution percentiles for pivot percentiles explained above.Given the degrees of freedom for the pivot statistic, an approximate (1 )   t-Naive confidence interval can be constructed using the formula 1 cutoff [17].Demidenko [1] indicated that one can compute confidence interval under this approach by taking variance estimator of fixed effect as the naïve approximation and taking n p  degrees of freedom (where p is number of fixed effect terms in the model) for the approximating t distribution.

Parametric bootstrap method
Bootstrap is a statistical concept firstly introduced by Efron [18].Diccio and Romano approached the bootstrap for constructing confidence interval [19,20].In order to obtain confidence interval and standard error of estimates for multilevel linear model, Goldstein [21] used bootstrap technique.Zeger and Liang [22] provided an example of parametric bootstrap using generalized estimating equations.Very broad bootstrap literature can be found by Chernick [23].Efron's idea about bootstrap is that he considers the original bootstrap method as a maximum likelihood approach in nonparametric framework.If F was assumed to have a parametric form such as Gaussian distribution, estimator of F would be Gaussian distribution with the MLE of  and 2  used for the respective unknown parameter of F .According to Fisher's MLE theory, resampling with replacement from parametric estimates of F provides bootstrap estimates that are also ML estimates.
In this paper, one of the most important parametric bootstrap confidence interval methods called as Bootstrap-t introduced by Efron in his monograph was used [24].
Among available bootstrap confidence interval methods, bootstrap-t usually provides the best balance between performance and cost when standard error of the estimate is convenient and accurate [10].Therefore, this method was taken into consideration to obtain parametric bootstrap confidence interval for fixed effect parameter of nested error regression model.Two parametric bootstrap confidence intervals were constructed: one of them is based on Kenward & Roger MSE approximation [6] and the other is based on conventional MSE obtained by FIM to see the effect of MSEs on the coverage rates of the same method.

Bootstrap -t confidence interval method
Bootstrap-t interval [25] is derived from bootstrap estimate of the sampling distribution of the pivot denotes the bootstrap pivot approximation and * q  denotes the  cutoff for the bootstrap distribution of * q .The probability approximation is shown below: An approximate bootstrap-t confidence interval for 1  is given by the interval below: Because double bootstrap is quite time consuming to obtain standard error of 1  as a second layer of bootstrap, two different analytic approximations of standard error were used.One of them is conventional standard error obtained by FIM explained in Section 2.1 and this bootstrap interval is called as BS-t in the simulation study.The other is adjusted estimator of the variance covariance matrix of the fixed effect estimators.This is known as Kenward-Roger approximation (BS-t-KR) and is less biased than the conventional estimator for small or moderate number of experimental units [6].The confidence interval based on Kenward-Roger approximation of standard error [6] with bootstrap-t procedure is shown below:

Simulation Study
Confidence interval methods are compared in terms of coverage probability of true parameter value.The datasets generated from one-way random effect ANOVA model, , are fitted to nested error regression model, Scores for a fixed predictor ij x are simulated from normal distribution, (0,100) N .It is important to indicate that the datasets were generated independently of the ij x , and the value of 1  in the generating model (one-way random effect ANOVA model) was taken as zero.For each dataset, three analytic confidence intervals for The values in the parameter vector 2 2  ( , , ) u     were chosen differently to examine their effects on the coverage probability.The summaries of these values are given in Table 1.

Simulation algorithm
The two-sided %95 confidence interval methods described in Section 2.1, 2.2, 2.3 were examined.For each method, combination of ( , , ) n m  , the coverage probabilities were obtained by the following steps below: 1. Simulate a dataset for each combination of factors from the simulation design.Construct Wald-type, profile likelihood and parametric bootstrap confidence intervals given in Section 2.1, 2.2, 2.3.
2. Using the one-way random effect model, parametric bootstrap samples were generated.Using the generated samples, bootstrap-t, bootstrap-t based on Kenward-Roger MSE approximation confidence intervals given in Section 2.4 were constructed.The number of bootstrap samples generated in the simulation study was chosen as 2000 B  as Efron and Tibshirani [25] suggested.
3. In step 2 and step 3, check the confidence intervals obtained if the true value of 1  was included, assign an indicator a value of 1; if not included, and then assign an indicator a value of 0.

Repeat
Step 1 to 4 for a total of 2000 N  times and compute the proportion of confidence intervals which contain the true value of 1 0   .

Simulation results
Coverage rates of confidence interval methods are summarized in Figure 1 to 3 for each combination of ( , , ) it can be said that t-naïve method provides usually better coverage rates among the other methods as seen in Figure 1(c).combination, profile likelihood method provides the highest coverage rates among five methods.Bootstrap-based methods provide similar results to each other and nominal level of alpha.In addition, they provide better results than Wald-type intervals for small sample sizes as shown in Figure 3.

Discussion and Conclusion
Even with small sample sizes of experimental unit or cluster, the parametric bootstrap interval method demonstrates a crucial alternative to standard methods of inference for fixed effects in the mixed model especially for small intra-class correlation coefficient.The bootstrap-t method based on Kenward-Roger MSE approximation represents the best coverage rates for small experimental unit sizes despite of its dependency of standard error estimates.
When the intra-class correlation coefficient is increased to 0.5, 0.75   , profile likelihood method generally provides the highest coverage rates for all combinations of n and m .For this reason, it is the most conservative method among the others in terms of type 1 error.The highest coverage rates of profile likelihood method may be the result of wide length of confidence intervals.Since average expected length of confidence interval methods were not reported, it is not ideal to state that profile likelihood method among them as the best method without the knowledge of average expected lengths for moderate and strong intra-correlation coefficients.
As a suggestion and future direction of this study, investigating and comparing nonparametric and parametric bootstrap interval approaches in terms of coverage probability rate and average expected length for different linear mixed models especially when normality assumption of random effects of the model cannot be verified can be the main focus.It is known that parametric bootstrap is limited to the normality assumption of random effects in the linear mixed model.For this situation, finding a promising confidence interval method as an alternative to standard methods of inference will be the main attention for the authors of the paper.Furthermore, a two-sided, 5% level test of hypothesis   , false rejection rate of null hypothesis can be obtained by the proportion of confidence intervals that do not include zero, or by one minus the observed coverage probability rate of the confidence intervals, equivalently.
Among five interval methods, Wald-type methods are the fastest.The profile likelihood is a more time consuming method than Wald-type methods.However, the most computation time is required by bootstrap-based methods which are the only disadvantage of this method.However, they are the most accurate methods for especially small to moderate sample sizes of small intra-class correlation coefficient.Even though Wald-type methods are the fastest, for the small sample sizes they may not provide accurate results of coverage rates.Also, it can be stated that as the number of observation within unit and the number of experimental unit are increased, Wald-type methods improve their coverage rates for all simulation settings.
All combination factors have a quiet effect on the coverage probability of the methods and especially sample sizes have a great impact on coverage probability rate of the interval methods.Our study focused on confidence interval methods of fixed effect in nested error regression model under small samples.Similar results in a linear mixed model are expected; however, this is too early to generalize the comment obtained through this study for all situations of linear mixed model.In order to make more generalized suggestions, the present study requires more careful attention and simulation as a future research.The goal of the next study is to present a method that performs well in a broad variety of settings and is easier to implement.The authors can supply a document containing R code to show and obtain these confidence interval methods reported in this article.

1 
of variance component parameters for the model in Eq.(2).Since we only take fixed effect parameter of nested error regression model into consideration, Wald confidence region is a CI for 1  , an approximate

For the combination of 0. 25   and 3 m
 for all n values, it is observed that parametric bootstrapbased confidence intervals methods provide the highest coverage rate among five interval methods and close to nominal level of alpha.BS-t-KR provides the highest coverage rate for small and moderate n values as seen in Figures1(a), 1(b), 1(c).For combination of 0.25   and 9 m  the same result is valid for all n values stated above.Especially for small n values, BS-t-KR has the highest coverage rate.For combination of 0

Figure 1 (
Figure 1(a).Coverage rate of confidence interval methods for the combination of 3 & 0.25 m   

Figure 1 (
Figure 1(b).Coverage rate of confidence interval methods for the combination of 9 & 0.25 m   

Figure 1 (
Figure 1(c).Coverage rate of confidence interval methods for the combination of 15 & 0.25 m   

Figure 2 (
Figure 2(a).Coverage rate of confidence interval methods for the combination of 3 & 0.5 m   

Figure 2 (
Figure 2(b).Coverage rate of confidence interval methods for the combination of 9 & 0.5 m   

Figure 2 (
Figure 2(c).Coverage rate of confidence interval methods for the combination of 15 & 0.5 m   

Figure 3 (
Figure 3(a).Coverage rate of confidence interval methods for the combination of 3 & 0.75 m   

Figure 3 (
Figure 3(b).Coverage rate of confidence interval methods for the combination of 9 & 0.75 m   

Figure 3 (
Figure 3(c).Coverage rate of confidence interval methods for the combination of 15 & 0.75 m   

H
is rejected if the interval does not include zero.Since the dataset of the simulation design is generated with 1 0 a response for j of unit i , n denotes the total number of experimental units or clusters, and m is the number of replications or observations within each unit or cluster.0

Table 1 .
Parameter values selection summary for simulation design.