Reparameterized Birnbaum-Saunders regression models with varying precision

We propose a methodology based on a reparameterized Birnbaum-Saunders regression model with varying precision, which generalizes the existing works in the literature on the topic. This methodology includes the estimation of model parameters, hypothesis tests for the precision parameter, a residual analysis and influence diagnostic tools. Simulation studies are conducted to evaluate its performance. We apply it to two real-world case-studies to show its potential with the R software. MSC 2010 subject classifications: Primary 62J12, 62J20; secondary 62F03.


Introduction
The Birnbaum-Saunders (BS) distribution has been widely studied and applied; see the seminal paper by Birnbaum and Saunders [2] and the books by Johnson et al. [12, pp. 651-663] and Leiva [13]. The BS distribution is skewed positively, has nice properties and a close relation with the normal distribution. It was derived in terms of shape and scale parameters, but the latter is also its median. The original parameterization of the BS distribution is useful in several settings, for example, when modeling biological, environmental and fatigue data; see Owen and Padgett [25], Owen [24], Qu and Xie [28], Villegas et al. [46], Ferreira et al. [8], Li et al. [20], Saulo et al. [36], Leiva et al. [15,14,19], Garcia-Papani et al. [10] and Marchant et al. [22,23].
Santos-Neto et al. [34] proposed several parameterizations for the BS distribution by using different arguments. One of them indexes the BS distribution by its mean and precision, which we name the reparameterized BS (RBS) distribution. As mentioned, the original BS parameterization is based on its shape and median and then the associated BS modeling is formulated by its median instead of its mean. However, in statistics, it is usual to model the mean. Thus, because the RBS distribution is parameterized by its mean, it can be used as a competitor of the normal distribution, but also of well-known asymmetrical distributions, such as gamma and lognormal. Therefore, the RBS distribution is useful in settings for which the original parameters are limited. For example, when modeling economic, financial and management data; see the works by Jin and Kawczak [11], Bhatti [1], Paula et al. [27], Leiva et al. [16,18,17], Santos-Neto et al. [35], Rojas et al. [32] and Wanke and Leiva [47], applications that, such as in the original parameterization, were conducted by international, transdisciplinary groups of researchers.
Note also that the original BS parameterization describes data based on a logarithmic transformation, inducting to an interpretation problem of the obtained results. Regression models are often concerned on the mean response and in its original scale, because there the interpretations become simpler. By using the RBS distribution, one can model the mean with no transformations similarly as in generalized linear models (GLM), but the BS and RBS distributions do not belong to the exponential family. However, a GLM type modeling based on the RBS distribution can also be carried out; see Leiva et al. [16]. Thus, the mean response is related to a linear predictor by one of several possible link functions, encompassing parameters to be estimated. Differently from all the existing BS regression models studied until now, the approach proposed by Leiva et al. [16] allows data to be modeled in their natural scale with a wide flexibility. In addition, the RBS distribution has properties that its competitor distributions of the exponential family do not have; see Subsection 2.1.
The RBS distribution has a precision parameter. Variability is often measured by dispersion parameters, but it can also be described by precision parameters, which are inversely proportional to the dispersion. Variability modeling has been widely discussed in the literature related to heteroscedasticity; see Van Keilegom and Wang [43] and Saumard [37]. For example, Cook and Weisberg [3] studied heteroscedastic normal models. Taylor and Verbyla [42] described jointly the location and dispersion parameters of the Student-t model. Lin et al. [21] considered tests for heteroscedasticity in Student-t regression models. Wu et al. [49] proposed a method to select variables describing the mean and dispersion of lognormal models. In the context of GLM, Smyth [39] defined sub-models to describe the mean and dispersion, whereas Smyth and Verbyla [40] proposed an extension of GLM allowing both the mean and dispersion to be modeled. Cysneiros et al. [5] considered heteroscedastic symmetric linear models and their diagnostics. However, there exists few works modeling heteroscedasticity by precision parameters. Ferrari et al. [7] considered beta regression models for which the precision parameter is not constant across data and described it as a function of explanatory variables (covariates). Simas et al. [38] assumed a non-linear regression structure for the precision parameter by using a beta distribution, whereas Rocha and Simas [31] derived local influence in this model. Paula [26] modeled simultaneously the mean and precision of the gamma distribution, and carried out diagnostics in double generalized linear models.
For BS regressions based on its original parameterization, Rieck and Nedelman [30] and Galea et al. [9] assumed that the corresponding shape parameter is homogeneous across data. Xie and Wei [50] proposed a test for homogeneity of this shape parameter. Heterogeneous BS log-linear and non-linear regressions with both shape and scale parameters modeled by covariates were studied by Qu and Xie [28], Li et al. [20] and Vanegas et al. [44]. For the RBS regression model proposed by Leiva et al. [16], it is assumed that the precision is constant across data. Modeling of precision based on the RBS distribution has not been studied.
The main objective of this paper is to propose an RBS regression model with precision varying, allowing heteroscedasticity to be described, extending the work by Leiva et al. [16]. The specific objectives are (i) to estimate the parameters with the maximum likelihood (ML) method; (ii) to introduce hypothesis tests for the precision parameter and evaluate their performance; (iii) to present four types of residuals for the RBS model and study their distributions by Monte Carlo (MC) simulations; (iv) to analyze the sensitivity of the ML estimators to perturbations by using local influence (LI) and generalized leverage (GL) methods [see 45]; and (v) to apply the obtained results to two real-world case-studies with the R software; see www.R-project.org and R-Team [29].
In Section 2, we formulate the RBS regression model with varying precision, estimate its parameters and discuss gradient (GR), likelihood ratio (LR), score (SC) and Wald (WA) tests for the precision parameter. In Section 3, we conduct a diagnostic analysis, including residuals, GL and LI under case-weight, response and covariate perturbations. In Section 4, we carry out MC simulation studies to evaluate the performance of the proposed hypothesis tests and the empirical distribution of the residuals. In Section 5, we illustrate the potential applications of the proposed methodology by means of two real-world case-studies. In Section 6, we provide our conclusions and possible future work.

RBS distribution
A random variable T follows a BS distribution with shape (ϕ > 0) and scale (ρ > 0) parameters, which is denoted by T ∼ BS(ϕ, ρ), if its probability density function (PDF) is given by , which is function only of δ. In addition, δ plays the role of a precision parameter in the sense that, for fixed μ, as δ increases, the corresponding variance decreases. If Y ∼ RBS(μ, δ), then its PDF and log-PDF are given respectively by In addition, the cumulative distribution function (CDF), quantile function (QF) and hazard rate of Y ∼ RBS(μ, δ) are defined respectively as , y > 0, where Φ and Q N are the N(0, 1) CDF and QF, respectively. The RBS model has the properties: (iii) as its QF has closed form, its median is [δ/{δ + 1}]μ; and (iv) its hazard rate has increasing, decreasing and upside-down shapes. All of these properties are shared by few distributions, in particular, several of them are not shared by exponential family distributions used in GLM. Moreover, note that, for modeling asymmetric data, the median might be more suitable than the mean. However, in the case of the RBS distribution, the median is proportional to its mean. Figure 1 presents the relation between the median and mean of the RBS distribution. From this figure, observe that, as the parameter δ increases, the two measures tend to be equal. For instance, from δ = 4, we have that the median is 80% of the mean. Generally, in real-world applications, the estimate of δ is large, thus providing a good approximation between the median and mean. In the application of the work by Leiva et al. [16], the estimate of δ is 49.65, which means that the median represents 98.03% of the mean. In the two data analyses of the present paper, the estimates of δ are 82.38 and 35.32, representing 98.80% and 97.25% of mean, respectively. Thus, in practice, our model has no relevant loss of information by modeling the mean instead of the median. In addition, for the RBS regression model, we have the advantage of not transforming the response. Furthermore, a GLM type method, as that used in the RBS model [see 16], provides more flexibility in the functional relationship between the mean response and the linear predictor. More properties of the RBS distribution can be found in Santos-Neto et al. [35].

RBS regression model with varying precision
Let Y = [Y 1 , . . . , Y n ] be a sample from an RBS population, that is, independent (IND) random variables but not independent identically distributed, such that, . . , n, and y = [y 1 , . . . , y n ] its observed value. The RBS regression model with varying precision can be written supposing that the corresponding mean and precision parameters satisfy, respectively, the functional relations where  .3) must be strictly monotone, positive and at least twice differentiable. Table 1 lists the most common link functions for g and h along with their first and second derivatives.

Estimation
The log-likelihood function for the parameter θ = [β , α ] obtained from (2.2), related to μ = [μ 1 , . . . , μ n ] and δ = [δ 1 , . . . , δ n ] for the class of models with link functions in (2.3), is given by where (ignoring the constant term) The [p + q] × 1 score vector with first derivatives is obtained from (2.4) aṡ Hessian matrix with second derivatives also is obtained from (2.4) as see details in Appendix A.2. The corresponding expected Fisher information matrix is given by where the elements of V , S and U are given in Equation (A.2). Also consider X as being another 2n × [p + q] augmented matrix of the form Therefore, from (2.6) and (2.5), the Fisher information matrix given in Equation (A.2) and its inverse can be rewritten as i(θ) =X WX and i(θ) − By using the Fisher scoring iterative procedure, the corresponding estimation algorithm for θ = [β , α ] is given by Note that θ (m+1) given in (2.7) has the form of a reweighted least square (LS) estimate, whereỹ is the modified response variable. We propose to use the Cole-Green algorithm to conduct with the iterative procedure defined in (2.7). This algorithm can be better for distributions with highly correlated parameter estimators and equivalent to the Fisher scoring algorithm, when we are in the fully parametric case; see details about the Cole-Green algorithm in [41].

Hypothesis testing
We want to test if the model precision is constant, that is, our hypotheses are H 0 : For large n, under H 0 and usual regularity conditions, the GR, LR, SC and WA statistics follow the χ 2 q−1 distribution. Thus, the above hypotheses can be tested with critical values based on this distribution.
We use θ for denoting the ML estimator of θ evaluated at H 0 (restricted case), or for any function or element of this vector, and θ for the ML estimator of θ under H A (unrestricted case).

The GR test
Its statistic to test H 0 versus H A is given by where Z 0 is composed by the elements of the q − 1 last columns of Z and y , δ are defined in Appendix A.1.

The LR test
Its statistics to test H 0 versus H A is defined as

The SC test
Its statistic to test H 0 versus H A can be expressed as

The WA test
Its statistic to test H 0 versus H A is obtained as

Residuals
We introduce residuals for the model with link function defined in (2.3) based on the standardized Pearson, score and quantile residuals given respectively by being the ML estimates of μ i and δ i , respectively. In addition, see details about the quantile residual in Dunn and Smyth [6]. For the score residual, see details in Appendix A.3. Moreover, we propose a deviance component type residual, assuming that the precision parameter has a structure of regression known, by where the sign function is defined as sign(x) = {−1, 0, 1}, for x < 0, x = 0 and x > 0, respectively. Residual are used to verify model adequacy, to test nonlinearity and to detect outliers, autocorrelation and heteroscedasticity, plotting them versus covariates individually, or versus μ i or η i .

Generalized leverage
GL of an estimator is defined in regression as a measure of the importance of individual cases, evaluating the influence of the observed response on its own estimated value. GL can be obtained in a general form as ∂ y/∂y = In the RBS regression model with varying precision, the GL matrix is given by Considering the results presented in (3.1), the GL given in (3.2) can be rewrit- is the GL considering only the vector β and I n is the n × n identity matrix.

Local influence
For the RBS model given in (2.3), let (θ|ω) be log-likelihood function corresponding to this model perturbed by ω. The perturbation vector ω belongs to a subset of R n and ω 0 is an n × 1 non-perturbation vector, such that (θ|ω 0 ) = (θ), for all θ. The likelihood displacement (LD) defined as LD(ω) = 2[ ( θ) − ( θ ω )], where θ ω denotes the ML estimate of θ upon the perturbed RBS model, can be used to assess the influence of the perturbation on the ML estimate. The normal curvature for θ in the direction vector l, with ||l|| = 1, Note that Δ must be evaluated at θ = θ and ω = ω 0 , for j = 1, . . . , p + 1 and i = 1, . . . , n. An LI diagnostic is generally based on index plots. For instance, the index graph of the eigenvector l max corresponding to the maximum eigenvalue of C lmax (θ) say, evaluated at θ = θ, can detect those cases that under small perturbations exercise a great influence on LD(ω). In addition to the direction vector of maximum normal curvature, l max say, another direction of interest is l i = e in , which corresponds to the direction of the case i, where e in is an n × 1 vector of zeros with a value equal to one at the ith position, that is, {e in , 1 ≤ i ≤ n} is the canonical basis of R n . In this case, the normal curvature is given by /n, are considered as potentially influential. This procedure is called the total LI of the case i. For the indicated scheme, Table 2 presents the respective perturbation matrix given in general by which must be evaluated at the non-perturbation vector ω 0 and at θ.
n×q are n × p and n × q matrices, respectively, of zeros except for the kth and lth columns, which only contains ones.

Hypothesis testing
We report MC simulations of size 5000 to compare the performance of the GR, LR, SC and WA statistics given in (2.9), (2.10), (2.11) and (2.12), respectively, by four R functions. We estimate the coefficients of the RBS regression model with varying precision defined in (2.3) using the gamlss function, contained in the R package of the same name. We implement the RBS distribution inside the distributions of this R package; see Stasinopoulos and Rigby [41]. To use the gamlss function, we introduce the RBS distribution defined in (2.1) in the same structure of the distributions defined in the gamlss.dist library. Based on the proposed methodology, we develop a set of computational routines in R, which form part of the rbs package; see Santos-Neto et al. [33]. To install this package, the code devtools::install github(''santosneto/RBS'') must be used. We consider λ = max δ i /min δ i , for i = 1, . . . , n, to measure the heterogeneity degree of the data. Note that if λ = 1, the precision is constant obtaining the model proposed by Leiva et al. [16].  Table 3, from where it is possible to note that the GR, LR and WA tests are markedly liberal, whereas the SC test is, in general, conservative. Note that, when δ = 20.1, n = 100 and γ = 0.01, the rejection rates are 0.96% (SC), 1.26% (GR), 1.38% (LR) and 1.76% (WA). As expected and for all the tests, these rates converge to the assumed levels as n increases. Note that the GR and SC tests present the best performances. Figure 2 displays empirical quantiles versus theoretical quantiles (QQ) plots of these statistics compared to the χ 2 q−1 distribution. From this figure, observe that the empirical distributions of the four studied test statistics agree very well with their asymptotic distributions. Second, we compute the rejection rates under the alternative hypotheses α 1 ∈ {−3.0, −2.0, −1.0} against α 0 = 1.5, which implies values of λ ≈ {2.6, 6.8, 18.4}. Table 4 presents the empirical power of the tests for different n and λ. Note that the GR and SC tests present smaller power values in relation to the LR and WA tests. For instance, when λ = 18.4, n = 50 and γ = 0.05, these powers are 99.20 (GR), 99.20 (LR), 98.62 (SC) and 99.44 (WA). As expected, the powers of the four tests increase with λ. For small to moderate n, the best performance is detected for the GR and SC tests, which are more powerful than the LR and WA tests. Hence, GR and SC tests may be recommended to test hypotheses on the precision parameter in RBS regression models. The GR test has a slight advantage over the SC test, because the GR statistic is simpler to calculate.

Residuals
Now, we report the results of MC simulations of size 5000 to study the empirical distributions of the residuals r p i , r s i , r d i and r q i for the RBS regression model with   We compute the mean (r), standard deviation (SD) and coefficients of skewness (CS) and kurtosis (CK), whose results are presented in Tables 5 and 6. Note that all of the residuals have mean approximately equal to zero and SD close to one. Observe also that the empirical distribution of r p i has positive asymmetry. Also, notice that r s i , r d i and r q i present a CS close to zero. The residuals r p i and r s i presents, generally, CK closer to three for both cases. Observe that r d i and r q i display in general a similar behavior. Figure 3 presents the QQ-plots with simulated envelopes of empirical quantiles versus theoretical quantiles of

Case-study description
Alfalfa is a high protein crop and appropriate feed for dairy cows. Then, rent for land planted with alfalfa in relation to rent for other agricultural uses should be higher in zones with a high density of dairy cows. However, these rents should be lower in zones where a fertilizer is required, due to further expenses. In this line, Weisberg [48, Problem 9.10, p. 208] reported a study on the variation in rent paid for agricultural land planted with alfalfa in Minnesota. One of the objectives of that study was to investigate how the rent for land planted with alfalfa crops in relation to rent for other agricultural uses is affected by the density of dairy cows and by the proportion of farmland used as pasture, when land requires a fertilizer to increase the productivity of alfalfa or not.

Previous studies on these data
To evaluate the objective of the case-study described in Subsection 5.1, data were collected for 67 counties of Minnesota in 1977; see Weisberg [48,Problem 9.10,p. 208]. These data are available from an R package named alr3 by the command data(landrent). Landrent data were analyzed by Taylor and Verbyla [42], Lin et al. [21] and Wu et al. [49]. Non-constant variance in a linear regression model can be diagnosed by residual plots. Taylor and Verbyla [42] discussed the joint modeling of location and dispersion parameters and derived a methodology to detect heteroscedasticity, when the response is Student-t distributed. Lin et al. [21] considered tests for heteroscedasticity in Student-t linear regression models. All of these studies and also Wu et al. [49] showed evidence of heteroscedasticity in landrent data, but this last recent work showed the convenience of modeling simultaneously the mean and dispersion of the lognormal distribution, which is an asymmetrical distribution, such as the RBS distribution is.

Variables to be modeled
The considered variables are: (i) the ratio between the average rent per acre planted with alfalfa and the corresponding average rent for other agricultural uses ('ratio'); (ii) the density of dairy cows in number per square mile ('density'); (iii) the proportion of farmland used as pasture ('proportion'); and (iv) if the fertilizer 'liming' is required to increase the productivity of alfalfa or not. We discard the covariate 'proportion' due to it is correlated with the covariate 'density' and concentrate on the group 'liming'. Note that the response variable 'ratio' is strongly correlated with 'density' indicating a linear relation between these two variables; see Figure 4 (1st panel left). Then, our study is concentrated on the response variable 'ratio' (Y ) and the covariate 'density' for the group 'liming' (X).

Estimation and model cheking
First, we consider the RBS model with fixed precision Y i ∼ RBS(μ i , δ) and link function μ i = β 0 + β 1 x, for i = 1, . . . , 33; see Figure 4 (1st panel left). The ML estimates of its parameters, with estimated asymptotic standard errors (SE) in parenthesis, are β 0 = 0.67810(0.0363), β 1 = 0.0166(0.0031) and δ = 82.3752(20.2798). From Figure 4 (1st panel center), observe that the assumption of the RBS distribution seems to be reasonable. From Figure 4 (1st panel right), note that the residual plot shows a pattern that indicates an evidence of a non-constant precision. Observing at Figure 4 (2nd panel left), note that the precision depends on the value x of the covariate. Then, based on this last figure, we propose the RBS model with varying precision 33. For the precision parameter, we choice the square root link function based on usual selection model criteria (AIC, BIC) and the tests for varying precision proposed in Subsection 2.5. For the RBS model with varying precision, the ML estimates of its parameters and estimated asymptotic SEs in parenthesis are shown in Table 7. Model assumptions are verified using a residual analysis based on landrent data provided in Figure 4 (2nd panel center), because unusual features are not observed. In addition, the independence assumption also is verified by the QQ-plot with simulated envelope and by the plot of residuals displayed in Figure 4 (2nd panel right), from which an outlying case (#33) is detected.

Influence local
Diagnostics for the RBS regression model with varying precision are in Figure  5. Specifically, this figure displays index plots of C i , from where cases #26 and #33 are detected as potentially influential. We investigate their impact on the model inference when they are removed. Then, we again estimate the model after removing cases #26 and #33. To evaluate the impact of the removed cases, we define the relative change (RC) by RC θ denote the ML estimate of θ j obtained after removing the case i, for j = 1, 2, 3 and i = 1, . . . , 33. The corresponding RCs are in Table 8. Inferential changes are not detected when the cases are removed. The case #26 is a county with a higher average rent per acre planted with alfalfa, even not being the density of dairy cows the largest one. The case #33 is a county with small average rent per acre planted with alfalfa, despite possessing a density of dairy cows larger than 25% of the counties. We have that the mean of the ratio between the average rent per acre planted with alfalfa and the corresponding average rent for other agricultural uses can be described by μ(x) = 0.7426 + 0.0121 x. In this model, 0.7547 is the estimated mean ratio when the dairy cow density per square mile is zero, whereas the estimated mean ratio increases in 0.0121 if we increase in one unit the dairy cow density per square mile.  [26]. One of the objectives of that study was to determine the ideal storing time.

Previous studies on these data
These data are available from an R package named ssym by the command data(Snacks). Snack data were analyzed by Paula [26], who discussed diagnostic methods in double generalized linear models. The author modeled simultaneously the mean and precision of the gamma distribution, which is an asymmetrical distribution, such as the RBS distribution is.

Variables to be modeled
The considered variables are: texture (Y ), snack type (S) and quantity of weeks (W ). The variable texture is compared across time among the five snack types. Paula [26] observed that the means and CVs seem to be different among the snacks, changing across weeks, with indication of quadratic tendencies for the means.

Estimation and model cheking
Based on the descriptive analysis carried out by Paula [26], we consider the RBS regression model with varying precision. Here, Y ijk denotes the texture corresponding to the unit k of the snack type i in the jth week, with i = 1(A), 2(B), 3(C), 4(D), 5(E), j = 2, 4, . . . , 18, 20, and k = 1, 2, . . . , 15. We fitted 54 different models (named Models 1, . . . , 54) with identity, logarithm and square root link functions for g and h.  Table 10. Model assumptions are verified using residuals in Figure 6 (right), because unusual features are not detected. In addition, the independence assumption also is verified by the QQ-plot with simulated envelope and by the residual plot displayed in Figure 6, from which the case #91 (type A snack) is detected as atypical.

Influence local
Diagnostics for the RBS regression model with varying precision for snack data are presented in Figure 7. This Figure displays C i versus weeks for θ under caseweight, response and covariate-X perturbation, from where cases #76 (type A), #91 (type A), #136 (type A), #346 (type C), #465 (type D) and #750 (type E) are detected as potentially influential. We investigate the impact on the model inference when they are removed, but their elimination does not change the inference for the mean nor for the precision. Figure 8 presents the predicted mean and precision values for the texture across weeks, for each snack type. From Figure 8 (left), note that the types A and C snacks have the largest mean values across weeks. From Figure 8 (right), note that the types D and E snacks have the largest precision for the texture. An objective of the case-study is to determine the ideal storing time, and looking at Figure 8 (left), it seems to be about 14 weeks.

Concluding remarks
In this paper, we have developed a methodology based on a reparameterized Birnbaum-Saunders regression model with varying precision, generalizing the existing works in the literature on the topic. We have dealt with the issues of estimating the model parameters and of performing hypothesis testing on the corresponding precision parameter. We have considered three classic statistics, such as likelihood ratio, score and Wald, and one recently proposed called gradient, to test varying precision in our model. In addition, we have conducted Monte Carlo simulations to study the finite-sample performance of these tests. The simulation results have indicated that the gradient and score tests are preferred, but the former one has an advantage because it is simpler to calculate. Also, we have presented fours types of residuals when the mean and precision are modeled simultaneously. Again using Monte Carlo simulations, we have evaluated their performances and determined their distributions empirically, which have resulted to follow a normal distribution reasonably. Furthermore, we have developed diagnostic tools for this new class of regression models based on generalized leverage and local influence methods, considering case-weight, covariate and response perturbations. These tools have shown to be useful for detecting influential cases. Moreover, we have carried out applications with two real-world case-studies, which have shown the potential of using the new methodology based on a reparameterized Birnbaum-Saunders regression model with varying precision.

A.1. Score vector
The elements of the score vector, obtained by differentiation of the log-likelihood function expressed in (2.4) with respect to θ, are given, for j = 1, . . . , p and r = 1, . . . , q, aṡ where

A.2. Hessian matrix
The corresponding Hessian matrix is given bÿ (A.1) The elements of the first block of the matrix¨ (θ),¨ (β) say, are obtained from the derivative ∂ 2 (θ)/∂β j ∂β k = n i=1 c i x ij x ik , which may be written in matrix form as¨ (β) = X CX, where C = c i δ n ij and c i = d The elements of the second block of the matrix¨ (θ),¨ (βα) say, are obtained from the derivative ∂ (θ)/∂β j ∂α r = In matrix form, we have¨ (βα) = X M Z, with M = m i δ n ij . Consequently, we obtain the third block of the matrix¨ (θ),¨ (αβ) say, since this is the transpose of the matrix¨ (βα). The elements of the last block of¨ (θ),¨ (α) say, are obtained from the derivative ∂ 2 (θ)/∂α l ∂α s = whereas its matrix form can be expressed as¨ (α) = Z W Z, with W = w i δ n ij . The inverse Hessian matrix is given bÿ where V = diag{E c1 , . . . , E cn }, S = diag{E m1 , . . . , E mn }, U = diag{E w1 , . . . , E wn }, with , which can be interpreted as the LS solution of the linear regression of z 1 on X with weight matrix V . By using this result, we propose a residual based on the solution of weighted LS of z 1 against X, which we call score residual and define as r = V 1/2 [ z 1