Abstract
In this paper, we present recent results in the context of multivariate linear regression models considering that random errors follow multivariate skew scale mixtures of normal distributions. This class of distributions includes the scale mixtures of multivariate normal distributions, as special cases, and provides flexibility in capturing a wide variety of asymmetric behaviors. We implemented the algorithm ECM (Expectation/Conditional Maximization) and we obtained closed-form expressions for all the estimators of the parameters of the proposed model. Inspired by the ECM algorithm, we have developed an influence diagnostics for detecting influential observations to investigate the sensitivity of the maximum likelihood estimators. To examine the performance and the usefulness of the proposed methodology, we present simulation studies and analyze a real dataset.
Similar content being viewed by others
References
Akaike, H. (1974). A new look at the statistical model identification. IEEE Trans. Autom. Control. 19, 6, 716–723.
Andrews, D.F. and Mallows, C.L. (1974). Scale mixtures of normal distributions. J. R. Stat. Soc. Ser. B Methodol. 36, 99–102.
Atlas (2013). Atlas do Desenvolvimento Humano no Brasil. http://www.atlasbrasil.org.br/2013/pt/consulta/.
Azzalini, A. and Dalla-Valle, A. (1996). The multivariate skew-normal distribution. Biometrika 83, 715–726.
Branco, M.D. and Dey, D.K. (2001). A general class of multivariate skew elliptical distributions. J. Multivariate Anal. 79, 99–113.
Cabral, C.R.B., Lachos, V.H. and Zeller, C.B. (2014). Multivariate measurement error models using finite mixtures of skew-student t distributions. J. Multivariate Anal. 124, 179–198.
Chatterjee, S. and Hadi, A.S. (2006). Regression Analysis by Example, 4th edn. WiJey-Interscience, New York.
Cook, R.D. (1977). Detection of influential observation in linear regression. Technometrics 19, 15–18.
Cook, R.D. (1986). Assessment of local influence. J. R. Stat. Soc. Ser. B Methodol. 48, 133–169.
Cook, R.D. and Weisberg, S. (1999). Applied Regression Including Computing and Graphics. Wiley, New York.
Davidson, R. and MacKinnon, J.G. (2003). Econometric - Theory and Methods. Oxford University Press, New York.
Dempster, A.P., Laird, N.M. and Rubin, D.B. (1977). Maximum likelihood from incomplete data via the em algorithm. J. R. Stat. Soc. Ser. B Methodol.39, 1, 1–38.
Draper, N.R. and Smith, H. (1998). Applied Regression Analysis, 3rd edn. Wiley, New York.
Ferreira, C.S., Bolfarine, H. and Lachos, V.H. (2011). Skew scale mixture of normal distri- butions: properties and estimation. Stat. Method. 8, 154–171.
Ferreira, C.S., Lachos, V.H. and Bolfarine, H. (2016). Likelihood-based inference for multivariate skew scale mixtures of normal distributions. AStA Adv. Stat. Anal. 100, 421–441.
Gugliano, A.A. (2017). Globalização contra-hegemônica e instituições participativas: características das cidades com orçamentos participativos no rio grande do sul. Ciênc. Sociais Unisinos 53, 309–316.
Harville, D.A. (1997). Matrix Algebra from a Statistician’s Perspective, 1. Springer, New York.
Hu, W. (2005). Calibration of multivariate generalized hyperbolic distributions using the em algorithm, with applications in risk management portfolio optimization and portfolio credit risk. Ph.D. thesis. Florida State University.
Islam, M.Q., Yildirimb, F. and Yazicia, M. (2014). Inference in multivariate linear regression models with elliptically distributed errors. J. Appl. Stat.41, 1746–1766.
Johnson, R.A. and Wichern, D.W. (2007). Applied Multivariate Statistical Analysis, 6th edn. Pearson Prentice Hall, Upper Saddle River.
Lachos, V.H., Bolfarine, H., Arellano-Valle, R.B. and Montenegro, L.C. (2007). Likelihood based inference for multivariate skew-normal regression models. Commun. Stat. Theory Methods 36, 1769–1786.
Lachos, V.H., Ghosh, P. and Arellano-Valle, R.B. (2010). Likelihood based inference for skew-normal independent linear mixed models. Stat. Sinica 20, 303–322.
Lange, K. and Sinsheimer, J.S. (1993). Normal/independent distributions and their applications in robust regression. J. Comput. Graph. Statist. 2, 175–198.
Lee, S.-Y., Lu, B. and Song, X.-Y. (2006). Assessing local influence for nonlinear structural equation models with ignorable missing data. Comput. Stat. Data Anal. 50, 1356–1377.
Lee, S.-Y. and Xu, L. (2004). Influence analyses of nonlinear mixed-effects models. Comput. Stat. Data Anal. 45, 321–341.
Lesaffre, E. and Verbeke, G. (1998). Local influence in linear mixed models. Biometrics 54, 570–582.
Lobo, C., Matos, R., Cardoso, L., Comini, L. and Pinto, G. (2015). Expanded commuting in the metropolitan region of belo horizonte: evidence for reverse commuting. Rev. bras. Est. Pop. 32, 219–233.
Matos, L.A., Lachos, V.H., Lin, T.I. and Castro, L.M. (2019). Heavy-tailed longitudinal regression models for censored data: a robust parametric approach. Test 28, 844–878.
McNeil, A., Frey, R. and Embrechts, P. (2005). Quantitative Risk Management: Concepts Techniques and Toolss. Princeton University Press, Princeton.
Meng, X. and Rubin, D.B. (1993). Maximum likelihood estimation via the ecm algorithm: A general framework. Biometrika 80, 267–278.
Neter, J., Wasserman, W., Kutner, M. and Nachtsheim, C. (1996). Applied Linear Regression Models, 3rd edn. Richard D. Irwin, Chicago.
Powell, M.J.D. (1981). Approximation Theory and Methods. Cambridge University Press, Cambridge.
Core Team, R. (2019). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna Austria. http://www.R-project.org/.
Rao, C.R. (2002). Linear Statistical Inference and Its Applications, 2nd edn. WiIey-Intersclence, New York.
Sahu, S.K., Dey, D.K. and Branco, M.D. (2003). A new class of multivariate skew distributions with applications to bayesian regression models. Can. J. Stat. 31, 129–150.
Schwarz, G. et al. (1978). Estimating the dimension of a model. Ann. Stat. 6, 461–464.
Seber, G.A.F. (1977). Linear Regression Analysis. Wiley, New York.
Silva, M.T. and Galvão, T.F. (2017). Uso de serviços de saúde entre adultos residentes na região metropolitana de manaus: inquérito de base populacional, 2015. Epidemiol. Serviços Saúde 26, 725–734.
Zeller, C.B., Lachos, V.H. and Vilca-Labra, F.E. (2011). Local influence analysis for regression models with scale mixtures of skew-normal distributions. J. Appl. Stat. 38, 343–368.
Zhu, H. and Lee, S. (2001). Local influence for incomplete data models. J. R. Stat. Soc. Ser. B Stat. Methodol. 63, 111–126.
Zhu, H., Lee, S., Wei, B. and Zhou, J. (2001). Case-deletion measures for models with incomplete data. Biometrika 88, 727–737.
Acknowledgements
Camila Borelli Zeller and Clécio da Silva Ferreira were supported by CNPq and FAPEMIG.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendices
Appendix A. E-and M-steps in the ECM Algorithm
Firstly, we present the result of the exponential family whose corollaries allowed the complete establishment of the E-step in the STN-RM and the SSN-RM. After, we cite the main approximation result by weighted least squares from Powell (1981), which is useful to estimate the parameter τ in the STN-RM.
Proposition 2.
Given a random variable X in the exponential family, that is, with pdf \(g(x|{\boldsymbol {\vartheta }})=\eta (x)\kappa ({\boldsymbol {\vartheta }})e^{{\sum }_{i=1}^{k}\omega _{i}({\boldsymbol {\vartheta }})t_{i}(x)}\) depending on 𝜗 = (𝜗1,…,𝜗m). Then,
Corollary 1.
Let \(X\sim \text {Gamma}(\alpha ,\beta )\). Then, by Proposition 2,
Corollary 2.
Let \(X{\sim }\text {TGamma}(\alpha ,\beta ;(0,1)),\) where TGamma(r,s;(a,b)) denotes the univariate Gamma distribution (Gamma(r,s)), truncated on the interval (a,b). Then, by Proposition 2,
Proposition 3.
Given \(q\in {\mathscr{B}}=\mathcal {C}([a, b];\mathbb {R})\), i.e., a continuous real function on interval [a,b] and \( w\in \mathcal {C}([a, b];(0,\infty ))\) a weight function, \(p^{*}\in \mathcal {A}\) — a function in a finite-dimensional subspace of \({\mathscr{B}}\) — minimizes \({{\int \limits }_{a}^{b}} w(x)[q(x)-p(x)]^{2}dx=\left \langle q-p,q-p \right \rangle \text {with} p\in \mathcal {A} \), being called the best weighted least squares approximation for q within \(\mathcal {A}\) if, and only if, the approximation error e∗ = q − p∗ satisfies \(\left \langle e^{*},p \right \rangle =0 ,\ \forall p\in \mathcal {A}\).
Corollary 3.
If \(\mathcal {U}=\{p_{0},p_{1},\ldots ,p_{n}\} \) is an orthogonal basis for a subspace \(\mathcal {A}\subset {\mathscr{B}}=\mathcal {C}([a, b];\mathbb {R})\), then \(p_{n}^{*}=\sum \limits _{j=0}^{n} c_{j}p_{j}\) with \(c_{j}=\frac {\left \langle p_{j},q \right \rangle }{\left \|p_{j}\right \|^{2}}\) for each j ∈{0,1,…,n} is the best weighted least squares approximation for the function \(q\in {\mathscr{B}}\) within \(\mathcal {A}\).
Finally, we describe briefly how the above results have been used in each considered model.
-
For the STN-RM \((U \sim Gamma(\tau /2,\tau /2)),\) we obtain the Q2 function given in (3.5), where \(\widehat {s}^{(k)}\) is defined in the expression (3.6) by applying Corollary 1. The derivative of this function relative to ν contains the function \(e^{\eta (\nu )}=\ln {\left (\frac {\nu }{2}\right )}-{\Psi }\left (\frac {\nu }{2}\right )\). We apply a technique that was inspired by a method found in Powell (1981): approximation of functions by least squares weighted; see Proposition 3. This method has been used to approximate the function \(\eta (x)=\ln {\left (\ln {x}-{\Psi }(x)\right )}\) through a function with the form \(\xi (x)=c_{0}+c_{1}\ln {x}+c_{2}(\ln {x})^{2},\) with x = ν/2; see Corollary 3, considering the basis \(\mathcal {U}\) as the orthogonalization of \(\{1,\ln ,\ln ^{2}\}\). For that, we use the weight function \(w(\nu )=\frac {2}{\nu }e^{-\nu /2}\) to obtain, with an accuracy of 10− 4 on the norm constructed in Proposition 3, the approximation
$$ \eta(\nu) \approx c_{0}+c_{1}\ln{\left( \frac{\nu}{2}\right)}+c_{2}\ln{\left( \frac{\nu}{2}\right)}^{2}, $$where \( e^{c_{0}}\approx 0.5768854, c_{1}\approx -1.112673, c_{2}\approx 0.02783412 \). Thus, by replacing the new function in \(\frac {\partial Q_{2}}{\partial \nu }\) and solving its stationary equation, we find the expression (3.9).
-
For the SSN-RM \((U\sim Beta(\tau ,1))\), we obtain
$$Q_{2}\left( \nu|{\boldsymbol{\widehat{\theta}}}^{(k)}\right)=\sum\limits_{i=1}^{n}Q_{2i}\left( \nu|{\boldsymbol{\widehat{\theta}}}^{(k)}\right)=n\ln{\nu}+(\nu-1)\sum\limits_{i=1}^{n}\widehat{lu_{i}}^{(k)},$$where \(\widehat {lu_{i}}^{(k)}\) is defined in the expression (3.7) by applying Corollary 2. Then a simple calculation shows that the solution of the stationary equation involving the derivative \(\frac {\partial Q_{2}}{\partial \nu }\) is (3.10).
-
For the SCN-RM, where τ = (ν,γ)⊤ (U is a discrete variable assuming two states, such that h(u|τ) = νI{u=γ} + (1 − ν)I{u= 1}, the i th observation is drawn from of two populations: \(\mathbf {Y}_{1} \sim ~ \text {SN}_{p}\left (\boldsymbol {\mu },\frac {\boldsymbol {\Sigma }}{\gamma },\frac {\boldsymbol {\lambda }}{\sqrt {\gamma }}\right )\) and \(\mathbf {Y}_{2} \sim ~ \text {SN}_{p}\left (\boldsymbol {\mu },\boldsymbol {\Sigma },\boldsymbol {\lambda }\right )\). The complete data can be represented as \(\mathbf {y}_{c}=\left (\mathbf {y}^{\top },\mathbf {v}_{1}^{\top },\mathbf {v}_{2}^{\top },\mathbf {t}^{\top }\right )^{\top },\) where \(\mathbf {v}_{j}=\left (v_{j1},\ldots ,v_{jn}\right )^{\top },\) with Vji = 1 if the i th observation is from the j th population and 0 otherwise, for i = 1,…,n and j = 1,2. It follows that the complete data likelihood for the i th observation is
$$ \begin{array}{@{}rcl@{}} f_{c}\left( \mathbf{y}_{i},v_{1i},v_{2i},t_{i}\right)&=&2\left[\nu\phi_{p}\left( \mathbf{Y}_{i}|\mathbf{X}_{i}\boldsymbol{\beta},\boldsymbol{\Sigma}/\gamma\right)\right]^{v_{1i}}\left[(1-\nu)\phi_{p}(\mathbf{Y}_{i}|\mathbf{X}_{i}\boldsymbol{\beta},\boldsymbol{\Sigma})\right]^{v_{2i}}\\ &&\times \phi_{1}\left( t_{i}|\boldsymbol{\lambda}^{\top}\boldsymbol{\Sigma}^{-1/2}(\mathbf{Y}_{i}-\mathbf{X}_{i}\boldsymbol{\beta}),1\right). \end{array} $$In this context, the Q2 function is given by
$$ {Q}_{2}\left( \nu,\gamma|{\boldsymbol{\widehat{\theta}}}^{(k)}\right)=\left[\ln{(\nu)}+\frac{p}{2}\ln{(\gamma)}\right]\sum\limits_{i=1}^{n}\widehat{v_{1i}}^{(k)}+\ln{(1-\nu)}\sum\limits_{i=1}^{n}\widehat{v_{2i}}^{(k)}-\frac{\gamma}{2}\sum\limits_{i=1}^{n}\widehat{v_{1i}}^{(k)}\widehat{d_{i}}^{(k)}, $$where \(\widehat {v_{1i}}^{(k)}\) and \(\widehat {v_{2i}}^{(k)}\) are defined in the expression (3.8). Then, by solving the stationary conditions of Q2 concerning ν and γ, we find the expressions given in (3.11).
Appendix B. Auxiliary Calculus of the Score Vector and the Information Matrix
Suppose that we have observations of n independent individuals, Y1,…, Yn, where \(\mathbf {Y}_{i}\sim {SSMN}_{p}\left (\mathbf {X}_{i}\boldsymbol {\beta },\boldsymbol {\Sigma }({\boldsymbol {\alpha }}),\boldsymbol {\lambda },\boldsymbol {\tau }\right )\) for i = 1,…,n. Then, the log-likelihood function for \(\boldsymbol {\theta }=\left (\boldsymbol {\beta }^{\top },{\boldsymbol {\alpha }}^{\top },\boldsymbol {\lambda }^{\top },\boldsymbol {\tau }^{\top }\right )^{\top }\in \mathbb {R}^{r},\) given the observed sample \(\mathbf {y}=\left (\mathbf {y}^{\top }_{1},\ldots ,\mathbf {y}^{\top }_{n}\right )^{\top },\) is of the form
where \(\ell _{i}(\boldsymbol {\theta })=\ln {2}-\frac {p}{2}\ln {2\pi }-\frac {1}{2}{\Lambda }+\ln {K_{i}}+\ln {\Phi }(A_{i})\) with \(K_{i}=K_{i}(\boldsymbol {\theta })={\int \limits }^{\infty }_{0}u_{i}^{p/2}e^{-\frac {u_{i} d_{i}}{2}}h(u_{i}|\boldsymbol {\tau })du_{i}\). The score vector is given by \(\frac {\partial \ell (\boldsymbol {\theta })}{\partial \boldsymbol {\theta }}=\sum \limits _{i=1}^{n}\frac {\partial \ell _{i}(\boldsymbol {\theta })}{\partial \boldsymbol {\theta }},\) where \(\frac {\partial \ell _{i}(\boldsymbol {\theta })}{\partial \boldsymbol {\theta }}=\frac {\partial \ell _{i}}{\partial \boldsymbol {\theta }}=\left (\frac {\partial \ell _{i}}{\partial \boldsymbol {\beta }^{\top }},\frac {\partial \ell _{i}}{\partial {\boldsymbol {\alpha }}^{\top }},\frac {\partial \ell _{i}}{\partial \boldsymbol {\lambda }^{\top }},\frac {\partial \ell _{i}}{\partial \boldsymbol {\tau }^{\top }}\right )^{\top }\) with \(\frac {\partial \ell _{i}}{\partial \boldsymbol {\beta }}=-\frac {1}{2}EU_{\mathbf {y}_{i}}\frac {\partial d_{i}}{\partial \boldsymbol {\beta }}+W_{\Phi }(A_{i})\frac {\partial A_{i}}{\partial \boldsymbol {\beta }}, \frac {\partial \ell _{i}}{\partial \boldsymbol {\alpha }}=-\frac {1}{2}\frac {\partial {\Lambda }}{\partial \boldsymbol {\alpha }}-\frac {1}{2}EU_{\mathbf {y}_{i}}\frac {\partial d_{i}}{\partial \boldsymbol {\alpha }}+W_{\Phi }(A_{i})\frac {\partial A_{i}}{\partial \boldsymbol {\alpha }},\) \(\frac {\partial \ell _{i}}{\partial \boldsymbol {\lambda }}=W_{\Phi }(A_{i})\frac {\partial A_{i}}{\partial \boldsymbol {\lambda }}\quad \text {and}\) \( \quad \frac {\partial \ell _{i}}{\partial \boldsymbol {\tau }}=\frac {1}{K_{i}}\frac {\partial K_{i}}{\partial \boldsymbol {\tau }},\) such that WΦ(Ai) = ϕ1(Ai)/Φ(Ai) and \(EU_{\mathbf {y}_{i}}=E\left (U_{i}|\mathbf {Y}=\mathbf {y}_{i}\right )\). It follows that the matrix of the second derivatives concerning 𝜃 is given by
with
where \(W_{\Phi }^{\prime }(A_{i})=-W_{\Phi }(A_{i})(A_{i}+W_{\Phi }(A_{i}))\) and \(VU_{\mathbf {y}_{i}}=Var\left (U_{i}|\mathbf {Y}=\mathbf {y}_{i}\right )\). The first and second derivatives of Ki(𝜃) concerning 𝜃 are given by \(\frac {\partial K_{i}(\boldsymbol {\theta })}{\partial \boldsymbol {\theta }}=-\frac {1}{2} K_{i}(\boldsymbol {\theta }) \frac {\partial d_{i}}{\partial \boldsymbol {\theta }} EU_{\mathbf {y}_{i}}\) and \(\frac {\partial ^{2} K_{i}(\boldsymbol {\theta })}{\partial \boldsymbol {\theta } \partial \boldsymbol {\theta }^{\top }}=-\frac {K_{i}(\boldsymbol {\theta })}{2}\left [\frac {\partial ^{2} d_{i}}{\partial \boldsymbol {\theta } \partial \boldsymbol {\theta }^{\top }} EU_{\mathbf {y}_{i}}-\frac {1}{2}\frac {\partial d_{i}}{\partial \boldsymbol {\theta }}\frac {\partial d_{i}}{\partial \boldsymbol {\theta }^{\top }}EU^{2}_{\mathbf {y}_{i}} \right ]\), such that \(EU^{2}_{\mathbf {y}_{i}}=E\left ({U_{i}^{2}}|\mathbf {Y}=\mathbf {y}_{i}\right )\). The expected values \(E\left (U^{l}|\mathbf {Y}=\mathbf {y}\right )\) for the SN, STN, SSN and SCN distributions can be found in Ferreira et al. (2016); see Section 2.1. According to Harville (1997), consider α = vech(B), where Σ1/2 = B = B(α). Thus, the first and second derivatives of Λ, Ai and di can be obtained.
-
Λ
$$ \begin{array}{@{}rcl@{}} \frac{\partial{\Lambda}}{\partial\boldsymbol{\theta}}&=&\left( \mathbf{0},\frac{\partial{\Lambda}}{\partial{\boldsymbol{\alpha}}^{\top}},\mathbf{0},\mathbf{0}\right)^{\top}, \frac{\partial{\Lambda}}{\partial\boldsymbol{\alpha}}= \left[\frac{\partial{\Lambda}}{\partial\alpha_{j}}\right]_{j\in\{1,\ldots,p_{0}\}}=\left[2\text{tr}\left( \mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\right)\right]_{j\in\{1,\ldots,p_{0}\}},\\ \frac{\partial^{2}{\Lambda}}{\partial\boldsymbol{\theta}\partial\boldsymbol{\theta}^{\top}}&=&\begin{bmatrix} \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \frac{\partial^{2}{\Lambda}}{\partial{\boldsymbol{\alpha}}\partial{\boldsymbol{\alpha}}^{\top}} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \\ \mathbf{0} & \mathbf{0} & \mathbf{0} & \mathbf{0} \end{bmatrix},\\ \frac{\partial^{2}{\Lambda}}{\partial\boldsymbol{\alpha}\partial\boldsymbol{\alpha}^{\top}}&=&\left[\frac{\partial^{2}{\Lambda}}{\partial\alpha_{j}\partial\alpha_{k}}\right]_{j,k\in\{1,\ldots,p_{0}\}}=\left[-\text{tr}\left( \mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\right)\right]_{j,k\in\{1,\ldots,p_{0}\}}, \end{array} $$where \(\dot {\mathbf {B}}_{s}=\partial {\mathbf {B}(\boldsymbol {\alpha })}/\partial {\alpha _{s}}\) with s = 1,2,…,p0 = p(p + 1)/2 and ei = yi −Xiβ.
-
Ai
$$ \begin{array}{@{}rcl@{}} \frac{\partial A_{i}}{\partial\boldsymbol{\beta}}&=&-\mathbf{X}_{i}^{\top}\mathbf{B}^{-1}\boldsymbol{\lambda}, \frac{\partial A_{i}}{\partial\alpha_{j}}=-\boldsymbol{\lambda}^{\top}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}\left( \mathbf{Y}_{i}-\mathbf{X}_{i}\boldsymbol{\beta}\right), \frac{\partial A_{i}}{\partial\boldsymbol{\lambda}}=\mathbf{B}^{-1}\mathbf{e}_{i}, \frac{\partial A_{i}}{\partial\boldsymbol{\tau}}=\mathbf{0},\\ \frac{\partial^{2}A_{i}}{\partial\boldsymbol{\beta}\partial\boldsymbol{\beta}^{\top}}&=&\mathbf{0}, \frac{\partial^{2}A_{i}}{\partial\boldsymbol{\beta}\partial\alpha_{j}}=\mathbf{X}_{i}^{\top}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}\boldsymbol{\lambda}, \frac{\partial^{2}A_{i}}{\partial\boldsymbol{\beta}\partial\boldsymbol{\lambda}^{\top}}=-\mathbf{X}_{i}^{\top}\mathbf{B}^{-1},\quad \frac{\partial^{2}A_{i}}{\partial\boldsymbol{\theta}\partial\boldsymbol{\tau}^{\top}}=\mathbf{0},\\ \frac{\partial^{2}A_{i}}{\partial\alpha_{j}\partial\alpha_{k}}&=&-\boldsymbol{\lambda}^{\top}\mathbf{B}^{-1}\left( \dot{\mathbf{B}}_{k}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}+\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\right)\mathbf{B}^{-1}\mathbf{e}_{i}\\ \frac{\partial^{2}A_{i}}{\partial\alpha_{j}\partial\boldsymbol{\lambda}^{\top}}&=&-\mathbf{e}_{i}^{\top}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}, \frac{\partial^{2}A_{i}}{\partial\boldsymbol{\lambda}\partial\boldsymbol{\lambda}^{\top}}=\mathbf{0}. \end{array} $$ -
di
$$ \begin{array}{@{}rcl@{}} \frac{\partial d_{i}}{\partial\boldsymbol{\beta}}&=&-2\mathbf{X}_{i}^{\top}\mathbf{B}^{-2}\mathbf{e}_{i}, \frac{\partial d_{i}}{\partial\alpha_{j}}=-\mathbf{e}_{i}^{\top}\mathbf{B}^{-1}\left( \dot{\mathbf{B}}_{j}\mathbf{B}^{-1}+\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\right)\mathbf{B}^{-1}\mathbf{e}_{i}, \frac{\partial d_{i}}{\partial\boldsymbol{\lambda}}=\mathbf{0}, \frac{\partial d_{i}}{\partial\boldsymbol{\tau}}=\mathbf{0},\\ \frac{\partial^{2}d_{i}}{\partial\boldsymbol{\beta}\partial\boldsymbol{\beta}^{\top}}&=&2\mathbf{X}_{i}^{\top}\mathbf{B}^{-2}\mathbf{X}_{i}, \frac{\partial^{2}d_{i}}{\partial\boldsymbol{\beta}\partial\alpha_{j}}=2\mathbf{X}_{i}^{\top}\mathbf{B}^{-1}\left( \dot{\mathbf{B}}_{j}\mathbf{B}^{-1}+\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\right)\mathbf{B}^{-1}\mathbf{e}_{i},\\ \frac{\partial^{2}d_{i}}{\partial\boldsymbol{\theta}\partial\boldsymbol{\lambda}^{\top}}&=&\mathbf{0}, \frac{\partial^{2}d_{i}}{\partial\boldsymbol{\theta}\partial\boldsymbol{\tau}^{\top}}=\mathbf{0},\\ \frac{\partial^{2}d_{i}}{\partial\alpha_{j}\partial\alpha_{k}}&=&\mathbf{e}_{i}^{\top}\mathbf{B}^{-1}\left( \dot{\mathbf{B}}_{k}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}+\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\mathbf{B}^{-1}+\dot{\mathbf{B}}_{j}\mathbf{B}^{-2}\dot{\mathbf{B}}_{k}+\dot{\mathbf{B}}_{k}\mathbf{B}^{-2}\dot{\mathbf{B}}_{j}\right.\\ &&\left.+\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}+\mathbf{B}^{-1}\dot{\mathbf{B}}_{j}\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\right)\mathbf{B}^{-1}\mathbf{e}_{i}, \end{array} $$
and the derivatives of ℓi can be obtained for the STN-RM, the SSN-RM and the SCN-RM as follows.
-
STN-RM
$$ \begin{array}{@{}rcl@{}} \frac{\partial \ell_{i}}{\partial\tau}&=&\frac{1}{2}\left[\ln{\left( \frac{\tau}{\tau+d_{i}}\right)}+{\Psi}\left( \frac{\tau+p}{2}\right)-{\Psi}\left( \frac{\tau}{2}\right)+\frac{d_{i}-p}{\tau+d_{i}}\right],\\\frac{\partial^{2}\ell_{i}}{\partial\boldsymbol{\theta}_{1}\partial\tau}&=&\frac{1}{2}\frac{p-d_{i}}{(\tau+d_{i})^{2}}\frac{\partial d_{i}}{\partial\boldsymbol{\theta}_{1}}\quad\text{and}\\\frac{\partial^{2}\ell_{i}}{\partial\tau^{2}}&=&\frac{1}{4}\left[{\Psi}_{1}\left( \frac{\tau+p}{2}\right)-{\Psi}_{1}\left( \frac{\tau}{2}\right)+\frac{2(d_{i}+p\tau)}{\tau(\tau+d_{i})^{2}}\right], \end{array} $$where Ψ1(.) is the trigamma function.
-
SSN-RM
$$ \begin{array}{@{}rcl@{}} \frac{\partial \ell_{i}}{\partial\tau}&=&\frac{1}{\tau}+\frac{I(1,1)}{I(1,0)}, \frac{\partial^{2}\ell_{i}}{\partial\boldsymbol{\theta}_{1}\partial\tau}=-\frac{1}{2}\left[\frac{I(0,1)}{I(1,0)}-\frac{I(1,1)I(0,0)}{I(1,0)^{2}}\right]\frac{\partial d_{i}}{\partial\boldsymbol{\theta}_{1}} \text{and} \\ \frac{\partial^{2}\ell_{i}}{\partial\tau^{2}}&=&\frac{I(1,2)}{I(1,0)}-\left[\frac{I(1,1)}{I(1,0)}\right]^{2}-\frac{1}{\tau^{2}}, \end{array} $$where \(I(a,b)={{\int \limits }_{0}^{1}}\ u^{\nu +p/2-a}(\ln {u})^{b} e^{-ud_{i}/2}du\).
-
SCN-RM, where τ = (ν,γ)⊤
$$ \begin{array}{@{}rcl@{}} \frac{\partial\ell_{i}}{\partial\nu}&=&\frac{\gamma^{p/2}e^{(1-\gamma) d_{i}/2}-1}{1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}},\quad \frac{\partial\ell_{i}}{\partial\gamma}=\frac{\nu}{2}\frac{(p-\gamma d_{i})\left( \gamma^{p/2-1}e^{(1-\gamma) d_{i}/2}-1\right)}{1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}},\\ \frac{\partial^{2}\ell_{i}}{\partial\boldsymbol{\theta}_{1}\partial\nu}&=&\frac{1}{2}\frac{(1-\gamma)\gamma^{p/2}e^{(1-\gamma)d_{i}/2}}{\left[1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}\right]^{2}}\frac{\partial d_{i}}{\partial\boldsymbol{\theta}_{1}},\\ \frac{\partial^{2}\ell_{i}}{\partial\boldsymbol{\theta}_{1}\partial\gamma}&=&\frac{\nu}{4}\frac{(1-\nu)\gamma^{p/2-1}[p-\gamma(2+p+d_{i})+\gamma^{2}d_{i}]e^{(1-\gamma)d_{i}/2}-2\nu\gamma^{p} e^{(1-\gamma) d_{i}}}{\left[1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}\right]^{2}}\frac{\partial d_{i}}{\partial\boldsymbol{\theta}_{1}},\\ \frac{\partial^{2}\ell_{i}}{\partial\nu^{2}}&=&-\frac{\left[\gamma^{p/2}e^{(1-\gamma) d_{i}/2}-1\right]^{2}}{\left[1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}\right]^{2}},\quad \frac{\partial^{2}\ell_{i}}{\partial\nu\partial\gamma}=\frac{1}{2}\frac{(p-\gamma d_{i})\gamma^{p/2-1}e^{(1-\gamma) d_{i}/2}}{\left[1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}\right]^{2}},\\ \frac{\partial^{2}\ell_{i}}{\partial\gamma^{2}}&=&\frac{\nu}{4}\frac{(1-\nu)\gamma^{p/2-2}[(p-\gamma d_{i})^{2}-2p]e^{(1-\gamma)d_{i}/2}-2p\nu\gamma^{p-2}e^{(1-\gamma) d_{i}}}{\left[1-\nu+\nu\gamma^{p/2}e^{(1-\gamma)d_{i}/2}\right]^{2}}. \end{array} $$
Appendix C. Auxiliary Calculus for the Influence Measures
To obtain the diagnostic measures, it is necessary to compute \(\frac {\partial ^{2}Q}{\partial \boldsymbol {\theta }\partial \boldsymbol {\theta }^{\top }}\left (\!{\boldsymbol {\theta }}|{\boldsymbol {\widehat {\theta }}}\right )\) \(={\sum }_{i=1}^{n} \frac {\partial ^{2}Q_{i}}{\partial \boldsymbol {\theta }\partial \boldsymbol {\theta }^{\top }}\left ({\boldsymbol {\theta }}|{\boldsymbol {\widehat {\theta }}}\right )\). Hence, the Hessian matrix has elements given by
where \(A_{i}=A_{i}(\boldsymbol {\theta })=\boldsymbol {\lambda }^{\top }\boldsymbol {\Sigma }^{-1/2}\left (\mathbf {y}_{i}-\mathbf {X}_{i}\boldsymbol {\beta }\right )\). The terms \(\frac {\partial ^{2} Q_{i}}{\partial \boldsymbol {\tau }\partial \boldsymbol {\tau }^{\top }}\) can be readily evaluated for the STN-RM, the SSN-RM and the SCN-RM as follows.
-
STN-RM
$$ \frac{\partial^{2}Q_{i}}{\partial\tau^{2}}=\frac{1}{4}\left[\frac{2}{\tau}-{\varPsi}_{1}\left( \frac{\tau}{2}\right)\right],$$where Ψ1(⋅) is the trigamma function.
-
SSN-RM
$$\frac{\partial^{2}Q_{i}}{\partial\tau^{2}}=-\frac{1}{\tau^{2}}.$$ -
SCN-RM, where τ = (ν,γ)⊤
$$\frac{\partial^{2}Q_{i}}{\partial\nu^{2}}=-\left[\frac{\widehat{v_{1i}}}{\nu^{2}}+\frac{\widehat{v_{2i}}}{(1-\nu)^{2}}\right],\quad \frac{\partial^{2}Q_{i}}{\partial\nu\partial\gamma}=0 \quad\text{and} \quad \frac{\partial^{2}Q_{i}}{\partial\gamma^{2}}=-\frac{p\widehat{v_{1i}}}{2\gamma^{2}}.$$
In addition, the derivatives of Λ, di and Ai are given in Appendix B.
-
(i)
Case weight perturbation
$$ \begin{array}{@{}rcl@{}} \frac{\partial Q_{i}}{\partial\boldsymbol{\beta}}&=&-\frac{1}{2}\widehat{u_{i}}\frac{\partial d_{i}}{\partial\boldsymbol{\beta}}+\left( \widehat{t_{i}}-A_{i}\right)\frac{\partial A_{i}}{\partial\boldsymbol{\beta}},\quad \frac{\partial Q_{i}}{\partial\boldsymbol{\alpha}}=-\frac{1}{2}\frac{\partial{\Lambda}}{\partial\boldsymbol{\alpha}}-\frac{1}{2}\widehat{u_{i}}\frac{\partial d_{i}}{\partial\boldsymbol{\alpha}}+\left( \widehat{t_{i}}-A_{i}\right)\frac{\partial A_{i}}{\partial\boldsymbol{\alpha}},\\ \frac{\partial Q_{i}}{\partial\boldsymbol{\lambda}}&=&\left( \widehat{t_{i}}-A_{i}\right)\frac{\partial A_{i}}{\partial\boldsymbol{\lambda}} \quad \text{and} \quad \frac{\partial Q_{i}}{\partial\boldsymbol{\tau}} \end{array} $$
can be readily evaluated for the STN-RM, the SSN-RM and the SCN-RM as follows.
-
STN-RM
$$ \begin{array}{@{}rcl@{}} \frac{\partial Q_{i}}{\partial\tau}=\frac{1}{2}\left[\ln{\left( \frac{\tau}{2}\right)}-{\Psi}\left( \frac{\tau}{2}\right)+1-\widehat{u_{i}}+\widehat{lu_{i}}\right].\end{array} $$ -
SSN-RM
$$ \begin{array}{@{}rcl@{}} \frac{\partial Q_{i}}{\partial\tau}=\frac{1}{\tau}+\widehat{lu_{i}}. \end{array} $$ -
SCN-RM, where τ = (ν,γ)⊤
$$ \begin{array}{@{}rcl@{}}\frac{\partial Q_{i}}{\partial\nu}=\frac{\widehat{v_{1i}}}{\nu}-\frac{\widehat{v_{2i}}}{1-\nu}\quad \text{and} \quad \frac{\partial Q_{i}}{\partial\gamma}=\frac{1}{2\gamma}\widehat{v_{1i}}(p-\gamma d_{i}). \end{array} $$
-
(ii)
Response perturbation
$$ \begin{array}{@{}rcl@{}} \frac{\partial A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\omega}^{\top}}&=&\boldsymbol{\lambda}^{\top}\mathbf{B}^{-1}\mathbf{D}\mathbf{S}_{\mathbf{y}}\mathbf{c}_{i}^{\top}, \qquad \frac{\partial^{2}A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\lambda}\partial\boldsymbol{\omega}^{\top}}=\mathbf{B}^{-1}\mathbf{D}\mathbf{S}_{\mathbf{y}}\mathbf{c}_{i}^{\top},\\ \frac{\partial^{2}d_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\beta}\partial\boldsymbol{\omega}^{\top}}&=&-2\mathbf{X}_{i}^{\top}\mathbf{B}^{-2}\mathbf{D}\mathbf{S}_{\mathbf{y}}\mathbf{c}_{i}^{\top}, \qquad \frac{\partial^{2}A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\alpha_{k}\partial\boldsymbol{\omega}^{\top}}=-\boldsymbol{\lambda}^{\top}\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\mathbf{B}^{-1}\mathbf{D}\mathbf{S}_{\mathbf{y}}\mathbf{c}_{i}^{\top},\\ \frac{\partial^{2}d_{i}^{\boldsymbol{\omega}_{0}}}{\partial\alpha_{k}\partial\boldsymbol{\omega}^{\top}}&=&-2\mathbf{e}_{i}^{\top}\mathbf{B}^{-1}\left( \dot{\mathbf{B}}_{k}\mathbf{B}^{-1}+\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\right)\mathbf{B}^{-1}\mathbf{D}\mathbf{S}_{\mathbf{y}}\mathbf{c}_{i}^{\top}. \end{array} $$
-
(iii)
Explanatory perturbation
$$ \begin{array}{@{}rcl@{}} \frac{\partial A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\omega}^{\top}}&=&-\boldsymbol{\lambda}^{\top}\mathbf{B}^{-1}\mathbf{E}\mathbf{M}_{\mathbf{x}}\boldsymbol{\beta}\mathbf{c}_{i}^{\top}, \qquad \frac{\partial^{2}A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\beta}\partial\boldsymbol{\omega}^{\top}}=-\mathbf{E}\mathbf{M}_{\mathbf{x}}^{\top}\mathbf{B}^{-1}\boldsymbol{\lambda}\mathbf{c}_{i}^{\top},\\ \frac{\partial^{2}A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\alpha_{k}\partial\boldsymbol{\omega}^{\top}}&=&\boldsymbol{\lambda}^{\top}\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\mathbf{B}^{-1}\mathbf{E}\mathbf{M}_{\mathbf{x}}\boldsymbol{\beta}\mathbf{c}_{i}^{\top}, \qquad \frac{\partial^{2}A_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\lambda}\partial\boldsymbol{\omega}^{\top}}=-\mathbf{B}^{-1}\mathbf{E}\mathbf{M}_{\mathbf{x}}\boldsymbol{\beta}\mathbf{c}_{i}^{\top},\\ \frac{\partial^{2}d_{i}^{\boldsymbol{\omega}_{0}}}{\partial\boldsymbol{\beta}\partial\boldsymbol{\omega}^{\top}}&=&-2\left( \mathbf{E}\mathbf{M}_{\mathbf{x}}^{\top}\mathbf{B}^{-2}\mathbf{e}_{i}-\mathbf{X}_{i}^{\top}\mathbf{B}^{-2}\mathbf{E}\mathbf{M}_{\mathbf{x}}\boldsymbol{\beta}\right)\mathbf{c}_{i}^{\top},\\ \frac{\partial^{2}d_{i}^{\boldsymbol{\omega}_{0}}}{\partial\alpha_{k}\partial\boldsymbol{\omega}^{\top}}&=&2\mathbf{e}_{i}^{\top}\mathbf{B}^{-1}\left( \dot{\mathbf{B}}_{k}\mathbf{B}^{-1}+\mathbf{B}^{-1}\dot{\mathbf{B}}_{k}\right)\mathbf{B}^{-1}\mathbf{E}\mathbf{M}_{\mathbf{x}}\boldsymbol{\beta}\mathbf{c}_{i}^{\top}. \end{array} $$
Rights and permissions
About this article
Cite this article
Louredo, G.M.S., Zeller, C.B. & Ferreira, C.S. Estimation and Influence Diagnostics for the Multivariate Linear Regression Models with Skew Scale Mixtures of Normal Distributions. Sankhya B 84, 204–242 (2022). https://doi.org/10.1007/s13571-021-00257-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13571-021-00257-y
Keywords
- Skew scale mixtures of normal distributions
- multivariate linear regression
- ECM algorithm
- global and local influence.