Spike and slab Bayesian sparse principal component analysis

Ning, Yu-Chien Bo; Ning, Ning

doi:10.1007/s11222-024-10430-8

Spike and slab Bayesian sparse principal component analysis

Original Paper
Published: 13 May 2024

Volume 34, article number 118, (2024)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Yu-Chien Bo Ning¹ &
Ning Ning²

147 Accesses
Explore all metrics

Abstract

Sparse principal component analysis (SPCA) is a popular tool for dimensionality reduction in high-dimensional data. However, there is still a lack of theoretically justified Bayesian SPCA methods that can scale well computationally. One of the major challenges in Bayesian SPCA is selecting an appropriate prior for the loadings matrix, considering that principal components are mutually orthogonal. We propose a novel parameter-expanded coordinate ascent variational inference (PX-CAVI) algorithm. This algorithm utilizes a spike and slab prior, which incorporates parameter expansion to cope with the orthogonality constraint. Besides comparing to two popular SPCA approaches, we introduce the PX-EM algorithm as an EM analogue to the PX-CAVI algorithm for comparison. Through extensive numerical simulations, we demonstrate that the PX-CAVI algorithm outperforms these SPCA approaches, showcasing its superiority in terms of performance. We study the posterior contraction rate of the variational posterior, providing a novel contribution to the existing literature. The PX-CAVI algorithm is then applied to study a lung cancer gene expression dataset. The $\textsf{R}$ package $\textsf{VBsparsePCA}$ with an implementation of the algorithm is available on the Comprehensive R Archive Network (CRAN).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Article 30 August 2016

Codivergences and information matrices

Article Open access 22 May 2024

An evaluation of computational methods for aggregate data meta-analyses of diagnostic test accuracy studies

Article Open access 10 May 2024

References

Avalos-Pacheco, A., Rossell, D., Savage, R.S.: Heterogeneous large datasets integration using Bayesian factor regression. Bayesian Anal. 17, 33–66 (2022)
Article MathSciNet Google Scholar
Banerjee, S., Castillo, I., Ghosal, S.: Bayesian inference in high-dimensional models. Springer volume on Data Science (to Appear) (2021)
Belitser, E., Ghosal, S.: Empirical Bayes oracle uncertainty quantification for regression. Ann. Stat. 48, 3113–3137 (2020)
Article MathSciNet Google Scholar
Blei, D.M., Kucukelbir, A., McAuliffe, J.D.: Variational inference: a review for statisticians. J. Am. Stat. Assoc. 518, 859–877 (2017)
Article MathSciNet Google Scholar
Bouveyron, C., Latouche, P., Mattei, P.-A.: Bayesian variable selection for globally sparse probabilistic PCA. Electron. J. Stat. 12, 3036–3070 (2018)
Article MathSciNet Google Scholar
Cai, T., Ma, Z., Wu, Y.: Optimal estimation and rank detection for sparse spiked covariance matrices. Probab. Theory Relat. Fields 161(3), 781–815 (2015)
Article MathSciNet Google Scholar
Carbonetto, P., Stephens, M.: Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Anal. 7(1), 73–108 (2012)
Article MathSciNet Google Scholar
Castillo, I., Roquain, E.: On spike and slab empirical Bayes multiple testing. Ann. Stat. (to appear) (2020)
Castillo, I., Schmidt-Hieber, J., van der Vaart, A.: Bayesian linear regression with sparse priors. Ann. Stat. 43, 1986–2018 (2015)
Article MathSciNet Google Scholar
Castillo, I., Szabó, B.: Spike and slab empirical Bayes sparse credible sets. Bernoulli 26, 127–158 (2020)
Article MathSciNet Google Scholar
Castillo, I., van der Vaart, A.: Needles and straw in a haystack: Posterior concentration for possibly sparse sequences. Ann. Stat. 40, 2069–2101 (2012)
Article MathSciNet Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39, 1–22 (1977)
Article MathSciNet Google Scholar
Erichson, N.B., Zheng, P., Manohar, K., Brunton, S.L., Kutz, J.N., Aravkin, A.Y.: Sparse principal component analysis via variable projection. SIAM J. Appl. Math. 80, 977–1002 (2020)
Article MathSciNet Google Scholar
Gao, C., Zhou, H.H.: Rate-optimal posterior contraction rate for sparse PCA. Ann. Stat. 43, 785–818 (2015)
Article MathSciNet Google Scholar
Ghahramani, Z., Beal, M.: Variational inference for Bayesian mixtures of factor analysers. In: Advances in Neural Information Processing Systems, vol. 12. MIT Press, Cambridge (1999)
Guan, Y., Dy, J.: Sparse probabilistic principal component analysis. Proc. Twelfth Int. Conf. Artif. Intell. Stat. 5, 185–192 (2009)
Google Scholar
Hansen, B., Avalos-Pacheco, A., Russo, M., Vito, R.D.: Fast variational inference for Bayesian factor analysis in single and multi-study settings. (2023). arXiv:2305.13188
Huang, X., Wang, J., Liang, F.: A variational algorithm for Bayesian variable selection. arXiv:1602.07640 (2016)
Jammalamadaka, S.R., Qiu, J., Ning, N.: Predicting a stock portfolio with the multivariate Bayesian structural time series model: Do news or emotions matter? Int. J. Artif. Intell. 17(2), 81–104 (2019)
Google Scholar
Jeong, S., Ghosal, S.: Unified Bayesian asymptotic theory for sparse linear regression. arXiv:2008.10230 (2020)
Johnson, V.E., Rossell, D.: On the use of non-local prior densities in Bayesian hypothesis tests. J. R. Stat. Soc. Ser. B 72, 143–170 (2010)
Article MathSciNet Google Scholar
Johnstone, I.M., Lu, A.Y.: On consistency and sparsity for principal components analysis in high dimensions. J. Am. Stat. Assoc. 104, 682–693 (2009)
Article MathSciNet Google Scholar
Johnstone, I.M., Silverman, B.W.: Needles and straw in haystacks: Empirical Bayes estimates of possibly sparse sequences. Ann. Stat. 32(4), 1594–1649 (2004)
Article MathSciNet Google Scholar
Li, Z., Safo, S.E., Long, Q.: Incorporating biological information in sparse principal component analysis with application to genomic data. BMC Bioinformatics, 12 pages (2017)
Liu, C., Rubin, D.B., Wu, Y.N.: Parameter expansion to accelerate EM: The PX-EM algorithm. Biometrika 85(4), 755–770 (1998)
Article MathSciNet Google Scholar
Martin, R., Mess, R., Walker, S.G.: Empirical Bayes posterior concentration in sparse high-dimensional linear models. Bernoulli 23, 1822–1857 (2017)
Article MathSciNet Google Scholar
Martin, R., Ning, B.: Empirical priors and coverage of posterior credible sets in a sparse normal mean model. Sankhya A 82, 477–498 (2020)
Article MathSciNet Google Scholar
Ning, B., Ghosal, S., Thomas, J.: Bayesian method for causal inference in spatially-correlated multivariate time series. Bayesian Anal. 14(1), 1–28 (2019)
Article MathSciNet Google Scholar
Ning, B., Jeong, S., Ghosal, S.: Bayesian linear regression for multivariate responses under group sparsity. Bernoulli 26, 2353–2382 (2020)
Article MathSciNet Google Scholar
Ning, B.Y.-C.: Empirical Bayes large-scale multiple testing for high-dimensional sparse binary sequences. arXiv:2307.05943, 80 pages (2023a)
Ning, N.: Bayesian feature selection in joint quantile time series analysis. Bayesian Anal. 1(1), 1–27 (2023b)
Google Scholar
Ohn, I., Lin, L., Kim, Y.: A Bayesian sparse factor model with adaptive posterior concentration. Bayesian Anal. (to Appear), 1–25 (2023)
Pati, D., Bhattacharya, A., Pillai, N.S., Dunson, D.: Posterior contraction in sparse Bayesian factor models for massive covariance matrices. Ann. Stat. 42(3), 1102–1130 (2014)
Article MathSciNet Google Scholar
Paul, D.: Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Stat. Sin. 17(4), 1617–1642 (2007)
MathSciNet Google Scholar
Qiu, J., Jammalamadaka, S.R., Ning, N.: Multivariate Bayesian structural time series model. J. Mach. Learn. Res. 19(1), 2744–2776 (2018)
MathSciNet Google Scholar
Qiu, J., Jammalamadaka, S.R., Ning, N.: Multivariate time series analysis from a Bayesian machine learning perspective. Ann. Math. Artif. Intell. 88(10), 1061–1082 (2020)
Article MathSciNet Google Scholar
Rapach, D., Zhou, G.: Sparse macro factors. Available at SSRN: https://ssrn.com/abstract=3259447 (2019)
Ray, K., Szabo, B.: Variational Bayes for high-dimensional linear regression with sparse priors. J. Am. Stat. Assoc. 117, 1270–1281 (2022)
Ročková, V.: Bayesian estimation of sparse signals with a continuous spike-and-slab prior. Ann. Stat. 46(1), 401–437 (2018)
Article MathSciNet Google Scholar
Ročková, V., George, E.I.: EMVS: The EM approach to Bayesian variable selection. J. Am. Stat. Assoc. 109, 828–846 (2014)
Article MathSciNet Google Scholar
Ročková, V., George, E.I.: Fast Bayesian factor analysis via automatic rotations to sparsity. J. Am. Stat. Assoc. 111, 1608–1622 (2016)
Article MathSciNet Google Scholar
Ročková, V., George, E.I.: The spike-and-slab lasso. J. Am. Stat. Assoc. 113, 431–444 (2018)
Article MathSciNet Google Scholar
Ročková, V., Lesaffre, E.: Incorporating grouping information in Bayesian variable selection with applications in genomics. Bayesian Anal. 9(1), 221–258 (2014)
Article MathSciNet Google Scholar
She, Y.: Selective factor extraction in high dimensions. Biometrika 104, 97–110 (2017)
MathSciNet Google Scholar
van der Pas, S., Szabó, B., van der Vaart, A.: Uncertainty quantification for the horseshoe (with discussion). Bayesian Anal. 12(4), 1221–1274 (2017)
MathSciNet Google Scholar
Varmuza, K., Filzmoser, P.: Introduction to Multivariate Statistical Analysis in Chemometrics. CRC Press, Boca Raton, FL (2009)
Google Scholar
Wang, Y., Blei, D.M.: Frequentist consistency of variational Bayes. J. Am. Stat. Assoc. 114, 1147–1161 (2019)
Article MathSciNet Google Scholar
Wang, Z., Gu, Y., Lan, A., Baraniuk, R.: VarFA: A variational factor analysis framework for efficient Bayesian learning analytics. arXiv:2005.13107, 12 pages (2020)
Xie, F., Cape, J., Priebe, C.E., Xu, Y.: Bayesian sparse spiked covariance model with a continuous matrix shrinkage prior. Bayesian Anal. 17(4), 1193–1217 (2022)
Article MathSciNet Google Scholar
Yang, Y., Pati, D., Bhattacharya, A.: $\alpha $-variational inference with statistical guarantees. Ann. Stat. 48, 886–905 (2020)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49–67 (2006)
Article MathSciNet Google Scholar
Zhang, F., Gao, C.: Convergence rates of variational posterior distributions. Ann. Stat. 48, 2180–2207 (2020)
Article MathSciNet Google Scholar
Zou, H., Hastie, T., Tibshirani, R.: Sparse principal component analysis. J. Comput. Gr. Stat. 265–286 (2006)
Zou, H., Xue, L.: A selective overview of sparse principal component analysis. Proc. IEEE 106(8), 1311–1320 (2018)
Article Google Scholar

Download references

Acknowledgements

We would like to warmly thanks Drs. Ryan Martin and Botond Szabó for their helpful suggestions on an early version of this paper. Bo Ning gratefully acknowledges the funding support provided by NASA XRP 80NSSC18K0443. The authors would like to thank two anonymous reviewers and the Editors for their very constructive comments and efforts on this lengthy work, which greatly improved the quality of this paper.

Funding

The research of Ning was partially supported by NIH grant 1R21AI180492-01 and the Individual Research Grant at Texas A &M University.

Author information

Authors and Affiliations

Department of Epidemiology, Harvard T. H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA
Yu-Chien Bo Ning
Department of Statistics, Texas A &M University, College Station, TX, 77843, USA
Ning Ning

Authors

Yu-Chien Bo Ning
View author publications
You can also search for this author in PubMed Google Scholar
Ning Ning
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ning Ning.

Ethics declarations

Conflict of interest

Ning Ning serves as an Associate Editor for statistics and computing.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 601 KB)

Appendix A: Derivation of (12)–(18)

First, we need the following result:

$$\begin{aligned}&\mathbb {E}_{w|\Theta ^{(t)}} \left[ \frac{1}{2\sigma ^2} \sum _{i=1}^n (X_{ij} - {{\widetilde{\beta }}}_j w_i)^2 \right] \nonumber \\&= \frac{1}{2\sigma ^2} \sum _{i=1}^n \left( X_{ij}^2 - 2X_{ij} {{\widetilde{\beta }}}_j {{\widetilde{\omega }}}_i + {{\widetilde{\beta }}}_j H_i {{\widetilde{\beta }}}_j' \right) , \end{aligned}$$

(36)

where $H_i = {{\widetilde{\omega }}}_i{{\widetilde{\omega }}}_i' + {\widetilde{V}}_w$ and the expressions of ${\widetilde{w}}_i$ and ${\widetilde{V}}_w$ are given in (10).

Since the ELBO is a summation of p terms, we solve $u_j$ and $M_j$ for each j. As the posterior conditional on $\gamma _j =0$ is singular to the Dirac measure, we only need to consider the case $\gamma _j = 1$. This leads to minimize the function

$$\begin{aligned}&\mathbb {E}_{{\widetilde{u}}_j, {\widetilde{M}}_j, z_j|\gamma _j = 1}\Bigg [ \frac{1}{2\sigma ^2} \sum _{i=1}^n \left( - 2X_{ij} {{\widetilde{\beta }}}_j {{\widetilde{\omega }}}_i + {{\widetilde{\beta }}}_j H_i {{\widetilde{\beta }}}_j' \right) \\&\qquad \qquad \qquad + \log \frac{N({\widetilde{u}}_j, \sigma ^2 {\widetilde{M}}_j)}{\kappa _j^\circ g( {{\widetilde{\beta }}}_j|\lambda _1)} \Bigg ]\\ {}&\quad = C - \frac{1}{\sigma ^2} \sum _{i=1}^n X_{ij} {\widetilde{u}}_j {{\widetilde{\omega }}}_i \\&\hspace{1cm}+ \frac{1}{2\sigma ^2} \sum _{i=1}^n \left( {\widetilde{u}}_j H_i {\widetilde{u}}_j' + {{\,\text {Tr}\,}}\left( \sigma ^2{\widetilde{M}}_jH_i\right) \right) \\ {}&\hspace{1cm} + \lambda _1 \sum _{k=1}^r f({\widetilde{u}}_{jk}, {\widetilde{M}}_{j,kk}), \end{aligned}$$

where $\kappa _j^\circ = \int \pi (\gamma _j|\kappa ) d\Pi (\kappa )$. Then we take the derivative of ${\widetilde{u}}_j$ and ${\widetilde{M}}_j$ to obtain (12) and (13). The solutions in (14) are obtained by changing $\lambda _1 \sum _{k=1}^r f({\widetilde{u}}_{jk}, {\widetilde{M}}_{j,kk})$ in the last display with $\frac{\lambda _1}{2\sigma ^2} \left( {\widetilde{u}}_j {\widetilde{u}}_j' + \sigma ^2 {{\,\textrm{Tr}\,}}(M_j)\right) $.

To derive (15), we have

$$\begin{aligned}&\mathbb {E}_P \left( \mathbb {E}_{w|\Theta ^{(t)}} \pi ({{\widetilde{\beta }}}_j, w, X) - \log q({{\widetilde{\beta }}}_j) \right) \nonumber \\&= C + \mathbb {E}_{{{\widetilde{\mu }}}_j, {\widetilde{M}}_j, z_j} \Bigg [ \frac{1}{2\sigma ^2} \sum _{i=1}^n \left( -2X_{ij} {{\widetilde{\beta }}}_j {{\widetilde{\omega }}}_i + {{\widetilde{\beta }}}_j H_i {{\widetilde{\beta }}}_j' \right) \nonumber \\&\quad + \mathbb {1}_{\{\gamma _j = 0\}}\log \frac{1 - z_j}{1-\kappa _j^\circ } \nonumber \\&\quad + \mathbb {1}_{\{\gamma _j = 1\}}\log \frac{z_j N({{\widetilde{\mu }}}_j, \sigma ^2 M_j)}{\kappa _j^\circ g({{\widetilde{\beta }}}_j|\lambda _1)} \Bigg ]\nonumber \\&= C+ (1-z_j) \log \frac{1-z_j}{1-\kappa _j^\circ }\nonumber \\&\quad +z_j \Bigg \{ \frac{1}{2\sigma ^2} \sum _{i=1}^n \left( {{\widetilde{\mu }}}_j H_i {{\widetilde{\mu }}}_j' +\sigma ^2 {{\,\text {Tr}\,}}({\widetilde{M}}_j H_i) - 2X_{ij} {{\widetilde{\mu }}}_j {{\widetilde{\omega }}}_i \right) \nonumber \\&\quad + r\log \left( \frac{\sqrt{2}}{\sqrt{\pi } \sigma \lambda _1}\right) - \frac{1}{2}\log \det ({\widetilde{M}}_j) - \frac{1}{2}\nonumber \\&\quad + \lambda _1 \sum _{k=1}^r f({{\widetilde{\mu }}}_{jk},\sigma ^2 {\widetilde{M}}_{j,kk}) + \log \frac{z_j}{\kappa _j^\circ } \Bigg \}. \end{aligned}$$

(37)

The solution of ${\widehat{h}}_j$ can be obtained by minimizing $z_j$ from the last line of the above display. Similarly, (16) is obtained by minimizing $z_j$ from the following expression

$$\begin{aligned}&C+z_j\Bigg \{ \frac{1}{2\sigma ^2} \sum _{i=1}^n \left( {{\widetilde{\mu }}}_j H_i {{\widetilde{\mu }}}_j' +\sigma ^2 {{\,\text {Tr}\,}}({\widetilde{M}}_j H_i) - 2X_{ij} {{\widetilde{\mu }}}_j {{\widetilde{\omega }}}_i \right) \nonumber \\&- \frac{r\log \lambda _1 + 1}{2}- \frac{1}{2}\log \det ({\widetilde{M}}_j) + \frac{\lambda _1}{2\sigma ^2} \left( {\widetilde{u}}_j {\widetilde{u}}_j' + {{\,\text {Tr}\,}}(\sigma ^2 {\widetilde{M}}_j) \right) \nonumber \\&+ \log \frac{z_j}{\kappa _j^\circ }\Bigg \} + (1-z_j) \log \frac{1-z_j}{1-\kappa _j^\circ }. \end{aligned}$$

(38)

Last, to obtain (17), we first sum the expressions in (37) for all $j =1, \dots , p$. Next, we write down the explicit expression of C which involves $\sigma ^2$, i.e.,

$$\begin{aligned} pC_{\sigma ^2} = \frac{(np + 2\sigma _a +2)\log \sigma ^2}{2} + \frac{{{\,\textrm{Tr}\,}}(X'X) + 2\sigma _b}{2\sigma ^2}. \end{aligned}$$

Last, we plugging the above expression and solve $\sigma ^2$. The solution (18) can be obtained similarly using (38).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ning, YC.B., Ning, N. Spike and slab Bayesian sparse principal component analysis. Stat Comput 34, 118 (2024). https://doi.org/10.1007/s11222-024-10430-8

Download citation

Received: 06 August 2023
Accepted: 28 March 2024
Published: 13 May 2024
DOI: https://doi.org/10.1007/s11222-024-10430-8

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Spike and slab Bayesian sparse principal component analysis

Abstract

Access this article

Similar content being viewed by others

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Codivergences and information matrices

An evaluation of computational methods for aggregate data meta-analyses of diagnostic test accuracy studies

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 601 KB)

Appendix A: Derivation of (12)–(18)

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Spike and slab Bayesian sparse principal component analysis

Abstract

Access this article

Similar content being viewed by others

Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC

Codivergences and information matrices

An evaluation of computational methods for aggregate data meta-analyses of diagnostic test accuracy studies

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 601 KB)

Appendix A: Derivation of (12)–(18)

Appendix A: Derivation of (12)–(18)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation