Poisson reduced-rank models with sparse loadings

Lee, Eun Ryung; Park, Seyoung

doi:10.1007/s42952-021-00106-8

Poisson reduced-rank models with sparse loadings

Research Article
Published: 02 February 2021

Volume 50, pages 1079–1097, (2021)
Cite this article

Journal of the Korean Statistical Society Aims and scope Submit manuscript

Eun Ryung Lee¹ &
Seyoung Park¹

297 Accesses
2 Citations
Explore all metrics

Abstract

High-dimensional Poisson reduced-rank models have been considered for statistical inference on low-dimensional locations of the individuals based on the observations of high-dimensional count vectors. In this study, we assume sparsity on a so-called loading matrix to enhance its interpretability. The sparsity assumption leads to the use of \(L_1\) penalty, for the estimation of the loading. We provide novel computational and theoretical analyses for the corresponding penalized Poisson maximum likelihood estimation. We establish theoretical convergence rates for the parameters under weak-dependence conditions; this implies consistency even in large-dimensional problems. To implement the proposed method involving several computational issues, including nonconvex log-likelihoods, \(L_1\) penalty, and orthogonal constraints, we developed an iterative algorithm. Further, we propose a Bayesian-Information-Criteria-based penalty parameter selection, which works well in the implementation. Some numerical evidence is provided by conducting real-data-based simulation analyses and the proposed method is illustrated with the analysis of German party manifesto data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Article Open access 05 May 2021

Violating the normality assumption may be the lesser of two evils

Article Open access 07 May 2021

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

Article 15 July 2015

References

Ahn, S. C., & Horenstein, A. R. (2013). Eigenvalue ratio test for the number of factors. Econometrica, 81, 1203–1227.
Article MathSciNet Google Scholar
Bai, J., & Ng, S. (2002). Determining the number of factors in approximate factor models. Econometrica, 70, 191–221.
Article MathSciNet Google Scholar
Bathia, N., Yao, Q., & Ziegelmann, F. (2010). Identifying the finite dimensionality of curve time series. Annals of Statistics, 38, 3352–3386.
Article MathSciNet Google Scholar
Collins, M., Dasgupta, S., & Schapire, R. (2002). A generalization of principal component analysis to the exponential family. Adv. Neu. Info. Proces. Sys, 14, 617–624.
Google Scholar
Freyaldenhoven, S. (2019). Identification through sparsity in factor models. Working paper.
Goodman, L. (1979). Simple models for the analysis of association in cross-classifications having ordered categories. J. R. Statist. Soc., B 74, 537–552.
MathSciNet Google Scholar
Goodman, L. A. (1981). Association models, canonical correlation in the analysis of cross-classification having ordered categories. Journal of American Statistical Association, 76, 320–334.
MathSciNet Google Scholar
Gopalan, P., Hofman, J.M., & Blei, D.M. (2015). Scalable recommendation with hierarchical Poisson factorization. In Proc. 31st Conf. on Uncertainty in Artificial Intelligence, pages 326–335. AUAI Press Corvallis, Oregon.
Hallin, M., & Liska, R. (2007). Determining the number of factors in the general dynamic factor model. Journal of American Statistical Association, 102, 603–617.
Article MathSciNet Google Scholar
Jentsch, C., Lee, E. R., & Mammen, E. (2020+). Poisson reduced rank models with an application to political text data. Biometrika.
Jentsch, C., Lee, E. R., & Mammen, E. (2020). Time-dependent poisson reduced rank models for political text data analysis. Computational Statistics and Data Analysis, 142, 106813.
Article MathSciNet Google Scholar
Jung, S., & Marron, J. (2009). Pca consistency in high dimension, low sample size context. Annals of Statistics, 37, 4104–4130.
Article MathSciNet Google Scholar
Jung, Y., Huang, J. Z., & Hu, J. (2014). Biomarker detection in association studies: Modeling SNPs simultaneously via logistic ANOVA. Journal of American Statistical Association, 108, 1355–1367.
Article MathSciNet Google Scholar
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200.
Article Google Scholar
Lam, C., & Yao, Q. (2012). Factor modeling for high-dimenional time series: inference for the number of factors. Annals of Statistics, 40, 694–726.
Article MathSciNet Google Scholar
Lee, Y. K., Lee, E. R., & Park, B. U. (2012). Principal component analysis in very high-dimensional spaces. Statistica Sinica, 22, 933–956.
MathSciNet MATH Google Scholar
Lee, S., Chugh, P. E., Shen, H., Eberle, R., & Dittmer, D. (2013). Poisson factor models with applications to non-normalized microRNA profiling. Bioinformatics, 29, 1105–1111.
Article Google Scholar
Lee, S., Zou, F., & Wright, F. (2014). Convergence of sample eigenvalues, eigenvectors, and principal component scores for ultra-high dimensional data. Biometrika, 101, 484–490.
Article MathSciNet Google Scholar
Park, Zhao. (2019). Sparse principal component analysis with missing observations. Annals of Applied Statistics, 13(2), 1016–1042.
MathSciNet MATH Google Scholar
Recht, B., Fazel, M., & Parrilo, P. A. (2010). Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52, 471–501.
Article MathSciNet Google Scholar
Shen, D., Shen, H., & Marron, J. (2016). A general framework for consistency of principal component analysis. Journal of Machine Learning Research, 17, 1–3.
MathSciNet MATH Google Scholar
Slapin, J. B., & Proksch, S.-O. (2008). A scaling model for estimating time series party positions from texts. American Journal of Political Science, 52, 705–722.
Article Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistics Society, B 58, 267–288.
MathSciNet MATH Google Scholar
Wedel, M., Böckenholt, U., & Kamakura, W. A. (2003). Factor models for multivariate count data. Journal of Multivariate Analysis, 87, 356–369.
Article MathSciNet Google Scholar
Yu, Y., Wang, T., & Samworth, R. J. (2015). A useful variant of the Davis-Kahan theorem for statisticians. Biometrika, 102, 315–323.
Article MathSciNet Google Scholar
Zou, H., Hastie, T., & Tibshirani, R. (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics, 15(2), 265–286.
Article MathSciNet Google Scholar

Download references

Acknowledgements

The first two authors, Eun Ryung Lee and Seyoung Park, equally contributed to the paper. This research was supported by a National Research Foundation of Korea grant funded by the Korea government (MSIP) (No. NRF-2019R1C1C1003805). Eun Ryung Lee is supported by a National Research Foundation of Korea grant funded by the Korean government (MSIT) (No. NRF-2019R1F1A1062795). Seyoung Park was supported by Sungkyun Research Fund, Sungkyunkwan University, 2018.

Author information

Authors and Affiliations

Department of Statistics, Sungkyunkwan University, Seoul, South Korea
Eun Ryung Lee & Seyoung Park

Authors

Eun Ryung Lee
View author publications
You can also search for this author in PubMed Google Scholar
Seyoung Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Eun Ryung Lee.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Lee, E.R., Park, S. Poisson reduced-rank models with sparse loadings. J. Korean Stat. Soc. 50, 1079–1097 (2021). https://doi.org/10.1007/s42952-021-00106-8

Download citation

Received: 17 December 2020
Accepted: 19 January 2021
Published: 02 February 2021
Issue Date: December 2021
DOI: https://doi.org/10.1007/s42952-021-00106-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Poisson reduced-rank models with sparse loadings

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Poisson reduced-rank models with sparse loadings

Abstract

Access this article

Similar content being viewed by others

Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R

Violating the normality assumption may be the lesser of two evils

Confirmatory factor analysis with ordinal data: Comparing robust maximum likelihood and diagonally weighted least squares

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation