Toward a New Approach for Tuning Regularization Hyperparameter in NMF

Del Buono, Nicoletta; Esposito, Flavia; Selicato, Laura

doi:10.1007/978-3-030-95467-3_36

Toward a New Approach for Tuning Regularization Hyperparameter in NMF

Nicoletta Del Buono¹⁶,
Flavia Esposito¹⁶ &
Laura Selicato¹⁶

Conference paper
First Online: 02 February 2022

1419 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 13163))

Abstract

Linear Dimensionality Reduction (LDR) methods has gained much attention in the last decades and has been used in the context of data mining applications to reconstruct a given data matrix. The effectiveness of low rank models in data science is justified by the fact that one can suppose that each row or column in the data matrix is associated to a bounded latent variable, and entries of the matrix are generated by applying a piece-wise analytic function to these latent variables. Formally, LDR can be mathematically formalized as optimization problems at which regularization terms can be often added to enforce particular constraints emphasizing useful properties in data. From this point of view, the tune of the regularization hyperparameters (HPs), controlling the weight of the additional constraints, represents an interesting problem to be solved automatically rather than by a trial and error approach. In this work, we focus on the role the regularization HPs act in Nonnegative Matrix Factorizations (NMF) context and how their right choice can affect further results, proposing a complete overview and new directions for a novel approach. Moreover, a novel bilevel formulation of the regularization HP selection is proposed which incorporates the HP choice directly in the unsupervised algorithm as a part of the updating process.

N. Del Buono, F. Esposito and L. Selicato—All authors equally contribute to this work.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
In bilevel programming an outer optimization problem is solved subject to the optimality of an inner optimization problem.
2.
\(\varLambda =\varLambda _1\times \dots \times \varLambda _d\) is the HP domain, where each set \(\varLambda _i\) can be real-valued (e.g., learning rate, regularization coefficient), integer-valued (e.g., number of layers), binary (e.g., whether to use early stopping or not), categorical (e.g., choice of optimizer).
3.
\(\omega \) can be scalar, vector or matrix. \(\varOmega =\varOmega _1 \times \dots \times \varOmega _n\) is the parameter domain, where each set \(\varOmega _j\) can be real-valued or integer-valued (e.g., weights of regression and classification, factors in matrix decompositions).
4.
Note that the term “orthogonal” is to be understood as “soft-orthogonal” indicating the orthogonality property of the columns or rows of the matrices W or H, respectively. With this clarification, the soft-orthogonal NMF problem can be defined as \(\min \limits _{W\ge 0,H\ge 0}{D_\beta (X,WH)}\qquad \text {s.t.}\quad W^\top W = I_r\quad \text {and/or}\quad HH^\top =I_r\).
5.
\(\ell _0\) norm is not truly a norm since the property of positive homogeneity is not respected. Nevertheless, since it can be expressed in terms of the \(\ell _p\) norm \(\left\| x\right\| _0 = \lim \limits _{p\rightarrow 0}{\left\| x\right\| _p^p}\), in literature, it is referred to as a “norm”.
6.
The Hoyer sparsity measure is computed as the normalized ratio of \(\ell _1\) and \(\ell _2\) norm.
7.
For particular values of \(\beta \) and specific regularization functions.

References

Argyriou, A., Evgeniou, T., Pontil, M.: Multi-task feature learning. Adv. Neural Inf. Process. Syst. 19, 41–48 (2007)
MATH Google Scholar
Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52(1), 155–173 (2007)
Article MathSciNet MATH Google Scholar
Del Buono, N., Esposito, F., Selicato, L.: Methods for hyperparameters optimization in learning approaches: an overview. In: Nicosia, G., et al. (eds.) LOD 2020. LNCS, vol. 12565, pp. 100–112. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-64583-0_11
Chapter Google Scholar
Esposito, F.: A review on initialization methods for nonnegative matrix factorization: towards omics data experiments. Mathematics 9(9), 1006 (2021)
Article Google Scholar
Esposito, F., Del Buono, N., Selicato, L.: Nonnegative matrix factorization models for knowledge extraction from biomedical and other real world data. PAMM 20(1), e202000032 (2021)
Article Google Scholar
Esposito, F., Gillis, N., Del Buono, N.: Orthogonal joint sparse NMF for microarray data analysis. J. Math. Biol. 79(1), 223–247 (2019)
Article MathSciNet MATH Google Scholar
Franceschi, L., Donini, M., Frasconi, P., Pontil, M.: Forward and reverse gradient-based hyperparameter optimization. In: International Conference on Machine Learning, pp. 1165–1173. PMLR (2017)
Google Scholar
Gao, T., Guo, Y., Deng, C., Wang, S., Yu, Q.: Hyperspectral unmixing based on constrained nonnegative matrix factorization via approximate L0. In: Proceedings of IEEE International Geoscience Remote Sensing Symposium, pp. 2156–2159 (2015)
Google Scholar
Gillis, N.: The why and how of nonnegative matrix factorization. In: Suykens, J., Signoretto, M., Argyriou, A. (eds.) Regularization, Optimization, Kernels, and Support Vector Machines. Machine Learning and Pattern Recognition Series, pp. 257–291. Chapman & Hall/CRC, Boca Raton (2014)
Google Scholar
Gillis, N.: Nonnegative Matrix Factorization. SIAM, Philadelphia (2020)
Book MATH Google Scholar
Hanke, M.: A Taste of Inverse Problems: Basic Theory and Examples. SIAM, Philadelphia (2017)
Book MATH Google Scholar
Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing, 2002, pp. 557–565. IEEE (2002)
Google Scholar
Hyunsoo, K., Haesun, P.: Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12), 1495–1502 (2007)
Article Google Scholar
Kompass, R.: A generalized divergence measure for nonnegative matrix factorization. Neural Comput. 19(3), 780–791 (2007)
Article MathSciNet MATH Google Scholar
Leplat, V., Gillis, N., Févotte, C.: Multi-resolution beta-divergence NMF for blind spectral unmixing. arXiv preprint arXiv:2007.03893 (2020)
Li, Z., Tang, Z., Ding, S.: Dictionary learning by nonnegative matrix factorization with 1/2-norm sparsity constraint. In: 2013 IEEE International Conference on Cybernetics (CYBCONF), pp. 63–67. IEEE (2013)
Google Scholar
Liu, J.-X., Wang, D., Gao, Y.-L., Zheng, C.-H., Xu, Y., Yu, J.: Regularized non-negative matrix factorization for identifying differentially expressed genes and clustering samples: a survey. IEEE/ACM Trans. Comput. Biol. Bioinf. 15(3), 974–987 (2017)
Article Google Scholar
Lucy, L.B.: An iterative technique for the rectification of observed distributions. Astron. J. 79, 745 (1974)
Article Google Scholar
Oraintara, S., Karl, W.C., Castanon, D.A., Nguyen, T.Q.: A method for choosing the regularization parameter in generalized Tikhonov regularized linear inverse problems. In: Proceedings 2000 International Conference on Image Processing (Cat. No. 00CH37101), vol. 1, pp. 93–96. IEEE (2000)
Google Scholar
Pedregosa, F.: Hyperparameter optimization with approximate gradient. In: International Conference on Machine Learning, pp. 737–746. PMLR (2016)
Google Scholar
R-Team, R.C.: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2015)
Google Scholar
Richardson, W.H.: Bayesian-based iterative method of image restoration. JoSA 62(1), 55–59 (1972)
Article Google Scholar
Selicato, L.: A new ensemble method for detecting anomalies in gene expression matrices. Mathematics 9(8), 882 (2021)
Article Google Scholar
Shaban, A., Cheng, C.-A., Hatch, N., Boots, B.: Truncated back-propagation for bilevel optimization. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 1723–1732. PMLR (2019)
Google Scholar
Taslaman, L., Nilsson, B.: A framework for regularized non-negative matrix factorization, with application to the analysis of gene expression data. PloS one 7(11), e46331 (2012)
Article Google Scholar
Zdunek, R.: Regularized NNLS algorithms for nonnegative matrix factorization with application to text document clustering. In: Computer Recognition Systems, vol. 4, pp. 757–766. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-20320-6_77
Zdunek, R.: Regularized nonnegative matrix factorization: geometrical interpretation and application to spectral unmixing. Int. J. Appl. Math. Comput. Sci. 24(2), 233–247 (2014)
Article MathSciNet MATH Google Scholar
Zdunek, R., Cichocki, A.: Nonnegative matrix factorization with constrained second-order optimization. Signal Process. 87(8), 1904–1916 (2007)
Article MATH Google Scholar
Zhang, Z., Xu, Y., Yang, J., Li, X., Zhang, D.: A survey of sparse representation: algorithms and applications. IEEE Access 3, 490–530 (2015)
Article Google Scholar
Zheng, C.-H., Huang, D.-S., Zhang, L., Kong, X.-Z.: Tumor clustering using nonnegative matrix factorization with gene selection. IEEE Trans. Inf. Technol. Biomed. 13(4), 599–607 (2009)
Article Google Scholar

Download references

Acknowledgments

This work was supported in part by the GNCS-INDAM (Gruppo Nazionale per il Calcolo Scientifico of Istituto Nazionale di Alta Matematica) Francesco Severi, P.le Aldo Moro, Roma, Italy. The author F.E. was funded by REFIN Project, grant number 363BB1F4, Reference project idea UNIBA027 “Un modello numerico-matematico basato su metodologie di algebra lineare e multilineare per l’analisi di dati genomici".

Author information

Authors and Affiliations

Department of Mathematics, Members of INDAM-GNCS Research Group, University of Bari Aldo Moro, Via E. Orabona 4, Bari, Italy
Nicoletta Del Buono, Flavia Esposito & Laura Selicato

Authors

Nicoletta Del Buono
View author publications
You can also search for this author in PubMed Google Scholar
Flavia Esposito
View author publications
You can also search for this author in PubMed Google Scholar
Laura Selicato
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laura Selicato .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
Department of Computer Science, University of Reading, Reading, UK
Varun Ojha
Department of Computer Science, University of Oxford, Oxford, UK
Emanuele La Malfa
Cambridge Judge Business School, University of Cambridge, Cambridge, UK
Gabriele La Malfa
Department of Biochemistry, University of Cambridge, Cambridge, UK
Giorgio Jansen
Department of Industrial and Systems Engineering, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
University of Catania, Catania, Italy
Giovanni Giuffrida
Department of Informatics, Dana-Farber Cancer Institute, Boston, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Del Buono, N., Esposito, F., Selicato, L. (2022). Toward a New Approach for Tuning Regularization Hyperparameter in NMF. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2021. Lecture Notes in Computer Science(), vol 13163. Springer, Cham. https://doi.org/10.1007/978-3-030-95467-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-030-95467-3_36
Published: 02 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95466-6
Online ISBN: 978-3-030-95467-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics