Skip to main content
Log in

Adaptive hypergraph regularized logistic regression model for bioinformatic selection and classification

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

The classification of cancer using established biological knowledge has become increasingly prevalent, primarily due to the improved accuracy and enhanced biological interpretability this method offers for classification outcomes. Despite these advances, current cancer classification methods encounter challenges in maintaining the intricate structure of gene networks and leveraging the statistical information embedded within gene data. In this paper, we introduce an adaptive hypergraph regularized logistic regression model that capitalizes on established biological knowledge and statistical information within gene data. Specifically, our model integrates a hypergraph into the objective function, an innovation that preserves the complex gene network structure more effectively. Additionally, we implement adaptive penalties in the penalty term, which facilitates the targeted selection of disease-related genes based on gene weights. To further refine our model, we incorporate constraints on gene pairs with high statistical correlations within the penalty term, thereby minimizing the inclusion of redundant genes. We adopt the block coordinate descent algorithm to address the nonconvexity of our model. Through comparative experimentation with established methodologies on real datasets, our proposed model demonstrates marked improvement in classification accuracy and adept selection of genes pertinent to specific diseases.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability and access

The data and code underlying this study have been uploaded to github and are accessible at the following link: https://github.com/AdaH-LR/AdaH.LR.

References

  1. Gurunlu B, Ozturk S (2022) A novel method for forgery detection on lung cancer images. Int J Inf Secur Sci 11(3):13–20

    Google Scholar 

  2. Brumback B, Srinath M (1987) A chi-square test for fault-detection in kalman filters. IEEE Trans Auto Control 32(6):552–4. https://doi.org/10.1109/TAC.1987.1104658

    Article  Google Scholar 

  3. Urbanowicz RJ, Meeker M, Cava WL, Olson RS, Moore JH (2018) Relief-based feature selection: introduction and review. J Biomed Informat 85:189–203. https://doi.org/10.1016/j.jbi.2018.07.014

    Article  Google Scholar 

  4. Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422. https://doi.org/10.1023/A:1012487302797

    Article  Google Scholar 

  5. Algamal ZY, Lee MH (2015) Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification. Comput Biol Med 67:136–145. https://doi.org/10.1016/j.compbiomed.2015.10.008

    Article  CAS  PubMed  Google Scholar 

  6. Park H, Shiraishi Y, Imoto S, Miyano S (2016) A novel adaptive penalized logistic regression for uncovering biomarker associated with anti-cancer drug sensitivity. IEEE/ACM Trans Comput Biol Bioinformat 14(4):771–782. https://doi.org/10.1109/TCBB.2016.2561937

    Article  Google Scholar 

  7. Liu C, Wong HS (2017) Structured penalized logistic regression for gene selection in gene expression data analysis. IEEE/ACM Trans Comput Biol Bioinformat 16(1):312–321. https://doi.org/10.1109/TCBB.2017.2767589

    Article  Google Scholar 

  8. Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B Methodol 58(1):267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

    Article  MathSciNet  Google Scholar 

  9. Wang R, Xiu N-H, Zhang C (2019) Greedy projected gradient-newton method for sparse logistic regression. IEEE Trans Neural Netw Learn Syst 31(2):527–538. https://doi.org/10.1109/TNNLS.2019.2905261

    Article  MathSciNet  PubMed  Google Scholar 

  10. Song X-K, Liang K, Li J-T (2022) Wrlr: a weighted group regularized logistic regression for cancer diagnosis and gene selection. IEEE/ACM Trans Comput Biol Bioinformat 20(2):1563–1573. https://doi.org/10.1109/TCBB.2022.3203167

    Article  Google Scholar 

  11. Yang S-J, Chen S-J, Wang P (2023) Tsplasso: a two-stage prior lasso algorithm for gene selection using omics data. IEEE J Biomed Health Informat. https://doi.org/10.1109/JBHI.2023.3326485

    Article  Google Scholar 

  12. Seffernick AE, Mrózek K, Nicolet D, Stone RM (2022) High-dimensional genomic feature selection with the ordered stereotype logit model. Brief Bioinformat 23(6):bbac414. https://doi.org/10.1093/bib/bbac414

  13. Perscheid C (2021) Integrative biomarker detection on high-dimensional gene expression datasets: a survey on prior knowledge approaches. Brief Bioinformat 22(3):bbaa151. https://doi.org/10.1093/bib/bbaa151

  14. Li C-Y, Li H-Z (2008) Network-constrained regularization and variable selection for analysis of genomic data. Bioinformat 24(9):1175–1182. https://doi.org/10.1093/bioinformatics/btn081

    Article  CAS  Google Scholar 

  15. Min W-W, Liu J, Zhang S-H (2016) Network-regularized sparse logistic regression models for clinical risk prediction and biomarker discovery. IEEE/ACM Trans Comput Biol Bioinformat 15(3):944–953. https://doi.org/10.1109/TCBB.2016.2640303

    Article  Google Scholar 

  16. Wang W, Liu W (2020) Integration of gene interaction information into a reweighted lasso-cox model for accurate survival prediction. Bioinformat 36(22–23):5405–5414. https://doi.org/10.1093/bioinformatics/btaa1046

    Article  CAS  Google Scholar 

  17. Scholkopf B, Platt J, Hofmann T (2007) Learning with hypergraphs: clustering, classification, and embedding. Advances in Neural Information Processing Systems 19: Proceedings of the 2006

  18. Yang X-H, Che H-J, Liu C (2023) Adaptive graph nonnegative matrix factorization with the self-paced regularization. Appl Intell 53:15818–15835. https://doi.org/10.1007/s10489-022-04339-w

    Article  Google Scholar 

  19. Xu X-Y, Wu X, Wei F-L, Zhong W, Nie F-P (2021) A general framework for feature selection under orthogonal regression with global redundancy minimization. IEEE Trans Knowl Data Eng 34(11):5056–5069. https://doi.org/10.1109/TKDE.2021.3059523

    Article  Google Scholar 

  20. Zou H (2006) The adaptive lasso and its oracle properties. J Am Stat Assoc 101(476):1418–1429. https://doi.org/10.2307/27639762

    Article  MathSciNet  CAS  Google Scholar 

  21. Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J Royal Stat Soc Ser B Stat Methodol 67(2):301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x

    Article  MathSciNet  Google Scholar 

  22. Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J Royal Stat Soc Ser B Stat Methodol 67(1):91–108. https://doi.org/10.1111/j.1467-9868.2005.00490.x

    Article  MathSciNet  Google Scholar 

  23. Xie L-H, He B, Varathan P, Nho K, Risacher SL, Saykin AJ, Yan J-W (2021) Integrative-omics for discovery of network-level disease biomarkers: a case study in alzheimer’s disease. Brief Bioinformat 22(6):bbab121. https://doi.org/10.1093/bib/bbab121

  24. Peake RW (2013) Significance for the sake of significance: the relevance of statistical data. Clin Chem 59(6):1002. https://doi.org/10.1373/clinchem.2013.205757

    Article  CAS  PubMed  Google Scholar 

  25. Sedgwick P (2012) Pearson’s correlation coefficient. BMJ (online) 345(jul04 1):e4483–e4483. https://doi.org/10.1136/bmj.e4483

  26. Yamaguchi F (2002) Geometric newton-raphson method. Comput Aided Geom Des 299–324. https://doi.org/10.1007/978-4-431-67881-6_15

  27. Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1. https://doi.org/10.1163/ej.9789004178922.i-328.7

  28. Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494. https://doi.org/10.1023/A:1017501703105

  29. Kanehisa M, Furumichi M, Tanabe M, Sato Y, Morishima K (2017) KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res 45(D1):D353–D361. https://doi.org/10.1093/nar/gkw1092

    Article  CAS  PubMed  Google Scholar 

  30. Xie C, Mao X-Z, Huang J-J, Ding Y, Wu J-M, Dong S, Wei L-P (2011) KOBAS 2.0: a web server for annotation and identification of enriched pathways and diseases. Nucleic Acids Res 39(2): W316–W322. https://doi.org/10.1093/nar/gkr483

  31. Coller HA (2014) Is cancer a metabolic disease? Am J Pathol 184(1):4–17. https://doi.org/10.1016/j.ajpath.2013.07.035

    Article  PubMed  PubMed Central  Google Scholar 

  32. Zou Y-F, Xie C-W, Yang S-X, Xiong J-P (2017) AMPK activators suppress breast cancer cell growth by inhibiting dvl3-facilitated wnt/\(\beta \)-catenin signaling pathway activity. Mol Med Rep 15(2):899–907. https://doi.org/10.3892/mmr.2016.6094

    Article  CAS  PubMed  Google Scholar 

  33. Dong H-L, Claffey KP, Brocke S, Epstein PM (2015) Inhibition of breast cancer cell migration by activation of cAMP signaling. Breast Cancer Res Treat 152(1):17–28. https://doi.org/10.1007/s10549-015-3445-9

    Article  CAS  PubMed  Google Scholar 

  34. Chen Y-Z, Xue J-Y, Chen C-M, Yang B-L, Xu Q-H, Wu F, Wu J (2012) PPAR signaling pathway may be an important predictor of breast cancer response to neoadjuvant chemotherapy. Cancer Chemother Pharmacol 70(5):637–644. https://doi.org/10.1007/s00280-012-1949-0

    Article  CAS  PubMed  Google Scholar 

  35. Khodabandehlou N, Mostafaei S, Etemadi A, Ghasemi A, Payandeh M, Hadifar S, Moghoofei M (2019) Human papilloma virus and breast cancer: the role of inflammation and viral expressed proteins. BMC Cancer 19(1):1–11. https://doi.org/10.1186/s12885-019-5286-0

    Article  Google Scholar 

  36. Wu M, Tong X, Wang D-G, Wang L, Fan H (2020) Soluble intercellular cell adhesion molecule-1 in lung cancer: a meta-analysis. Pathol Res Pract 216(10):153029. https://doi.org/10.1016/j.prp.2020.153029

    Article  CAS  PubMed  Google Scholar 

  37. Parker AL, Cox TR (2020) The role of the ecm in lung cancer dormancy and outgrowth. Front Oncol 10(1766). https://doi.org/10.3389/fonc.2020.01766

  38. Cheng H-Y, Shcherba M, Pendurti G, Liang Y-X, Piperdi B, Perez-Soler R (2014) Targeting the pi3k/akt/mtor pathway: potential for lung cancer treatment. Lung Cancer Manage 3(1):67–75. https://doi.org/10.2217/lmt.13.72

    Article  CAS  Google Scholar 

Download references

Acknowledgements

The authors would like to thank all the anonymous reviewers for their constructive advice.

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, Yong Jin and Huaibin Hou; methodology, Yong Jin and Huaibin Hou; validation, Huaibin Hou; formal analysis, Mian Qin and Zhen Zhang; investigation, Mian Qin and Wei Yang; data curation,Huaibin Hou and Wei Yang; writing—original draft preparation, Huaibin Hou; writing—review and editing, Yong Jin and Huaibin Hou; funding acquisition, Yong Jin. All the authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Mian Qin.

Ethics declarations

Ethical and informed consent for the data used

This article does not involve any studies with human participants or animals performed by any of the authors.

Competing interests

No conflicts of interest exit in the submission of this manuscript, and the manuscript has been approved by all the authors for publication.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jin, Y., Hou, H., Qin, M. et al. Adaptive hypergraph regularized logistic regression model for bioinformatic selection and classification. Appl Intell 54, 2349–2360 (2024). https://doi.org/10.1007/s10489-024-05304-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-024-05304-5

Keywords

Navigation