Skip to main content

A Framework for the Automatic Combination and Evaluation of Gene Selection Methods

  • Conference paper
  • First Online:
Practical Applications of Computational Biology and Bioinformatics, 12th International Conference (PACBB2018 2018)

Abstract

High-throughput RNA-Sequencing technologies produce large gene expression datasets whose analysis leads to a better understanding and treatment of diseases like cancer. The data’s high dimensionality poses challenges to its computational analysis, which is addressed by applying gene selection. Traditional gene selection methods are based on the data only. In turn, integrative approaches include curated biological information from external knowledge bases in the gene selection process, which improves result accuracy and computational complexity.

This paper presents a framework for comparing knowledge based and computational gene selection. Moreover, a novel integrative method of the automatic combination of both approaches is presented. Results on a cancer dataset show that simple computational methods enriched by external knowledge can compete with complex computational techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Acharya, S., Saha, S., Nikhil, N.: Unsupervised gene selection using biological knowledge: application in sample clustering. BMC Bioinform. 18(1), 513 (2017)

    Article  Google Scholar 

  2. Ang, J.C., et al.: Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Trans. Comput. Biol. Bioinform. 13(5), 971–989 (2016)

    Article  Google Scholar 

  3. Bellazzi, R., Zupan, B.: Towards knowledge-based gene expression data mining. J. Biomed. Inform. 40(6), 787–802 (2007)

    Article  Google Scholar 

  4. Bolón-Canedo, V., et al.: A review of microarray datasets and applied feature selection methods. Inf. Sci. 282, 111–135 (2014)

    Article  Google Scholar 

  5. Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: A review of feature selection methods on synthetic data. Knowl. Inf. Syst. 34(3), 483–519 (2013)

    Article  Google Scholar 

  6. Consortium, G.O., et al.: Expansion of the Gene Ontology knowledgebase and resources. Nucleic Acids Res. 45(D1), D331–D338 (2017)

    Google Scholar 

  7. Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(3), 131–156 (1997)

    Article  Google Scholar 

  8. Díaz-Uriarte, R., De Andres, S.A.: Gene selection and classification of microarray data using random forest. BMC Bioinform. 7(1), 3 (2006)

    Article  Google Scholar 

  9. Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3(02), 185–205 (2005)

    Article  Google Scholar 

  10. Durbin, B.P., et al.: A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics 18(suppl. 1), S105–S110 (2002)

    Article  Google Scholar 

  11. Fang, O.H., Mustapha, N., Sulaiman, M.N.: An integrative gene selection with association analysis for microarray data classification. Intell. Data Anal. 18(4), 739–758 (2014)

    Google Scholar 

  12. Grossman, R.L., et al.: Toward a shared vision for cancer genomic data. N. Engl. J. Med. 375(12), 1109–1112 (2016)

    Article  Google Scholar 

  13. Guyon, I., et al.: Gene selection for cancer classification using support vector machines. Mach. Learn. 46(1–3), 389–422 (2002)

    Article  Google Scholar 

  14. Hall, M., et al.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  15. Kanehisa, M., Goto, S.: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)

    Article  Google Scholar 

  16. Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: European Conference on Machine Learning, pp. 171–182. Springer (1994)

    Google Scholar 

  17. Kukurba, K.R., Montgomery, S.B.: RNA sequencing and analysis. Cold Spring Harb. Protoc. 2015(11), 951–69 (2015)

    Article  Google Scholar 

  18. Leung, Y., Hung, Y.: A multiple-filter-multiple-wrapper approach to gene selection and microarray data classification. IEEE/ACM Trans. Comput. Biol. Bioinform. 7(1), 108–117 (2010)

    Article  Google Scholar 

  19. Liu, H., Liu, L., Zhang, H.: Ensemble gene selection by grouping for microarray data classification. J. Biomed. Inform. 43(1), 81–87 (2010)

    Article  Google Scholar 

  20. Mundra, P.A., Rajapakse, J.C.: SVM-RFE with MRMR filter for gene selection. IEEE Trans. Nanobiosci. 9(1), 31–37 (2010)

    Article  Google Scholar 

  21. Ooi, C., Tan, P.: Genetic algorithms applied to multi-class prediction for the analysis of gene expression data. Bioinformatics 19(1), 37–44 (2003)

    Article  Google Scholar 

  22. Papachristoudis, G., Diplaris, S., Mitkas, P.A.: SoFoCles: feature filtering for microarray classification based on gene ontology. J. Biomed. Inform. 43(1), 1–14 (2010)

    Article  Google Scholar 

  23. Piñero, J., et al.: DisGeNET: a discovery platform for the dynamical exploration of human diseases and their genes. Database 2015 (2015)

    Google Scholar 

  24. Qi, J., Tang, J.: Integrating gene ontology into discriminative powers of genes for feature selection in microarray data. In: Proceedings of APGV. ACM (2007)

    Google Scholar 

  25. Quanz, B., Park, M., Huan, J.: Biological pathways as features for microarray data classification. In: International Workshop on Data and Text Mining in Biomedical Informatics, pp. 5–12. ACM (2008)

    Google Scholar 

  26. Raghu, V.K., et al.: Integrated theory-and data-driven feature selection in gene expression data analysis. In: Proceedings of International Conference on Data Engineering, pp. 1525–1532. IEEE (2017)

    Google Scholar 

  27. Sharma, A., Imoto, S., Miyano, S.: A top-r feature selection algorithm for microarray gene expression data. IEEE/ACM Trans. Comput. Biol. Bioinform. 9(3), 754–764 (2012)

    Article  Google Scholar 

  28. Soh, D., et al.: Enabling more sophisticated gene expression analysis for understanding diseases and optimizing treatments. SIGKDD Explor. 9(1), 3–13 (2007)

    Article  Google Scholar 

  29. Weinstein, J.N., et al.: The cancer genome atlas pan-cancer analysis project. Nat. Genet. 45(10), 1113–1120 (2013)

    Article  Google Scholar 

  30. Yang, F., Mao, K.: Robust feature selection for microarray data based on multicriterion fusion. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(4), 1080–1092 (2011)

    Article  Google Scholar 

  31. Zhao, Z., et al.: An integrative approach to identifying biologically relevant genes. In: Proceedings of SIAM International Conference Data Mining 2010, pp. 838–849. SIAM (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cindy Perscheid .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Grasnick, B., Perscheid, C., Uflacker, M. (2019). A Framework for the Automatic Combination and Evaluation of Gene Selection Methods. In: Fdez-Riverola, F., Mohamad, M., Rocha, M., De Paz, J., González, P. (eds) Practical Applications of Computational Biology and Bioinformatics, 12th International Conference. PACBB2018 2018. Advances in Intelligent Systems and Computing, vol 803. Springer, Cham. https://doi.org/10.1007/978-3-319-98702-6_20

Download citation

Publish with us

Policies and ethics