skip to main content
10.1145/1244002.1244101acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

Integrating gene ontology into discriminative powers of genes for feature selection in microarray data

Published:11 March 2007Publication History

ABSTRACT

One of the main challenges in the classification of microarray gene expression data is the small sample size compared with the large number of genes, so feature selection is an essential step to remove genes not relevant to class labels. Most feature selection methods are solely based on expression values to determine discriminative values of genes and remove redundancy. However, due to the characteristics of microarray technology, some values may not be accurately measured. This may reduce the effectiveness of these models. To cope with this problem, in this paper, we integrate Gene Ontology (GO) annotations into gene selection. The novelty of our work is to evaluate genes based on not only their individual discriminative powers but also the powers of GO terms that annotate them. This strategy implicitly verifies the accuracies of the measurements and reduces redundancy. Experimental results in four public datasets demonstrate the effectiveness of the proposed method.

References

  1. U. Alon, N. Barkai, and D. A. Notterman. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. In Proc. Natl Acad. Sci, 1999.Google ScholarGoogle ScholarCross RefCross Ref
  2. M. Diehn and G. Sherlock. Source: a unified genomic resource of functional annotations, ontologies, and gene expression data. Nucleic Acids Research, 31:219--223, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  3. U. Fayyad and K. Irani. Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, 1993.Google ScholarGoogle Scholar
  4. Gordon GJ and Jensen RV. Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma. Cancer Research, 62:4963--4967, 2002.Google ScholarGoogle Scholar
  5. T. R. Golub and et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286:531--537, 1999.Google ScholarGoogle Scholar
  6. B. Hanczar, M. Courtine, A. Benis, and C. Hennegar. Improving classification of microarray data using prototype-based feature selection. In ACM SIGKDD Explorations Newsletter, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Ron Kohavi and George H. John. Wrapper for feature subset selection. Artificial Intelligence, 97(1-2):273--274, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Mark Schena, Dari Shalon, Ronald W. Davis, and Patrick O. Brown. Quantitative monitoring of gene expression patterns with a complementary dna microarray. Science, 270:467--470, 1995.Google ScholarGoogle Scholar
  9. J. Tuikkala, L. Elo, OS. Nevalainen, and T. Aittokallio. Improving missing value estimation in microarray data with gene ontology. Bioinformatics, 22:566--572, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. van't Veer LJ and Dai H. Gene expression profiling predicts clinical outcome of breast cancer. Nature, 415:530--536, 2002.Google ScholarGoogle Scholar
  11. H. Wang and F. Azuaje. Gene expression correlation and gene ontology-based similarity: An assessment of quantitative relationships. In Proc. of IEEE CIBCB 2004, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  12. Yuhang Wang, Fillia S. Makedon, and James C. Ford. Hykgene: a hybrid approach for selecting marker genes for phenotype classification using microarray gene expression data. Bioinformatics, 21:1530--1537, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Ian H. Witten and Eibe Frank. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. E. Xing, M. Jordan, and R. Karp. Feature selection for high-dimensional genomic microarray data. In Proceedings of the 18th International Conference on Machine Learning, pages 601--608, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xian Xu and Aidong Zhang. Selecting informative genes from microarray dataset by incorporating gene ontology. In Proceedings of the Fifth IEEE Symposium on Bioinformatics and Bioengineering, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Integrating gene ontology into discriminative powers of genes for feature selection in microarray data

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SAC '07: Proceedings of the 2007 ACM symposium on Applied computing
      March 2007
      1688 pages
      ISBN:1595934804
      DOI:10.1145/1244002

      Copyright © 2007 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 11 March 2007

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate1,650of6,669submissions,25%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader