skip to main content
10.1145/2811163.2811165acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
poster

Building Text-mining Framework for Gene-Phenotype Relation Extraction using Deep Leaning

Authors Info & Claims
Published:22 October 2015Publication History

ABSTRACT

The scientific literature is a rich resource for information retrieval on the biological knowledge. Nevertheless, the unstructured textual data in the research articles makes it difficult to access the information with computer-aided systems. Text-mining is one of the solution that can transform unstructured information in the text into database content, and most of the approaches are based on the machine learning models. Since these approaches require high-dimensional features, the performance of the model is heavily dependent on the selection of features. However, it is usually difficult and labor-intensive to choose good features, because feature extraction requires prior knowledge and ingenuity of human experts. Here, we suggest a novel framework to extract biological relations from the texts by using hierarchical text features that enhance the effectiveness of relation extraction model.

The proposed framework is composed of two parts, node and edge detection, using deep belief networks. Each part is based on the hierarchical text features learned by Gaussian-Bernoulli restricted Boltzmann machine (GBRBM). In this work, we performed gene-cancer relation extraction task as a pilot study. The classification model was trained based on both GE09 corpus from BioNLP'09 Shared Task and CoMAGC corpus. The results show that our model achieved better performance than other handcrafted feature-based approaches. The evaluation results suggest that deep belief networks offers the optimized and generalized hierarchical text features for the large-scale text mining.

References

  1. Sætre, Rune, et al. "AKANE system: protein-protein interaction pairs in BioCreAtIvE2 challenge, PPI-IPS subtask." Proceedings of the Second BioCreative Challenge Workshop. 2007.Google ScholarGoogle Scholar
  2. Leaman, Robert, and Graciela Gonzalez. "BANNER: an executable survey of advances in biomedical named entity recognition." Pacific Symposium on Biocomputing. Vol. 13. 2008.Google ScholarGoogle Scholar
  3. Charniak, Eugene. "A maximum-entropy-inspired parser." Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference. Association for Computational Linguistics, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Björne, Jari, Filip Ginter, and Tapio Salakoski. "University of Turku in the BioNLP'11 Shared Task." BMC bioinformatics 13.Suppl 11 (2012): S4.Google ScholarGoogle ScholarCross RefCross Ref
  5. Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. "A fast learning algorithm for deep belief nets." Neural computation 18.7 (2006): 1527--1554. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Building Text-mining Framework for Gene-Phenotype Relation Extraction using Deep Leaning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      DTMBIO '15: Proceedings of the ACM Ninth International Workshop on Data and Text Mining in Biomedical Informatics
      October 2015
      40 pages
      ISBN:9781450337878
      DOI:10.1145/2811163

      Copyright © 2015 Owner/Author

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 22 October 2015

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate41of247submissions,17%

      Upcoming Conference

    • Article Metrics

      • Downloads (Last 12 months)4
      • Downloads (Last 6 weeks)2

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader