Skip to main content

Exploiting Gene Ontology to Conceptualize Biomedical Document Collections

  • Conference paper
Book cover The Semantic Web (ASWC 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5367))

Included in the following conference series:

Abstract

As biomedical science progresses, ontologies play an increasingly important role in easing the understanding of biomedical information. Although much research, such as Gene Ontology annotation, has been proposed to utilize ontologies to help users understand biomedical information easily, most of the research does not focus on capturing gene-related terms and their relationships within biomedical document collections. Understanding key gene-related terms as well as their semantic relationships is essential for comprehending the conceptual structure of biomedical document collections and avoiding information overload for users. To address this issue, we propose a novel approach called ‘GOClonto’ to automatically generate ontologies for conceptualization of biomedical document collections. Based on GO (Gene Ontology), GOClonto extracts gene-related terms from biomedical text, applies latent semantic analysis to identify key gene-related terms, allocates documents based on the key gene-related terms, and utilizes GO to automatically generate a corpus-related gene ontology. The experimental results show that GOClonto is able to identify key gene-related terms. For a test biomedical document collection, GOClonto shows better performance than other clustering algorithms in terms of F-measure. Moreover, the ontology generated by GOClonto shows a significant informative conceptual structure.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M., Sherlock, G.: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 25(1), 25–29 (2000)

    Article  Google Scholar 

  2. Zheng, H.T., Borchert, C., Kim, H.G.: A concept-driven automatic ontology generation approach for conceptualization of document corpora (unpublished manuscript, 2008)

    Google Scholar 

  3. Zamir, O., Etzioni, O.: Web document clustering: a feasibility demonstration. In: SIGIR 1998: Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, pp. 46–54. ACM, New York (1998)

    Chapter  Google Scholar 

  4. Zamir, O., Etzioni, O.: Grouper: a dynamic clustering interface to web search results. Comput. Netw. 31(11-16), 1361–1374 (1999)

    Article  Google Scholar 

  5. Schockaert, S.: Het clusteren van zoekresultaten met behulp van vaagmieren (clustering of search results using fuzzy ants). Master thesis, University of Ghent (2004)

    Google Scholar 

  6. Lang, N.C.: A tolerance rough set approach to clustering web search results. Master thesis, Warsaw University (2004)

    Google Scholar 

  7. Osinski, S., Weiss, D.: A concept-driven algorithm for clustering search results. IEEE Intelligent Systems 20(3), 48–54 (2005)

    Article  Google Scholar 

  8. Plaisant, C., Fekete, J.D., Grinstein, G.: Promoting insight-based evaluation of visualizations: From contest to benchmark repository. IEEE Transactions on Visualization and Computer Graphics 14(1), 120–134 (2008)

    Article  Google Scholar 

  9. Olsen, K.A., Korfhage, R.R., Sochats, K.M., Spring, M.B., Williams, J.G.: Visualization of a document collection: the vibe system. Inf. Process. Manage. 29(1), 69–81 (1993)

    Article  Google Scholar 

  10. Grobelnik, M., Maldenic, D.: Visualization of news articles. Informatica 28, 32–35 (2004)

    Google Scholar 

  11. Fortuna, B., Grobelnik, M., Mladenic, D.: Visualization of text document corpus. Informatica 29, 497–504 (2005)

    Google Scholar 

  12. Zhu, W., Chen, C.: Storylines: Visual exploration and analysis in latent semantic spaces. Computers & Graphics 31(3), 338–349 (2007)

    Article  Google Scholar 

  13. Shaw, C.D., Kukla, J.M., Soboroff, I., Ebert, D.S., Nicholas, C.K., Zwa, A., Miller, E.L., Roberts, D.A.: Interactive volumetric information visualization for document corpus management. Int. J. on Digital Libraries 2(2-3), 144–156 (1999)

    Article  Google Scholar 

  14. Fluit, C., Sabou, M., van Harmelen, F.: Ontology-based information visualisation: Towards semantic web applications. In: Visualising the Semantic Web, 2nd edn. (2005)

    Google Scholar 

  15. Thai, V., Handschuh, S., Decker, S.: IVEA: An information visualization tool for personalized exploratory document collection analysis. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 139–153. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  16. Bada, M., Turi, D., McEntire, R., Stevens, R.: Using reasoning to guide annotation with gene ontology terms in goat. SIGMOD Rec. 33(2), 27–32 (2004)

    Article  Google Scholar 

  17. Camon, E., Magrane, M., Barrell, D., Lee, V., Dimmer, E., Maslen, J., Binns, D., Harte, N., Lopez, R., Apweiler, R.: The gene ontology annotation (goa) database: sharing knowledge in uniprot with gene ontology. Nucleic Acids Res. 32 (database issue) (2004)

    Google Scholar 

  18. Gene_Ontology_Annotation_Tool, http://www.geneontology.org/go.tools.annotation.shtml

  19. Hill, D.P., Smith, B., McAndrews-Hill, M.S., Blake, J.A.: Gene ontology annotations: what they mean and where they come from. BMC bioinformatics 9 (suppl. 5) (2008)

    Google Scholar 

  20. Seki, K., Mostafa, J.: An application of text categorization methods to gene ontology annotation. In: SIGIR 2005: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 138–145. ACM, New York (2005)

    Chapter  Google Scholar 

  21. Phan, X.H.: Crftagger: Crf english pos tagger (2006), http://crftagger.sourceforge.net/

  22. Phan, X.H.: Crfchunker: Crf english phrase chunker (2006), http://crfchunker.sourceforge.net/

  23. Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Commun. ACM 18(11), 613–620 (1975)

    Article  MATH  Google Scholar 

  24. OWL_Web_Ontology_Language, http://www.w3.org/tr/owl-ref/

  25. Carrot2, http://project.carrot2.org/

  26. PubMed, http://www.ncbi.nlm.nih.gov/sites/entrez/

  27. Steinbach, M., Karypis, G., Kumar, V.: A comparison of document clustering techniques. In: KDD Workshop on Text Mining (2000)

    Google Scholar 

  28. Rosse, C., Mejino, J.L.V.: A reference ontology for biomedical informatics: the foundational model of anatomy. J. of Biomedical Informatics 36(6), 478–500 (2003)

    Article  Google Scholar 

  29. Stearns, M., Price, C., Spackman, K., Wang, A.: Snomed clinical terms: overview of the development process and project status. In: Proc. AMIA Symp., pp. 662–666 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zheng, HT., Borchert, C., Kim, HG. (2008). Exploiting Gene Ontology to Conceptualize Biomedical Document Collections. In: Domingue, J., Anutariya, C. (eds) The Semantic Web. ASWC 2008. Lecture Notes in Computer Science, vol 5367. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89704-0_26

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89704-0_26

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89703-3

  • Online ISBN: 978-3-540-89704-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics