skip to main content
10.1145/1183535.1183539acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

Towards applying text mining and natural language processing for biomedical ontology acquisition

Published:10 November 2006Publication History

ABSTRACT

The use of text mining and natural language processing can extend into the realm of knowledge acquisition and management for biomedical applications. In this paper, we describe how we implemented natural language processing and text mining techniques on the transcribed verbal descriptions from retinal experts of biomedical disease features. The feature-attribute pairs generated were then incorporated within a user interface for a collaborative ontology development tool. This tool, IDOCS, is being used in the biomedical domain to help retinal specialists reach a consensus on a common ontology for describing age-related macular degeneration (AMD). We compare the use of traditional text mining and natural language processing techniques with that of a retinal specialist's analysis and discuss how we might integrate these techniques for future biomedical ontology and user interface development.

References

  1. Aronson, A.R. Effective Mapping of Biomedical Text to the UMLS Metathesaurus: The MetaMap Program. Proc AMIA Symp, (2001), 17--21.Google ScholarGoogle Scholar
  2. Ashburner, M., et al. Gene Ontology: Tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet, 25 (2000), 25--29.Google ScholarGoogle ScholarCross RefCross Ref
  3. Banerjee, S. and Pederson, T. The Design, Implementation and Use of the Ngram Statistics Package. Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics. (Feb 2003) Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Bruijn, B. and Martin J. Getting to the (c)ore of Knowledge: Mining Biomedical Literature. International Journal of Medical Informatics, 67, 1-3, (2002), 7--18.Google ScholarGoogle Scholar
  5. Chen, L. and Friedman, C. Extracting Phenotypic Information from the Literature via Natural Language Processing. In MEDINFO 2004 (M. Fieschi et al., eds), 758--762.Google ScholarGoogle Scholar
  6. Church, K. and Hanks, P. Word Association Norms, Mutual Information, and Lexicography. Proceedings of the 28th Annual Meeting of the Association for Computational Linguistics, 16, 1 (1990). Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Eom, J.-H. and Zhang, B.-T. PubMiner: Machine Learning-based Text Mining for Biomedical Information and Analysis. Genomics and Informatics, 2, 2, (2004), 99--106.Google ScholarGoogle Scholar
  8. Fiszman, M., Chapman, W.W., Aronsky, D., Evans, R.S., and Haug, P.J. Automatic Detection of Acute Bacterial Pneumonia from Chest X-ray Reports, Journal of the American Medical Informatics Association, 7, (2000), 593--604.Google ScholarGoogle ScholarCross RefCross Ref
  9. Friedman, C., Alderson, P.O., Austin, J.H., Cimino, J.J., and Johnson, S.B. A General Natural-Language Text Processor for Clinical Radiology. Journal of the American Medical Informatics Association, 1, 2, (1994), 161--174.Google ScholarGoogle ScholarCross RefCross Ref
  10. Friedman, C., Cimino, J.J., and Johnson, S.B. A Schema for Representing Medical Language Applied to Clinical Radiology. Journal of the American Medical Informatics Association, 1, 3, (1994), 233--248.Google ScholarGoogle ScholarCross RefCross Ref
  11. Friedman, C. Towards a Comprehensive Medical Language Processing System: Methods and Issues. Proc AMIA Symp, (1997), 595--599.Google ScholarGoogle Scholar
  12. Friedman, C. A Broad-Coverage Natural Language Processing System. Proc AMIA Symp, 19, 19, (2000), 270--274.Google ScholarGoogle Scholar
  13. Friedman, C., Kra, P., Yu, H., Krauthammer, M., and Rzhetsky, A. GENIES: A Natural Language Processing System for the Extraction of Molecular Pathways from Journal Articles. Bioinformatics, 17, Suppl 1, (2001), S74--S82.Google ScholarGoogle ScholarCross RefCross Ref
  14. Gruber, T.R. A Translation Approach to Portable Ontologies. Knowledge Acquisition, 5 (1993), 199--220. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Hearst, M. Untangling Text Data Mining. Proceedings of ACL'99: the 37th Annual Meeting of the Association of Computational Linguistics (1999). Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Hearst, M. What is Text Mining? http://www.sims.berkeley.edu/~hearst/text-mining.htmlGoogle ScholarGoogle Scholar
  17. Hobbs, J.R. Information Extraction from Biomedical Text. Journal of Biomedical Informatics, 35, (2002), 260--264. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Humphreys, B.L., Lindberg, D.A., Schoolman, H.M., and Barnett, G.O. The Unified Medical Language System: an Informatics Research Collaboration. Journal of the American Medical Informatics Association, 5, (1998), 1--11.Google ScholarGoogle ScholarCross RefCross Ref
  19. Humphreys, K., Demetriou, G., and Gaizauskas, R. Two Applications of Information Extraction to Biological Science Journal Articles: Enzyme Interactions and Protein Structures. Pacific Symposium on Biocomputing, (2000), 505--516.Google ScholarGoogle Scholar
  20. Inniss, T.R. Seasonal Clustering Technique for Time Series Data. European Journal of Operational Research, In Press. Available online 10 August 2005 at Science Direct.Google ScholarGoogle Scholar
  21. Jurafsky, D. and Martin, J.H. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Prentice Hall, Upper Saddle River, New Jersey, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kim, J.-D., Ohta, T., Tateisi, Y., and Tsujii, J. GENIA corpus- a Semantically Annotated Corpus for Bio-textmining. Bioinformatics, 19, Suppl 1, (2003), i180--i182.Google ScholarGoogle ScholarCross RefCross Ref
  23. Kosala, R. and Blockeel, H. Web Mining Research: A Survey. Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining (SIGKDD Explorations), 2, 1 (2000), 1--15. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Krallinger, M., Erhardt, R. A.-A., and Valencia, A. Text-Mining Approaches in Molecular Biology and Biomedicine. Drug Discovery Today, 10, 6 (March 1995).Google ScholarGoogle Scholar
  25. Lussier, Y., Borlawsky, T., Rappaport, L.Y., and Friedman, C. PhenoGO: Assigning Phenotypic Context to Gene Ontology Annotations with Natural Language Processing. Pacific Symposium on Biocomputing, (2006), 64--75.Google ScholarGoogle Scholar
  26. Manning, C.D. and Schutze, H. Foundations of Statistical Natural Language Processing. MIT Press, Cambridge, Massachusetts, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Novichkova, S., Egorov, S., and Daraselia, N. MedScan, a Natural Language Processing Engine for MEDLINE Abstracts. Bioinformatics, 19, 13, (2003), 1699--1706.Google ScholarGoogle ScholarCross RefCross Ref
  28. Raychaudhuri, S., Schutze, H., and Altman, R.B. Using Text Analysis to Identify Functionally Coherent Gene Groups. Genome Research, 12, (2002), 1582--1590.Google ScholarGoogle ScholarCross RefCross Ref
  29. Rosse, C. and Mejino, J.L. A Reference Ontology for Biomedical Informatics: the Foundational Model of Anatomy. Journal of Biomedical Informatics, 36, 6, (2003), 478--500. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Sager, N., et al. Natural Language Processing and Representation of Clinical Data. Journal of the American Medical Informatics Association, 1, (1994), 142--160.Google ScholarGoogle ScholarCross RefCross Ref
  31. SAS Institute Inc. Mining Textual Data Using SAS Text Miner for SAS® 9. SAS Institute Inc., Cary, North Carolina, 2004.Google ScholarGoogle Scholar
  32. SAS Institute Inc. Getting Started with SAS® 9.1 Text Miner. SAS Institute Inc., Cary, North Carolina, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Spyns, P. Natural Language Processing in Medicine: an Overview. Methods Inf Med, 35, 4-5, (1996), 285--301.Google ScholarGoogle Scholar
  34. Thomas, J., Milward, D., Ouzounis, C., Pulman, S., and Carroll, M. Automatic Extraction of Protein Interactions from Scientific Abstracts. Pacific Symposium on Biocomputing, (2000), 541--552.Google ScholarGoogle Scholar
  35. Williams, A.B., Krygowski, T., and Casavant, T. I-DOCS: Distributed Agent-Assisted Knowledge Fusion for Disease Gene Discovery. Proceeding of the Eighth International Conference on Parallel and Distributed Systems (Kyongju, ,Korea, June 26-29 2001). IEEE Computer Society Press, 698--70. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Towards applying text mining and natural language processing for biomedical ontology acquisition

                Recommendations

                Comments

                Login options

                Check if you have access through your login credentials or your institution to get full access on this article.

                Sign in
                • Published in

                  cover image ACM Conferences
                  TMBIO '06: Proceedings of the 1st international workshop on Text mining in bioinformatics
                  November 2006
                  78 pages
                  ISBN:1595935266
                  DOI:10.1145/1183535

                  Copyright © 2006 ACM

                  Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                  Publisher

                  Association for Computing Machinery

                  New York, NY, United States

                  Publication History

                  • Published: 10 November 2006

                  Permissions

                  Request permissions about this article.

                  Request Permissions

                  Check for updates

                  Qualifiers

                  • Article

                  Upcoming Conference

                PDF Format

                View or Download as a PDF file.

                PDF

                eReader

                View online with eReader.

                eReader