Skip to main content
Log in

Neuroanatomical term generation and comparison between two terminologies

  • Original Article
  • Published:
Neuroinformatics Aims and scope Submit manuscript

Abstract

An approach and software tools are described for identifying and extracting compound terms (CTs), acronyms and their associated contexts from textual material that is associated with neuroanatomical atlases. A set of simple syntactic rules were appended to the output of a commercially available part of speech (POS) tagger (Qtag v 3.01) that extracts CTs and their associated context from the texts of neuroanatomical atlases. This “hybrid” parser appears to be highly sensitive and recognized 96% of the potentially germane neuroanatomical CTs and acronyms present in the cat and primate thalamic atlases.

A comparison of neuroanatomical CTs and acronyms between the cat and primate atlas texts was initially performed using exact-term matching. The implementation of string-matching algorithms significantly improved the identification of relevant terms and acronyms between the two domains. The End Gap Free string matcher identified 98% of CTs and the Needleman Wunsch (NW) string matcher matched 36% of acronyms between the two atlases.

Combining several simple grammatical and lexical rules with the POS tagger (“hybrid parser”) (1) extracted complex neuroanatomical terms and acronyms from selected cat and primate thalamic atlases and (2) and facilitated the semi-automated generation of a highly granular thalamic terminology. The implementation of string-matching algorithms (1) reconciled terminological errors generated by optical character recognition (OCR) software used to generate the neuroanatomical text information and (2) increased the sensitivity of matching neuroanatomical terms and acronyms between the two neuroanatomical domains that were generated by the “hybrid” parser.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • American Heritage Dictionary of the English Language, The: Fourth Edition. 2000, Houghton-Mifflin, Boston, MA.

    Google Scholar 

  • Assadi, H. and Bourigault, D. (1996) Acquisition and modeling of knowledge starting from texts: data-processing tools and methodological elements. In: Acts of 10th Congress Pattern Recognition and Artificial Intelligence, Rennes, France.

  • Berman, A. L. and Jones E. G. (1982) The Thalamus and Basal Telencephalon of the Cat. A Cytoarchitectonic Atlas with Stereotaxic Coordinates. University of Wisconsin Press, Madison, WI.

    Google Scholar 

  • Chang, J., Schutze, H., and Altman, R. (1999) Creating an online dictionary of abbreviations from MED-LINE. J. Am. Med. Inform. Assoc. 9:612–620.

    Article  Google Scholar 

  • Crasto, C., Marenco, L., Miller, P., and Shepherd, G. (2002) Olfactory receptor database: a metadata-driven automated population from sources of gene and protein sequences. Nucleic Acids Res. 30:354–360.

    Article  CAS  Google Scholar 

  • Gardner, D., Abato, M., Knuth, K. H., Debellis, R., and Gardner, E. P. (2001a) A functional ontology for neuroinformatics. The Human Brain Project/Neuroinformatics Annual Spring Meeting, May 21–22, 2001, Bethesda, MD.

  • Gusfield, D. (1997) Algorithms on strings, trees and sequences: computer science and computational biology. Cambridge University Press, Cambridge, UK.

    Google Scholar 

  • Jacquemin, C. and Bourigault, D. (2002) Termextraction and automatic indexing. In: Handbook of Computational Linguistics. (Mitkov, R., ed.) Oxford University Press, Oxford, UK, Chapter 19.

    Google Scholar 

  • Jones, E. G. (1998) The thalamus of primates In: Handbook of Chemical Neuroanatomy, Volume 14. (Bloom, F. E., et al., eds.) Elsevier, Amsterdam, The Netherlands.

    Google Scholar 

  • Kuang-Hua, C. and Chert, I. (1994) Extracting noun phrases from large-scale texts: A hybrid approach and its automatic evaluation. In: 32nd Annual Meeting of the Association for Computational Linguistics, June 27–30, New Mexico State University, Las Cruces, NM.

  • Language Technology Group. http://www.ltg.ed.ac.uk/software/chunk/

  • Lopresti, D. and Wilfong, G. (1999) Cross-domain approximate string matching. Sixth International Symposium on String Processing and Information Retrieval. Cancun, Mexico, September 22–24, pp. 120–127.

  • Manning, C. D. and Schütze, H. (2000) Foundations of statistical natural language. MIT Press, Cambridge, MA, p. 83.

    Google Scholar 

  • Maynard, D. and Ananiadou, S. (1999) Identifying contextual information for multi-word term extraction, In: 5th International Congress on Terminology and Knowledge Engineering (TKE99), pp. 212–221.

  • Monge, A. E. and Elkan, C. P. (1996) The field matching problem: Algorithms and applications. Second International Conference on Knowledge Discovery and Data Mining. (KDD96), Portland, OR, August 2–4, pp. 267–270.

  • Penn Tree Bank. http://www.cis.upenn.edu/~treebank/home.html

  • Qtag v 3.01, Portable POS Tagger. Oliver Mason, Department of English, School of Humanities, The University of Birmingham, UK. http://web.bham.ac.uk/O.Mason/

  • SPECIALIST Lexicon. http://www.nlm.nih.gov/research/umls/META4.HTML#s4

  • Zhu, J. J. and Ungar, L. H. (2000) String Edit Analysis for merging databases. Knowledge Discovery and Data Mining Workshop, August 20. Boston, MA. ACM SIG KDD, Jan 2001, Vol. 2., No, 2, p. 3.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fredric A. Gorin.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Srinivas, P.R., Gusfield, D., Mason, O. et al. Neuroanatomical term generation and comparison between two terminologies. Neuroinform 1, 177–192 (2003). https://doi.org/10.1007/s12021-003-0004-z

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12021-003-0004-z

Index Entries

Navigation