skip to main content
10.5555/1289189.1289278dlproceedingsArticle/Chapter ViewAbstractPublication PageshltConference Proceedingsconference-collections
Article
Free Access

An adaptive approach to named entity extraction for meeting applications

Published:24 March 2002Publication History

ABSTRACT

Named entity extraction has been intensively investigated in the past several years. Both statistical approaches and rule-based approaches have achieved satisfactory performance for regular written/spoken language. However when applied to highly informal or ungrammatical languages, e.g., meeting languages, because of the many mismatches in language genre, the performance of existing methods decreases significantly.

In this paper we propose an adaptive method of named entity extraction for meeting understanding. This method combines a statistical model trained from broadcast news data with a cache model built online for ambiguous words, computes their global context name class probability from local context name class probabilities, and integrates name lists information from meeting profiles. Such a fusion of supervised and unsupervised learning has shown improved performance of named entity extraction for meeting applications. When evaluated using manual meeting transcripts, the proposed method demonstrates a 26.07% improvement over the baseline model. Its performance is also comparable to that of the statistical model trained from a small annotated meeting corpus. We are currently applying the proposed method to automatic meeting transcripts.

References

  1. D. Appelt, J. Hobbs, D. Israel, and M. Tyson. Fastus: A finite-state processor for information extraction from real world texts. In Proceeding of IJCAI-93, 1993.Google ScholarGoogle Scholar
  2. S. Baluja, V. O. Mittal, and R. Sukthankar. Applying machine learning for high performance named-entity extraction. In Pacific Association for Computational Linguistics, 1999.Google ScholarGoogle Scholar
  3. D. Bikel, S. Miller, R. Schwarz, and R. Weischedel. Nymble: A high-performance learning name-finder. In Proceedings of Applied Natural Language Processing, pages 194--201, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. N. Chinchor. Overview of muc-7/met-2. In Proceedings of the Seventh Message Understanding Conference(MUC7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_proceedings/overview.html, 1998.Google ScholarGoogle Scholar
  5. Y. Gotoh and S. Renals. Information extraction from broadcast news. In Philosophical Transactions of the Royal Society of London, A 358, pages 1295--1310, 2000.Google ScholarGoogle Scholar
  6. R. Grishman and B. Sundheim. Design of the muc-6 evaluation. In Proceedings of MUC-6, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. R. Kuhn and R. D. Mori. A cache-based natural language model for speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-12(6):570--583, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. N. Chinchor, P. Robinson, and E. Brown. The hub-4 named entity task definition, version 4.8. In Proceedings of DARPA Broadcast News Workshop http://www.nist.gov/speech/hub4_98, 1999.Google ScholarGoogle Scholar
  9. L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--285, Feburary, 1989.Google ScholarGoogle ScholarCross RefCross Ref
  10. P. Robinson, E. Brown, J. Burger, N. Chinchor, A. Douthat, L. Ferro, and L. Hirschman. Overview: Information extraction from broadcast news. In Proceedings of DARPA Broadcast News Workshop, pages 27--30, 1999.Google ScholarGoogle Scholar
  11. M. Stevenson and R. Gaizauskas. Using corpus-driven name lists for name entity recognition. In Proceedings of 6th Applied Natural Language Processing and 1st North American Chapter of the Association for Computational Linguistics, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Viterbi. Error bound for convolutional codes and asymptotically optimum decoding algorithm. IEEE Transaction on Information Theory, 13:260--269, 1967.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Zechner. Automatic summarization of spoken dialogs in unrestricted domains. In Ph.D Thesis, Language Technology Institute, Carnegie Mellon University, 2001.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image DL Hosted proceedings
    HLT '02: Proceedings of the second international conference on Human Language Technology Research
    March 2002
    436 pages

    Publisher

    Morgan Kaufmann Publishers Inc.

    San Francisco, CA, United States

    Publication History

    • Published: 24 March 2002

    Qualifiers

    • Article

    Acceptance Rates

    Overall Acceptance Rate240of768submissions,31%
  • Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1

    Other Metrics

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader