Article

Free Access

An adaptive approach to named entity extraction for meeting applications

Authors:
Fei Huang

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

,
Alex Waibel

Carnegie Mellon University, Pittsburgh, PA

Carnegie Mellon University, Pittsburgh, PA
View Profile

HLT '02: Proceedings of the second international conference on Human Language Technology ResearchMarch 2002Pages 165–170

Published:24 March 2002Publication History

HLT '02: Proceedings of the second international conference on Human Language Technology Research

Pages 165–170

ABSTRACT

Named entity extraction has been intensively investigated in the past several years. Both statistical approaches and rule-based approaches have achieved satisfactory performance for regular written/spoken language. However when applied to highly informal or ungrammatical languages, e.g., meeting languages, because of the many mismatches in language genre, the performance of existing methods decreases significantly.

In this paper we propose an adaptive method of named entity extraction for meeting understanding. This method combines a statistical model trained from broadcast news data with a cache model built online for ambiguous words, computes their global context name class probability from local context name class probabilities, and integrates name lists information from meeting profiles. Such a fusion of supervised and unsupervised learning has shown improved performance of named entity extraction for meeting applications. When evaluated using manual meeting transcripts, the proposed method demonstrates a 26.07% improvement over the baseline model. Its performance is also comparable to that of the statistical model trained from a small annotated meeting corpus. We are currently applying the proposed method to automatic meeting transcripts.

References

D. Appelt, J. Hobbs, D. Israel, and M. Tyson. Fastus: A finite-state processor for information extraction from real world texts. In Proceeding of IJCAI-93, 1993.Google Scholar
S. Baluja, V. O. Mittal, and R. Sukthankar. Applying machine learning for high performance named-entity extraction. In Pacific Association for Computational Linguistics, 1999.Google Scholar
D. Bikel, S. Miller, R. Schwarz, and R. Weischedel. Nymble: A high-performance learning name-finder. In Proceedings of Applied Natural Language Processing, pages 194--201, 1997. Google ScholarDigital Library
N. Chinchor. Overview of muc-7/met-2. In Proceedings of the Seventh Message Understanding Conference(MUC7), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/muc_7_proceedings/overview.html, 1998.Google Scholar
Y. Gotoh and S. Renals. Information extraction from broadcast news. In Philosophical Transactions of the Royal Society of London, A 358, pages 1295--1310, 2000.Google Scholar
R. Grishman and B. Sundheim. Design of the muc-6 evaluation. In Proceedings of MUC-6, 1995. Google ScholarDigital Library
R. Kuhn and R. D. Mori. A cache-based natural language model for speech recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-12(6):570--583, 1990. Google ScholarDigital Library
N. Chinchor, P. Robinson, and E. Brown. The hub-4 named entity task definition, version 4.8. In Proceedings of DARPA Broadcast News Workshop http://www.nist.gov/speech/hub4_98, 1999.Google Scholar
L. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257--285, Feburary, 1989.Google ScholarCross Ref
P. Robinson, E. Brown, J. Burger, N. Chinchor, A. Douthat, L. Ferro, and L. Hirschman. Overview: Information extraction from broadcast news. In Proceedings of DARPA Broadcast News Workshop, pages 27--30, 1999.Google Scholar
M. Stevenson and R. Gaizauskas. Using corpus-driven name lists for name entity recognition. In Proceedings of 6th Applied Natural Language Processing and 1st North American Chapter of the Association for Computational Linguistics, 2000. Google ScholarDigital Library
A. Viterbi. Error bound for convolutional codes and asymptotically optimum decoding algorithm. IEEE Transaction on Information Theory, 13:260--269, 1967.Google ScholarDigital Library
K. Zechner. Automatic summarization of spoken dialogs in unrestricted domains. In Ph.D Thesis, Language Technology Institute, Carnegie Mellon University, 2001.Google Scholar

Recommendations

Comparison of Methods to Annotate Named Entity Corpora

The authors compared two methods for annotating a corpus for the named entity (NE) recognition task using non-expert annotators: (i) revising the results of an existing NE recognizer and (ii) manually annotating the NEs completely. The annotation time, ...
Read More
Named entity extraction and disambiguation: the missing link
ESAIR '13: Proceedings of the sixth international workshop on Exploiting semantic annotations in information retrieval

Named entity extraction (NEE) and disambiguation (NED) are two areas of research that are well covered in literature. Typical fields addressing these topics are information retrieval, natural language processing, and semantic web. Although these topics ...
Read More
Automatic gazette creation for named entity recognition and application to resume processing
COMPUTE '12: Proceedings of the 5th ACM COMPUTE Conference: Intelligent & scalable system technologies

Named entities are important content-carrying units within documents. Consequently named entity recognition (NER) is an important part of information extraction. One fast and accurate approach to NER uses a list or gazette consisting of known instances. ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
HLT '02: Proceedings of the second international conference on Human Language Technology Research
March 2002
436 pages
Conference Chair:
Mitchell Marcus
University of Pennsylvania
Sponsors
In-Cooperation
Publisher
Morgan Kaufmann Publishers Inc.
San Francisco, CA, United States
Publication History
- Published: 24 March 2002
Author Tags
cache model
meeting application
named entity extraction
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate240of768submissions,31%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 109
  Total Downloads
- Downloads (Last 12 months)15
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

An adaptive approach to named entity extraction for meeting applications

HLT '02: Proceedings of the second international conference on Human Language Technology Research

ABSTRACT

References

Cited By

Recommendations

Comparison of Methods to Annotate Named Entity Corpora

Named entity extraction and disambiguation: the missing link

Automatic gazette creation for named entity recognition and application to resume processing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

An adaptive approach to named entity extraction for meeting applications

HLT '02: Proceedings of the second international conference on Human Language Technology Research

ABSTRACT

References

Cited By

Recommendations

Comparison of Methods to Annotate Named Entity Corpora

Named entity extraction and disambiguation: the missing link

Automatic gazette creation for named entity recognition and application to resume processing

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media