Abstract
In this paper, we propose a named-entity recognition (NER) system that addresses two major limitations frequently discussed in the field. First, the system requires no human intervention such as manually labeling training data or creating gazetteers. Second, the system can handle more than the three classical named-entity types (person, location, and organization). We describe the system’s architecture and compare its performance with a supervised system. We experimentally evaluate the system on a standard corpus, with the three classical named-entity types, and also on a new corpus, with a new named-entity type (car brands).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chinchor, N.: MUC-7 Named Entity Task Definition, version 3.5. In: Proc. of the Seventh Message Understanding Conference (1998)
Cohen, W., Fan, W.: Learning Page-Independent Heuristics for Extracting Data from Web Page. In: Proc. of the International World Wide Web Conference (1999)
Collins, M., Singer, Y.: Unsupervised Models for Named Entity Classification. In: Proc. of the Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora (1999)
Etzioni, O., Cafarella, M., Downey, D., Popescu, A.-M., Shaked, T., Soderland, S., Weld, D.S., Yates, A.: Unsupervised Named-Entity Extraction from the Web: An Experimental Study. Artificial Intelligence 165, 91–134 (2005)
Evans, R.: A Framework for Named Entity Recognition in the Open Domain. In: Proc. Recent Advances in Natural Language Processing (2003)
Hearst, M.: Automatic Acquisition of Hyponyms from Large Text Corpora. In: Proc. of International Conference on Computational Linguistics (1992)
Lin, D., Pantel, P.: Induction of Semantic Classes from Natural Language Text. In: Proc. of ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2001)
Ling, C., Li, C.: Data Mining for Direct Marketing: Problems and Solutions. In: Proc. International Conference on Knowledge Discovery and Data Mining (1998)
Mikheev, A.: A Knowledge-free Method for Capitalized Word Disambiguation. In: Proc. Conference of Association for Computational Linguistics (1999)
Mikheev, A., Moens, M., Grover, C.: Named Entity Recognition without Gazetteers. In: Proc. Conference of European Chapter of the Association for Computational Linguistics (1999)
Nadeau, D.: Création de surcouche de documents hypertextes et traitement du langage naturel. In: Proc. Computational Linguistics in the North-East (2005)
Palmer, D.D., Day, D.S.: A Statistical Profile of the Named Entity Task. In: Proc. ACL Conference for Applied Natural Language Processing (1997)
Petasis, G., Vichot, F., Wolinski, F., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D.: Using Machine Learning to Maintain Rule-based Named-Entity Recognition and Classification Systems. In: Proc. Conference of Association for Computational Linguistics (2001)
Riloff, E., Jones, R.: Learning Dictionaries for Information Extraction using Multi-level Bootstrapping. In: Proc. of National Conference on Artificial Intelligence (1999)
Sekine, S., Sudo, K., Nobata, C.: Extended Named Entity Hierarchy. In: Proc. of the Language Resource and Evaluation Conference (2002)
Zhu, X., Wu, X., Chen, Q.: Eliminating Class Noise in Large Data-Sets. In: Proc. of the International Conference on Machine Learning (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nadeau, D., Turney, P.D., Matwin, S. (2006). Unsupervised Named-Entity Recognition: Generating Gazetteers and Resolving Ambiguity. In: Lamontagne, L., Marchand, M. (eds) Advances in Artificial Intelligence. Canadian AI 2006. Lecture Notes in Computer Science(), vol 4013. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11766247_23
Download citation
DOI: https://doi.org/10.1007/11766247_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34628-9
Online ISBN: 978-3-540-34630-2
eBook Packages: Computer ScienceComputer Science (R0)