Abstract
Many applications make use of named entity classification. Machine learning is the preferred technique adopted for many named entity classification methods where the choice of features is critical to final performance. Existing approaches explore only the features derived from the characteristic of the named entity itself or its linguistic context. With the development of the Semantic Web, a large number of data sources are published and connected across the Web as Linked Open Data (LOD). LOD provides rich a priori knowledge about entity type information, knowledge that can be a valuable asset when used in connection with named entity classification. In this paper, we explore the use of LOD to enhance named entity classification. Our method extracts information from LOD and builds a type knowledge base which is used to score a (named entity string, type) pair. This score is then injected as one or more features into the existing classifier in order to improve its performance. We conducted a thorough experimental study and report the results, which confirm the effectiveness of our proposed method.
Keywords
References
Linked data, http://linkeddata.org/
Opencyc, http://www.cyc.com/opencyc
Redirects, http://en.wikipedia.org/wiki/Redirects_on_wikipedia
Ahn, D., Jijkoun, V., Mishne, G., Muller, K., de Rijke, M., Schlobach, S.: Using wikipedia at the trec qa track. In: Proceedings of the 13rd Text REtrieval Conference, TREC 13 (2004)
Banerjee, S., Pedersen, T.: Extended gloss overlaps as a measure of semantic relatedness. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence (2003)
Banko, M., Cafarella, M.J., Soderland, S., Boardhead, M., Etzioni, O.: Open information extraction from the web. Communications of the ACM (2008)
Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: High-performance learning name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing (1997)
Cimiano, P., Volker, J.: Towards large-scale, open-domain and ontology-based named entity classification. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing, RANLP (2005)
Etzioni, O., Cafarella, M., Downey, D., Kok, S., Popescu, A.-M., Shaked, T., Soderland, S., Yates, D.S.W.S.A.: Web-scale information extraction in knowitall. In: Proceedings of the 13th International Conference on World Wide Web, WWW (2004)
Evans, R.: A framework for named entity recognition in the open domain. In: Proceedings of the Recent Advances in Natural Language Processing, RANLP (2003)
Fellbaum, C. (ed.): Wordnet: An electronic lexical database. MIT Press, Cambridge (1998)
Fleischman, M., Hovy, E.: Fine-grained classification of named entities. In: Proceedings of the 19th International Conference on Computational Linguistics, Coling (2002)
Ganti, V., Konig, A.C., Vernica, R.: Entity categorization over large document collections. In: Proceedings of the 14th ACM SIGKDD International Conference On Knowledge Discovery & Data Mining (2008)
Giuliano, C.: jLSI a for latent semantic indexing (2007) Software available at, http://tcc.itc.it/research/textec/tools-resources/jLSI.html
Giuliano, C.: Fine-grained classification of named entities exploiting latent semantic kernels. In: Proceedings of the 13rd Conference onCcomputational Natural Language Learning, CoNLL (2009)
Giuliano, C., Gliozzo, A.: Instance-based ontology population exploiting named-entity substitution. In: Proceedings of the 22nd International Conference on Computational Linguistics, Coling (2008)
Harabagiu, S., Moldovan, D., Pasca, M., Mihalcea, R., Surdeanu, M., Bunescu, R., Girju, R., Rus, V., Morarescu, P.: Falcon: Boosting knowledge for answer engines. In: Proceedings of 9th Text REtrieval Conference, TREC 9 (2000)
Hirschman, L., Chinchor, N.: Muc-7 named entity task definition. In: Proceedings of the 7th Message Understanding Conference, MUC-7 (1997)
Kwok, C.C.T., Etzioni, O., Weld, D.S.: Scaling question answering to the web. In: Proceedings of the 10th World Wide Web Conference, WWW (2001)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. In: Linguisticae Investigationes (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ni, Y., Zhang, L., Qiu, Z., Wang, C. (2010). Enhancing the Open-Domain Classification of Named Entity Using Linked Open Data. In: Patel-Schneider, P.F., et al. The Semantic Web – ISWC 2010. ISWC 2010. Lecture Notes in Computer Science, vol 6496. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17746-0_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-17746-0_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17745-3
Online ISBN: 978-3-642-17746-0
eBook Packages: Computer ScienceComputer Science (R0)