Towards Large-Scale Unsupervised Relation Extraction from the Web

Bonan Min, Shuming Shi, Ralph Grishman, Chin-Yew Lin

Source Title: International Journal on Semantic Web and Information Systems (IJSWIS)8(3)

ISSN: 1552-6283|EISSN: 1552-6291|EISBN13: 9781466614871|DOI: 10.4018/jswis.2012070101

MLA

Min, Bonan, et al. "Towards Large-Scale Unsupervised Relation Extraction from the Web." IJSWIS vol.8, no.3 2012: pp.1-23. http://doi.org/10.4018/jswis.2012070101

APA

Min, B., Shi, S., Grishman, R., & Lin, C. (2012). Towards Large-Scale Unsupervised Relation Extraction from the Web. International Journal on Semantic Web and Information Systems (IJSWIS), 8(3), 1-23. http://doi.org/10.4018/jswis.2012070101

Chicago

Min, Bonan, et al. "Towards Large-Scale Unsupervised Relation Extraction from the Web," International Journal on Semantic Web and Information Systems (IJSWIS) 8, no.3: 1-23. http://doi.org/10.4018/jswis.2012070101

Export Reference

Favorite Full-Issue Download

View Full Text HTML

View Full Text PDF

Abstract

The Web brings an open-ended set of semantic relations. Discovering the significant types is very challenging. Unsupervised algorithms have been developed to extract relations from a corpus without knowing the relation types in advance, but most rely on tagging arguments of predefined types. One recently reported system is able to jointly extract relations and their argument semantic classes, taking a set of relation instances extracted by an open IE (Information Extraction) algorithm as input. However, it cannot handle polysemy of relation phrases and fails to group many similar (“synonymous”) relation instances because of the sparseness of features. In this paper, the authors present a novel unsupervised algorithm that provides a more general treatment of the polysemy and synonymy problems. The algorithm incorporates various knowledge sources which they will show to be very effective for unsupervised relation extraction. Moreover, it explicitly disambiguates polysemous relation phrases and groups synonymous ones. While maintaining approximately the same precision, the algorithm achieves significant improvement on recall compared to the previous method. It is also very efficient. Experiments on a real-world dataset show that it can handle 14.7 million relation instances and extract a very large set of relations from the Web.

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.

Username or email: *

Password: *

Forgot individual login password?

Create individual account

Towards Large-Scale Unsupervised Relation Extraction from the Web

MLA

APA

Chicago

Export Reference

Abstract

Request Access