research-article

Developing Position Structure-Based Framework for Chinese Entity Relation Extraction

Authors:
Peng Zhang

Robert Gordon University

Robert Gordon University
View Profile

,
Wenjie Li

The Hong Kong Polytechnic University

The Hong Kong Polytechnic University
View Profile

,
Yuexian Hou

Tianjin University

Tianjin University
View Profile

,
Dawei Song

Robert Gordon University

Robert Gordon University
View Profile

ACM Transactions on Asian Language Information Processing Volume 10 Issue 3Article No.: 14pp 1–22https://doi.org/10.1145/2002980.2002984

Published:01 September 2011Publication History

ACM Transactions on Asian Language Information Processing

Abstract

Relation extraction is the task of finding semantic relations between two entities in text, and is often cast as a classification problem. In contrast to the significant achievements on English language, research progress in Chinese relation extraction is relatively limited. In this article, we present a novel Chinese relation extraction framework, which is mainly based on a 9-position structure. The design of this proposed structure is motivated by the fact that there are some obvious connections between relation types/subtypes and position structures of two entities. The 9-position structure can be captured with less effort than applying deep natural language processing, and is effective to relieve the class imbalance problem which often hurts the classification performance. In our framework, all involved features do not require Chinese word segmentation, which has long been limiting the performance of Chinese language processing. We also utilize some correction and inference mechanisms to further improve the classified results. Experiments on the ACE 2005 Chinese data set show that the 9-position structure feature can provide strong support for Chinese relation extraction. As well as this, other strategies are also effective to further improve the performance.

References

Boser, B. E., Guyon, I., and Vapnik, V. 1992. A training algorithm for optimal margin classifers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (CLT’92). 144--152. Google ScholarDigital Library
Bunescu, R. and Mooney, R. 2005. A shortest path dependency tree kernel for relation extraction. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNL’05). 724--731. Google ScholarDigital Library
Chawla, N., Japkowicz, N., and Kolcz, A. 2004. Editorial: Special issue on learning from imbalanced datasets. SIGKDD Explor. Newsl. 6, 1, 1--6. Google ScholarDigital Library
Che, W., Jiang, J., Su, Z., Pan, Y., and Liu, T. 2005a. Improved-edit-distance kernel for Chinese relation extraction. In Proceedings of the 2nd International Joint Conference on Natural Language Processing (IJCNLP’05). 134--139.Google Scholar
Che, W., Liu, T., and Li, S. 2005b. Automatic entity relation extraction. J. Chi. Inf. Proc. 19, 2, 1--6.Google Scholar
Chen, J., Ji, D., Tan, C., and Niu, Z. 2006a. Unsupervised relation disambiguation using spectral clustering. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics and 21st International Conference on Computational Linguistics (COLING-ACL’06). 89--96. Google ScholarDigital Library
Chen, J., Ji, D., Tan, C., and Niu, Z. 2006b. Relation extraction using label propagation based semi-supervised learning. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics and 21st International Conference on Computational Linguistics (COLING-ACL’06). 129--136. Google ScholarDigital Library
Chen, Y., Li, W., Liu, Y., Zheng, D., and Zhao, T. 2010. Exploring deep belief network for Chinese relation extraction. In Proceedings of the Joint Conference on Chinese Language Processing (CLP’10).Google Scholar
Cortes, C. and Vapnik, V. 1995. Support-vector network. Mach. Learn. 20, 273--297. Google ScholarDigital Library
Culotta, A. and Sorensen, J. 2004. Dependency tree kernels for relation extraction. In Proceedings of the 42th Annual Meeting of the Association for Computer Linguistics (ACL’04). 423--429. Google ScholarDigital Library
Culotta, A., McCallum, A., and Betz, J. 2006. Integrating probabilistic extraction models and data mining to discover relations and patterns in text. In Proceedings of the Joint Human Language Technology Conference/Annual Meeting of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL’06). Google ScholarDigital Library
Forman, G. 2003. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3, 1289--1305. Google ScholarCross Ref
Hofmann, T. 1999. Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual International Conference on Research and Development in Information Retrieval (SIGIR’99). 50--57. Google ScholarDigital Library
Huang, R. H., Sun, L., and Feng, Y. Y. 2008. Study of kernel-based methods for feature space for relation extraction. In Proceedings of the 4th Asia Information Retrieval Symposium (AIRS’08). 598--604. Google ScholarDigital Library
Jiang, J. and Zhai, C. 2007. A systematic exploration of the feature space for relation extraction. In Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT’07). 113--120.Google Scholar
Joachims, T. 1998. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the European Conference on Machine Learning (ECML’98). Google ScholarDigital Library
Kambhatla, N. 2004. Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In Proceedings of the 42th Annual Meeting of the Association for Computer Linguistics (ACL’04). 178--181. Google ScholarDigital Library
Kambhatla, N. 2006. Minority vote: At-Least-N voting improves recall for extracting relations. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics and 21st International Conference on Computational Linguistics (COLING-ACL’06). 460--466. Google ScholarDigital Library
Katrenko, S., Adriaans, P., and van Someren, M. 2010. Using local alignments for relation recognition. J. Artif. Int. Res. 38, 1, 1--48. Google ScholarDigital Library
Li, W., Qian, D., Lu, Q., and Yuan, C. 2007. Detecting, categorizing and clustering entity mentions in Chinese text. In Proceedings of the 30th Annual International Conference on Research and Development in Information Retrieval (SIGIR’07). 647--654. Google ScholarDigital Library
Li, W., Zhang, P., Wei, F., Lu, Q., and Hou, Y. 2008. A novel feature-based approach to Chinese entity relation extraction. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 89--92. Google ScholarDigital Library
Manevitz, M. L. and Yousef, M. 2001. One-class SVMs for document classification. J. Mach. Learn. Res. 2, 139--154. Google Scholar
Miller, S., Fox, H., Ramshaw, L., and Weischedel, R. 2000. A novel use of statistical parsing to extract information from text. In Proceedings of 6th Applied Natural Language Processing Conference (ANLP’00). Google ScholarDigital Library
Miyao, Y., Saetre, R., Sagae, K., Matsuzaki, T., and Tsujii, J. 2008. Task-oriented evaluation of syntactic parsers and their representations. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 46--54.Google Scholar
Nakov, P. and Hearst, M. 2008. Solving relational similarity problems using the web as a corpus. In Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL’08). 452--460.Google Scholar
Sra, S. 2006. Efficient large scale linear programming support vector machines. In Proceedings of the European Conference on Machine Learning (ECML’06). 767--774. Google ScholarDigital Library
Takaaki, H., Satoshi, S., and Ralph, G. 2004. Discovering relations among named entities from large corpora. In Proceedings of the 42th Annual Meeting of the Association for Computer Linguistics (ACL’04). Google ScholarDigital Library
Wang, T. and Li, Y. 2006. Automatic extraction of hierarchical relations from texts. In Proceedings of the 3rd European Semantic Web Conference (ESWC’06). Google ScholarDigital Library
Yang, Y. and Pedersen, J. O. 1997. A comparative study on feature selection in text categorization. In Proceedings of the 14th International Conference on Machine Learning (ICML’97). 412--420. Google ScholarDigital Library
Zelenko, D., Aone, C., and Richardella, A. 2003. Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083--1106. Google ScholarDigital Library
Zhang, J., Ouyang, Y., Li, W., and Hou, Y. 2009. A novel composite kernel approach to Chinese entity relation extraction. In Proceedings of the 22nd International Conference on Computer Processing of Oriental Languages (ICCPOL’09). 236--247. Google ScholarDigital Library
Zhang, M., Zhang, J., Su, J., and Zhou, G. 2006. A composite kernel to extract relations between entities with both flat and structured features. In Proceedings of the 44th Annual Meeting of the Association for Computational Linguistics and 21st International Conference on Computational Linguistics (COLING-ACL’06). 825--832. Google ScholarDigital Library
Zhang, Z. 2004. Weakly-supervised relation classification for information extraction. In Proceedings of ACM 13th conference on Information and Knowledge Management (CIKM’04). Google ScholarDigital Library
Zhou, G. and Zhang, M. 2007. Extracting relation information from text documents by exploring various types of knowledge. Inf. Process. Manage. 43, 4, 969--982. Google ScholarDigital Library
Zhou, G., Su, J., Zhang, J., and Zhang, M. 2005. Exploring various knowledge in relation extraction. In Proceedings of the 43rd Annual Meeting of the Association for Computer Linguistics (ACL’05). 427--434. Google ScholarDigital Library
Zhou, G., Zhan, M., Ji, D., and Zhu, Q. 2007. Tree kernel-based relation extraction with context-sensitive structured parse tree information. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL’07). 728--736.Google Scholar
Zhou, J., Xu, Q., Chen, J., and Qu, W. 2009a. A multi-view approach for relation extraction. In Proceedings of the International Conference on Web Information Systems and Mining (WISM’09), Wenyin Liu, Xiangfeng Luo, Fu Lee Wang, and Jingsheng Lei (Eds.) Google ScholarDigital Library
Zhou, G., Qian, L., and Zhu, Q. 2009b. Label propagation via bootstrapped support vectors for semantic relation extraction between named entities. Comput. Speech Lang. 23, 4. Google ScholarDigital Library
Zhou, G., Qian, L., and Fan, J. 2010. Tree kernel-based semantic relation extraction with rich syntactic and semantic information. Inf. Sci. 180, 8, 1313--1325. Google ScholarDigital Library

Index Terms

Developing Position Structure-Based Framework for Chinese Entity Relation Extraction
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources

Recommendations

Chinese Open Relation Extraction and Knowledge Base Establishment

Named entity relation extraction is an important subject in the field of information extraction. Although many English extractors have achieved reasonable performance, an effective system for Chinese relation extraction remains undeveloped due to the ...
Read More
Corpus-Based Extraction of Collocations in Chinese
WI-IAT '08: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 03

Collocation, i.e. the sequences of certain words which habitually co-occur, plays an essential part in human language. The present study is intending to identify the detailed classification and typical features of collocations in Chinese language, and ...
Read More
Research on Progress and Inspiration of Entity Relation Extraction in English Open Domain
Machine Learning for Cyber Security
Abstract
In the era of big data, how to extract unrestricted type of entity relations from open domain text is a challenging topic. In order to further understand related deep issues, this paper summarized the latest progress in the field of English entity ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Asian Language Information Processing Volume 10, Issue 3
September 2011
114 pages
ISSN:1530-0226
EISSN:1558-3430
DOI:10.1145/2002980
Issue’s Table of Contents

Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 September 2011
- Accepted: 1 April 2011
- Revised: 1 February 2011
- Received: 1 November 2010
Published in talip Volume 10, Issue 3

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Chinese language
Entity relation extraction
imbalance class classification
position structure
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 9
  Total Citations
  View Citations
- 396
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Developing Position Structure-Based Framework for Chinese Entity Relation Extraction

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Chinese Open Relation Extraction and Knowledge Base Establishment

Corpus-Based Extraction of Collocations in Chinese

Research on Progress and Inspiration of Entity Relation Extraction in English Open Domain

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Developing Position Structure-Based Framework for Chinese Entity Relation Extraction

ACM Transactions on Asian Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

Chinese Open Relation Extraction and Knowledge Base Establishment

Corpus-Based Extraction of Collocations in Chinese

Research on Progress and Inspiration of Entity Relation Extraction in English Open Domain

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media