Abstract
Discourse parsing is a challenging task and plays a critical role in discourse analysis. Since the release of the Rhetorical Structure Theory Discourse Treebank and the Penn Discourse Treebank, the research on English discourse parsing has attracted increasing attention and achieved considerable success in recent years. At the same time, some preliminary research on certain subtasks about discourse parsing for other languages, such as Chinese, has been conducted. In this article, we present an end-to-end Chinese discourse parser with the Connective-Driven Dependency Tree scheme, which consists of multiple components in a pipeline architecture, such as the elementary discourse unit (EDU) detector, discourse relation recognizer, discourse parse tree generator, and attribution labeler. In particular, the attribution labeler determines two attributions (i.e., sense and centering) for every nonterminal node (i.e., discourse relation) in the discourse parse trees. Systematically, our parser detects all EDUs in a free text, generates the discourse parse tree in a bottom-up way, and determines the sense and centering attributions for all nonterminal nodes by traversing the discourse parse tree. Comprehensive evaluation on the Connective-Driven Dependency Treebank corpus from both component-wise and error-cascading perspectives is conducted to illustrate how each component performs in isolation, and how the pipeline performs with error propagation. Finally, it shows that our end-to-end Chinese discourse parser achieves an overall F1 score of 20% with full automation.
- Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. 2001. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Proceedings of the 2001 SIGdial Workshop on Discourse and Dialogue. Google ScholarDigital Library
- Vanessa Wei Feng and Graeme Hirst. 2012. Text-level discourse parsing with rich linguistic features. In Proceedings of the 2012 ACL Conference (ACL’12). Google ScholarDigital Library
- Hen-Hsen Huang and Hsin-Hsi Chen. 2011. Chinese discourse relation recognition. In Proceedings of the 2011 IJCNLP Conference (IJCNLP’11).Google Scholar
- Hen-Hsen Huang and Hsin-Hsi Chen. 2012a. An annotation system for development of Chinese discourse corpus. In Proceedings of COLING 2012 Demonstration Papers.Google Scholar
- Hen-Hsen Huang and Hsin-Hsi Chen. 2012b. Contingency and comparison relation labeling and structure prediction in Chinese sentences. In Proceedings of the 2012 Special Interest Group on Discourse and Dialogue. Google ScholarDigital Library
- Fang Kong, Hwee Tou Ng, and Guodong Zhou. 2004. A constituent-based approach to argument labeling with joint inference in discourse parsing. In Proceedings of the 2014 EMNLP Conference (EMNLP’14).Google Scholar
- Yancui Li, Wenhe Feng, Jing Sun, Fang Kong, and Guodong Zhou. 2014. Building Chinese discourse corpus with connective-driven dependency tree structure. In Proceedings of the 2014 EMNLP Conference (EMNLP’14).Google ScholarCross Ref
- Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of the 2009 EMNLP Conference (EMNLP’09). Google ScholarDigital Library
- Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of the 2011 ACL Conference (ACL’11). Google ScholarDigital Library
- Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2014. A PDTB-styled end-to-end discourse parser. Natural Language Engineering 20, 2, 151--184.Google ScholarCross Ref
- Thomas Meyer and Bonnie Webber. 2013. Implicitation of discourse connectives in (machine) translation. In Proceedings of the 2013 Workshop on Discourse in Machine Translation.Google Scholar
- Emily Pitler and Ani Nenkova. 2009. Using syntax to disambiguate explicit discourse connectives in text. In Proceedings of the ACL-IJCNLP 2009 Short Papers. Google ScholarDigital Library
- Rashmi Prasad, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi, and Bonnie Webber. 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 2008 LREC Conference (LREC’08).Google Scholar
- Susan Verberne, Lou Boves, Nelleke Oostdijk, and Perter Arno Coppen. 2007. Discourse-based answering of why-questions. Traitement Automatique Des Langues 47, 2, 21--41.Google Scholar
- Nianwen Xue. 2005. Annotating discourse connectives in the Chinese Treebank. In Proceedings of the 2005 Workshop on Frontiers in Corpus Annotations. Google ScholarDigital Library
- Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Marta Palmer. 2005. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. Natural Language Engineering 11, 2, 207--238. Google ScholarDigital Library
- Yaqin Yang and Nianwen Xue. 2012. Chinese comma disambiguation for discourse analysis. In Proceedings of the 2012 ACL Conference (ACL’12). Google ScholarDigital Library
- Ming Yue. 2008. Rhetorical structure annotation of Chinese news commentaries. Journal of Chinese Information Processing 22, 4, 19--23.Google Scholar
- Lanjun Zhou, Binyang Li, Zhongyu Wei, and Kam-Fai Wong. 2014. The CUHK discourse treebank for Chinese: Annotating explicit discourse connectives for the Chinese Treebank. In Proceedings of the 2014 LREC Conference (LREC’14).Google Scholar
- Yuping Zhou and Nianwen Xue. 2012. PDTB-style discourse annotation of Chinese text. In Proceedings of the 2012 ACL Conference (ACL’12). Google ScholarDigital Library
- Yuping Zhou and Nianwen Xue. 2015. The Chinese discourse treebank: A Chinese corpus annotated with discourse relations. Language Resources and Evaluation 49, 2, 397--431. Google ScholarDigital Library
Index Terms
- A CDT-Styled End-to-End Chinese Discourse Parser
Recommendations
Building a Language-Independent Discourse Parser using Universal Networking Language
Discourse parsing has become an inevitable task to process information in the natural language processing arena. Parsing complex discourse structures beyond the sentence level is a significant challenge. This article proposes a discourse parser that ...
A survey of discourse parsing
AbstractDiscourse parsing is an important research area in natural language processing (NLP), which aims to parse the discourse structure of coherent sentences. In this survey, we introduce several different kinds of discourse parsing tasks, mainly ...
End-to-End Discourse Parser Evaluation
ICSC '11: Proceedings of the 2011 IEEE Fifth International Conference on Semantic ComputingWe are interested in the problem of discourse parsing of textual documents. We present a novel end-to-end discourse parser that, given a plain text document in input, identifies the discourse relations in the text, assigns them a semantic label and ...
Comments