skip to main content
research-article

A CDT-Styled End-to-End Chinese Discourse Parser

Published:13 July 2017Publication History
Skip Abstract Section

Abstract

Discourse parsing is a challenging task and plays a critical role in discourse analysis. Since the release of the Rhetorical Structure Theory Discourse Treebank and the Penn Discourse Treebank, the research on English discourse parsing has attracted increasing attention and achieved considerable success in recent years. At the same time, some preliminary research on certain subtasks about discourse parsing for other languages, such as Chinese, has been conducted. In this article, we present an end-to-end Chinese discourse parser with the Connective-Driven Dependency Tree scheme, which consists of multiple components in a pipeline architecture, such as the elementary discourse unit (EDU) detector, discourse relation recognizer, discourse parse tree generator, and attribution labeler. In particular, the attribution labeler determines two attributions (i.e., sense and centering) for every nonterminal node (i.e., discourse relation) in the discourse parse trees. Systematically, our parser detects all EDUs in a free text, generates the discourse parse tree in a bottom-up way, and determines the sense and centering attributions for all nonterminal nodes by traversing the discourse parse tree. Comprehensive evaluation on the Connective-Driven Dependency Treebank corpus from both component-wise and error-cascading perspectives is conducted to illustrate how each component performs in isolation, and how the pipeline performs with error propagation. Finally, it shows that our end-to-end Chinese discourse parser achieves an overall F1 score of 20% with full automation.

References

  1. Lynn Carlson, Daniel Marcu, and Mary Ellen Okurowski. 2001. Building a discourse-tagged corpus in the framework of Rhetorical Structure Theory. In Proceedings of the 2001 SIGdial Workshop on Discourse and Dialogue. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Vanessa Wei Feng and Graeme Hirst. 2012. Text-level discourse parsing with rich linguistic features. In Proceedings of the 2012 ACL Conference (ACL’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Hen-Hsen Huang and Hsin-Hsi Chen. 2011. Chinese discourse relation recognition. In Proceedings of the 2011 IJCNLP Conference (IJCNLP’11).Google ScholarGoogle Scholar
  4. Hen-Hsen Huang and Hsin-Hsi Chen. 2012a. An annotation system for development of Chinese discourse corpus. In Proceedings of COLING 2012 Demonstration Papers.Google ScholarGoogle Scholar
  5. Hen-Hsen Huang and Hsin-Hsi Chen. 2012b. Contingency and comparison relation labeling and structure prediction in Chinese sentences. In Proceedings of the 2012 Special Interest Group on Discourse and Dialogue. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Fang Kong, Hwee Tou Ng, and Guodong Zhou. 2004. A constituent-based approach to argument labeling with joint inference in discourse parsing. In Proceedings of the 2014 EMNLP Conference (EMNLP’14).Google ScholarGoogle Scholar
  7. Yancui Li, Wenhe Feng, Jing Sun, Fang Kong, and Guodong Zhou. 2014. Building Chinese discourse corpus with connective-driven dependency tree structure. In Proceedings of the 2014 EMNLP Conference (EMNLP’14).Google ScholarGoogle ScholarCross RefCross Ref
  8. Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng. 2009. Recognizing implicit discourse relations in the Penn Discourse Treebank. In Proceedings of the 2009 EMNLP Conference (EMNLP’09). Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2011. Automatically evaluating text coherence using discourse relations. In Proceedings of the 2011 ACL Conference (ACL’11). Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Ziheng Lin, Hwee Tou Ng, and Min-Yen Kan. 2014. A PDTB-styled end-to-end discourse parser. Natural Language Engineering 20, 2, 151--184.Google ScholarGoogle ScholarCross RefCross Ref
  11. Thomas Meyer and Bonnie Webber. 2013. Implicitation of discourse connectives in (machine) translation. In Proceedings of the 2013 Workshop on Discourse in Machine Translation.Google ScholarGoogle Scholar
  12. Emily Pitler and Ani Nenkova. 2009. Using syntax to disambiguate explicit discourse connectives in text. In Proceedings of the ACL-IJCNLP 2009 Short Papers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Rashmi Prasad, Nikhil Dinesh, Alan Lee, Eleni Miltsakaki, Livio Robaldo, Aravind Joshi, and Bonnie Webber. 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 2008 LREC Conference (LREC’08).Google ScholarGoogle Scholar
  14. Susan Verberne, Lou Boves, Nelleke Oostdijk, and Perter Arno Coppen. 2007. Discourse-based answering of why-questions. Traitement Automatique Des Langues 47, 2, 21--41.Google ScholarGoogle Scholar
  15. Nianwen Xue. 2005. Annotating discourse connectives in the Chinese Treebank. In Proceedings of the 2005 Workshop on Frontiers in Corpus Annotations. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Nianwen Xue, Fei Xia, Fu-Dong Chiou, and Marta Palmer. 2005. The Penn Chinese Treebank: Phrase structure annotation of a large corpus. Natural Language Engineering 11, 2, 207--238. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Yaqin Yang and Nianwen Xue. 2012. Chinese comma disambiguation for discourse analysis. In Proceedings of the 2012 ACL Conference (ACL’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Ming Yue. 2008. Rhetorical structure annotation of Chinese news commentaries. Journal of Chinese Information Processing 22, 4, 19--23.Google ScholarGoogle Scholar
  19. Lanjun Zhou, Binyang Li, Zhongyu Wei, and Kam-Fai Wong. 2014. The CUHK discourse treebank for Chinese: Annotating explicit discourse connectives for the Chinese Treebank. In Proceedings of the 2014 LREC Conference (LREC’14).Google ScholarGoogle Scholar
  20. Yuping Zhou and Nianwen Xue. 2012. PDTB-style discourse annotation of Chinese text. In Proceedings of the 2012 ACL Conference (ACL’12). Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Yuping Zhou and Nianwen Xue. 2015. The Chinese discourse treebank: A Chinese corpus annotated with discourse relations. Language Resources and Evaluation 49, 2, 397--431. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A CDT-Styled End-to-End Chinese Discourse Parser

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Asian and Low-Resource Language Information Processing
      ACM Transactions on Asian and Low-Resource Language Information Processing  Volume 16, Issue 4
      December 2017
      146 pages
      ISSN:2375-4699
      EISSN:2375-4702
      DOI:10.1145/3097269
      Issue’s Table of Contents

      Copyright © 2017 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 13 July 2017
      • Accepted: 1 May 2017
      • Received: 1 January 2017
      Published in tallip Volume 16, Issue 4

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article
      • Research
      • Refereed

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader