skip to main content
10.3115/1067737.1067739dlproceedingsArticle/Chapter ViewAbstractPublication PageseaclConference Proceedingsconference-collections
Article
Free Access

Integrating cohesion and coherence for automatic summarization

Published:12 April 2003Publication History

ABSTRACT

This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers.When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting summaries and cannot beat a dummy baseline consisting of the first sentence in the document. Nevertheless, we argue that this approach relies on basic linguistic mechanisms and is therefore genre-independent.

References

  1. Laura Alonso and Irene Castellón. 2001. Towards a delimitation of discursive segment for natural language processing applications. In First International Workshop on Semantics, Pragmatics and Rhetoric, Donostia - San Sebastiàn, November.Google ScholarGoogle Scholar
  2. Laura Alonso and Maria Fuentes. 2002. Collaborating discourse for text summarisation. In Proceedings of the Seventh ESSLLI Student Session.Google ScholarGoogle Scholar
  3. Laura Alonso, Irene Castellón, and Lluís Padró. 2002a. Design and implementation of a spanish discourse marker lexicon. In SEPLN, Valladolid.Google ScholarGoogle Scholar
  4. Laura Alonso, Irene Castellón, and Lluís Padró. 2002b. X-tractor: A tool for extracting discourse markers. In LREC 2002 workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data, Las Palmas.Google ScholarGoogle Scholar
  5. J. C. Anscombre and O. Ducrot. 1983. L'argumentation dans la langue. Mardaga.Google ScholarGoogle Scholar
  6. Montse Arévalo, Xavi Carreras, Lluís Màrquez, M. Antònia Martí, Lluís Padró, and M. José Simón. 2002. A proposal for wide-coverage spanish named entity recognition. Procesamiento del Lenguaje Natural, 1(3).Google ScholarGoogle Scholar
  7. Nicholas Asher and Alex Lascarides. 2002. The Logic of Conversation. Cambridge University Press.Google ScholarGoogle Scholar
  8. Regina Barzilay. 1997. Lexical Chains for Summarization. Ph.D. thesis, Ben-Gurion University of the Negev.Google ScholarGoogle Scholar
  9. Meru Brunn, Yllias Chali, and Christopher J. Pinchak. 2001. Text Summarization using lexical chains. In Workshop on Text Summarization in conjunction with the ACM SIGIR Conference 2001, New Orleans, Louisiana.Google ScholarGoogle Scholar
  10. Josep Carmona, Sergi Cervell, Lluís Màrquez, M. Antònia Mart, Lluís Padró, Roberto Placer, Horacio Rodríguez, Mariona Taulé, and Jordi Turmo. 1998. An environment for morphosyntactic processing of unrestricted spanish text. In First International Conference on Language Resources and Evaluation (LREC'98), Granada, Spain.Google ScholarGoogle Scholar
  11. Simon H. Corston-Oliver and W. Dolan. 1999. Less is more: Eliminating index terms from subordinate clauses. In 37th Annual Meeting of the Association for Computational Linguistics (ACL'99), pages 348 -- 356. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. DUC. 2002. DUC-document understanding conference. http://duc.nist.gov/.Google ScholarGoogle Scholar
  13. K. Forbes, E. Miltsakaki, R. Prasad, A. Sarkar, A. Joshi, and B. Webber. 2003. D-LTAG system - discourse parsing with a lexicalized tree-adjoining grammar. Journal of Language, Logic and Information. to appear. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Maria Fuentes and Horacio Rodríguez. 2002. Using cohesive properties of text for automatic summarization. In JOTRI'02.Google ScholarGoogle Scholar
  15. Jade Goldstein, Vibhu Mittal, Mark Kantrowitz, and Jaime Carbonell. 1999. Summarizing text documents: Sentence selection and evaluation metrics. In SIGIR-99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. M. A. K. Halliday and R. Hasan. 1976. Cohesion in English. English Language Series. Longman Group Ltd.Google ScholarGoogle Scholar
  17. Alistair Knott, Jon Oberlander, Mick O'Donnell, and Chris Mellish. 2001. Beyond elaboration: The interaction of relations and focus in coherent text. In Ted Sanders, Joost Schilperoord, and Wilbert Spooren, editors, Text representation: linguistic and psycholinguistic aspects, pages 181--196. Benjamins.Google ScholarGoogle Scholar
  18. Inderjeet Mani. 2001. Automatic Summarization. Natural Language Processing. John Benjamins Publishing Company.Google ScholarGoogle Scholar
  19. William C. Mann and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organisation. Text, 3(8):234--281.Google ScholarGoogle Scholar
  20. Daniel Marcu. 1997. The Rhetorical Parsing, Summarization and Generation of Natural Language Texts. Ph.D. thesis, Department of Computer Science, University of Toronto, Toronto, Canada. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Daniel Marcu. 1999. The automatic construction of large-scale corpora for summarization research. In SIGIR-99. 2002. MEADeval. http://perun.si.umich.edu/clair/meadeval/. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Jane Morris and Graeme Hirst. 1991. Lexical cohesion, the thesaurus, and the structure of text. Computational linguistics, 17(1):21--48. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. M. Palomar, A. Ferrández, L. Moreno, P. Martínez-Barco, J. Peral, M. Saiz-Noeda, and R. Mu noz. 2001. An algorithm for anaphora resolution in spanish texts. Computational Linguistics, 27(4). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Livia Polanyi. 1988. A formal model of the structure of discourse. Journal of Pragmatics, 12:601--638.Google ScholarGoogle ScholarCross RefCross Ref
  25. R. Schank and R. Abelson. 1977. Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum, Hillsdale, NJ.Google ScholarGoogle Scholar
  26. SweSum. 2002. http://www.nada.kth.se/~xmartin/swesum/index-eng.html.Google ScholarGoogle Scholar
  27. Arie Verhagen. 2001. Subordination and discourse segmentation revisited, or: Why matrix clauses may be more dependent than complements. In Ted Sanders, Joost Schilperoord, and Wilbert Spooren, editors, Text Representation. Linguistic and psychological aspects, pages 337--357. John Benjamins.Google ScholarGoogle Scholar
  28. Piek Vossen, editor. 1998. Euro WordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers. Google ScholarGoogle ScholarDigital LibraryDigital Library
  1. Integrating cohesion and coherence for automatic summarization

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image DL Hosted proceedings
        EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
        April 2003
        254 pages
        ISBN:1111567890

        Publisher

        Association for Computational Linguistics

        United States

        Publication History

        • Published: 12 April 2003

        Qualifiers

        • Article

        Acceptance Rates

        Overall Acceptance Rate100of360submissions,28%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader