Article

Free Access

Integrating cohesion and coherence for automatic summarization

Authors:
Laura Alonso i Alemany

Universitat de Barcelona

Universitat de Barcelona
View Profile

,
Maria Fuentes Fort

Universitat de Girona

Universitat de Girona
View Profile

EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2April 2003Pages 1–8https://doi.org/10.3115/1067737.1067739

Published:12 April 2003Publication History

EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2

Pages 1–8

ABSTRACT

This paper presents the integration of cohesive properties of text with coherence relations, to obtain an adequate representation of text for automatic summarization. A summarizer based on Lexical Chains is enchanced with rhetorical and argumentative structure obtained via Discourse Markers.When evaluated with newspaper corpus, this integration yields only slight improvement in the resulting summaries and cannot beat a dummy baseline consisting of the first sentence in the document. Nevertheless, we argue that this approach relies on basic linguistic mechanisms and is therefore genre-independent.

References

Laura Alonso and Irene Castellón. 2001. Towards a delimitation of discursive segment for natural language processing applications. In First International Workshop on Semantics, Pragmatics and Rhetoric, Donostia - San Sebastiàn, November.Google Scholar
Laura Alonso and Maria Fuentes. 2002. Collaborating discourse for text summarisation. In Proceedings of the Seventh ESSLLI Student Session.Google Scholar
Laura Alonso, Irene Castellón, and Lluís Padró. 2002a. Design and implementation of a spanish discourse marker lexicon. In SEPLN, Valladolid.Google Scholar
Laura Alonso, Irene Castellón, and Lluís Padró. 2002b. X-tractor: A tool for extracting discourse markers. In LREC 2002 workshop on Linguistic Knowledge Acquisition and Representation: Bootstrapping Annotated Language Data, Las Palmas.Google Scholar
J. C. Anscombre and O. Ducrot. 1983. L'argumentation dans la langue. Mardaga.Google Scholar
Montse Arévalo, Xavi Carreras, Lluís Màrquez, M. Antònia Martí, Lluís Padró, and M. José Simón. 2002. A proposal for wide-coverage spanish named entity recognition. Procesamiento del Lenguaje Natural, 1(3).Google Scholar
Nicholas Asher and Alex Lascarides. 2002. The Logic of Conversation. Cambridge University Press.Google Scholar
Regina Barzilay. 1997. Lexical Chains for Summarization. Ph.D. thesis, Ben-Gurion University of the Negev.Google Scholar
Meru Brunn, Yllias Chali, and Christopher J. Pinchak. 2001. Text Summarization using lexical chains. In Workshop on Text Summarization in conjunction with the ACM SIGIR Conference 2001, New Orleans, Louisiana.Google Scholar
Josep Carmona, Sergi Cervell, Lluís Màrquez, M. Antònia Mart, Lluís Padró, Roberto Placer, Horacio Rodríguez, Mariona Taulé, and Jordi Turmo. 1998. An environment for morphosyntactic processing of unrestricted spanish text. In First International Conference on Language Resources and Evaluation (LREC'98), Granada, Spain.Google Scholar
Simon H. Corston-Oliver and W. Dolan. 1999. Less is more: Eliminating index terms from subordinate clauses. In 37th Annual Meeting of the Association for Computational Linguistics (ACL'99), pages 348 -- 356. Google ScholarDigital Library
DUC. 2002. DUC-document understanding conference. http://duc.nist.gov/.Google Scholar
K. Forbes, E. Miltsakaki, R. Prasad, A. Sarkar, A. Joshi, and B. Webber. 2003. D-LTAG system - discourse parsing with a lexicalized tree-adjoining grammar. Journal of Language, Logic and Information. to appear. Google ScholarDigital Library
Maria Fuentes and Horacio Rodríguez. 2002. Using cohesive properties of text for automatic summarization. In JOTRI'02.Google Scholar
Jade Goldstein, Vibhu Mittal, Mark Kantrowitz, and Jaime Carbonell. 1999. Summarizing text documents: Sentence selection and evaluation metrics. In SIGIR-99. Google ScholarDigital Library
M. A. K. Halliday and R. Hasan. 1976. Cohesion in English. English Language Series. Longman Group Ltd.Google Scholar
Alistair Knott, Jon Oberlander, Mick O'Donnell, and Chris Mellish. 2001. Beyond elaboration: The interaction of relations and focus in coherent text. In Ted Sanders, Joost Schilperoord, and Wilbert Spooren, editors, Text representation: linguistic and psycholinguistic aspects, pages 181--196. Benjamins.Google Scholar
Inderjeet Mani. 2001. Automatic Summarization. Natural Language Processing. John Benjamins Publishing Company.Google Scholar
William C. Mann and Sandra A. Thompson. 1988. Rhetorical structure theory: Toward a functional theory of text organisation. Text, 3(8):234--281.Google Scholar
Daniel Marcu. 1997. The Rhetorical Parsing, Summarization and Generation of Natural Language Texts. Ph.D. thesis, Department of Computer Science, University of Toronto, Toronto, Canada. Google ScholarDigital Library
Daniel Marcu. 1999. The automatic construction of large-scale corpora for summarization research. In SIGIR-99. 2002. MEADeval. http://perun.si.umich.edu/clair/meadeval/. Google ScholarDigital Library
Jane Morris and Graeme Hirst. 1991. Lexical cohesion, the thesaurus, and the structure of text. Computational linguistics, 17(1):21--48. Google ScholarDigital Library
M. Palomar, A. Ferrández, L. Moreno, P. Martínez-Barco, J. Peral, M. Saiz-Noeda, and R. Mu noz. 2001. An algorithm for anaphora resolution in spanish texts. Computational Linguistics, 27(4). Google ScholarDigital Library
Livia Polanyi. 1988. A formal model of the structure of discourse. Journal of Pragmatics, 12:601--638.Google ScholarCross Ref
R. Schank and R. Abelson. 1977. Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum, Hillsdale, NJ.Google Scholar
SweSum. 2002. http://www.nada.kth.se/~xmartin/swesum/index-eng.html.Google Scholar
Arie Verhagen. 2001. Subordination and discourse segmentation revisited, or: Why matrix clauses may be more dependent than complements. In Ted Sanders, Joost Schilperoord, and Wilbert Spooren, editors, Text Representation. Linguistic and psychological aspects, pages 337--357. John Benjamins.Google Scholar
Piek Vossen, editor. 1998. Euro WordNet: a multilingual database with lexical semantic networks. Kluwer Academic Publishers. Google ScholarDigital Library

Integrating cohesion and coherence for automatic summarization
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Lexical cohesion based topic modeling for summarization
CICLing'08: Proceedings of the 9th international conference on Computational linguistics and intelligent text processing

In this paper, we attack the problem of forming extracts for text summarization. Forming extracts involves selecting the most representative and significant sentences from the text. Our method takes advantage of the lexical cohesion structure in the ...
Read More
Automatic Extractive Text Summarization using Multiple Linguistic Features
Automatic text summarization (ATS) provides a summary of distinct categories of information using natural language processing (NLP). Low-resource languages like Hindi have restricted applications of these techniques. This study proposes a method for ...
Read More
Towards content-level coherence with aspect-guided summarization

The TAC 2010 summarization track initiated a new task—aspect-guided summarization—that centers on textual aspects embodied as particular kinds of information of a text. We observe that aspect-guided summaries not only address highly specific user need, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2
April 2003
254 pages
ISBN:1111567890
Program Chairs:
Ann Copestake
United Kingdom
,
Jan Hajic
Czech Republic
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 12 April 2003
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate100of360submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 346
  Total Downloads
- Downloads (Last 12 months)22
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Integrating cohesion and coherence for automatic summarization

EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2

ABSTRACT

References

Cited By

Recommendations

Lexical cohesion based topic modeling for summarization

Automatic Extractive Text Summarization using Multiple Linguistic Features

Towards content-level coherence with aspect-guided summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Integrating cohesion and coherence for automatic summarization

EACL '03: Proceedings of the tenth conference on European chapter of the Association for Computational Linguistics - Volume 2

ABSTRACT

References

Cited By

Recommendations

Lexical cohesion based topic modeling for summarization

Automatic Extractive Text Summarization using Multiple Linguistic Features

Towards content-level coherence with aspect-guided summarization

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media