Automatic condensation of electronic publications by sentence selection

https://doi.org/10.1016/0306-4573(95)00052-IGet rights and content

Abstract

As electronic information access becomes the norm, and the variety of retrievable material increases, automatic methods of summarizing or condensing text will become critical. This paper describes a system that performs domain-independent automatic condensation of news from a large commercial news service encompassing 41 different publications. This system was evaluated against a system that condensed the same articles using only the first portion of the texts (the lead), up to the target length of the summaries. Three lengths of articles were evaluated for 250 documents by both systems, totalling 1500 suitability judgements in all. The outcome of perhaps the largest evaluation of human vs machine summarization performed to date was unexpected. The lead-based summaries outperformed the “intelligent” summaries significantly, achieving acceptability ratings of over 90%, compared to 74.4%. This paper briefly reviews the literature, details the implications of these results, and addresses the remaining hopes for content-based summarization. We expect the results presented here to be useful to other researchers currently investigating the viability of summarization through sentence selection heuristics.

References (10)

There are more references available in the full text version of this article.

Cited by (240)

  • A semantic approach to extractive multi-document summarization: Applying sentence expansion for tuning of conceptual densities

    2020, Information Processing and Management
    Citation Excerpt :

    According to Shareghi & Hassanabadi (2008), human executes the following three steps to build a summary: 1) Understanding the content of the document, 2) Identifying the most important pieces of the information in the text, and 3) Writing up this information. Due to their challenges, there is little hope to automate the first and the third steps for any arbitrary texts (Brandow, Mitze, & Rau, 1995). Thus, most of the approaches attempt to automate the second step.

  • Various Diseases’ Prediction Based on Symptom by Using Machine Learning

    2023, Lecture Notes on Data Engineering and Communications Technologies
View all citing articles on Scopus

This paper was prepared while Lisa Rau was on an NSF Visiting Professorship for Women grant (NSF GER-9350134), hosted by the Computer and Information Sciences Department at the University of Pennsylvania.

View full text