Article

Free Access

Training a selection function for extraction

Author:
Chin-Yew Lin

USC/Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA

USC/Information Sciences Institute, 4676 Admiralty Way, Marina del Rey, CA
View Profile

CIKM '99: Proceedings of the eighth international conference on Information and knowledge managementNovember 1999Pages 55–62https://doi.org/10.1145/319950.319957

Published:01 November 1999Publication History

CIKM '99: Proceedings of the eighth international conference on Information and knowledge management

Pages 55–62

ABSTRACT

In this paper we compare performance of several heuristics in generating informative generic/query-oriented extracts for newspaper articles in order to learn how topic prominence affects the performance of each heuristic. We study how different query types can affect the performance of each heuristic and discuss the possibility of using machine learning algorithms to automatically learn good combination functions to combine several heuristics. We also briefly describe the design, implementation, and performance of a multilingual text summarization system SUMMARIST.

References

1.Aone, C., M.E. Okurowski, and J. Gorlinsky. 1998. Trainable, Scalable Summarization System using Robust NLP and Machine Learning, Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL), Montreal, Canada, pp. 62-66. Google ScholarDigital Library
2.Baldwin, B. and T. Morton. 1998. Coreference-Based Summarization. In T. Firmin Hand and B. Sundheim (eds) TIPSTER-SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
3.Barzilay, R. and M. Elhadad. 1998. Using lexical chains for text summarization. Proceedings of the ACL-97/EACL- 97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10-17.Google Scholar
4.Baxendale, P.B. 1958. Machine-Made Index for Technical Literature--An Experiment. IBM Journal (October): 354- 361.Google Scholar
5.Brandow, R. K. Mitze, and L. Rau. 1995. Automatic Condensation of Electronic Publishing Publications by Sentence Selection. Information Processing and Management 31 (5): 675-685. Google ScholarDigital Library
6.Brill, E. 1992. A Corpus-Based Approach to Language Learning. Ph.D. dissertation, University of Pennsylvania. Google ScholarDigital Library
7.Buckley, C. and C. Cardie. 1997. SMART Summarization System. In T. Firmin Hand and B. Sundheim (eds) TIPSTER-SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
8.Cowie, }., E. Ludovik, and H. Molina-Salgado. 1998. MINDS~Multi-lingual Interactive Document Summarization. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
9.Edmundson, H.P. 1969. New Methods in Automatic Extraction. Journal of the ACM 16 (2): 264-285. Google ScholarDigital Library
10.Firmin Hand, T. and B. Sundheim. 1998. TIPSTER- SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase IIi Workshop. Washington, DC.Google Scholar
11.Goldstein, J. and M. Borger. 1998. Underline Project: Scout: Automated Query-Relevant Document Summarization. Proceedings of the TIPSTER Text Phase Ill Workshop. Washington, DC.Google Scholar
12.Harman, D. (editor). 1995. Proceedings of the Fourth Text Retrieval Conference (TREC-4). NIST, Gaithersburg, MD.Google Scholar
13.Hovy, E.H. and L. Wanner. 1996. Managing Sentence Planning Requirements. Proceedings of the Workshop on Gaps and Bridges in NL Planning and Generation, at ECAI Conference. Budapest, Hungary, pp. 53-58.Google Scholar
14.Hovy, E.H. and C-Y. Lin. 1999. Automated Text Summarization in SUMMARIST. in I. Mani and M. Maybury (eds), Advances in Automated Text Summarization. Cambridge: MIT Press, pp. 81-94.Google Scholar
15.Hovy, E.H. and H. Liu. 1999. The Power of Indicator Phrases for Automated Text Summarization. Submitted.Google Scholar
16.Jing, H., R. Barzilay, K. McKeown, and M. Elhadad. 1998. Summarization Evaluation Methods: Experiments and Results. In E.H. Hovy and D. Radev (eds), Proceedings of the AAAI Spring Symposium on Intelligent Text Summarization, pp. 60-68.Google Scholar
17.Kupiec, l., J. Pedersen, and F. Chen. 1995. A Trainable Document Summarizer. Proceedings of the Eighteenth Annual International ACM Conference on Research and Development in Information Retrieval ($1GIR), Seattle, WA, pp. 68-73. Google ScholarDigital Library
18.Langkilde, I. and K. Knight. 1998. Generation that Exploits Corpus Knowledge. In Proceedings of the COLING/ACL Conference, Montreal, Canada, pp. 704-709. Google ScholarDigital Library
19.Lin, C-Y. 1995. Topic Identification by Concept Generalization. Proceedings of the Thirty-third Conference of the Association of Computational Linguistics (ACL-95), Boston, MA, pp. 308-310.Google Scholar
20.Lin, C-Y. 1997. Robust Automated Topic Identification. Ph.D. dissertation, University of Southern California. Google ScholarDigital Library
21.Lin, C-Y. and E.H. How. 1997. Identifying Topics by Position. Proceedings of the Applied Natural Language Processing Conference (ANLP-97), Washington, DC, pp. 283-290. Google ScholarDigital Library
22.Lin, C-Y. 1999. Machine Translation for Information Access across the Language Barrier: the MuST System. Proceedings of the Machine Translation Summit VII, MT in the Great Translation Era, Singapore.Google Scholar
23.Luhn, H.P. 1959. The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development: 159-165.Google Scholar
24.Mani, I. and E. Bloedorn. 1998. Machine Learning of Genetic and User-Focused Summarization. Proceedings of AAAI-98, pp. 821-826. Google ScholarDigital Library
25.Mani, i., E. Bloedorn, and Barbara Gates. 1998. Using Cohesion and coherence models for text summarization. Proceedings of the AAAI-98 Spring Symposium on Intelligent Text Summarization, pp. 69-76.Google Scholar
26.Mann, W.C. and S.A. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8 (3): 243-281.Google ScholarCross Ref
27.Marcu, D. 1997. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. dissertation, University of Toronto. Google ScholarDigital Library
28.Marcu, D. 1998. Improving Summarization through Rhetorical Parsing Tuning. Proceedings of the COLING- ACL Workshop on Very Large Corpora. Montreal, Canada.Google Scholar
29.McKeown, K.R. and D.R. Radev. 1995. Generating Summaries of Multiple News Articles. Proceedings of the 18th Annual International ACM Conference on Research Google ScholarDigital Library
30.Results. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
31.Teufel, S. and M. Moens. 1998. Sentence Extraction as a Classification Task. In I. Mani and M. Maybury (eds), and Development in Information Retrieval (SIGIR), Seattle, WA, pp. 74-82.Google Scholar
32.Miike, S., E. Itoh, K. Ono, and K. Sumita. 1994. A Full- Text Retrieval System with Dynamic Abstract Generation Function. Proceedings of the 17th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR-94), pp. 152-161. Google ScholarDigital Library
33.Miller, G., R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. 1990. Five papers on WordNet. CSL Report 43, Cognitive Science Laboratory, Princeton University.Google Scholar
34.Paice, C.D. 1990, Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26 (1): 171-I 86. Google ScholarDigital Library
35.Quinlan, J.R. 1992. C4.5: Programs for Machine Learning, San Mateo, California: Morgan Kaufmann Publishers. Google ScholarDigital Library
36.Rau, L.S. and P.S. Jacobs. 1991. Creating Segmented Databases from Free Text for Text Retrieval. Proceedings of the Fourteenth Annual ACM Conference on Research and Development in Information Retrieval (SiGiR), pp. 337-346. New York, NY. Google ScholarDigital Library
37.Reimer, U, and U. Hahn. 1998. A Formal Model of Text summarization Based on Condensation Operators of a Terminological Logic. In i. Mani and M. Maybury (eds), Advances in Automated Text Summarization. Cambridge: MIT Press.Google Scholar
38.Salton, G. 1988. Automatic Text Processing. Reading, MA: Addison-Wesley. Google ScholarDigital Library
39.Salton, G., A. Singhal, M. Mitra, and C. Buckley. 1997. Automatic Text Structuring and Summarization. Information Processing and Management 33 (2): 193-208. Google ScholarDigital Library
40.Selman, B., H. Levesque, and D. Mitchell, 1992, A New Method for Solving Hard Satisfiability Problem. Proceeding of the Tenth National Conference on Artificial Intelligence (AAAl-92), pp. 440-446, San Jose, California.Google Scholar
41.Sp'Jxck Jones, K. 1999.Introduction to Text Summarisation. In I. Mani and M. Maybury (eds), Advances in Automated Text Summarization. Cambridge: MIT Press.Google Scholar
42.SPSS 1997. SPSS Base 7.5 Application Guide. SPSS Inc., Chicago. Google ScholarDigital Library
43.Strzalkowski, T. Jin Wang, and Bowden Wise. 1998. A Robust Practical Text Summarization. Proceedings of the AAAi-98 Spring Symposium on Intelligent Text Summarization, pp. 26-33.Google Scholar
44.Sundheim, B. 1998. The TiPSTER Question-and-Answer (Q&A) Summarization Task: Test Design and Test Advances in Automated Text Summarization. Cambridge: MIT Press.Google Scholar
45.Voorhees, E. and D. Harman (editors). 1998. Proceedings of the Seventh Text Retrieval Conference (TREC-7). NIST, Gaithersburg, MD.Google ScholarCross Ref
46.Wnek, K., Bloedom, E., and Michalski, R. 1995. Selective Inductive Le. aming Method AQI5C. George Mason University, Faiffax, Social Science Computing Review, 2 (Winter 1992), 453-469.Google Scholar

Index Terms

Training a selection function for extraction
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
    2. Search methodologies
      1. Heuristic function construction
2. Information systems
  1. Data management systems
    1. Database management system engines

Recommendations

Cover Coefficient-Based Multi-document Summarization
ECIR '09: Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval

In this paper we present a generic, language independent multi-document summarization system forming extracts using the cover coefficient concept. Cover Coefficient-based Summarizer (CCS) uses similarity between sentences to determine representative ...
Read More
Update summarization based on novel topic distribution
DocEng '09: Proceedings of the 9th ACM symposium on Document engineering

This paper deals with our recent research in text summarization. The field has moved from multi-document summarization to update summarization. When producing an update summary of a set of topic-related documents the summarizer assumes prior knowledge ...
Read More
Dialogue Topic Extraction as Sentence Sequence Labeling
Natural Language Processing and Chinese Computing
Abstract
The topic information of the dialogue text is important for the model to understand the intentions of the dialogue participants and to abstractly summarize the content of the dialogue. The dialogue topic extraction task aims to extract the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '99: Proceedings of the eighth international conference on Information and knowledge management
November 1999
564 pages
ISBN:1581131461
DOI:10.1145/319950
Editor:
Susan Gauch
Copyright © 1999 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 November 1999
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
automated text summarization
summary evaluation
topic extraction
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate1,861of8,427submissions,22%
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 67
  Total Citations
  View Citations
- 556
  Total Downloads
- Downloads (Last 12 months)122
- Downloads (Last 6 weeks)18
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Training a selection function for extraction

CIKM '99: Proceedings of the eighth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cover Coefficient-Based Multi-document Summarization

Update summarization based on novel topic distribution

Dialogue Topic Extraction as Sentence Sequence Labeling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Training a selection function for extraction

CIKM '99: Proceedings of the eighth international conference on Information and knowledge management

ABSTRACT

References

Cited By

Index Terms

Recommendations

Cover Coefficient-Based Multi-document Summarization

Update summarization based on novel topic distribution

Dialogue Topic Extraction as Sentence Sequence Labeling

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media