ABSTRACT
In this paper we compare performance of several heuristics in generating informative generic/query-oriented extracts for newspaper articles in order to learn how topic prominence affects the performance of each heuristic. We study how different query types can affect the performance of each heuristic and discuss the possibility of using machine learning algorithms to automatically learn good combination functions to combine several heuristics. We also briefly describe the design, implementation, and performance of a multilingual text summarization system SUMMARIST.
- 1.Aone, C., M.E. Okurowski, and J. Gorlinsky. 1998. Trainable, Scalable Summarization System using Robust NLP and Machine Learning, Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics (COLING-ACL), Montreal, Canada, pp. 62-66. Google ScholarDigital Library
- 2.Baldwin, B. and T. Morton. 1998. Coreference-Based Summarization. In T. Firmin Hand and B. Sundheim (eds) TIPSTER-SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
- 3.Barzilay, R. and M. Elhadad. 1998. Using lexical chains for text summarization. Proceedings of the ACL-97/EACL- 97 Workshop on Intelligent Scalable Text Summarization, Madrid, Spain, pp. 10-17.Google Scholar
- 4.Baxendale, P.B. 1958. Machine-Made Index for Technical Literature--An Experiment. IBM Journal (October): 354- 361.Google Scholar
- 5.Brandow, R. K. Mitze, and L. Rau. 1995. Automatic Condensation of Electronic Publishing Publications by Sentence Selection. Information Processing and Management 31 (5): 675-685. Google ScholarDigital Library
- 6.Brill, E. 1992. A Corpus-Based Approach to Language Learning. Ph.D. dissertation, University of Pennsylvania. Google ScholarDigital Library
- 7.Buckley, C. and C. Cardie. 1997. SMART Summarization System. In T. Firmin Hand and B. Sundheim (eds) TIPSTER-SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
- 8.Cowie, }., E. Ludovik, and H. Molina-Salgado. 1998. MINDS~Multi-lingual Interactive Document Summarization. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
- 9.Edmundson, H.P. 1969. New Methods in Automatic Extraction. Journal of the ACM 16 (2): 264-285. Google ScholarDigital Library
- 10.Firmin Hand, T. and B. Sundheim. 1998. TIPSTER- SUMMAC Summarization Evaluation. Proceedings of the TIPSTER Text Phase IIi Workshop. Washington, DC.Google Scholar
- 11.Goldstein, J. and M. Borger. 1998. Underline Project: Scout: Automated Query-Relevant Document Summarization. Proceedings of the TIPSTER Text Phase Ill Workshop. Washington, DC.Google Scholar
- 12.Harman, D. (editor). 1995. Proceedings of the Fourth Text Retrieval Conference (TREC-4). NIST, Gaithersburg, MD.Google Scholar
- 13.Hovy, E.H. and L. Wanner. 1996. Managing Sentence Planning Requirements. Proceedings of the Workshop on Gaps and Bridges in NL Planning and Generation, at ECAI Conference. Budapest, Hungary, pp. 53-58.Google Scholar
- 14.Hovy, E.H. and C-Y. Lin. 1999. Automated Text Summarization in SUMMARIST. in I. Mani and M. Maybury (eds), Advances in Automated Text Summarization. Cambridge: MIT Press, pp. 81-94.Google Scholar
- 15.Hovy, E.H. and H. Liu. 1999. The Power of Indicator Phrases for Automated Text Summarization. Submitted.Google Scholar
- 16.Jing, H., R. Barzilay, K. McKeown, and M. Elhadad. 1998. Summarization Evaluation Methods: Experiments and Results. In E.H. Hovy and D. Radev (eds), Proceedings of the AAAI Spring Symposium on Intelligent Text Summarization, pp. 60-68.Google Scholar
- 17.Kupiec, l., J. Pedersen, and F. Chen. 1995. A Trainable Document Summarizer. Proceedings of the Eighteenth Annual International ACM Conference on Research and Development in Information Retrieval ($1GIR), Seattle, WA, pp. 68-73. Google ScholarDigital Library
- 18.Langkilde, I. and K. Knight. 1998. Generation that Exploits Corpus Knowledge. In Proceedings of the COLING/ACL Conference, Montreal, Canada, pp. 704-709. Google ScholarDigital Library
- 19.Lin, C-Y. 1995. Topic Identification by Concept Generalization. Proceedings of the Thirty-third Conference of the Association of Computational Linguistics (ACL-95), Boston, MA, pp. 308-310.Google Scholar
- 20.Lin, C-Y. 1997. Robust Automated Topic Identification. Ph.D. dissertation, University of Southern California. Google ScholarDigital Library
- 21.Lin, C-Y. and E.H. How. 1997. Identifying Topics by Position. Proceedings of the Applied Natural Language Processing Conference (ANLP-97), Washington, DC, pp. 283-290. Google ScholarDigital Library
- 22.Lin, C-Y. 1999. Machine Translation for Information Access across the Language Barrier: the MuST System. Proceedings of the Machine Translation Summit VII, MT in the Great Translation Era, Singapore.Google Scholar
- 23.Luhn, H.P. 1959. The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development: 159-165.Google Scholar
- 24.Mani, I. and E. Bloedorn. 1998. Machine Learning of Genetic and User-Focused Summarization. Proceedings of AAAI-98, pp. 821-826. Google ScholarDigital Library
- 25.Mani, i., E. Bloedorn, and Barbara Gates. 1998. Using Cohesion and coherence models for text summarization. Proceedings of the AAAI-98 Spring Symposium on Intelligent Text Summarization, pp. 69-76.Google Scholar
- 26.Mann, W.C. and S.A. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8 (3): 243-281.Google ScholarCross Ref
- 27.Marcu, D. 1997. The Rhetorical Parsing, Summarization, and Generation of Natural Language Texts. Ph.D. dissertation, University of Toronto. Google ScholarDigital Library
- 28.Marcu, D. 1998. Improving Summarization through Rhetorical Parsing Tuning. Proceedings of the COLING- ACL Workshop on Very Large Corpora. Montreal, Canada.Google Scholar
- 29.McKeown, K.R. and D.R. Radev. 1995. Generating Summaries of Multiple News Articles. Proceedings of the 18th Annual International ACM Conference on Research Google ScholarDigital Library
- 30.Results. Proceedings of the TIPSTER Text Phase III Workshop. Washington, DC.Google Scholar
- 31.Teufel, S. and M. Moens. 1998. Sentence Extraction as a Classification Task. In I. Mani and M. Maybury (eds), and Development in Information Retrieval (SIGIR), Seattle, WA, pp. 74-82.Google Scholar
- 32.Miike, S., E. Itoh, K. Ono, and K. Sumita. 1994. A Full- Text Retrieval System with Dynamic Abstract Generation Function. Proceedings of the 17th Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR-94), pp. 152-161. Google ScholarDigital Library
- 33.Miller, G., R. Beckwith, C. Fellbaum, D. Gross, and K. Miller. 1990. Five papers on WordNet. CSL Report 43, Cognitive Science Laboratory, Princeton University.Google Scholar
- 34.Paice, C.D. 1990, Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26 (1): 171-I 86. Google ScholarDigital Library
- 35.Quinlan, J.R. 1992. C4.5: Programs for Machine Learning, San Mateo, California: Morgan Kaufmann Publishers. Google ScholarDigital Library
- 36.Rau, L.S. and P.S. Jacobs. 1991. Creating Segmented Databases from Free Text for Text Retrieval. Proceedings of the Fourteenth Annual ACM Conference on Research and Development in Information Retrieval (SiGiR), pp. 337-346. New York, NY. Google ScholarDigital Library
- 37.Reimer, U, and U. Hahn. 1998. A Formal Model of Text summarization Based on Condensation Operators of a Terminological Logic. In i. Mani and M. Maybury (eds), Advances in Automated Text Summarization. Cambridge: MIT Press.Google Scholar
- 38.Salton, G. 1988. Automatic Text Processing. Reading, MA: Addison-Wesley. Google ScholarDigital Library
- 39.Salton, G., A. Singhal, M. Mitra, and C. Buckley. 1997. Automatic Text Structuring and Summarization. Information Processing and Management 33 (2): 193-208. Google ScholarDigital Library
- 40.Selman, B., H. Levesque, and D. Mitchell, 1992, A New Method for Solving Hard Satisfiability Problem. Proceeding of the Tenth National Conference on Artificial Intelligence (AAAl-92), pp. 440-446, San Jose, California.Google Scholar
- 41.Sp'Jxck Jones, K. 1999.Introduction to Text Summarisation. In I. Mani and M. Maybury (eds), Advances in Automated Text Summarization. Cambridge: MIT Press.Google Scholar
- 42.SPSS 1997. SPSS Base 7.5 Application Guide. SPSS Inc., Chicago. Google ScholarDigital Library
- 43.Strzalkowski, T. Jin Wang, and Bowden Wise. 1998. A Robust Practical Text Summarization. Proceedings of the AAAi-98 Spring Symposium on Intelligent Text Summarization, pp. 26-33.Google Scholar
- 44.Sundheim, B. 1998. The TiPSTER Question-and-Answer (Q&A) Summarization Task: Test Design and Test Advances in Automated Text Summarization. Cambridge: MIT Press.Google Scholar
- 45.Voorhees, E. and D. Harman (editors). 1998. Proceedings of the Seventh Text Retrieval Conference (TREC-7). NIST, Gaithersburg, MD.Google ScholarCross Ref
- 46.Wnek, K., Bloedom, E., and Michalski, R. 1995. Selective Inductive Le. aming Method AQI5C. George Mason University, Faiffax, Social Science Computing Review, 2 (Winter 1992), 453-469.Google Scholar
Index Terms
- Training a selection function for extraction
Recommendations
Cover Coefficient-Based Multi-document Summarization
ECIR '09: Proceedings of the 31th European Conference on IR Research on Advances in Information RetrievalIn this paper we present a generic, language independent multi-document summarization system forming extracts using the cover coefficient concept. Cover Coefficient-based Summarizer (CCS) uses similarity between sentences to determine representative ...
Update summarization based on novel topic distribution
DocEng '09: Proceedings of the 9th ACM symposium on Document engineeringThis paper deals with our recent research in text summarization. The field has moved from multi-document summarization to update summarization. When producing an update summary of a set of topic-related documents the summarizer assumes prior knowledge ...
Dialogue Topic Extraction as Sentence Sequence Labeling
Natural Language Processing and Chinese ComputingAbstractThe topic information of the dialogue text is important for the model to understand the intentions of the dialogue participants and to abstractly summarize the content of the dialogue. The dialogue topic extraction task aims to extract the ...
Comments