Skip to main content

The REG Summarization System with Question Reformulation at QA@INEX Track 2010

  • Conference paper
Comparative Evaluation of Focused Retrieval (INEX 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6932))

Abstract

In this paper we present REG, a graph approach to study a fundamental problem of Natural Language Processing: the automatic summarization of documents. The algorithm models a document as a graph, to obtain weighted sentences. We applied this approach to the INEX@QA 2010 task (question-answering). To do it, we have extracted the terms and name entities from the queries, in order to obtain a list of terms and name entities related with the main topic of the question. Using this strategy, REG obtained good results regarding performance (measured with the automatic evaluation system FRESA) and readability (measured with human evaluation), being one of the seven best systems into the task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abracos, J., Lopes, G.: Statistical methods for retrieving most significant paragraphs in newspaper articles. In: Proceedings of the ACL/EACL 1997 Workshop on Intelligent Scalable Text Summarization, Madrid, pp. 51–57 (1997)

    Google Scholar 

  2. Afantenos, S., Karkaletsis, V., Stamatopoulos, P.: Summarization of medical documents: A survey. Artificial Intelligence in Medicine 33(2), 157–177 (2005)

    Article  Google Scholar 

  3. Barrón-Cedeño, A., Sierra, G., Drouin, P., Ananiadou, S.: An Improved Automatic Term Recognition Method for Spanish. In: Gelbukh, A. (ed.) CICLing 2009. LNCS, vol. 5449, pp. 125–136. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  4. Bourigault, D., Jacquemin, C.: Term Extraction + Term Clustering: an integrated platform for computer-aided terminology. In: Proceedings of EACL, pp. 15–22 (1999)

    Google Scholar 

  5. Cabré, M.T.: La terminología. Representación y comunicación. IULA-UPF, Barcelona (1999)

    Google Scholar 

  6. Cabré, M.T., Estopà, R., Vivaldi, J.: Automatic term detection: a review of current systems. In: Bourigault, D., Jacquemin, C., L’Homme, M.C. (eds.) Recent Advances in Computational Terminology, pp. 53–87. John Benjamins, Amsterdam (2001)

    Chapter  Google Scholar 

  7. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to algorithms, 2nd edn. The MIT Press, Cambridge (2005)

    MATH  Google Scholar 

  8. da Cunha, I., Wanner, L., Cabré, M.T.: Summarization of specialized discourse: The case of medical articles in Spanish. Terminology 13(2), 249–286 (2007)

    Article  Google Scholar 

  9. Edmunson, H.P.: New Methods in Automatic Extraction. Journal of the Association for Computing Machinery 16, 264–285 (1969)

    Article  Google Scholar 

  10. Farzindar, A., Lapalme, G., Desclés, J.-P.: Résumé de textes juridiques par identification de leur structure thématique. Traitement Automatique des Langues 45(1), 39–64 (2004)

    Google Scholar 

  11. Fuentes, M., Gonzalez, E., Rodriguez, H.: Resumidor de noticies en catala del projecte Hermes. In: Proceedings of II Congrés d’Enginyeria en Llengua Catalana (CELC 2004), Andorra, pp. 102–102 (2004)

    Google Scholar 

  12. Gaizauskas, R., Herring, P., Oakes, M., Beaulieu, M., Willett, P., Fowkes, H., Jonsson, A.: Intelligent access to text: Integrating information extraction technology into text browsers. In: Proceedings of the Human Language Technology Conference, San Diego, pp. 189–193 (2001)

    Google Scholar 

  13. Johnson, D.B., Zou, Q., Dionisio, J.D., Liu, V.Z., Chu, W.W.: Modeling medical content for automated summarization. Annals of the New York Academy of Sciences 980, 247–258 (2002)

    Article  Google Scholar 

  14. Jun’ichi, K., Kentaro, T.: Exploiting Wikipedia as External Knowledge for Name Entity Recognition. In: Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 698–707 (2007)

    Google Scholar 

  15. Kageura, K., Umino, B.: Methods of automatic term recognition: A review. Terminology 3(2), 259–289 (1996)

    Article  Google Scholar 

  16. Lal, P., Reger, S.: Extract-based Summarization with Simplication. In: Proceedings of the 2nd Document Understanding Conference at the 40th Meeting of the Association for Computational Linguistics, pp. 90–96 (2002)

    Google Scholar 

  17. Leong Chieu, H., Tou Ng, H.: Named entity recognition: a maximum entropy approach using global information. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 1-7 (2002)

    Google Scholar 

  18. Lin, C.-Y.: ROUGE: A Package for Automatic Evaluation of Summaries. In: Proceedings of Text Summarization Branches Out: ACL 2004 Workshop, pp. 74–81 (2004)

    Google Scholar 

  19. Nanba, H., Okumura, M.: Producing More Readable Extracts by Revising Them. In: Proceedings of the 18th International Conference on Computational Linguistics (COLING 2000), Saarbrucken, pp. 1071–1075 (2000)

    Google Scholar 

  20. Ono, K., Sumita, K., Miike, S.: Abstract generation based on rhetorical structure extraction. In: Proceedings of the International Conference on Computational Linguistics, Kyoto, pp. 344–348 (1994)

    Google Scholar 

  21. Paice, C.D.: Constructing literature abstracts by computer: Techniques and prospects. Information Processing and Management 26, 171–186 (1990)

    Article  Google Scholar 

  22. Pazienza, M.T., Pennacchiotti, M., Zanzotto, F.M.: Terminology Extraction: An Analysis of Linguistic and Statistical Approaches. In: Studies in Fuzziness and Soft Computing, vol. 185, pp. 255–279 (2005)

    Google Scholar 

  23. Pearson, J.: Terms in context. John Benjamin, Amsterdam (1998)

    Book  Google Scholar 

  24. Radev, D.: Language Reuse and Regeneration: Generating Natural Language Summaries from Multiple On-Line Sources. New York, Columbia University [PhD Thesis] (1999)

    Google Scholar 

  25. Sager, J.C.: In search of a foundation: Towards a theory of terms. Terminology 5(1), 41–57 (1999)

    Article  Google Scholar 

  26. Saggion, H., Lapalme, G.: Generating Indicative-Informative Summaries with SumUM. Computational Linguistics 28(4), 497–526 (2002)

    Article  Google Scholar 

  27. Saggion, H., Torres-Moreno, J.-M., da Cunha, I., SanJuan, E., Velázquez-Morales, P., SanJuan, E.: Multilingual Summarization Evaluation without Human Models. In: Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Pekin (2010)

    Google Scholar 

  28. SanJuan, E., Bellot, P., Moriceau, V., Tannier, X.: Overview of the 2010 QA Track: Preliminary results. In: Geva, S., et al. (eds.) INEX 2010. LNCS, vol. 6932, pp. 269–281. Springer, Heidelberg (2010)

    Google Scholar 

  29. Sclano, F., Velardi, P.: Termextractor: a web application to learn the shared terminology of emergent web communities. In: Proceedings of the 3rd International Conference on Interoperability for Enterprise Software and Applications, pp. 287–298 (2007)

    Google Scholar 

  30. Torres-Moreno, J.-M., Saggion, H., da Cunha, I., SanJuan, E., Velázquez-Morales, P., SanJuan, E.: Summary Evaluation With and Without References. Polibitis: Research Journal on Computer Science and Computer Engineering with Applications 42 (2010a)

    Google Scholar 

  31. Torres-Moreno, J.-M., Saggion, H., da Cunha, I., Velázquez-Morales, P., SanJuan, E.: Ealuation automatique de résumés avec et sans référence. In: Proceedings of the 17e Conférence sur le Traitement Automatique des Langues Naturelles (TALN), Université de Montréal et Ecole Polytechnique de Montréal, Montreal Canada (2010)

    Google Scholar 

  32. Torres-Moreno, J-M., Ramírez, J.: REG: un algorithme glouton appliqué au résumé automatique de texte. In: JADT 2010, Roma, Italia (2010)

    Google Scholar 

  33. Torres-Moreno, J-M., Ramírez, J.: Un resumeur a base de graphes, indépendant de la langue. In: Proceedings of the International Workshop African HLT 2010, Djibouti (2010)

    Google Scholar 

  34. Torres-Moreno, J.M., Velázquez-Morales, P., Meunier, J.G.: Condensés de textes par des méthodes numériques. In: Proceedings of the 6th International Conference on the Statistical Analysis of Textual Data (JADT), St. Malo, pp. 723–734 (2002)

    Google Scholar 

  35. Vivaldi, J., da Cunha, I., Torres-Moreno, J.M., Velázquez, P.: Automatic Summarization Using Terminological and Semantic Resources. In: En Actas del 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta (2010)

    Google Scholar 

  36. Vivaldi, J.: Extracción de candidatos a término mediante combinación de estrategias heterogéneas. Ph.D. thesis, Universitat Politcnica de Catalunya, Barcelona (2001)

    Google Scholar 

  37. Vivaldi, J., Rodríguez, H.: Improving term extraction by combining different techniques. Terminology 7(1), 31–47 (2001a)

    Article  Google Scholar 

  38. Vivaldi, J., Màrquez, L., Rodríguez, H.: Improving term extraction by system combination using boosting. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 515–526. Springer, Heidelberg (2001b)

    Chapter  Google Scholar 

  39. Volk, M., Clematide, S.: Learn-filter-apply-forget. Mixed approaches to name entity recognition. In: Proceedings of the 6th International Workshop on Applications of Natural Language for Informations Systems, Madrid, Spain (2001)

    Google Scholar 

  40. Won, W., Liu, W., Bennamoun, M.: Determination of Unithood and Termhood for Term Recognition. In: Song, M., Wu, Y. (eds.) Handbook of Research on Text and Web Mining Technologies. IGI Global (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vivaldi, J., da Cunha, I., Ramírez, J. (2011). The REG Summarization System with Question Reformulation at QA@INEX Track 2010. In: Geva, S., Kamps, J., Schenkel, R., Trotman, A. (eds) Comparative Evaluation of Focused Retrieval. INEX 2010. Lecture Notes in Computer Science, vol 6932. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23577-1_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23577-1_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23576-4

  • Online ISBN: 978-3-642-23577-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics