Skip to main content
Log in

Recent Advances on Neural Headline Generation

  • Survey
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Recently, neural models have been proposed for headline generation by learning to map documents to headlines with recurrent neural network. In this work, we give a detailed introduction and comparison of existing work and recent improvements in neural headline generation, with particular attention on how encoders, decoders and neural model training strategies alter the overall performance of the headline generation system. Furthermore, we perform quantitative analysis of most existing neural headline generation systems and summarize several key factors that impact the performance of headline generation systems. Meanwhile, we carry on detailed error analysis to typical neural headline generation systems in order to gain more comprehension. Our results and conclusions are hoped to benefit future research studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Dorr B, Zajic D, Schwartz R. Hedge trimmer: A parse-andtrim approach to headline generation. In Proc. the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics on Text summarization workshop, Volume 5, May 2003, pp.1-8.

  2. Chopra S, Auli M, Rush A M. Abstractive sentence summarization with attentive recurrent neural networks. In Proc. the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, June 2016, pp.93-98.

  3. Nallapati R, Zhou B, Santos C. Abstractive text summarization using sequence-to-sequence RNNS and beyond. http://aclweb.org/anthology/K/K16/K16-1028.pdf, May 2017.

  4. Takase S, Suzuki J, Okazaki N, Hirao T, Nagata M. Neural headline generation on abstract meaning representation. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.1054-1059.

  5. Hu B, Chen Q, Zhu F. LCSTS: A large scale Chinese short text summarization dataset. In Proc. the Conference on Empirical Methods in Natural Language Processing, September 2015, pp.1967-1972.

  6. Gu J, Lu Z, Li H, Li V O. Incorporating copying mechanism in sequence-to-sequence learning. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.1631-1640.

  7. Rush A M, Chopra S, Weston J. A neural attention model for abstractive sentence summarization. In Proc. the Conference on Empirical Methods in Natural Language Processing, September 2015, pp.379-389.

  8. Bengio Y, Simard P, Frasconi P. Learning long-term dependencies with gradient descent is diffcult. IEEE Transactions on Neural Networks, 1994, 5(2): 157-166.

    Article  Google Scholar 

  9. Cho K, van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proc. the Conference on Empirical Methods in Natural Language Processing, October 2014, pp.1724-1734.

  10. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Computation, 1997, 9(8): 1735-1780.

    Article  Google Scholar 

  11. Kikuchi Y, Neubig G, Sasano R, Takamura H, Okumura M. Controlling output length in neural encoder-decoders. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.1328-1338.

  12. Miao Y, Blunsom P. Language as a latent variable: Discrete generative models for sentence compression. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.319-328.

  13. Schuster M, Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 1997, 45(11): 2673-2681.

    Article  Google Scholar 

  14. Bengio Y, Ducharme R, Vincent P, Jauvin C. A neural probabilistic language model. The Journal of Machine Learning Research, 2003, 3: 1137-1155.

  15. Bahdanau D, Cho K, Bengio Y. Neural machine translation by jointly learning to align and translate. In Proc. ICLR, May 2015.

  16. Shen S, Cheng Y, He Z, He W, Wu H, Sun M, Liu Y. Minimum risk training for neural machine translation. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.1683-1692.

  17. Ranzato M, Chopra S, Auli M, Zaremba W. Sequence level training with recurrent neural networks. In Proc. ICLR, May 2016.

  18. Och F J. Minimum error rate training in statistical machine translation. In Proc. the 41st Annual Meeting on Association for Computational Linguistics, July 2003, pp.160-167.

  19. Smith D A, Eisner J. Minimum risk annealing for training log-linear models. In Proc. the COLING/ACL Main Conference Poster Sessions, July 2006, pp.787-794.

  20. Gao J, He X, Yih W, Deng L. Learning continuous phrase representations for translation modeling. In Proc. the 52nd Annual Meeting of the Association for Computational Linguistics, June 2014.

  21. Lin C Y. ROUGE: A package for automatic evaluation of summaries. In Proc. the Workshop on Text Summarization Branches Out, July 2004.

  22. Gulcehre C, Ahn S, Nallapati R, Zhou B, Bengio Y. Pointing the unknown words. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, August 2016, pp.140-149.

  23. Vinyals O, Fortunato M, Jaitly N. Pointer networks. In Proc. Advances in Neural Information Processing Systems, Dec. 2015, pp.2692-2700.

  24. Jean S, Cho K, Memisevic R, Bengio Y. On using very large target vocabulary for neural machine translation. In Proc. the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, July 2015, pp.1-10.

  25. Vogel S, Ney H, Tillmann C. HMM-based word alignment in statistical translation. In Proc. the 16th Conference on Computational Linguistics, Aug. 1996, pp.836-841.

  26. Tillmann C, Vogel S, Ney H, Zubiaga A. A DP-based search using monotone alignments in statistical translation. In Proc. the 35th Annual Meeting of the Association for Computational Linguistics, July 1997, pp.289-296.

  27. Yu L, Buys J, Blunsom P. Online segment to segment neural transduction. In Proc. the Conference on Empirical Methods in Natural Language Processing, November 2016, pp.1307-1316.

  28. Banko M, Mittal V O, Witbrock M J. Headline generation based on statistical translation. In Proc. the 38th Annual Meeting of the Association for Computational Linguistics, Oct. 2000, pp.318-325.

  29. Napoles C, Gormley M, van Durme B. Annotated Gigaword. In Proc. the Joint Workshop on Automatic Knowledge Base Construction and Web-Scale Knowledge Extraction, June 2012, pp.95-100.

  30. Zeiler M D. ADADELTA: An adaptive learning rate method. arXiv:1212.5701, 2012. https://arxiv.org/abs/12-12.5701, May 2017.

  31. Luong T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. In Proc. the Conference on Empirical Methods in Natural Language Processing, September 2015, pp.1412-1421.

  32. Zajic D, Dorr B, Schwartz R. BBN/UMD at DUC-2004: Topiary. In Proc. the HLT-NAACL Document Understanding Workshop, Jan. 2004, pp.112-119.

  33. Cheng J, Lapata M. Neural summarization by extracting sentences and words. In Proc. the 54th Annual Meeting of the Association for Computational Linguistics, Aug. 2016.

  34. Cao Z, Li W, Li S, Wei F, Li Y. AttSum: Joint learning of focusing and summarization with neural attention. In Proc. the 26th International Conference on Computational Linguistics, December 2016, pp.547-556.

Download references

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhi-Yuan Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ayana, Shen, SQ., Lin, YK. et al. Recent Advances on Neural Headline Generation. J. Comput. Sci. Technol. 32, 768–784 (2017). https://doi.org/10.1007/s11390-017-1758-3

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-017-1758-3

Keywords

Navigation