Skip to main content

Automatic Generation of Description for Images Using Recurrent Neural Network

  • Conference paper
  • First Online:
Computing and Network Sustainability

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 75))

  • 642 Accesses

Abstract

In this paper, a generative long short-term memory (LSTM) model for generating description of the image is implemented. Automatic generation of description that describes the content of a given image is a fundamental problem in artificial intelligence. This kind of work is achieved by connecting two different domains like computer vision and natural language processing. The solution proposed here makes use of deep learning. A deep learning framework known as Keras is used which uses TensorFlow for the backend process. TensorFlow is a framework used to do a series of operations in a chain. The general technique is to feed the features of an image to the model, which is capable of generating text of length less than or equal to a predefined caption length. The dataset Flickr30 K is used to train the model. The InceptionV3 is used to extract features of the images. BLEU metric is used to measure the accuracy of the description that is generated for that image using LSTM model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kinghorn P, Zhang L, Shao L (2017) Deep learning based image description generation. IEEE

    Google Scholar 

  2. Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. Department of Computer Science, Stanford University

    Google Scholar 

  3. Liang M, Hu X (2015) Recurrent CNNs for object recognition. In: CVPR, 2015; R. Girshick, Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV)

    Google Scholar 

  4. Vinyals O et al (2015) Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition

    Google Scholar 

  5. Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015). Long-term recurrent convolutional networks for visual recognition and description. In: CVPR

    Google Scholar 

  6. Mao J, Xu W, Yang Y, Wang J, Yuille A (2015) Explain images with multimodal recurrent neural networks (m-RNN). In: ICLR

    Google Scholar 

  7. Papineni K et al (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics

    Google Scholar 

  8. Sutskever I, Vinyals O, Le QV (2015) Sequence to sequence learning with neural networks. In: Proceedings advances in neural information processing systems, vol 27. pp 3104–3112

    Google Scholar 

  9. Simonyan K, Zisserman A (2015) Very deep convolutional networks for largescale image recognition. In: ICLR

    Google Scholar 

  10. Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. In: Proceedings of the international conference on learning representations, ICLR

    Google Scholar 

  11. Farhadi A, Hejrati M, Sadeghi MA, Young P, Rashtchian C, Hockenmaier J, Forsyth D (2010) Every picture tells a story: generating sentences from images. In: ECCV

    Google Scholar 

  12. Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Savitri Patil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Veena, G.S., Patil, S., Kumar, T.N.R. (2019). Automatic Generation of Description for Images Using Recurrent Neural Network. In: Peng, SL., Dey, N., Bundele, M. (eds) Computing and Network Sustainability. Lecture Notes in Networks and Systems, vol 75. Springer, Singapore. https://doi.org/10.1007/978-981-13-7150-9_44

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-7150-9_44

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-7149-3

  • Online ISBN: 978-981-13-7150-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics