Automatic Generation of Description for Images Using Recurrent Neural Network

Veena, G. S.; Patil, Savitri; Kumar, T. N. R.

doi:10.1007/978-981-13-7150-9_44

G. S. Veena¹²,
Savitri Patil¹² &
T. N. R. Kumar¹²

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 75))

642 Accesses

Abstract

In this paper, a generative long short-term memory (LSTM) model for generating description of the image is implemented. Automatic generation of description that describes the content of a given image is a fundamental problem in artificial intelligence. This kind of work is achieved by connecting two different domains like computer vision and natural language processing. The solution proposed here makes use of deep learning. A deep learning framework known as Keras is used which uses TensorFlow for the backend process. TensorFlow is a framework used to do a series of operations in a chain. The general technique is to feed the features of an image to the model, which is capable of generating text of length less than or equal to a predefined caption length. The dataset Flickr30 K is used to train the model. The InceptionV3 is used to extract features of the images. BLEU metric is used to measure the accuracy of the description that is generated for that image using LSTM model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Kinghorn P, Zhang L, Shao L (2017) Deep learning based image description generation. IEEE
Google Scholar
Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. Department of Computer Science, Stanford University
Google Scholar
Liang M, Hu X (2015) Recurrent CNNs for object recognition. In: CVPR, 2015; R. Girshick, Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV)
Google Scholar
Vinyals O et al (2015) Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Google Scholar
Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015). Long-term recurrent convolutional networks for visual recognition and description. In: CVPR
Google Scholar
Mao J, Xu W, Yang Y, Wang J, Yuille A (2015) Explain images with multimodal recurrent neural networks (m-RNN). In: ICLR
Google Scholar
Papineni K et al (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics
Google Scholar
Sutskever I, Vinyals O, Le QV (2015) Sequence to sequence learning with neural networks. In: Proceedings advances in neural information processing systems, vol 27. pp 3104–3112
Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for largescale image recognition. In: ICLR
Google Scholar
Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. In: Proceedings of the international conference on learning representations, ICLR
Google Scholar
Farhadi A, Hejrati M, Sadeghi MA, Young P, Rashtchian C, Hockenmaier J, Forsyth D (2010) Every picture tells a story: generating sentences from images. In: ECCV
Google Scholar
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS
Google Scholar

Download references

Author information

Authors and Affiliations

Ramaiah Institute of Technology, Bengaluru, India
G. S. Veena, Savitri Patil & T. N. R. Kumar

Authors

G. S. Veena
View author publications
You can also search for this author in PubMed Google Scholar
Savitri Patil
View author publications
You can also search for this author in PubMed Google Scholar
T. N. R. Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Savitri Patil .

Editor information

Editors and Affiliations

Department of Computer Science , National Dong Hwa University, Hualien, Taiwan
Sheng-Lung Peng
Techno India College of Technology, Kolkata, West Bengal, India
Nilanjan Dey
Poornima University, Jaipur, Rajasthan, India
Mahesh Bundele

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Veena, G.S., Patil, S., Kumar, T.N.R. (2019). Automatic Generation of Description for Images Using Recurrent Neural Network. In: Peng, SL., Dey, N., Bundele, M. (eds) Computing and Network Sustainability. Lecture Notes in Networks and Systems, vol 75. Springer, Singapore. https://doi.org/10.1007/978-981-13-7150-9_44

Download citation

DOI: https://doi.org/10.1007/978-981-13-7150-9_44
Published: 03 May 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7149-3
Online ISBN: 978-981-13-7150-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics