Abstract
In this paper, a generative long short-term memory (LSTM) model for generating description of the image is implemented. Automatic generation of description that describes the content of a given image is a fundamental problem in artificial intelligence. This kind of work is achieved by connecting two different domains like computer vision and natural language processing. The solution proposed here makes use of deep learning. A deep learning framework known as Keras is used which uses TensorFlow for the backend process. TensorFlow is a framework used to do a series of operations in a chain. The general technique is to feed the features of an image to the model, which is capable of generating text of length less than or equal to a predefined caption length. The dataset Flickr30 K is used to train the model. The InceptionV3 is used to extract features of the images. BLEU metric is used to measure the accuracy of the description that is generated for that image using LSTM model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kinghorn P, Zhang L, Shao L (2017) Deep learning based image description generation. IEEE
Karpathy A, Fei-Fei L (2017) Deep visual-semantic alignments for generating image descriptions. Department of Computer Science, Stanford University
Liang M, Hu X (2015) Recurrent CNNs for object recognition. In: CVPR, 2015; R. Girshick, Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV)
Vinyals O et al (2015) Show and tell: a neural image caption generator. In: Proceedings of the IEEE conference on computer vision and pattern recognition
Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015). Long-term recurrent convolutional networks for visual recognition and description. In: CVPR
Mao J, Xu W, Yang Y, Wang J, Yuille A (2015) Explain images with multimodal recurrent neural networks (m-RNN). In: ICLR
Papineni K et al (2002) BLEU: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting on association for computational linguistics. Association for Computational Linguistics
Sutskever I, Vinyals O, Le QV (2015) Sequence to sequence learning with neural networks. In: Proceedings advances in neural information processing systems, vol 27. pp 3104–3112
Simonyan K, Zisserman A (2015) Very deep convolutional networks for largescale image recognition. In: ICLR
Arjovsky M, Bottou L (2017) Towards principled methods for training generative adversarial networks. In: Proceedings of the international conference on learning representations, ICLR
Farhadi A, Hejrati M, Sadeghi MA, Young P, Rashtchian C, Hockenmaier J, Forsyth D (2010) Every picture tells a story: generating sentences from images. In: ECCV
Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: NIPS
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Veena, G.S., Patil, S., Kumar, T.N.R. (2019). Automatic Generation of Description for Images Using Recurrent Neural Network. In: Peng, SL., Dey, N., Bundele, M. (eds) Computing and Network Sustainability. Lecture Notes in Networks and Systems, vol 75. Springer, Singapore. https://doi.org/10.1007/978-981-13-7150-9_44
Download citation
DOI: https://doi.org/10.1007/978-981-13-7150-9_44
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-7149-3
Online ISBN: 978-981-13-7150-9
eBook Packages: EngineeringEngineering (R0)