Visual Disambiguation of Prepositional Phrase Attachments: Multimodal Machine Learning for Syntactic Analysis Correction

Delecraz, Sebastien; Becerra-Bonache, Leonor; Nasr, Alexis; Bechet, Frederic; Favre, Benoit

doi:10.1007/978-3-030-20521-8_52

Sebastien Delecraz¹⁷,
Leonor Becerra-Bonache¹⁸,
Alexis Nasr¹⁷,
Frederic Bechet¹⁷ &
…
Benoit Favre¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11506))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

2431 Accesses
1 Citations

Abstract

Prepositional phrase attachments are known to be an important source of errors in parsing natural language. In some cases, pure syntactic features cannot be used for prepositional phrase attachment disambiguation while visual features could help. In this work, we are interested in the impact of the integration of such features in a parsing system. We propose a correction strategy pipeline for prepositional attachments using visual information, trained on a multimodal corpus of images and captions. The evaluation of the system shows us that using visual features allows, in certain cases, to correct the errors of a parser. It also helps to identify the most difficult aspects of such integration.

The work of Leonor Becerra-Bonache has been performed during her teaching leave granted by the CNRS (French National Center for Scientific Research) in Laboratoire d’Informatique et Systèmes of Aix-Marseille University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agirre, E., Baldwin, T., Martinez, D.: Improving parsing and PP attachment performance with sense information. In: ACL HLT, pp. 317–325 (2008)
Google Scholar
Belinkov, Y., Lei, T., Barzilay, R., Globerson, A.: Exploring compositional architectures and word vector representations for prepositional phrase attachment. TACL 2, 561–572 (2014)
Google Scholar
Chang, A.X., Monroe, W., Savva, M., Potts, C., Manning, C.D.: Text to 3D scene generation with rich lexical grounding. In: ACL-IJCNLP:2015, pp. 53–62 (2015)
Google Scholar
Christie, G., Laddha, A., Agrawal, A., et al.: Resolving language and vision ambiguities together: joint segmentation & prepositional attachment resolution in captioned scenes. In: EMNLP, pp. 1493–1503 (2016)
Google Scholar
Coco, M.I., Keller, F.: The interaction of visual and linguistic saliency during syntactic ambiguity resolution. QJEP 68(1), 46–74 (2015)
Google Scholar
Coyne, R., Sproat, R.: WordsEye: an automatic text-to-scene conversion system. In: SIGGRAPH, pp. 487–496 (2001)
Google Scholar
Dasigi, P., Ammar, W., Dyer, C., Hovy, E.: Ontology-aware token embeddings for prepositional phrase attachment. In: ACL, vol. 1, pp. 2089–2098 (2017)
Google Scholar
Delecraz, S., Nasr, A., Bechet, F., Favre, B.: Correcting prepositional phrase attachments using multimodal corpora. In: IWPT, pp. 72–77 (2017)
Google Scholar
Delecraz, S., Nasr, A., Béchet, F., Favre, B.: Adding syntactic annotations to flickr30k entities corpus for multimodal ambiguous prepositional-phrase attachment resolution. In: LREC (2018)
Google Scholar
Faghri, F., Fleet, D.J., Kiros, J.R., Fidler, S.: Vse++: Improved visual-semantic embeddings. arXiv preprint arXiv:1707.05612 (2017)
Fang, H., Gupta, S., Iandola, F.N., et al.: From captions to visual concepts and back. In: CVPR, pp. 1473–1482 (2015)
Google Scholar
Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. JSAI 14(771–780), 1612 (1999)
Google Scholar
Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)
Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Google Scholar
Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: CVPR, pp. 3128–3137 (2015)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
de Kok, D., Hinrichs, E.W.: Transition-based dependency parsing with topological fields. In: ACL, vol. 2: short paper (2016)
Google Scholar
de Kok, D., Ma, J., Dima, C., Hinrichs, E.: PP attachment: Where do we stand? In: EACL, vol. 2, pp. 311–317 (2017)
Google Scholar
Kummerfeld, J.K., Hall, D.L.W., Curran, J.R., Klein, D.: Parser showdown at the wall street corral: an empirical investigation of error types in parser output. In: EMNLP-CoNLL, pp. 1048–1059 (2012)
Google Scholar
Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Chapter Google Scholar
Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)
Google Scholar
Mirroshandel, S.A., Nasr, A.: Integrating selectional constraints and subcategorization frames in a dependency parser. Comput. Linguist. 42, 55–90 (2016)
Article MathSciNet Google Scholar
Nasr, A., Béchet, F., Rey, J.F., Favre, B., Le Roux, J.: MACAON: an NLP tool suite for processing word lattices. In: ACL HLT, pp. 86–91 (2011)
Google Scholar
Peyre, J., Laptev, I., Schmid, C., Sivic, J.: Weakly-supervised learning of visual relations. In: ICCV, pp. 5189–5198 (2017)
Google Scholar
Plummer, B.A., Wang, L., Cervantes, C.M., Caicedo, J.C., Hockenmaier, J., Lazebnik, S.: Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. ICCV 123(1), 74–93 (2017)
MathSciNet Google Scholar
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)
Google Scholar
Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: CVPR, pp. 6517–6525. IEEE (2017)
Google Scholar
Russakovsky, O., Deng, J., Su, H., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Shaerlaekens, A.: The Two-Word Sentence in Child Language Development: A Study Based on Evidence Provided by Dutch-Speaking Triplets. Mouton, The Hague (1973)
Google Scholar
Snow, C.E.: Mothers’ speech to children learning language. Child Dev. 43(2), 549–565 (1972)
Article Google Scholar
Spivey, M.J., Tanenhaus, M.K., Eberhard, K.M., Sedivy, J.C.: Eye movements and spoken language comprehension: effects of visual context on syntactic ambiguity resolution. Cogn. Psychol. 45(4), 447–481 (2002)
Article Google Scholar
Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR, pp. 3156–3164 (2015)
Google Scholar
Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. TACL 2, 67–78 (2014)
Google Scholar

Download references

Author information

Authors and Affiliations

Aix-Marseille Univ, Université de Toulon, CNRS, LIS, UMR 7020, Marseille, France
Sebastien Delecraz, Alexis Nasr, Frederic Bechet & Benoit Favre
Univ Lyon, UJM-Saint-Etienne, CNRS, Laboratoire Hubert Curien, UMR 5516, Saint-Étienne, France
Leonor Becerra-Bonache

Authors

Sebastien Delecraz
View author publications
You can also search for this author in PubMed Google Scholar
Leonor Becerra-Bonache
View author publications
You can also search for this author in PubMed Google Scholar
Alexis Nasr
View author publications
You can also search for this author in PubMed Google Scholar
Frederic Bechet
View author publications
You can also search for this author in PubMed Google Scholar
Benoit Favre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sebastien Delecraz .

Editor information

Editors and Affiliations

University of Granada, Granada, Spain
Ignacio Rojas
University of Malaga, Malaga, Spain
Gonzalo Joya
Polytechnic University of Catalonia, Barcelona, Spain
Andreu Catala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Delecraz, S., Becerra-Bonache, L., Nasr, A., Bechet, F., Favre, B. (2019). Visual Disambiguation of Prepositional Phrase Attachments: Multimodal Machine Learning for Syntactic Analysis Correction. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11506. Springer, Cham. https://doi.org/10.1007/978-3-030-20521-8_52

Download citation

DOI: https://doi.org/10.1007/978-3-030-20521-8_52
Published: 16 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20520-1
Online ISBN: 978-3-030-20521-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics