Skip to main content

Visual Disambiguation of Prepositional Phrase Attachments: Multimodal Machine Learning for Syntactic Analysis Correction

  • Conference paper
  • First Online:
Advances in Computational Intelligence (IWANN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11506))

Included in the following conference series:

Abstract

Prepositional phrase attachments are known to be an important source of errors in parsing natural language. In some cases, pure syntactic features cannot be used for prepositional phrase attachment disambiguation while visual features could help. In this work, we are interested in the impact of the integration of such features in a parsing system. We propose a correction strategy pipeline for prepositional attachments using visual information, trained on a multimodal corpus of images and captions. The evaluation of the system shows us that using visual features allows, in certain cases, to correct the errors of a parser. It also helps to identify the most difficult aspects of such integration.

The work of Leonor Becerra-Bonache has been performed during her teaching leave granted by the CNRS (French National Center for Scientific Research) in Laboratoire d’Informatique et Systèmes of Aix-Marseille University.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agirre, E., Baldwin, T., Martinez, D.: Improving parsing and PP attachment performance with sense information. In: ACL HLT, pp. 317–325 (2008)

    Google Scholar 

  2. Belinkov, Y., Lei, T., Barzilay, R., Globerson, A.: Exploring compositional architectures and word vector representations for prepositional phrase attachment. TACL 2, 561–572 (2014)

    Google Scholar 

  3. Chang, A.X., Monroe, W., Savva, M., Potts, C., Manning, C.D.: Text to 3D scene generation with rich lexical grounding. In: ACL-IJCNLP:2015, pp. 53–62 (2015)

    Google Scholar 

  4. Christie, G., Laddha, A., Agrawal, A., et al.: Resolving language and vision ambiguities together: joint segmentation & prepositional attachment resolution in captioned scenes. In: EMNLP, pp. 1493–1503 (2016)

    Google Scholar 

  5. Coco, M.I., Keller, F.: The interaction of visual and linguistic saliency during syntactic ambiguity resolution. QJEP 68(1), 46–74 (2015)

    Google Scholar 

  6. Coyne, R., Sproat, R.: WordsEye: an automatic text-to-scene conversion system. In: SIGGRAPH, pp. 487–496 (2001)

    Google Scholar 

  7. Dasigi, P., Ammar, W., Dyer, C., Hovy, E.: Ontology-aware token embeddings for prepositional phrase attachment. In: ACL, vol. 1, pp. 2089–2098 (2017)

    Google Scholar 

  8. Delecraz, S., Nasr, A., Bechet, F., Favre, B.: Correcting prepositional phrase attachments using multimodal corpora. In: IWPT, pp. 72–77 (2017)

    Google Scholar 

  9. Delecraz, S., Nasr, A., Béchet, F., Favre, B.: Adding syntactic annotations to flickr30k entities corpus for multimodal ambiguous prepositional-phrase attachment resolution. In: LREC (2018)

    Google Scholar 

  10. Faghri, F., Fleet, D.J., Kiros, J.R., Fidler, S.: Vse++: Improved visual-semantic embeddings. arXiv preprint arXiv:1707.05612 (2017)

  11. Fang, H., Gupta, S., Iandola, F.N., et al.: From captions to visual concepts and back. In: CVPR, pp. 1473–1482 (2015)

    Google Scholar 

  12. Freund, Y., Schapire, R., Abe, N.: A short introduction to boosting. JSAI 14(771–780), 1612 (1999)

    Google Scholar 

  13. Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)

    Google Scholar 

  14. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, pp. 580–587 (2014)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  16. Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: CVPR, pp. 3128–3137 (2015)

    Google Scholar 

  17. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  18. de Kok, D., Hinrichs, E.W.: Transition-based dependency parsing with topological fields. In: ACL, vol. 2: short paper (2016)

    Google Scholar 

  19. de Kok, D., Ma, J., Dima, C., Hinrichs, E.: PP attachment: Where do we stand? In: EACL, vol. 2, pp. 311–317 (2017)

    Google Scholar 

  20. Kummerfeld, J.K., Hall, D.L.W., Curran, J.R., Klein, D.: Parser showdown at the wall street corral: an empirical investigation of error types in parser output. In: EMNLP-CoNLL, pp. 1048–1059 (2012)

    Google Scholar 

  21. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

    Chapter  Google Scholar 

  22. Marcus, M.P., Marcinkiewicz, M.A., Santorini, B.: Building a large annotated corpus of English: the penn treebank. Comput. Linguist. 19(2), 313–330 (1993)

    Google Scholar 

  23. Mirroshandel, S.A., Nasr, A.: Integrating selectional constraints and subcategorization frames in a dependency parser. Comput. Linguist. 42, 55–90 (2016)

    Article  MathSciNet  Google Scholar 

  24. Nasr, A., Béchet, F., Rey, J.F., Favre, B., Le Roux, J.: MACAON: an NLP tool suite for processing word lattices. In: ACL HLT, pp. 86–91 (2011)

    Google Scholar 

  25. Peyre, J., Laptev, I., Schmid, C., Sivic, J.: Weakly-supervised learning of visual relations. In: ICCV, pp. 5189–5198 (2017)

    Google Scholar 

  26. Plummer, B.A., Wang, L., Cervantes, C.M., Caicedo, J.C., Hockenmaier, J., Lazebnik, S.: Flickr30k entities: collecting region-to-phrase correspondences for richer image-to-sentence models. ICCV 123(1), 74–93 (2017)

    MathSciNet  Google Scholar 

  27. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: CVPR, pp. 779–788 (2016)

    Google Scholar 

  28. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: CVPR, pp. 6517–6525. IEEE (2017)

    Google Scholar 

  29. Russakovsky, O., Deng, J., Su, H., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  30. Shaerlaekens, A.: The Two-Word Sentence in Child Language Development: A Study Based on Evidence Provided by Dutch-Speaking Triplets. Mouton, The Hague (1973)

    Google Scholar 

  31. Snow, C.E.: Mothers’ speech to children learning language. Child Dev. 43(2), 549–565 (1972)

    Article  Google Scholar 

  32. Spivey, M.J., Tanenhaus, M.K., Eberhard, K.M., Sedivy, J.C.: Eye movements and spoken language comprehension: effects of visual context on syntactic ambiguity resolution. Cogn. Psychol. 45(4), 447–481 (2002)

    Article  Google Scholar 

  33. Vinyals, O., Toshev, A., Bengio, S., Erhan, D.: Show and tell: a neural image caption generator. In: CVPR, pp. 3156–3164 (2015)

    Google Scholar 

  34. Young, P., Lai, A., Hodosh, M., Hockenmaier, J.: From image descriptions to visual denotations: new similarity metrics for semantic inference over event descriptions. TACL 2, 67–78 (2014)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastien Delecraz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Delecraz, S., Becerra-Bonache, L., Nasr, A., Bechet, F., Favre, B. (2019). Visual Disambiguation of Prepositional Phrase Attachments: Multimodal Machine Learning for Syntactic Analysis Correction. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11506. Springer, Cham. https://doi.org/10.1007/978-3-030-20521-8_52

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20521-8_52

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20520-1

  • Online ISBN: 978-3-030-20521-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics