Skip to main content

Handling Non-compositionality in Multilingual CNLs

  • Conference paper
Controlled Natural Language (CNL 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8625))

Included in the following conference series:

Abstract

In this paper, we describe methods for handling multilingual non-compositional constructions in the framework of GF. We specifically look at methods to detect and extract non-compositional phrases from parallel texts and propose methods to handle such constructions in GF grammars. We expect that the methods to handle non-compositional constructions will enrich CNLs by providing more flexibility in the design of controlled languages. We look at two specific use cases of non-compositional constructions: a general-purpose method to detect and extract multilingual multiword expressions and a procedure to identify nominal compounds in German. We evaluate our procedure for multiword expressions by performing a qualitative analysis of the results. For the experiments on nominal compounds, we incorporate the detected compounds in a full SMT pipeline and evaluate the impact of our method in machine translation process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Angelov, K.: The Mechanics of the Grammatical Framework. PhD thesis, Chalmers University of Technology (2011)

    Google Scholar 

  2. Angelov, K., Enache, R.: Typeful Ontologies with Direct Multilingual Verbalization. In: Rosner, M., Fuchs, N.E. (eds.) CNL 2010. LNCS, vol. 7175, pp. 1–20. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  3. Angelov, K., Ljunglöf, P.: fast statistical parsing with parallel multiple context-free grammars. In: European Chapter of the Association for Computational Linguistics, Gothenburg (2014)

    Google Scholar 

  4. Baldwin, T., Kim, S.N.: Multiword expressions. In: Handbook of Natural Language Processing, 2nd edn. (2010)

    Google Scholar 

  5. Bouamor, D., Semmar, N., Zweigenbaum, P.: Identifying bilingual multi-word expressions for statistical machine translation. In: Calzolari, N., Choukri, K., Declerck, T., Doan, M.U., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC 2012), Istanbul, Turkey, European Language Resources Association (ELRA) (May 2012)

    Google Scholar 

  6. Dannélls, D., Damova, M., Enache, R., Chechev, M.: A framework for improved access to museum databases in the semantic web. In: Recent Advances in Natural Language Processing (RANLP) (2011)

    Google Scholar 

  7. Dannélls, D., Enache, R., Damova, M., Chechev, M.: Multilingual online generation from semantic web ontologies. In: WWW 2012, EU projects track (2012)

    Google Scholar 

  8. Davis, B., Enache, R., van Grondelle, J., Pretorius, L.: Multilingual Verbalisation of Modular Ontologies using GF and Lemon. In: Kuhn, T., Fuchs, N.E. (eds.) CNL 2012. LNCS, vol. 7427, pp. 167–184. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  9. Enache, R.: Frontiers of Multilingual Grammar Development. PhD thesis, University of Gothenburg (2013)

    Google Scholar 

  10. Enache, R., España-Bonet, C., Ranta, A., Màrquez, L.: A hybrid system for patent translation. In: Proceedings of the 16th Annual Conference of the European Association for Machine Translation (EAMT 2012), Trento, Italy, pp. 269–276 (2012)

    Google Scholar 

  11. España-Bonet, C., Enache, R., Angelov, K., Virk, S., Galgóczy, E., Gonzàlez, M., Ranta, A., Màrquez, L.: WP5 final report: Statistical and robust machine translation (D 5.3) (2013)

    Google Scholar 

  12. Grūzītis, N., Dannélls, D.: Extracting a bilingual semantic grammar from FrameNet-annotated corpora (2014)

    Google Scholar 

  13. Gruzitis, N., Paikens, P., Barzdins, G.: FrameNet Resource Grammar Library for GF. In: Kuhn, T., Fuchs, N.E. (eds.) CNL 2012. LNCS, vol. 7427, pp. 121–137. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  14. Kaljurand, K., Alumäe, T.: Controlled natural language in speech recognition based user interfaces. In: Kuhn, T., Fuchs, N.E. (eds.) CNL 2012. LNCS, vol. 7427, pp. 79–94. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  15. Kiela, D., Clark, S.: Detecting compositionality of multi-word expressions using nearest neighbours in vector space models. In: EMNLP, pp. 1427–1432. ACL (2013)

    Google Scholar 

  16. Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceedings of ACL (2003)

    Google Scholar 

  17. Koehn, P.: Europarl: A Parallel Corpus for Statistical Machine Translation. In: Proceedings of the 10th Machine Translation Summit (2005)

    Google Scholar 

  18. Korkontzelos, I.: Unsupervised Learning of Multiword Expressions. PhD thesis, University of York (2010)

    Google Scholar 

  19. Ramisch, C., De Araujo, V., Villavicencio, A.: A broad evaluation of techniques for automatic acquisition of multiword expressions. In: Proceedings of ACL 2012 Student Research Workshop, ACL 2012, Stroudsburg, PA, USA, pp. 1–6. Association for Computational Linguistics (2012)

    Google Scholar 

  20. Ranta, A.: Grammatical Framework: Programming with Multilingual Grammars. CSLI Publications (2011)

    Google Scholar 

  21. Angelov, K., Ranta, A.: Implementing Controlled Languages in GF. In: Fuchs, N.E. (ed.) CNL 2009. LNCS, vol. 5972, pp. 82–101. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  22. Ranta, A., Camilleri, J., Détrez, G., Enache, R., Hallgren, T.: Grammar tool manual and best practices (D 2.3) (2012)

    Google Scholar 

  23. Ranta, A., Enache, R., Détrez, G.: Controlled language for everyday use: The MOLTO phrasebook. In: Rosner, M., Fuchs, N.E. (eds.) CNL 2010. LNCS, vol. 7175, pp. 115–136. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  24. Sag, I.A., Baldwin, T., Bond, F., Copestake, A., Flickinger, D.: Multiword expressions: A pain in the neck for NLP. In: Gelbukh, A. (ed.) CICLing 2002. LNCS, vol. 2276, pp. 1–15. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  25. Saludes, J., Xambó, S., The, G.F.: Mathematics Library. In: Proceedings of First Workshop on CTP Components for Educational Software, THedu 2011 (2011)

    Google Scholar 

  26. Saludes, J., Xambó, S.: Proceedings of EACA 2012, TODO (2012)

    Google Scholar 

  27. Saludes, J., Xambó, S.: Multilingual Sage. Tbilisi Mathematical Journal (2012)

    Google Scholar 

  28. Tsvetkov, Y., Wintner, S.: Extraction of multi-word expressions from small parallel corpora. In: Huang, C.-R., Jurafsky, D. (eds.) COLING (Posters), pp. 1256–1264. Chinese Information Processing Society of China (2010)

    Google Scholar 

  29. Villada Moirón, B., Tiedemann, J.: Identifying idiomatic expressions using automatic word alignment. In: Proceedings of the EACL 2006 Workshop on Multiword Expressions (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Enache, R., Listenmaa, I., Kolachina, P. (2014). Handling Non-compositionality in Multilingual CNLs. In: Davis, B., Kaljurand, K., Kuhn, T. (eds) Controlled Natural Language. CNL 2014. Lecture Notes in Computer Science(), vol 8625. Springer, Cham. https://doi.org/10.1007/978-3-319-10223-8_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-10223-8_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-10222-1

  • Online ISBN: 978-3-319-10223-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics