A Comparison of Character and Word Embeddings in Bidirectional LSTMs for POS Tagging in Italian

Marulli, Fiammetta; Pota, Marco; Esposito, Massimo

doi:10.1007/978-3-319-92231-7_2

A Comparison of Character and Word Embeddings in Bidirectional LSTMs for POS Tagging in Italian

Fiammetta Marulli⁸,
Marco Pota⁸ &
Massimo Esposito⁸

Conference paper
First Online: 12 June 2018

871 Accesses
7 Citations

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 98))

Abstract

Word representations are mathematical items capturing a word’s meaning and its grammatical properties in a machine-readable way. They map each word into equivalence classes including words sharing similar properties. Word representations can be obtained automatically by using unsupervised learning algorithms that rely on the distributional hypothesis, stating that the meaning of a word is strictly connected to its context in terms of surrounding words. This assessed notion of context has been recently reconsidered in order to include both distributional and morphological features of a word in terms of characters co-occurrence. This approach has evidenced very promising results, especially in NLP tasks, e.g, POS Tagging, where the representation of the so-called Out of Vocabulary (OOV) words represents a partially solved issue. This work is intended to face the problem of representing OOV words for a POS Tagging task, contextualized to the Italian language. Potential benefits and drawbacks of adopting a Bidirectional Long Short Term Memory (bi-LSTM) fed with a joint character and word embeddings representation to perform POS Tagging also considering OOV words have been investigated. Furthermore, experiments have been performed and discussed by estimating qualitative and quantitative indicators, and, thus, suggesting some possible future direction of the investigation.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

References

Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)
Google Scholar
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016)
Nivre, J., de Marneffe, M.C., Ginter, F., Goldberg, Y., Hajic, J., Manning, C.D., Tsarfaty, R.: Universal Dependencies v1: A Multilingual Treebank Collection. In: LREC (2016)
Google Scholar
Young, T., et al.: Recent trends in deep learning based natural language processing. arXiv preprint arXiv:1708.02709 (2017)
Goldberg, Y.: A primer on neural network models for natural language processing. ArXiv, arXiv:1510.00726 (2015)
Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 160–167. ACM (2008)
Google Scholar
Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)
MATH Google Scholar
Ballesteros, M., Dyer, C., Smith, N.A.: Improved transition-based parsing by modeling characters instead of words with LSTMs. In: EMNLP (2015)
Google Scholar
Kiperwasser, E., Goldberg, Y.: Simple and accurate dependency parsing using bidirectional LSTM feature representations. ArXiv e-prints (2016)
Google Scholar
Wang, P., Qian, Y., Soong, F.K., He, L., Zhao, H.: Part-of-speech tagging with bidirectional long short-term memory recurrent neural network. Pre-print, abs/1510.06168 (2015)
Google Scholar
Ling, W., Dyer, C., Black, A.W., Trancoso, I., Fermandez, R., Amir, S., Marujo, L., Luis, T.: Finding function in form: compositional character models for open vocabulary word representation. In: EMNLP (2015)
Google Scholar
Plank, B., Søgaard, A., Goldberg, Y.: Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss. arXiv preprint arXiv:1604.05529 (2016)
Neubig, G., Dyer, C., Goldberg, Y., Matthews, A., Ammar, W., Anastasopoulos, A., Duh, K.: Dynet: the dynamic neural network toolkit. arXiv preprint arXiv:1701.03980 (2017)
Aprosio, A.P., Moretti, G.: Italy goes to Stanford: a collection of CoreNLP modules for Italian. CoRR abs/1609.06204 (2016)
Google Scholar
Toutanova, K., Klein, D., Manning, C.D., Singer, Y.: Feature-rich part-of-speech tagging with a cyclic dependency network. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, NAACL 2003, vol. 1, pp. 173–180. Association for Computational Linguistics, Stroudsburg (2003)
Google Scholar
Marulli, F., Pota, M., Esposito, M., Maisto, A., Guarasci, R.: Tuning SyntaxNet for POS tagging Italian sentences. In: Xhafa, F., Caballé, S. (eds.) Advances on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC 2017. LNDECT, vol. 13, pp. 314–324. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-69835-9_30
Chapter Google Scholar
SYNTAXNET: Announcing. The Worlds Most Accurate Parser Goes Open Source (2016)
Google Scholar
Alberti, C., Andor, D., Bogatyy, I., Collins, M., Gillick, D., Kong, L., Thanapirom, C., et al.: SyntaxNet models for the CoNLL 2017 shared task. arXiv preprint arXiv:1703.04929 (2017)
Pinter, Y., Guthrie, R., Eisenstein, J.: Mimicking word embeddings using subword RNNs. arXiv preprint arXiv:1707.06961 (2017)
Rong, X.: word2vec parameter learning explained. arXiv preprint arXiv:1411.2738 (2014)
Cho, K.: Natural language understanding with distributed representation. ArXiv, arXiv:1511.07916 (2015)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: EMNLP, vol. 14, pp. 1532–1543 (2014)
Google Scholar
Santos, C.D., Zadrozny, B.: Learning character-level representations for part-of-speech tagging. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), pp. 1818–1826 (2014)
Google Scholar
Bosco, C., Dell’Orletta, F., Montemagni, S., Sanguinetti, M., Simi, M.: The Evalita 2014 dependency parsing task. In: CLiC-it 2014 and EVALITA 2014 Proceedings, pp. 1–8. Pisa University Press (2014). ISBN/EAN: 978-886741-472-7
Google Scholar
Bosco, C., Montemagni, S., Simi, M.: Converting Italian treebanks: towards an Italian stanford dependency treebank. In: Proceedings of the 7th Linguistic Annotation Workshop & Interoperability with Discourse (LAW VII & ID at ACL-2013), Sofia, Bulgaria, 8–9 August, pp. 61–69 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for High Performance Computing and Networking - National Research Council of Italy, Via Pietro Castellino 111, 80131, Naples, Italy
Fiammetta Marulli, Marco Pota & Massimo Esposito

Authors

Fiammetta Marulli
View author publications
You can also search for this author in PubMed Google Scholar
Marco Pota
View author publications
You can also search for this author in PubMed Google Scholar
Massimo Esposito
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fiammetta Marulli .

Editor information

Editors and Affiliations

Istituto Di Calcolo E Reti Ad Alte Prestazioni (Icar), National Research Council, Roma, Italy
Giuseppe De Pietro
Istituto Di Calcolo E Reti Ad Alte Prestazioni (Icar), National Research Council, Roma, Italy
Luigi Gallo
Bournemouth University, Poole, United Kingdom
Robert J. Howlett
Centre for Artificial Intelligence, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, New South Wales, Australia
Lakhmi C. Jain
Griffith Sciences - Centres and Institutes, Griffith University, South Brisbane, Queensland, Australia
Ljubo Vlacic

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marulli, F., Pota, M., Esposito, M. (2019). A Comparison of Character and Word Embeddings in Bidirectional LSTMs for POS Tagging in Italian. In: De Pietro, G., Gallo, L., Howlett, R., Jain, L., Vlacic, L. (eds) Intelligent Interactive Multimedia Systems and Services. KES-IIMSS-18 2018. Smart Innovation, Systems and Technologies, vol 98. Springer, Cham. https://doi.org/10.1007/978-3-319-92231-7_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-92231-7_2
Published: 12 June 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92230-0
Online ISBN: 978-3-319-92231-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics