Abstract
A clustering-based language model is proposed for analyzing the performance of context sensitive word embedding models that uses contextual information. The construction of text readability and prediction models faces several shortcomings due to the complex nature and structure of language. The language structure is complicated all the more by words having vastly different meanings and interpretation based on the context in which it is used. This paper aims to resolve this issue by first clustering the sentences based on similarity and then performing word embedding separately on each of these clusters to obtain an enhanced outcome. This would serve to embed the same word separately in varying context as an improvement over the standard existing word embedding models provided by various prediction based models. Comparing the two approaches, our results have showed that clustering improves the performance of the model and discriminate the contextual information based on sense which leads to more accurate representation in vector form.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Y. Sun, N. Rao, W. Ding, A Simple approach to learn polysemous word embeddings (2017)
M. Cha, Y. Gwon, H.T. Kung, Language modeling by clustering with word embeddings for text readability assessment (2017)
Z. Yang, W. Chen, F. Wang, B. Xu, Multi-sense based neural machine translation, in 2017 International Joint Conference on Neural Networks (IJCNN)
Mikolov et al., Efficient Estimation of Word Representations in Vector Space (2013)
Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality (2013)
T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: an efficient data clustering method for very large databases (1996)
J.E. Alvarez, A review of word embedding and document similarity algorithms applied to academic text (2017)
P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
W. Zhu, X. Jin, J. Ni, B. Wei, Z. Lu, Improve word embedding using both writing and pronunciation. PLoS ONE 13(12), e0208785 (2018). https://doi.org/10.1371/journal.pone.0208785
Z.-L. Ye, H.-X. Zhao, Syntactic word embedding based on dependency syntax and polysemous analysis. Front. Inf. Technol. Electron. Eng. 19, 524–535 (2018). https://doi.org/10.1631/FITEE.1601846
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Vimal Kumar, K., Ahuja, S., Choudary, S., Parwekar, P. (2022). Incorporating Contextual Information in Prediction Based Word Embedding Models. In: Rathore, V.S., Sharma, S.C., Tavares, J.M.R., Moreira, C., Surendiran, B. (eds) Rising Threats in Expert Applications and Solutions. Lecture Notes in Networks and Systems, vol 434. Springer, Singapore. https://doi.org/10.1007/978-981-19-1122-4_37
Download citation
DOI: https://doi.org/10.1007/978-981-19-1122-4_37
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1121-7
Online ISBN: 978-981-19-1122-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)