Incorporating Contextual Information in Prediction Based Word Embedding Models

Vimal Kumar, K.; Ahuja, Shruti; Choudary, Skandha; Parwekar, Pritee

doi:10.1007/978-981-19-1122-4_37

K. Vimal Kumar¹⁴,
Shruti Ahuja¹⁵,
Skandha Choudary¹⁶ &
…
Pritee Parwekar¹⁷

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 434))

466 Accesses

Abstract

A clustering-based language model is proposed for analyzing the performance of context sensitive word embedding models that uses contextual information. The construction of text readability and prediction models faces several shortcomings due to the complex nature and structure of language. The language structure is complicated all the more by words having vastly different meanings and interpretation based on the context in which it is used. This paper aims to resolve this issue by first clustering the sentences based on similarity and then performing word embedding separately on each of these clusters to obtain an enhanced outcome. This would serve to embed the same word separately in varying context as an improvement over the standard existing word embedding models provided by various prediction based models. Comparing the two approaches, our results have showed that clustering improves the performance of the model and discriminate the contextual information based on sense which leads to more accurate representation in vector form.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Y. Sun, N. Rao, W. Ding, A Simple approach to learn polysemous word embeddings (2017)
Google Scholar
M. Cha, Y. Gwon, H.T. Kung, Language modeling by clustering with word embeddings for text readability assessment (2017)
Google Scholar
Z. Yang, W. Chen, F. Wang, B. Xu, Multi-sense based neural machine translation, in 2017 International Joint Conference on Neural Networks (IJCNN)
Google Scholar
Mikolov et al., Efficient Estimation of Word Representations in Vector Space (2013)
Google Scholar
Mikolov et al., Distributed Representations of Words and Phrases and their Compositionality (2013)
Google Scholar
T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: an efficient data clustering method for very large databases (1996)
Google Scholar
J.E. Alvarez, A review of word embedding and document similarity algorithms applied to academic text (2017)
Google Scholar
P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)
Article Google Scholar
W. Zhu, X. Jin, J. Ni, B. Wei, Z. Lu, Improve word embedding using both writing and pronunciation. PLoS ONE 13(12), e0208785 (2018). https://doi.org/10.1371/journal.pone.0208785
Article Google Scholar
Z.-L. Ye, H.-X. Zhao, Syntactic word embedding based on dependency syntax and polysemous analysis. Front. Inf. Technol. Electron. Eng. 19, 524–535 (2018). https://doi.org/10.1631/FITEE.1601846
Article Google Scholar

Download references

Author information

Authors and Affiliations

Jaypee Institute of Information Technology, Noida, U.P., India
K. Vimal Kumar
Amazon, Hyderabad, India
Shruti Ahuja
AXA-XL, Gurgoan, India
Skandha Choudary
SRM Institute of Science & Technology, Ghaziabad, U.P., India
Pritee Parwekar

Authors

K. Vimal Kumar
View author publications
You can also search for this author in PubMed Google Scholar
Shruti Ahuja
View author publications
You can also search for this author in PubMed Google Scholar
Skandha Choudary
View author publications
You can also search for this author in PubMed Google Scholar
Pritee Parwekar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to K. Vimal Kumar .

Editor information

Editors and Affiliations

IIS (Deemed to be University), Jaipur, Rajastan, India
Vijay Singh Rathore
Electronics and Computer Discipline, Indian Institute of Technology, Roorkee, Uttarakhand, India
Subhash Chander Sharma
Faculdade de Engenharia, Universidade do Porto, Porto, Portugal
Joao Manuel R.S. Tavares
Faculty of Science, School Information Systems, Queensland University of Technology, Brisbane, QLD, Australia
Catarina Moreira
Deparment of Computer Science and Engineering, National Institute of Technology Puducherry, Karaikal, Pondicherry, India
B. Surendiran

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vimal Kumar, K., Ahuja, S., Choudary, S., Parwekar, P. (2022). Incorporating Contextual Information in Prediction Based Word Embedding Models. In: Rathore, V.S., Sharma, S.C., Tavares, J.M.R., Moreira, C., Surendiran, B. (eds) Rising Threats in Expert Applications and Solutions. Lecture Notes in Networks and Systems, vol 434. Springer, Singapore. https://doi.org/10.1007/978-981-19-1122-4_37

Download citation

DOI: https://doi.org/10.1007/978-981-19-1122-4_37
Published: 04 July 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-1121-7
Online ISBN: 978-981-19-1122-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics