Word-Context Attention for Text Representation

Piao, Chengkai; Wang, Yuchen; Zhu, Yapeng; Wei, Jin-Mao; Liu, Jian

doi:10.1007/s11063-023-11396-w

Word-Context Attention for Text Representation

Published: 08 September 2023

Volume 55, pages 11721–11738, (2023)
Cite this article

Neural Processing Letters Aims and scope Submit manuscript

Chengkai Piao ORCID: orcid.org/0000-0001-6222-2037¹,
Yuchen Wang¹,
Yapeng Zhu¹,
Jin-Mao Wei¹ &
…
Jian Liu¹

60 Accesses
Explore all metrics

Abstract

We tackle the insufficient context pattern limitation of existing Word-Word Attention caused by its spatial-shared property. To this end, we propose the Word-Context Attention method that utilizes item-wise filters to perform both temporal and spatial combinations. Specifically, the proposed method first compresses the global scale left and right context words into fixed-length vectors respectively. Then, a group of specific filters are learned to select features from the word and its context vectors. Last, a non-linear transformation is adopted to merge and activate the selected features. Since each word has its exclusive context filters and non-linear semantic transformations, the proposed method has the property of being spatial-specific, and thus can generate flexible context patterns. Experimental comparisons demonstrate the feasibility of our model and its attractive computational performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SSA: A Content-Based Sparse Attention Mechanism

Multi-modal Text Recognition Networks: Interactive Enhancements Between Visual and Semantic Features

Attention Enhanced Chinese Word Embeddings

Notes

This dataset contains about 87.5K URLs in which one-third are flagged as a spam URL and restrict are not spam. The dataset is available at https://www.kaggle.com/shivamb/spam-url-prediction.
This dataset contains cleaned tweets from India on topics like corona-virus, COVID-19 and lock-down etc. The tweets have been collected between dates 23rd March 2020 and 15th July 2020. Then the text have been labeled into four sentiment categories fear, sad, anger and joy. The dataset is available at https://www.kaggle.com/surajkum1198/twitterdata.
https://www.kaggle.com/shoumikgoswami/annotated-gmb-corpus.

References

Sprugnoli R, Tonelli S (2019) Novel event detection and classification for historical texts. Comput Linguist 45(2):229–265
Article Google Scholar
Yang Z, Wang Y, Chen X, Liu J, Qiao Y (2020) Context-transformer: tackling object confusion for few-shot detection. Proc AAAI Conf Artif Intell 34:12653–12660
Google Scholar
Yang M, Zhang M, Chen K, Wang R, Zhao T (2020) Neural machine translation with target-attention model. IEICE Trans Inf Syst 103(3):684–694
Article Google Scholar
Lutellier T, Pham HV, Pang L, Li Y, Wei M, Tan L (2020) Coconut: combining context-aware neural translation models using ensemble for program repair. In: Proceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis, pp 101–114
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. In: Advances in neural information processing systems, pp 5998–6008
Žabokrtskỳ Z, Zeman D, Ševčíková M (2020) Sentence meaning representations across languages: What can we learn from existing frameworks? Comput Linguist 46(3):605–665
Article Google Scholar
Jiang J, Zhang J, Zhang K (2020) Cascaded semantic and positional self-attention network for document classification. In: Proceedings of the 2020 conference on empirical methods in natural language processing: findings, pp 669–677
Wang W, Pan SJ (2020) Syntactically meaningful and transferable recursive neural networks for aspect and opinion extraction. Comput Linguist 45(4):705–736
Article Google Scholar
Li C, Bao Z, Li L, Zhao Z (2020) Exploring temporal representations by leveraging attention-based bidirectional lstm-rnns for multi-modal emotion recognition. Inf Process Manag 57(3):102185
Article Google Scholar
Laenen K, Moens M-F (2020) A comparative study of outfit recommendation methods with a focus on attention-based fusion. Inf Process Manag 57(6):102316
Article Google Scholar
Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042–2050
Al-Rfou R, Choe D, Constant N, Guo M, Jones L (2019) Character-level language modeling with deeper self-attention. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3159–3166
Ibrahim MA, Ghani Khan MU, Mehmood F, Asim MN, Mahmood W (2021) Ghs-net a generic hybridized shallow neural network for multi-label biomedical text classification. J Biomed Inform 116:103699–103699
Article Google Scholar
Niu G, Xu H, He B, Xiao X, Wu H, Gao S (2019) Enhancing local feature extraction with global representation for neural text classification. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pp 496–506. Association for Computational Linguistics, Hong Kong, China . https://doi.org/10.18653/v1/D19-1047. https://www.aclweb.org/anthology/D19-1047
Du C, Chen Z, Feng F, Zhu L, Gan T, Nie L (2019) Explicit interaction model towards text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 6359–6366
Dai Z, Yang Z, Yang Y, Cohen WW, Carbonell J, Le QV, Salakhutdinov R (2019) Transformer-xl: attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860
Liu X, He P, Chen W, Gao J (2019) Multi-task deep neural networks for natural language understanding. arXiv preprint arXiv:1901.11504
Ke P, Ji H, Liu S, Zhu X, Huang M (2020) Sentilare: linguistic knowledge enhanced language representation for sentiment analysis. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 6975–6988
Song C, Ning N, Zhang Y, Wu B (2021) A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks. Inf Process Manag 58(1):102437
Article Google Scholar
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. In: Proceedings of the 28th international conference on neural information processing systems, vol 1, pp 649–657
Liu P, Qiu X, Huang X (2017) Adversarial multi-task learning for text classification. In: Proceedings of the 55th annual meeting of the association for computational linguistics (Vol 1 Long Papers), pp 1–10
Dennis D, Acar DAE, Mandikal V, Sadasivan VS, Saligrama V, Simhadri HV, Jain P (2019) Shallow rnn: accurate time-series classification on resource constrained devices. In: Advances in neural information processing systems, pp 12896–12906
Wang B (2018) Disconnected recurrent neural networks for text categorization. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Vol 1, Long Papers), pp 2311–2320
Xu J, Cai Y, Wu X, Lei X, Huang Q, Leung H-F, Li Q (2020) Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing 386:42–53
Article Google Scholar
Conneau A, Schwenk H, Barrault L, Lecun Y (2017) Very deep convolutional networks for text classification. In: Proceedings of the 15th conference of the European chapter of the association for computational linguistics, vol 1, Long Papers, pp 1107–1116
Gururangan S, Dang T, Card D, Smith NA (2019) Variational pretraining for semi-supervised text classification. In: Proceedings of the 57th Annual meeting of the association for computational linguistics, pp 5880–5894
Guo C, Xie L, Liu G, Wang X (2020) A text representation model based on convolutional neural network and variational auto encoder. In: International conference on web information systems and applications, pp 225–235 . Springer
Li W, Qi F, Tang M, Yu Z (2020) Bidirectional lstm with self-attention mechanism and multi-channel features for sentiment classification. Neurocomputing 387:63–77
Article Google Scholar
Wang Y, Yang Y, Chen Y, Bai J, Zhang C, Su G, Kou X, Tong Y, Yang M, Zhou L (2020) Textnas: A neural architecture search space tailored for text representation. In: Proceedings of the AAAI conference on artificial intelligence vol 34, pp 9242–9249
Le HT, Cerisara C, Denis A (2018) Do convolutional networks need to be deep for text classification? In: Workshops at the thirty-second AAAI conference on artificial intelligence, pp 29–36
Asghari M, Sierra-Sosa D, Elmaghraby AS (2020) A topic modeling framework for spatio–temporal information management. Inf Process Manag 57(6):102340
Article Google Scholar
Kim Y (2014) Convolutional neural networks for sentence classification. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 746–1751
Dauphin YN, Fan A, Auli M, Grangier D (2017) Language modeling with gated convolutional networks. In: International conference on machine learning, pp 933–941
Guo X, Zhang H, Yang H, Xu L, Ye Z (2019) A single attention-based combination of CNN and RNN for relation classification. IEEE Access 7:12467–12475
Article Google Scholar
Chambua J, Niu Z (2021) Review text based rating prediction approaches: preference knowledge learning, representation and utilization. Artif Intell Rev 54(2):1171–1200
Article Google Scholar
Zhang S, Jiang H, Xu M, Hou J, Dai L (2015) The fixed-size ordinally-forgetting encoding method for neural network language models. In: Proceedings of the 53rd annual meeting of the association for computational linguistics and the 7th international joint conference on natural language processing (Vol 2, Short Papers), vol 2, pp 495–500
Conneau A, Schwenk H, Barrault L, Lecun Y (2016) Very deep convolutional networks for natural language processing. arXiv preprint arXiv:1606.01781 2
Pang B, Lee L (2005) Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd annual meeting on association for computational linguistics, pp 115–124 . Association for computational linguistics
Pennington J, Socher R, Manning C (2014) Glove: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp 1532–1543
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, vol 1 (Long and Short Papers), pp 4171–4186
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958
MathSciNet MATH Google Scholar
Kingma DP, Ba J (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980
Chen S, Zhang Y, Yang Q (2021) Multi-task learning in natural language processing: an overview. arXiv preprint arXiv:2109.09138
Zhang Y, Yang Q (2021) A survey on multi-task learning. IEEE Trans Knowl Data Eng 34:5586–5609
Article Google Scholar

Download references

Acknowledgements

This work was partially supported by the National Key R &D Programs of China (2018YFC1603800, 2018YFC1603802, 2020YFA0908700, 2020YFA0908702), the National Natural Science Foundation of China (61772288, 61872115) and the Natural Science Foundation of Tianjin City (18JCZDJC30900).

Author information

Authors and Affiliations

College of Computer Science, Nankai University, Tongyan 38, Tianjin, 300001, China
Chengkai Piao, Yuchen Wang, Yapeng Zhu, Jin-Mao Wei & Jian Liu

Authors

Chengkai Piao
View author publications
You can also search for this author in PubMed Google Scholar
Yuchen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yapeng Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Jin-Mao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Jian Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chengkai Piao.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Piao, C., Wang, Y., Zhu, Y. et al. Word-Context Attention for Text Representation. Neural Process Lett 55, 11721–11738 (2023). https://doi.org/10.1007/s11063-023-11396-w

Download citation

Accepted: 14 August 2023
Published: 08 September 2023
Issue Date: December 2023
DOI: https://doi.org/10.1007/s11063-023-11396-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Word-Context Attention for Text Representation

Abstract

Access this article

Similar content being viewed by others

SSA: A Content-Based Sparse Attention Mechanism

Multi-modal Text Recognition Networks: Interactive Enhancements Between Visual and Semantic Features

Attention Enhanced Chinese Word Embeddings

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Word-Context Attention for Text Representation

Abstract

Access this article

Similar content being viewed by others

SSA: A Content-Based Sparse Attention Mechanism

Multi-modal Text Recognition Networks: Interactive Enhancements Between Visual and Semantic Features

Attention Enhanced Chinese Word Embeddings

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation