Identical and Fraternal Twins: Fine-Grained Semantic Contrastive Learning of Sentence Representations

Xiao, Qingfa; Li, Shuangyin; Chen, Lei

doi:10.3233/FAIA230584

Computer Science > Computation and Language

arXiv:2307.10932 (cs)

[Submitted on 20 Jul 2023 (v1), last revised 14 Sep 2023 (this version, v2)]

Title:Identical and Fraternal Twins: Fine-Grained Semantic Contrastive Learning of Sentence Representations

Authors:Qingfa Xiao, Shuangyin Li, Lei Chen

View PDF

Abstract:The enhancement of unsupervised learning of sentence representations has been significantly achieved by the utility of contrastive learning. This approach clusters the augmented positive instance with the anchor instance to create a desired embedding space. However, relying solely on the contrastive objective can result in sub-optimal outcomes due to its inability to differentiate subtle semantic variations between positive pairs. Specifically, common data augmentation techniques frequently introduce semantic distortion, leading to a semantic margin between the positive pair. While the InfoNCE loss function overlooks the semantic margin and prioritizes similarity maximization between positive pairs during training, leading to the insensitive semantic comprehension ability of the trained model. In this paper, we introduce a novel Identical and Fraternal Twins of Contrastive Learning (named IFTCL) framework, capable of simultaneously adapting to various positive pairs generated by different augmentation techniques. We propose a \textit{Twins Loss} to preserve the innate margin during training and promote the potential of data enhancement in order to overcome the sub-optimal issue. We also present proof-of-concept experiments combined with the contrastive objective to prove the validity of the proposed Twins Loss. Furthermore, we propose a hippocampus queue mechanism to restore and reuse the negative instances without additional calculation, which further enhances the efficiency and performance of the IFCL. We verify the IFCL framework on nine semantic textual similarity tasks with both English and Chinese datasets, and the experimental results show that IFCL outperforms state-of-the-art methods.

Comments:	This article has been accepted for publication in European Conference on Artificial Intelligence (ECAI2023). 9 pages, 4 figures
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2307.10932 [cs.CL]
	(or arXiv:2307.10932v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2307.10932
Related DOI:	https://doi.org/10.3233/FAIA230584

Submission history

From: Qingfa Xiao [view email]
[v1] Thu, 20 Jul 2023 15:02:42 UTC (565 KB)
[v2] Thu, 14 Sep 2023 06:09:34 UTC (804 KB)

Computer Science > Computation and Language

Title:Identical and Fraternal Twins: Fine-Grained Semantic Contrastive Learning of Sentence Representations

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Identical and Fraternal Twins: Fine-Grained Semantic Contrastive Learning of Sentence Representations

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators