Not All Negatives are Equally Negative: Soft Contrastive Learning for Unsupervised Sentence Representations

Files

hdl_146800.pdf (2.25 MB)
  (Published version)

Date

2024

Authors

Zhuang, H.
Emma Zhang, W.
Yang, J.
Chen, W.
Sheng, Q.Z.

Editors

Advisors

Journal Title

Journal ISSN

Volume Title

Type:

Conference paper

Citation

Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM 2024), 2024, pp.3591-3601

Statement of Responsibility

Haojie Zhuang, Wei Emma Zhang, Jian Yang, Weitong Chen, Quan Z. Sheng

Conference Name

33rd ACM International Conference on Information and Knowledge Management (CIKM) (21 Oct 2024 - 25 Oct 2024 : Boise, Idaho, USA)

Abstract

Contrastive learning has been extensively studied in sentence representation learning as it demonstrates effectiveness in various downstream applications, where the same sentence with different dropout masks (or other augmentation methods) is considered as positive pair while taking other sentences in the same mini-batch as negative pairs. However, these methods mostly treat all negative examples equally and overlook the different similarities between the negative examples and the anchors, which thus fail to capture the fine-grained semantic information of the sentences. To address this issue, we explicitly differentiate the negative examples by their similarities with the anchor, and thus propose a simple yet effective method SoftCSE that individualizes either the weight or temperature of each negative pair in the standard InfoNCE loss according to the similarities of the negative examples and the anchors. We further provide the theoretical analysis of our methods to show why and how SoftCSE works, including the optimal solution, gradient analysis and the connection with other loss. Empirically, we conduct extensive experiments on semantic textual similarity (STS) and transfer (TR) tasks, as well as text retrieval and reranking, where we observe significant performance improvements compared to strong baseline models.

School/Discipline

Dissertation Note

Provenance

Description

Access Status

Rights

© 2024 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution International 4.0 License.

License

Call number

Persistent link to this record