/Pcc-tuning

Primary LanguagePython

Read Me

Pcc-tuning: Breaking the Contrastive Learning Ceiling in Semantic Textual Similarity

Our paper is the first to propose and substantiate the theoretical performance upper bound of contrastive learning methods. Additionally, Pcc-tuning is the inaugural method capable of achieving Spearman’s correlation scores above 87 on standard STS tasks, marking a significant advancement in the field.

This paper has been accepted to EMNLP 2024. (Main)


Results

main-table


Data


Checkpoints


Quick Start

  • Python Version: 3.9.18

  • Install Dependencies

    cd code
    pip install -r requirements.txt
  • Download SentEval

    cd SentEval/data/downstream/
    bash download_dataset.sh
  • Stage One

    cd code
    nohup torchrun --nproc_per_node=4 train.py > nohup.out & # 4090 * 4
  • Stage Two

    cd code
    nohup torchrun --nproc_per_node=4 tune.py > nohup.out & # 4090 * 4

Acknowledgement

  • Our code is based on PromptEOL

Related Work