This is a pytorch implementation of the paper Understanding Hard Negatives in Noise Contrastive Estimation [1].
The experiments were run with python 3.7.9
, transformers 3.1.0
, pytorch 1.7.1
using NVIDIA A100 (CUDA version 11.2
).
Download the public zeshel data here [2].
python main_retriever.py --model [model saving path] --data_dir [zeshel data directory] --B 16 --gradient_accumulation_steps 2 --logging_steps 1000 --k 64 --epochs 4 --lr 0.00001 --num_cands 64 --type_cands mixed_negative --cands_ratio 0.5 --gpus 3,4,5,7 --type_model sum_max --num_mention_vecs 128 --num_entity_vecs 128 --store_en_hiddens --en_hidden_path [the path for saving all the entity embeddings] --entity_bsz 4096 --mention_bsz 200
python save_candidates.py --model [pretrained model path] --data_dir [Zeshel data directory] --pre_model Bert --type_model sum_max --num_mention_vecs 128 --num_entity_vecs 128 --entity_bsz 1024 --mention_bsz 200 --store_en_hiddens --en_hidden_path [the path for saving all the entity embeddings] --num_cands 64 --cands_dir [the directory for saving the candidates] --gpus 0
python main_reranker.py --model [model saving path] --data [zeshel data directory] --B 2 --gradient_accumulation_steps 2 --num_workers 2 --warmup_proportion 0.2 --epochs 3 --gpus 5 --lr 2e-5 --cands_dir [candidates file directory] --eval_method [micro or macro] --type_model full --type_bert [base/large] --inputmark [--fp16]
[1] Understanding Hard Negatives in Noise Contrastive Estimation (Zhang and Stratos, 2021)
@article{zhang2021understanding,
title={Understanding Hard Negatives in Noise Contrastive Estimation},
author={Zhang, Wenzheng and Stratos, Karl},
journal={arXiv preprint arXiv:2104.06245},
year={2021}
}