/vp-pairwise

Primary LanguagePythonMIT LicenseMIT

Few-shot Learning + Interpolation for Classification in Low-resource Dialogue Systems

In this project, we implement a novel training framework to eliminate class-imabalance issues in a low-resource dialogue system like the Virtual Patient project. We combine the contrastive loss [1] with a 1-nearest-neighbor search to improve generalization for rare classes. Additionally, we combine it with a "mixup" [2] based KL divergence loss as a data-augmentation technique which also helps maintain performance on frequent classes.

We implement this with three underlying architectures:

  • Text-CNN [3]
  • Self-attention RNN [4]
  • BERT [5]

Requirements

All code was developed on Python 3.7. Additional requirements include:

  • pytorch >= 1.4.0
  • transformers >= 3.0.2 (Link)
  • pretrained BERT bert-based-uncased (Link)
  • FAISS toolkit for efficient nearest neighbor search (Link)

Usage

  • For fine-tuning hyperparameters, run: bash run_gs.sh . Logs will be saved in the file specified by --validation-log.
  • For testing, run: bash run_test.sh .

Specify the path to the pretrained bert model in --prebert-path. All other arguments can be seen in the parser definition in util.py.

References

[1] Raia Hadsell, Sumit Chopra, and Yann LeCun, “Dimensionality reduction by learning an invariant mapping,” in 2006 IEEE Computer Society Conference on ComputerVision and Pattern Recognition (CVPR’06). IEEE, 2006.

[2] Hongyi Zhang, Moustapha Cisse, Yann N Dauphin, andDavid Lopez-Paz, “mixup: Beyond empirical risk min-imization,” in ICLR 2018.

[3] Yoon Kim, “Convolutional neural networks for sentence classification,” in EMNLP 2014.

[4] Zhouhan Lin, Minwei Feng, Cicero Nogueira dos Santos, Mo Yu, Bing Xiang, Bowen Zhou, and Yoshua Bengio, “A structured self-attentive sentence embedding,” in ICLR 2017.

[5] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova, “Bert: Pre-training of deep bidirec-tional transformers for language understanding,” in NAACL 2019.

Citation

@article{sunder2020handling,
  title={Handling Class Imbalance in Low-Resource Dialogue Systems by Combining Few-Shot Classification and Interpolation},
  author={Sunder, Vishal and Fosler-Lussier, Eric},
  journal={ICASSP},
  year={2021}
}