tatsuropfgt/papers

Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding

Opened this issue · 0 comments

Contrastive Learning for Improving ASR Robustness in Spoken Language Understanding [ICASSP22]

Abstract

  • improve the ASR robustness by contrastive pretraining of RoBERTa
  • fine-tuning with self-distillation to reduce the label noises due to ASR errors

Method

Pretraining

Screenshot 2024-01-10 at 16 24 29
  • improve robustness with only textual information
    • previous works need additional speech-related input features

Fine-tuning

Screenshot 2024-01-10 at 16 32 27
  • hard + soft + self-distillation
  • self-distillation is calculated by KL divergence between current and previous prediction
  • soft loss is calculated by the similarity