Name | Paper | Comment | Organization | Source | Pre-trained model |
---|---|---|---|---|---|
vq-APC | Vector-Quantized Autoregressive Predictive Coding | Interspeech 2020 | MIT | code | |
wav2vec 2.0 | wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations | NeurIPS 2020 | Facebook AI Research | code | models |
CPC_audio | UNSUPERVISED PRETRAINING TRANSFERS WELL ACROSS LANGUAGES | ICASSP 2020 | Facebook AI Research | code | |
APC | An unsuper- vised autoregressive model for speech representation learning | Interspeech 2019 | MIT | ||
vq-wav2vec | VQ-WAV2VEC: SELF-SUPERVISED LEARNING OF DISCRETE SPEECH REPRESENTATIONS | ICLR 2020 | Facebook AI Research | code | models |
wav2vec | wav2vec: Unsupervised Pre-Training for Speech Recognition | Interspeech 2019 | Facebook AI Research | code | models |
Name | Paper | Comment | Organization | Source | Pre-trained model |
---|---|---|---|---|---|
Self-training and Pre-training are Complementary for Speech Recognition | Facebook AI Research |
Name | Paper | Comment | Organization | Source | Pre-trained model |
---|---|---|---|---|---|
Towards Semi-Supervised Semantics Understanding from Speech | MIT/Amazon | ||||
The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling | |||||
C/A/MPC | Similarity Analysis of Self-Supervised Speech Representations | MIT | |||
DAPC | Representation Learning for Sequence Data with Deep Autoencoding Predictive Components | Cornell | |||
BertVideo | VideoBERT: A Joint Model for Video and Language Representation Learning |
- Hierarchy k-means
- APC + CPC + MPC (joint training/conditional training/finetune)
- PI
- Short term action
- Metrics