This repository contains code for our ASRU 2019 paper titled "Acoustic model adaptation from raw waveforms with SincNet". The aim is to explore the adaptation of the SincNet layer (filter parameters and amplitudes) to speakers and domains.
The code is a little messy. I hope to clean it up soon, time permitting. Any questions or problems - please get in touch.
Much of the code is built on the work by Ondrej for Learning to Adapt.
This work is the result of a collaboration with my co-authors Ondrej Klejch, Erfan Loweimi, Peter Bell, and Steve Renals.
The code has been run with:
- Keras 2.2.2
- Tensorflow 1.10.0
- PyKaldi
- Kaldi
For training from scratch see experiments/ami/train_sinc_40_flat_6epochs.sh
. For speaker adaptation see experiments/ami/adapt_pfstar_40_flat_speaker_lhuc0+sinc.sh
. The layers to be adapted (LHUC0, LHUC1, LHUC0+Sinc, etc.) can are determined by an argument to adapt_pfstar_40_flat_speaker.py
. The above scripts assume an existing tri3 model of AMI (or a different dataset). It will also look for pdf_counts
in the main directory, which is equivalent to e.g. tri3/final.occs
.
For research using this work, please cite:
@inproceedings{Fainberg2019,
author={Joachim Fainberg and Ondřej Klejch and Erfan Loweimi and Peter Bell and Steve Renals},
title={{Acoustic Model Adaptation from Raw Waveforms with SincNet}},
booktitle={ASRU},
year=2019
}
Our work builds on a paper by Ravanelli and Bengio. They have a SincNet implementation for PyTorch.