Pytorch Version singel channel speech enhancemnet in cldnn

author: yxhu

thanks to Ke Wang and awni's repo

How to use it ?

1. install requirements

sru

git https://github.com/asappresearch/sru.git
cd sru 
pip install -r requirements.txt
python setup.py install

pypesq

git clone https://github.com/vBaiCai/python-pesq.git
cd python-pesq
python setup.py install

pystoi

pip install pystoi

2. download dataset

Aishell-1

https://www.openslr.org/33/

MUSAN

https://www.openslr.org/17/

3. prepare train. dev and test data

find ${aishell_dir}/train -iname "*.wav" > train.lst
find ${aishell_dir}/dev -iname "*.wav" > dev.lst
find ${aishell_dir}/test -iname "*.wav" > test.lst
find ${musan}/test -iname "*.wav" > noise.lst # Attention!!, please do not add musan/speech into noise.lst

bash tools/prepare_train.sh
bash tools/prepare_test.sh

4. train model

bash run_sruc.sh

5. eval model

bash eval.sh

update log:

2019-4-21

add clip_grad_norm in step/run_cldnn.py fix memory leak bug in ./tools/dataset.py's collat_fn: return numpy.array to torch.tensor, which can be lead to memory leak change mse in frames to mse in samples

20190630

change the conv2d to like tf's 'same' padding
add SRU: SRU
add 1k6h train strategy: warmup
add eval

20190920

update data prepare pipeline
add psa,psm

Chriszhangmw/se-cldnn