Dual-Path-Transformer-Network-PyTorch

Unofficial implementation of Dual-Path Transformer Network (DPTNet) for speech separation (Interspeech 2020)

Plan

-- in-dir: It means your WSJ2mix dataset directory. (It has tr/cv/tt folders)

-- out-dir: It saves json files(file information) (recommand not to change)

$ python preprocess.py --in-dir /data/min --out-dir data --sample-rate 8000

$ python train.py

If you change --out-dir option, you have to set --train_dir '{your_directory}/tr' --valid_dir '{your_directory}/cv'

$ python train.py --train_dir 'data/tr' --valid_dir 'data/cv'

$ python evaluate.py --model_path 'exp/temp/temp_best.pth.tar'

If you change --out-dir option, you have to set --data_dir '{your_directory}/tt'

$ python evaluate.py --data_dir 'data/tt' --model_path 'exp/temp/temp_best.pth.tar'

$ python separate.py --model_path 'exp/temp/temp_best.pth.tar'

If you change --out-dir option, you have to set --mix_json '{your_directory}/tt/mix.json'

$ python evaluate.py --mix_json 'data/tt/mix.json' --model_path 'exp/temp/temp_best.pth.tar'

We achive SI-SNRi 19.84dB when L=4 (encoder kernel length)

You can check the separated audio samples in the result directory.