NOTE: This paper has been accepted by ICASSP 2024!
This repository provides the examples of Sepformer (NASS) on Libri2Mix based on SpeechBrain.
Once you have created your Python environment (Python 3.7+) you can simply type:
git clone https://github.com/TzuchengChang/NASS
cd NASS/speechbrain
pip install -r requirements.txt
pip install --editable .
pip install mir-eval
pip install pyloudnorm
In this paper, we propose a noise-aware SS (NASS) method, which aims to improve the speech quality for separated signals under noisy conditions. Specifically, NASS views background noise as an additional output and predicts it along with other speakers in a mask-based manner. To effectively denoise, we introduce patch-wise contrastive learning (PCL) between noise and speaker representations from the decoder input and encoder output. PCL loss aims to minimize the mutual information between predicted noise and other speakers at multiple-patch level to suppress the noise information in separated signals. Experimental results show that NASS achieves 1 to 2dB SI-SNRi or SDRi over DPRNN and Sepformer on WHAM! and LibriMix noisy datasets, with less than 0.1M parameter increase.
We also provide a true example from Ted Cruz with -2dB WHAM! noise mixed.
Results are from Sepformer(NASS) trained on Libri2Mix.
Mixture | Speaker 1 | Speaker 2 | Noise |
---|---|---|---|
Download | Download | Download | Download |
Step1: Prepare datasets. Please refer to LibriMix repository.
Step2: Modify configurations.
Configuration files are saved in NASS/recipes/LibriMix/separation/hparams/
Step3: Run NASS method.
cd NASS/speechbrain/recipes/LibriMix/separation/
python train.py hparams/sepformer-libri2mix.yaml --data_folder /yourpath/Libri2Mix/
We also provide a yaml for custom data, and make sure your custom folder structure is like:
| custom
| |-- train
| | |-- mixture
| | |-- noise
| | |-- source1
| | |-- source2
| | |-- source3
| |-- valid
| | |-- mixture
| | |-- noise
| | |-- source1
| | |-- source2
| | |-- source3
| |-- test
| | |-- mixture
| | |-- noise
| | |-- source1
| | |-- source2
| | |-- source3
python train.py hparams/sepformer-libri2mix-custom.yaml
--data_folder /yourpath/custom/
We provide a pretrained model on github releases.
To use it, download "results.zip" and unzip it to NASS/recipes/LibriMix/separation/
Then run NASS method.
Please cite our paper and star our repository.
@inproceedings{zhang2024noise,
title={Noise-Aware Speech Separation with Contrastive Learning},
author={Zhang, Zizheng and Chen, Chen and Chen, Hsin-Hung and Liu, Xiang and Hu, Yuchen and Chng, Eng Siong},
booktitle={ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
pages={1381--1385},
year={2024},
organization={IEEE}
}