/SoundSourceSeparation

The code for multi-channel source separation and dereverberation such as FastMNMF1, FastMNMF2, and AR-FastMNMF2.

Primary LanguagePythonOtherNOASSERTION

Sound Source Separation

Tools for multi-channel sound source separation and dereverberation.

News

  • Ver2.1 is released. Source separation methods are implemented with Pytorch (numpy and cupy are not necessary)
  • Other methods implemented at ver1.0 such as MNMF-DP and FastMNMF-DP will be added in the future.

Method list

Source separation

  • FastMNMF1
  • FastMNMF2
  • ILRMA
  • MNMF (Pytorch version is much slower than cupy version on GPU)

Joint source separation and dereverberation

  • AR-FastMNMF2 (Pytorch version is not ready)

Requirements

  • Tested on Python3.8
  • Requirements for numpy and cupy version in src are listed below
numpy (1.19.2 was tested)
librosa
pysoundfile
tqdm

# optional packages
cupy # for GPU accelaration (9.4.0 was tested)
h5py # for saving the estimated parameters

You can install all the packages above with pip install -r src/requirements.txt

  • Requirements for pytorch version in src_torch are listed below
torch
torchaudio
tqdm

# optional packages
h5py # for saving the estimated parameters

You can install all the packages above with pip install -r src_torch/requirements.txt

Usage

python3 FastMNMF2.py [input_filename] --gpu [gpu_id]
  • Input is the multichannel observed signals.
  • If gpu_id < 0, CPU is used, and cupy is not required.

Citation

If you use the code of FastMNMF1 or FastMNMF2 in your research project, please cite the following paper:

If you use the code of AR-FastMNMF2 in your research project, please cite the following paper:

Detail

  • "n_bit" argument means the number of bits, and is set to 32 or 64 (32-> float32 and complex64, 64->float64 and complex128, default is 64). n_bit=32 reduces computational cost and memory usage in exchange for the separation performance. Especially when the number of microphones (or tap length in AR-based methods like AR-FastMNMF2) is large, the performance is likely to degrade. Moreover, when you are using simulated signals without reverberation, since the mixture SCM is likely to be rank-deficient, please add small noise to the simulated signals. In MNMF.py, only n_bit=64 is available.