Webpage: https://sites.google.com/site/deeplearningsourceseparation/
-
Training code:
codes/mir1k/train_mir1k_demo.m -
Demo
- Download a trained model
http://www.ifp.illinois.edu/~huang146/DNN_separation/model_400.mat - Put the model at
codes/mir1k/demoand go to the folder - Run:
codes/mir1k/demo/run_test_single_model.m
-
Training code:
codes/timit/train_timit_demo.mandcodes/timit/train_timit_demo_mini_clip.m -
Demo
- Download a trained model
http://www.ifp.illinois.edu/~huang146/DNN_separation/timit_model_70.mat - Put the model at
codes/timit/demoand go to the folder - Run:
codes/timit/demo/run_test_single_model.m
-
Training code:
codes/TSP/train_TSP_demo_mini_clip.m -
Demo
- Download a trained model
http://www.ifp.illinois.edu/~huang146/DNN_separation/TSP_model_RNN1_win1_h300_l2_r0_64ms_1000000_softabs_linearout_RELU_logmel_trn0_c1e-10_c0.001_bsz100000_miter10_bf50_c0_d0_7650.mat - Put the model at
codes/TSP/demoand go to the folder - Run the demo code at
codes/TSP/demo/run_test_single_model.m
- Demo
- Download a trained model
http://www.ifp.illinois.edu/~huang146/DNN_separation/denoising_model_870.mat - Put the model at
codes/denoising/demoand go to the folder - Run the demo code at
codes/denoising/demo/run_test_single_model.m
-
The package is modified based on rnn-speech-denoising
-
The software depends on Mark Schmidt's minFunc package for convex optimization.
-
Additionally, we have included Mark Hasegawa-Johnson's HTK write and read functions that are used to handle the MFCC files.
-
We use HTK for computing features (MFCC, logmel) (HCopy).
-
We use signal processing functions from labrosa.
-
We use BSS Eval toolbox Version 2.0, 3.0 for evaluation.
-
We use MIR-1K for singing voice separation task.
-
We use TSP for speech separation task.
-
To try the codes on your data, see mir1k, TSP settings - put your data into
codes/mir1k/Wavfileorcodes/TSP/Data/accordingly. -
Look at the unit test parameters below
codes/mir1k/train_mir1k_demo.m,codes/TSP/train_TSP_demo_mini_clip.m(with minibatch lbfgs, gradient clipping) -
Tune the parameters on the dev set and check the results.
-
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2136–2147, Dec. 2015
-
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks," in International Society for Music Information Retrieval Conference (ISMIR) 2014.
-
P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Deep Learning for Monaural Speech Separation," in IEEE International Conference on Acoustic, Speech and Signal Processing 2014.
##License Apache License. For commercial use, please contact me.