Dependencies

The package is modified based on rnn-speech-denoising. Reference: https://github.com/amaas/rnn-speech-denoising
The software depends on Mark Schmidt's minFunc package for convex optimization. Reference: http://www.di.ens.fr/~mschmidt/Software/minFunc.html
Additionally, we have included Mark Hasegawa-Johnson's HTK write and read functions that are used to handle the MFCC files. Reference: http://www.isle.illinois.edu/sst/software/
We use HTK for computing features (MFCC, logmel) (HCopy). Reference: http://htk.eng.cam.ac.uk/
We use signal processing functions from labrosa. Reference: http://labrosa.ee.columbia.edu/
We use BSS Eval toolbox Version 2.0, 3.0 for evaluation. Reference: http://bass-db.gforge.inria.fr/bss_eval/
We use MIR-1K for singing voice separation task. Reference: https://sites.google.com/site/unvoicedsoundseparation/mir-1k

Getting Started

MIR-1K experiment:

training: codes/mir1k/train_mir1k_demo.m
testing: codes/mir1k/run_test_single_mode.m
trained model: http://www.ifp.illinois.edu/~huang146/DNN_separation/model_400.mat -> put the model at codes/mir1k/model_demo

TIMIT experiment:

codes/timit/train_timit_demo.m

(change baseDir to the path with this README file)

Your data:

To try the codes on your data, see mir1k setting - put your data into codes/mir1k/Wavfile accordingly.
Look at the unit test parameters below codes/mir1k/train_mir1k_demo.m.
Tune the parameters and check the results.

TODO

Add more unit tests, comments, timit example

Reference

P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks," in International Society for Music Information Retrieval Conference (ISMIR) 2014.

P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Deep Learning for Monaural Speech Separation," in IEEE International Conference on Acoustic, Speech and Signal Processing 2014.

t-rad679/deeplearningsourceseparation

Dependencies

Getting Started

TODO

Reference