This is a project on speech separation problem using supervised learning on various training targets, building machine learning model using feed forward neural networks. Implementing metrics like STOI and PESQ for speech quality and interpretability metrics.
Dataset: The experiment has been done on TIMIT dataset The dataset contains 8 different dialects with each dialect contains male and female speeches. We have noise dataset mimicking real world noises like birds, keyboard, ocean etc.
Experimental settings: The models has been trained on 8core CPU with 52 GB internalmemory. To expedite training time, we took help of 2 x NVIDIA Tesla K80 GPUs, 12 GBmemory each on Google Compute Engine.
Programming Languages Used:
Python and Matlab.
Prerequisite libraries to import and run the code:
Python 3 or later Tensorflow Keras Matlab 2016a or later Lagasne
Note: The pretrained models are ready to be tested. The developer can reuse the pretrained models and test the validation speech samples using the TrainandTestModel.py