/CS5783-FinalProject

Final project for Fall 2022 CS5783 at OSU

Primary LanguagePythonApache License 2.0Apache-2.0

CS5783-FinalProject - Final project for Fall 2022 CS5783 at OSU

This project showcases the use of RNN and transformer based models to de-noise audio input samples.

Library requirements

In order to effectively execute the code, the library 'librosa' must be installed for the audio manipulation functions needed to create the test, train, and validation training sets for use in the model. This library can be installed with the following command:

pip3 install librosa

Other libraries used: numpy, tensorflow, shutil, soundfile, matplotlib, os, csv, and random. These libraries can be installed utilizing the same command structure as shown below.

pip3 install numpy
pip3 install tensorflow
pip3 install shutil
pip3 install soundfile
pip3 install matplotlib

Running the python scripts from the src directory

Traininng the RNN-based model:

python3 main.py -t 

Training the transformer based model:

python3 main.py -t -m tran

Loading a previous model for the RNN model:

python3 main.py -l ./backup/rnn.trial.1.bak

Loading a previous model for the transformer based model:

python3 main.py -l ./backup/tran.trial.1.bak -m tran

The output test data is stored in the outputs subdirectory inside of the src directory. The input audio track is denoted inputs, predicted values are denoted predicted, and the clean audio track is denoted as actual. The output MSE vs Epochs graph is stored inside of the src directory and named rnn_msevsepochs.png for easy viewing.

Note, due to time constraints, the transformer based model is present in the model.py code but it is not functional. The command line flags are functional and do work as intended but the model does not function appropriately.

Contributors

1. Landon Burleson, Oklahoma State University ECE Department
2. Madhusti Dhasaradhan, Oklahoma State University ECE Department
3. Alex Sensintaffar, Oklahoma State University ECE Department