Audio Classification for the Urban Sound Datasets

Steps to train a model:

Download the urban sound datasets, and unzip it in the raw_data folder
There should be 10 sub-folders in the raw_data folder
UrbanSound8K
UrbanSound
Change the sample rate of the audio
python src/wav16000.py --raw_data_dir raw_data --data_16000_dir wav_16000
Perfrom STFT to the audio files and save each audio as a tensor
python src/audio_pth.py --data_16000_dir wav_16000 --data_dir data
Create train/eval/test manifest files
python src/manifest.py
Train
python src/train.py
The best model will be store in ./log/<date_time>/best.model.pth
TODO
Correct the train/eval/test splits: AVOID COMMON PITFALLS

liucong3/audio_classification