/audio_classification

Primary LanguagePythonMIT LicenseMIT

Audio Classification for the Urban Sound Datasets

Steps to train a model:

  • Download the urban sound datasets, and unzip it in the raw_data folder
    There should be 10 sub-folders in the raw_data folder
    UrbanSound8K
    UrbanSound
  • Change the sample rate of the audio
    python src/wav16000.py --raw_data_dir raw_data --data_16000_dir wav_16000
  • Perfrom STFT to the audio files and save each audio as a tensor
    python src/audio_pth.py --data_16000_dir wav_16000 --data_dir data
  • Create train/eval/test manifest files
    python src/manifest.py
  • Train
    python src/train.py
    The best model will be store in ./log/<date_time>/best.model.pth
  • TODO
    Correct the train/eval/test splits: AVOID COMMON PITFALLS