/ENV50-Sound-classification

Primary LanguageJupyter NotebookGNU General Public License v3.0GPL-3.0

Environmental-sounds-UNIPD-2022

Dataset: ESC50 (50 classes, 2000 examples).

Preprocessing: MFCCs, Chromagram, data augmentaion (7 times the initial sample size).

eda_image

Evaluation metrics: Accuracy, Estimated Memory Usage. Architectures: CNN, RNN-SEQ2D, RNN431, RNN60-small, RNN60-LSTM, RNN60-GRU.

RNN-1x60_8M_params

Best performing model: in accuracy RNN60-LSTM (89.50% with 261.8 Mb), in accuracy with low memory usage RNN60-small (83.86% with 9.8 Mb).