Sphinx2023_hackathon

Models

1.Genre prediction Model

we trained our model on a Dataset from kaggle. We are preforming preprocessing and resampling using torchaudio before converting the dataset to a Huggingface transformers Dataset format.
Further we are vectorising the data using the Meta’s wav2vec2-base-960h to extract features from the data Test-train split = 80:20
Our model classifies the audio input into 10 classes
-blues
-classical
-country
-disco
-hip-hop
-jazz
-metal
-pop
-reggae
-rock
Model metrics :
Screenshot 2023-11-05 at 10 24 37 AM