- Aim is to create a machine learning model which classifies music samples into different genres.
- It aims to predict the genre by using an audio signal as its input.
- Automating the classification of music helps to make the selection of songs quick and less time-consuming.
- It can also be used to find valuable data such as finding out popular genres, trends, and artists very easily.
- We shall be using the GTZAN music classification dataset, which consists of audio files of 30 seconds each with each audio track being in .wav format.
- The dataset contains 10 genres of music:
- Classical, Jazz, Country, Rock, Blues, Reggae, Metal, Disco, Hip-hop, Pop
- GTZAN Dataset
-
In order to help us extract the features from the audio files, we have used the LibROSA library
-
Libros is a library designed for music and audio analysis and helps provides the building blocks necessary in order to create a music retrieval system.
-
Libros helps to visualize the audio signals and perform feature extraction using different signal processing techniques.
-
Signals :
-
The Fourier Transform :
-
Zero-Crossing Rate :
-
Spectral Centroid :
-
Spectral RollOff :
-
Croma Frequencies :
-
Naive Bayes Classifier
-
Decision Trees
-
K Nearest Neighbors
-
Neural Networks
first create a function which takes the model as the input and fits it to the training dataset, which then predicts the genres of the tracks in the test dataset. It then prints the confusion matrix along with the accuracy of the model.
Model | Accuracy |
---|---|
Naive Bayes Classifier | 51.9% |
Decision Trees | 64.4% |
K Nearest Neighbors | 80.5% |
Neural Networks | 67.1% |
It' determined that KNN has the highest accuracy, the final model is then developed based on it, and according to this model. So, I compute the feature importance.