This is a simple audio classification api
build to classify the sound of an audio, weather it is the cat
or dog
sound.
Given a .wav
audio the model will classify what does the sound the audio belongs to either cat
or dog
.
{
"predictions": {
"class": "dog",
"label": 1,
"probability": 1.0
},
"success": true
}
To start server and start audio
classification first you need to make sure you are in the server
folder and run the following commands:
- creating a virtual environment
virtualenv venv && .\venv\Scripts\activate.bat
- installing packages
pip install -r requirements.txt
- Starting the server
python api/app.py
The server will start on a default port of
3001
and you will be able to make api request to the server to do audio classification.
The following table shows all the metrics summary we get after training the model for few 15
epochs.
model name | model description | test accuracy | validation accuracy | train accuracy | test loss | validation loss | train loss |
---|---|---|---|---|---|---|---|
cats-dogs-sound-cnn.pt | audio sentiment classification for dogs and cats CNN. | 90.7% | 90.7% | 93.5% | 0.621 | 0.218 | 0.209 |
The following is the classification report for the model on the test
dataset.
# | precision | recall | f1-score | support |
---|---|---|---|---|
accuracy | - | - | 90% | 2305 |
macro avg | 91% | 90% | 90% | 2305 |
weighted avg | 92% | 89% | 90% | 2305 |
The following figure shows a confusion matrix for the classification model.
If you hit the server at http://localhost:3001/classify
you will be able to get the following expected response that is if the request method is POST
and you provide the file expected by the server.
The expected response at http://localhost:3001/classify
with a file audio
of the right format will yield the following json
response to the client.
{
"predictions": {
"class": "dog",
"label": 1,
"probability": 1.0
},
"success": true
}
Make sure that you have the audio named cat.wav
in the current folder that you are running your cmd
otherwise you have to provide an absolute or relative path to the audio.
To make a
curl
POST
request athttp://localhost:3001/classify
with the filecat.wav
we run the following command.
# for cat
curl -X POST -F audio=@cat.wav http://127.0.0.1:3001/classify
# for dog
curl -X POST -F audio=@dog.wav http://127.0.0.1:3001/classify
To make this request with postman we do it as follows:
- Change the request method to
POST
at http://127.0.0.1:3001/classify - Click on
form-data
- Select type to be
file
on theKEY
attribute - For the
KEY
typeaudio
and select the audio you want to predict undervalue
- Click send
If everything went well you will get the following response depending on the audio you have selected:
{
"predictions": { "class": "dog", "label": 1, "probability": 1.0 },
"success": true
}
- First you need to get the input from
html
- Create a
formData
object - make a POST requests
const input = document.getElementById("input").files[0];
let formData = new FormData();
formData.append("audio", input);
fetch("http://127.0.0.1:3001/classify", {
method: "POST",
body: formData,
})
.then((res) => res.json())
.then((data) => console.log(data));
If everything went well you will be able to get expected response.
{
"predictions": { "class": "dog", "label": 1, "probability": 1.0 },
"success": true
}
- All notebooks for training and saving the models are found in the
notebooks
folder of this repository.