
A template for your own music recognition machine learning project.


This repository is a template for a machine learning project for multi-class classification of music files.

It is implemented so that you can try out various experimental conditions (model types, dataset creation, etc.) by rewriting the configuration file.

In addition, MLFlow is used to manage the experimental results.

However, this repository has only basic functionality and will be expanded in the future.


The approach currently being implemented is as follows.

  1. Clipping any time interval from a music file.
  2. Perform augmentation on the signal data.
    • Time Stretch
    • Additional White Noise
    • Pitch Shift
    • Change Volume
  3. From the signal data, a Mel-Spectrogram made of three different parameters is generated as an image.
  4. The study is performed as an image classification problem using ResNet and other models.


  • Windows10 Home 64-bit
  • Anaconda
$conda create -n {env_name} python=3.7.2
$activate {env_name}
$pip install -r requirements.txt

It is possible to build an environment using nvidia-docker in wsl2.

Check Dockerfile.


Start Training

  1. Create a dataset CSV

  2. Modify settings.yaml

  3. Start train script

    $python settings.yaml {experiment name}
  4. Confirm the results of the experiment.

    $mlflow ui

Start Realtime Application

  1. Modify deploy.yml

  2. Start app script

    $python deploy.yml


Copyright © 2020 T_Sumida Distributed under the MIT License.