
Qawwali audio dataset with a genre classifier

Primary LanguagePythonMIT LicenseMIT


An audio dataset for genre recognition of Qawwali



  • data: Collection of 72 one-minutes duration Qawwali songs. Format is mono with sample rate 44100 Hz
  • metadata: JSON formatted schema and metadata file describing QawwalRang dataset: song name, artist, URL, offset and duration
  • src: Two python3 programs one to (rebuild) dataset from metadata file and another program attempting to recognize Qawwali as genre of a given song.
  • article: Documentation with motivation, description and results from this work


Building dataset

usage: qdsb.py [-h] [--opath OFFLINE_PATH] [--info] datapath metadata

Qawwali dataset builder

positional arguments:
  datapath              Folder/directory path where qawali reference dataset
                        will be built
  metadata              Json metadata file describing reference qawali dataset

optional arguments:
  -h, --help            show this help message and exit
  --opath OFFLINE_PATH  Folder/directory to look for qawali songs. Alternate
                        to internet download
  --info                Asks the program to report dataset statistics

Running classifier

usage: qdetect.py [-h] [--reload] [--extract]
                  [--compare [genre features directory]]

Qawwali genre detection program

positional arguments:
  songs_dir             folder/directory containing songs to be evaluated

optional arguments:
  -h, --help            show this help message and exit
  --reload              reload data from songs (required at least once per
                        songs directory)
  --extract             extract suitable audio features from raw data
                        (required at least once)
  --compare [genre features directory]
                        generates classification results comparing qawali wtih
                        other genre


  • Artist Map Artist distribution

  • Musical properties of dataset Thaat/Raag/Taal distribution

  • Qawwali recognition against GTZAN dataset Genre recognition results