A-dataset-of-Indian-Folk-Songs

This repository provides audio samples (5 sec chunks) for Indian folk songs classification. Folk songs from following five regions of India are used in this research.

  1. Assamese from Assam
  2. Uttarakhandi from Uttarakhand
  3. Kashmiri from Kashmir
  4. Kannada from Karnataka
  5. Marathi from Maharashtra

Objective of this repository is to provide an opportunity to researchers to utilize this dataset and test their algorithms on it. This database can be used to perform (a) folk songs classification (b) musical instrument classification (c) language classification. Except for the first case, classes have to be manually labeled.

Current Status

  • Audio in 5 Indian Languages
  • 307 songs and 1807 audio clips

  • Language Number of Songs Number of Clips
    Assamese 141 798
    Uttarakhandi 29 174
    Kashmiri 39 241
    Kannada 63 384
    Marathi 35 210

Organization

Files are named in the following format: {Language}{FileNumber}chunk({chunkIndex}).wav Example: Assamese100chunk(0).wav

Metadata

metadata.py contains meta-data regarding the song languages and region of India.

Included Utilities

featurevector.py Used to generate Mean, Median and Standard Deviation of the 19 features of each audio file and convert them into a csv.

melspectogram.py Convert audio data into melSpectogram Visualisation for all categories.

mfcc_visual.py Used to convert the mfcc audio features into Visual representation.

spectogram.py Convert audio data into Spectograms, used often as a preprocessing step.

Acknowledgement

This database has been collected through various publicly available websites and other online resources.

Usage

This database is open for use for any academic or research purpose.

Citation

If you are using this database for your research, kindly cite following paper - accepted at FRSM 2020 (citation details will be available after publication)