A-dataset-of-Indian-Folk-Songs
This repository provides audio samples (5 sec chunks) for Indian folk songs classification. Folk songs from following five regions of India are used in this research.
- Assamese from Assam
- Uttarakhandi from Uttarakhand
- Kashmiri from Kashmir
- Kannada from Karnataka
- Marathi from Maharashtra
Objective of this repository is to provide an opportunity to researchers to utilize this dataset and test their algorithms on it. This database can be used to perform (a) folk songs classification (b) musical instrument classification (c) language classification. Except for the first case, classes have to be manually labeled.
Current Status
- Audio in 5 Indian Languages
- 307 songs and 1807 audio clips
Language | Number of Songs | Number of Clips |
---|---|---|
Assamese | 141 | 798 |
Uttarakhandi | 29 | 174 |
Kashmiri | 39 | 241 |
Kannada | 63 | 384 |
Marathi | 35 | 210 |
Files are named in the following format: {Language}{FileNumber}chunk({chunkIndex}).wav Example: Assamese100chunk(0).wav Organization
metadata.py contains meta-data regarding the song languages and region of India. Metadata
featurevector.py Used to generate Mean, Median and Standard Deviation of the 19 features of each audio file and convert them into a csv. Included Utilities
melspectogram.py Convert audio data into melSpectogram Visualisation for all categories.
mfcc_visual.py Used to convert the mfcc audio features into Visual representation.
spectogram.py Convert audio data into Spectograms, used often as a preprocessing step.