/eva

A screaming vocal samples dataset.

Primary LanguagePython

EVA: The Extreme Vocals Archive Dataset

This is the final project I did when I was studying the MIR course in NTHU. I decided to do make a dataset full of screaming vocals samples, because this kind of vocals is less concerned in the MIR field. The making of this dataset may be a starting point for future research.

Raw audio

I pick up 16 songs as listed below from The 'Mixing Secrets' Free Multitrack Download Library, and used audacity to slice vocal tracks into small clips, ended up with 565 of samples.

The Apprehended: 'Still Flyin'
Cnoc An Tursa: 'Bannockburn'
The Complainiacs: 'Etc'	
Dark Ride: 'Burning Bridges'
Dark Ride: 'Hammer Down'
Dark Ride: 'Piece Of Me'
Death Of A Romantic: 'The Well'
Decypher: 'Unseen'
Headwound Harry: 'XXXV'
Hollow Ground: 'Ill Fate'
Hollow Ground: 'Left Blind'
Last Legacy: 'Who's Who In Hell'
Storm Of Particles: 'Of Ice And Hopeless Fate'
Timboz: 'Pony'
Titanium: 'Haunted Age'
Wall Of Death: 'Femme'

I named each clips using initials, so the first clips from the track 43_Vox18.wav in The Apprehended - Still Flyin is named as TA_SF_43_Vox18_01.wav, etc. I can't provide the raw audio, but you can use the files in mark_time which contain the start time and end time of each clips, to manually recreate them.

Data

Code

Require pandas, sklearn, matplotlib and librosa be installed.

  1. get_features: features extraction from the audio (used to create features.csv). Most of the code are borrowed from FMA.
  2. analysis: show the melspectrogram of some samples. Require raw audio files.
  3. baseline: baseline svm model for vocal techniques recognition.

Melspectrogram of some samples

Samples distribution after dimension reduction using LDA

Environment

  • ubuntu 16.04 LTS
  • python3.5.2 (using Pycharm 2018.1.4)

I also wrote a report in chinese about the making process.