/philharmonia-dataset

PyTorch Dataset for collection of 14,000 Philharmonia sound samples

Primary LanguagePythonMIT LicenseMIT

philharmonia-dataset

This is a PyTorch Dataset implementation for 14,000 sound samples of the Philharmonia Orchestra, retrieved from their website

Installing

Clone the repo and install using pip (and install ffmpeg)

apt-get install ffmpeg
git clone https://github.com/hugofloresgarcia/philharmonia-dataset
cd philharmonia-dataset && pip install -e .

Usage (Python 3):

from philharmonia_dataset import PhilharmoniaSet

# create a dataset object
dataset = PhilharmoniaDataset(root='./data/philharmonia', 
							 download=True, 
							 sample_rate=48000,)

During the first run, calling PhilharmoniaDataset will download the audio files from here and convert mp3 files to wav, for faster loading. This will take approximately 5-10 minutes.

sample output

dataset[0]

{
	audio (np.ndarray): audio array with shape (channels, samples)
	one_hot (np.ndarray): one hot encoding of label
	instrument (str): instrument name
	articulation (str): playing articulation (e.g 'pizz-normal') for pizzicato
	dynamic (str): playing dynamic (e.g. 'forte')
	pitch (str): pitch (e.g. 'B5'). If instrument is unpitched, will return 'nan'. 
}

Labels

each example is assigned one of the following labels:

Index Label
0 banjo
1 bass-clarinet
2 bassoon
3 cello
4 clarinet
5 contrabassoon
6 double-bass
7 english-horn
8 flute
9 french-horn
10 guitar
11 mandolin
12 oboe
13 saxophone
14 trombone
15 trumpet
16 tuba
17 viola
18 violin

for example, the one_hot encoding of clarinet would be

[0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

(index number 4 is 1, while all other indices are 0)