AUDIOMATE

Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a generic way. This should ease the use of audio datasets for example for machine learning tasks.

import audiomate
from audiomate.corpus import io

# Download a dataset
esc_downloader = io.ESC50Downloader()
esc_downloader.download('/local/path')

# Load and work with the dataset
esc50 = audiomate.Corpus.load('/local/path', reader='esc-50')

# e.g. Read the audio signal and the label of specific sample/utterance
utterance = esc50.utterances['1-100032-A-0']
samples = utterance.read_samples()
label = utterance.label_lists[audiomate.corpus.LL_SOUND_CLASS][0].value

Furthermore it provides tools for interacting with datasets (validation, splitting, subsets, merge, filter), extracting features, feeding samples for training ML models and more.

Currently supported datasets:

Currently supported formats:

Indirectly supported datasets (Details):

Spoken Wikipedia Corpora

Installation

pip install audiomate

Install the latest development version:

pip install git+https://github.com/ynop/audiomate.git

Development

Prerequisites

A supported version of Python 3

It's recommended to use a virtual environment when developing audiomate. To create one, execute the following command in the project's root directory:

python -m venv .

To install audiomate and all it's dependencies, execute:

pip install -e .

Running the test suite

pip install -e .[dev]
python setup.py test

With PyCharm you might have to change the default test runner. Otherwise, it might only suggest to use nose. To do so, go to File > Settings > Tools > Python Integrated Tools (on the Mac it's PyCharm > Preferences > Settings > Tools > Python Integrated Tools) and change the test runner to py.test.

Benchmarks

In order to check the runtime of specific parts, pytest-benchmark is used. Benchmarks are normal test functions, but call the benchmark fixture for the code under test.

To run benchmarks:

# Run all
pytest bench

# Specific benchmark
pytest bench/corpus/test_merge_corpus.py

To compare between different runs:

pytest-benchmark compare

Editing the Documentation

The documentation is written in reStructuredText and transformed into various output formats with the help of Sphinx.

To generate the documentation, execute:

pip install -e .[dev]
cd docs
make html

The generated files are written to docs/_build/html.

Versions

Versions is handled using bump2version. To bump the version:

bump2version [major,minor,patch,release,num]

comodoro/audiomate