This repository contains research and experiments aimed at producing sparse, interpretable representations of audio.
My current research is focused on sparse, interpretable representations of audio. Ideally, the sparsity and interpretability also leads to representations that can be manipulated at a level of granularity musicians are accustomed to.
- simpler, linear sparse decompositions such as matching pursuit
- perceptually-motivated loss functions, inspired by Mallat's scattering transform and the Auditory Image Model.
AUDIO_PATH=
PORT=9999
IMPULSE_RESPONSE_PATH=
S3_BUCKET=
The MusicNet dataset should be downloaded and extracted to a location on your local machine.
You can then update the AUDIO_PATH
environment variable to point to the musicnet/train_data
directory, wherever that may be on your local machine.
Room impulse responses to support convolution-based reverb can be downloaded here.
You can then update the IMPULSE_RESPONSE
environment variable to point at the directory on your local machine that contains the
impulse response audio files.
If you'd like to try out some of the models I've trained locally, you can set S3_BUCKET
to matching-pursuit-trained-models