/DAFX-2019

TIME SCALE MODIFICATION OF AUDIO USING NON-NEGATIVE MATRIX FACTORIZATION

Primary LanguagePython

TIME SCALE MODIFICATION OF AUDIO USING NON-NEGATIVE MATRIX FACTORIZATION

This repository contains the code to reproduce the method presented in:

Roma, G., Green, O. & Tremblay, P. A., Time scale modification of audio using non-negative matrix factorization. Proceedings of the 22nd International Conference on Digital Audio Effects (DAFX 2019)

Requirements:

Usage

The main script is nmf_tsm.py:

python nmf_tsm.py input_file_name.wav stretch_factor nmf_rank <t1> <t2> <t3>

The first three arguments are mandatory. If nmf_rank is 0, the rank will be automatically estimated via singular value decomposition. You can also modify the defaults in the first lines in the script. The envelope preservation option can also be switched in the code (lock_active). For more information please see the paper.

Other methods

Some classic time scale modification methods are included for comparison in classic_tsm.py (ported from the matlab TSM Toolbox). Here's an example python code for using wsola with a 1.8 stretch factor, assuming the signal is read in the numpy arrray x: y = wsola(x, 1.8, Hs = 512, window = signal.hann(2048, sym = False))