/griffin_lim

Implementation of the Griffin and Lim algorithm to recover an audio signal from a magnitude-only spectrogram.

Primary LanguagePythonBSD 3-Clause "New" or "Revised" LicenseBSD-3-Clause

griffin_lim

Python implementation of the Griffin and Lim algorithm to recover an audio signal from a magnitude-only spectrogram. This project is forked from https://github.com/bkvogel/griffin_lim and adds simpler tools for converting back and forth from wav to melscale spectrogram (db-scale).

Description

This is a python implementation of Griffin and Lim's algorithm to recover an audio signal given only the magnitude of its Short-Time Fourier Transform (STFT), also known as the spectrogram. The Griffin and Lim method is described in the paper:

Griffin D. and Lim J. (1984). "Signal Estimation from Modified Short-Time Fourier Transform". IEEE Transactions on Acoustics, Speech and Signal Processing. 32 (2): 236–243. doi:10.1109/TASSP.1984.1164317

This is an iterative algorithm that attempts to find the signal having an STFT such that the magnitude part is as close as possible to the modified spectrogram.

The Griffin and Lim algorithm can be useful in an audio processing system where an audio signal is transformed to a spectrogram which is then modified or in which an algorithm generates a synthetic spectrogram that we would like to "invert" into an audio signal.

Requirements

Requires Python 3 (tested with Anaconda Python 3.6 distribution)

Usage

python3 build-melspec-from-wav.py --in_file file.wav --sample_rate_hz 16000 --fft_size 512 --overlap_ratio 3 --mel_bin_count 128 --max_freq_hz 5000 --pad_length 24000

Generates melspec from file.wav the following files:

  • file_spectrogram_bw.png - spectrogram as a 16-bit bw png
  • file_spectrogram_color.png - spectrogram as a 3 channels 8-bit color png
  • file_params.txt - params used for the conversion, plus amplitude necessary to denormalize amplitude

--overlap_ratio 3 means 1/3 overlap between 2 windows. With the above parameters, the png generated are square png 128x128 pixels.

python3 build-wav-from-melspec.py --in_file file_spectrogram_bw.png --param_file file_params.txt --out_file file_rebuild.wav --iterations 1000

Applies 1000 iterations of Griffin-Lim parameters then denormalize the signal using parameters from file_params.txt to restore original wav file. in_file can either be black and white and color png.

License

FreeBSD license.