Python implementation of the Griffin and Lim algorithm to recover an audio signal from a magnitude-only spectrogram. This project is forked from https://github.com/bkvogel/griffin_lim and adds simpler tools for converting back and forth from wav to melscale spectrogram (db-scale).
This is a python implementation of Griffin and Lim's algorithm to recover an audio signal given only the magnitude of its Short-Time Fourier Transform (STFT), also known as the spectrogram. The Griffin and Lim method is described in the paper:
Griffin D. and Lim J. (1984). "Signal Estimation from Modified Short-Time Fourier Transform". IEEE Transactions on Acoustics, Speech and Signal Processing. 32 (2): 236–243. doi:10.1109/TASSP.1984.1164317
This is an iterative algorithm that attempts to find the signal having an STFT such that the magnitude part is as close as possible to the modified spectrogram.
The Griffin and Lim algorithm can be useful in an audio processing system where an audio signal is transformed to a spectrogram which is then modified or in which an algorithm generates a synthetic spectrogram that we would like to "invert" into an audio signal.
Requires Python 3 (tested with Anaconda Python 3.6 distribution)
python3 build-melspec-from-wav.py --in_file file.wav --sample_rate_hz 16000 --fft_size 512 --overlap_ratio 3 --mel_bin_count 128 --max_freq_hz 5000 --pad_length 24000
Generates melspec from file.wav the following files:
- file_spectrogram_bw.png - spectrogram as a 16-bit bw png
- file_spectrogram_color.png - spectrogram as a 3 channels 8-bit color png
- file_params.txt - params used for the conversion, plus amplitude necessary to denormalize amplitude
--overlap_ratio 3
means 1/3 overlap between 2 windows. With the above parameters, the png generated are square png 128x128 pixels.
python3 build-wav-from-melspec.py --in_file file_spectrogram_bw.png --param_file file_params.txt --out_file file_rebuild.wav --iterations 1000
Applies 1000 iterations of Griffin-Lim parameters then denormalize the signal using parameters from file_params.txt
to restore original wav file. in_file
can either be black and white and color png.
FreeBSD license.