loudness normalization using EBU-r128
Opened this issue · 2 comments
Hi!
Have you considered adding EBU-r128 normalization?
E.g. similar to the implementation below which however needs ffmpeg as a dependency?
https://github.com/slhck/ffmpeg-normalize#ebu-r128-normalization
Hi @oplatek,
EBU R128 uses BS.1770 as the algorithm for normalization. Using pyloudnorm
should produce very similar results to ffmpeg.
Did you have a specific use case in mind? Currently pyloudnorm
only measures integrated loudness but EBU R128 also includes short-term and momentary loudness. Was that what you were referring to?
My use-case is comparing relatively short Text-to-Speech (TTS) or Voice Converted (VC) samples converted between source speaker & condition to clean target speaker voice.
The samples are typically 2-14s long, with length normally distributed.
I noticed that RMS is sensitive to background noise e.g. for VC from noisy conditions to clean target conditions.
And as I want to compare side by side noisy and clean utterances I want them to be normalized to the same perceived loudness.
In general, I think that peak loudness normalization is the best.
I asked about EBU R 128 normalization because some other studies used it and it also uses peak normalization.