dynamic mode: Change in integrated loudness shouldn’t result in a true peak which exceeds the target TP
mifi opened this issue · 12 comments
Thanks for this awesome tool! I was having trouble finding info about the loudnorm filter in ffmpeg, but this repo is a wealth of knowledge.
I've been hit by the somewhat awkward implementation in ffmpeg where if the target LRA is lower than, it switches to "dynamic", causing the audio file to become turn completely quiet. Luckily you have already have a solution --keep-lra-above-loudness-range-target
for that.
Now reading the https://ffmpeg.org/ffmpeg-filters.html#loudnorm I see that:
... the change in integrated loudness shouldn’t result in a true peak which exceeds the target TP. If any of these conditions aren’t met, normalization mode will revert to dynamic.
So I'm wondering have any of you thought about the possibility of dynamic
mode getting accidentally triggered by this condition, and how to prevent that?
Good point, I guess what you mean is that the target integrated loudness should be capped to max(loudness target, measured_I + measured_LRA)
?
I can add that as a safeguard, checking with @richardpl if that would suffice.
To be honest, I don't know the filter code well enough to give a definitive answer here.
Not sure exactly what ffmpeg means because I'm not super into audio terminology, but yes my worry is that we will accidentally trigger "dynamic" mode if some condition is met. Maybe we could look at ffmpeg source code too, and just mirror what it does
We actually use ffmpeg under the hood!
The loudnorm filter will do whatever it does according to the description that you linked to. The ffmpeg-normalize wrapper simply adds a bunch of convenience functions and options like the one to keep the LRA above the threshold.
I'll see if I can look at the source code to verify that there might be a problem with too large LRA values.
There is already option to set custom non-default LRA, thus ensuring linear processing in 2nd pass. But if you use unrealistic target TP (too small value) it may not do linear processing at all.
I found this code:
if ((offset_tp <= s->target_tp) && (s->measured_lra <= s->target_lra)) {
s->frame_type = LINEAR_MODE;
s->offset = offset;
}
But I'm not sure what offset_tp
and s->target_tp
mean.
I tried to run a normalization using I=-5
(max allowed loudness) and tp=-2
(default value), making sure to set LRA
to the measured LRA to prevent dynamic mode due to LRA too low. The output still sounds like the volume is being dynamically adjusted (volume seems to be going up and down). Not sure, but maybe I triggered dynamic mode. It's a pity that ffmpeg doesn't print any warning when dynamic mode gets enabled.
linear mode is simple volume fixed gain knob. If you set loudness to extremes with also tp it can not do linear processing because its mathematically impossible to do linear processing in such case, dynamic mode is printed at end of processing, but sure it should give warning at start. (Thought you can guess by speed of processing of filter which mode is used currently)
Thanks for your comments, @richardpl!
I would agree that a warning printed at the beginning would be most useful. I don't think users will be able to tell by the processing speed.
If you ask me, I think when linear=true
is explicitly specified, but the loudnorm filter cannot achieve linear processing, crashing would be even better than printing a warning and reverting to dynamic mode, but that's a breaking change.
If you ask me, I think when
linear=true
is explicitly specified, but the loudnorm filter cannot achieve linear processing, crashing would be even better than printing a warning and reverting to dynamic mode, but that's a breaking change.
yes, same problem here, lot of my music library got nuked because of this :\
Sorry to hear that you have had some issues with your collection. Please note the entry in the FAQ on this: https://github.com/slhck/ffmpeg-normalize#should-i-use-this-to-normalize-my-music-collection — for music you want a ReplayGain-like algorithm.
I realize that a "set and forget" approach would be desirable, but it conflicts with the inner workings of the filter and the current behavior.
What I could imagine is adding a --linear
option that forces linear processing and exits with an error if it can't be done. That requires determining beforehand whether linear/dynamic processing will be used, which is not perfectly feasible and error-prone.
That said, with fe96734 you already get much clearer warnings about reversion to dynamic processing happening.
No worries about my library, i think i should have a backup of it somewhere on a pendrive.
I have talked with a friend who does producing and I think I know the reason for this now. If you are asking to normalize a quieter song to a higher level, the filter obviously should add gain to it. However, there may be no headroom in the track, so linearly gaining a few dBs is not possible, since the audio would clip. Therefore the only possibility is to add that gain + a limiter at the end, which is what the dynamic processing does. So for example I have a track that is -12 LUFS and I want it to normalize to -10 LUFS, thats a +2 dB gain. I also set the Max True Peak to -0.1 dB to prevent clipping, but the audio does not have that 2 dB, so linear gain is not possible.
The trick is to never add loudness, only to remove, so setting the target loudness to a low value (for example to -23 by default) should do it.
This explanation may be wrong a little, because I'm still trying to understand it all.
Yes, that is the explanation for why dynamic processing is needed, and why it may deteriorate quality (through limiting).