Hardware-accurate EAX reverb emulation
ThreeDeeJay opened this issue · 9 comments
It's a common complaint that emulated EAX just doesn't match hardware EAX (like on an X-Fi sound card). e.g. OpenAL Soft vs X-Fi
And to help with that, there's a person in our server with an Windows XP rig with a X-Fi Titanium Fatal1ty Champion, who's willing to record samples of the EAX presets for reference.
So would recording a Dirac delta impulse response with each reverb preset in something like RightMark 3D Sound or EFXShow be enough? Or would we also need something else to compensate for OpenAL Soft's different reverb method?
To clarify, OpenAL Soft doesn't "emulate" EAX. It implements it, using the effects and filters it already had in software. Other hardware implementations could also sound different, though OpenAL Soft is particularly different since the mixing bus in general (likely) works on different principles than older hardware, using 3D ambisonics, along with the reverb processor itself being made from scratch.
It's difficult to know if a recording would help. It's certainly possible there's something wrong in the reverb implementation, but there isn't really a reference implementation for what exactly it should sound like given the various parameters (e.g. what exactly Density=0 and/or Diffusion=0.5 is intended to sound like). By the sounds of it, not even Creative is very consistent with how the parameters affect the sound on different cards and drivers. It may help highlight where things are obviously very off, as long as the related listener and source/buffer properties are also known, but it will always sound different to some degree.
I see. What if we recorded a couple different X-Fi sound cards' EAX reverb presets (perhaps there's an app to set Density=0/Diffusion=0.5?) to see if there's a significant inconsistency, then try to match the average (or newest hardware) acoustically, or at least just in volume?
Alternatively, perhaps we should stop trying to replicate Creative's reverb and instead go for more realistic, modern approach.
Like, have you considered implementing support for Binaural Room Impulse Responses to replace reverb presets by loading external files specified in the config, kinda like we do with HRTFs, or is that too convoluted? (pun intended)
There's many BRIRs available in formats similar to HRTFs (WAV and some are even 3D like SOFA, and 6DOF):
https://www.sofaconventions.org/mediawiki/index.php/Files#Room_impulse_responses_databases_(DRIRs,_SRIRs,_BRIRs)
https://www.kvraudio.com/forum/viewtopic.php?t=107337&start=645#p8553062
https://github.com/ShanonPearce/ASH-Listening-Set/blob/main/BRIRs/Rooms.csv
https://www.airwindows.com/airwindows-impulses/
https://freetousesounds.bandcamp.com/track/ir-impulse-response-cave-suite-shower-room-cappadocia-turkey-19232
Though I'm not sure if it'd replace the user's HRTF or if there's a way to remove HRTF/extract only the reverb, sort of like the opposite of diffuse-field equalization in case that's not done already. 🤔
I suppose that some kind of "just ideas from 2005" mode could be helpful (absolutely not because having to sound like a creative is better or preferable, but because development here might require some vis-a-vis effect comparison for games that tended to be written against the implementation rather than the specification)
But that aside, this issue seems kinda pointless because eventually it would always be a matter of discussing each and any one specific problem at a time.
I see. What if we recorded a couple different X-Fi sound cards' EAX reverb presets (perhaps there's an app to set Density=0/Diffusion=0.5?) to see if there's a significant inconsistency, then try to match the average (or newest hardware) acoustically, or at least just in volume?
That would basically be the idea. As it is, I don't think there's intended to be a reference, as long as it sounds "good enough". For instance, the docs say about the diffusion parameter:
It is set by default to 1.0, which provides the highest density. Reducing diffusion gives the reverberation a more “grainy” character that is especially noticeable with percussive sound sources. If you set a diffusion value of 0.0, the later reverberation sounds like a succession of distinct echoes.
"Highest density" isn't a measurable metric. What's highest for one may not be for another, and there isn't a specific point where the time between successive repetitions of a sound become perceived as distinct echoes. Density is even worse:
Reverb Modal Density controls the coloration of the late reverb. Lowering the value adds more coloration to the late reverb.
So really it comes down to whether the properties used sound appropriate for the intended environment.
Alternatively, perhaps we should stop trying to replicate Creative's reverb and instead go for more realistic, modern approach.
Like, have you considered implementing support for Binaural Room Impulse Responses to replace reverb presets by loading external files specified in the config, kinda like we do with HRTFs, or is that too convoluted? (pun intended)
Too impractical. You wouldn't be able to store all the IRs for the potential combination of properties (you can see all the presets in efx-presets.h, and that's a non-exhaustive list of environments that games may want to use). Also, modulation can't be expressed with an impulse response, you can't speed up or slow down audio with it. Convolution reverb is also more CPU-intensive, which would cause problems for multi-environment modeling and smooth transitions when changing the properties.
OpenAL Soft does have an in-progress extension for convolution reverb, though (and I'd recommend B-Format IRs so it's full 3D and can then be output separately with HRTF, surround sound, or plain stereo). I'm just not sure how or if it should deal with things like rotation and focus and 'direct' output (vs virtualized channels), and if there's a way to better handle changing the impulse response without being so abrupt.
Hmm, so for practical purposes, how would we record reverb presets in a way we could at least objectively find the ballpark difference in volume between X-Fi and OpenAL Soft? Like is there an app that can play just the reverb without the direct sound so we can then compare the recorded waveform to find difference in decibels (or LUFs?) Or would recording both the reverb with the direct sound still be good enough?
And if we do find a pattern/constant-ish like, say, +6db for X preset (or all of them?), would it make sense for OpenAL Soft to use a boost like that by default or is the volume set to a certain level for a specific reason and we should just stick to boosting/attenuating each game separately? Because OpenAL Soft's reverb tends to be noticeably quieter than X-Fi in general, although there are exceptions, which I guess might just be bugged implementations or maybe just not "calibrated" for a balanced Goldilocks "just right" amount of reverb 🤔
Hmm, so for practical purposes, how would we record reverb presets in a way we could at least objectively find the ballpark difference in volume between X-Fi and OpenAL Soft? Like is there an app that can play just the reverb without the direct sound so we can then compare the recorded waveform to find difference in decibels (or LUFs?) Or would recording both the reverb with the direct sound still be good enough?
If you're using a dirac impulse, both the direct sound and reverb would be better. It helps to show how much the reverb reduced the sound from its original volume. The direct sound is useful to account for any overall volume difference. I'd suggest using either plain (non-HRTF) stereo, or at least ambi1
HRTF mode, to avoid any potential difference caused by the ambisonic decode (that the reverb will go through) compared to a direct HRTF filter. Avoid using 7.1 or 5.1 unless you're recording in that same format (don't have it downmix surround sound to stereo).
And if we do find a pattern/constant-ish like, say, +6db for X preset (or all of them?), would it make sense for OpenAL Soft to use a boost like that by default or is the volume set to a certain level for a specific reason and we should just stick to boosting/attenuating each game separately? Because OpenAL Soft's reverb tends to be noticeably quieter than X-Fi in general, although there are exceptions, which I guess might just be bugged implementations or maybe just not "calibrated" for a balanced Goldilocks "just right" amount of reverb 🤔
OpenAL Soft largely just applies the gains as specified in the properties. Some extra attenuation is applied to account for the given echo density, which was added since it seemed that's what the hardware used for the original comparisons did. If only some presets seem to have a different volume between X-Fi and OpenAL Soft, it will depend on what presets are affected and if there's any correlation with certain property values. If they're all consistently off by similar amounts, a manual adjustment may need to be added (unless the card is known to have too-loud or too-quiet reverb).
Here's the recording done by the user with the X-Fi XP rig:
https://mega.nz/file/Mh1BATLJ#senjfl7bYwI1QZIUHRLDINA-lKV4aVBqSB74yVhsz58
There's a dirac pop without EAX for reference near the end.
I was planning to replicate that in (DSOAL +) OpenAL Soft though tbh I'm not sure how to measure the perceived loudness of a dirac with reverb, since the peak amplitude would obviously belong to the direct sound. So do you have any suggestions on how to go about it? 🤔
My plan was to look at the waveform in audacity or something. Using the direct sound as a reference, I'd look at the density and distribution of peaks and compare to OpenAL Soft's. FWIW, it's best if the dirac impulse source is the same sample rate as the device output, which is the same sample rate it's recorded at (to avoid resampling and keep the original sample as undisturbed as possible, a single amplitude peak around 0
s). I'm not too concerned with the perceived loudness, since even Creative's drivers seem wildly different and OpenAL Soft's levels seem okay with various presets.
Hmm now that you mention it, we did try to specify a higher sample rate, but couldn't find the option in Windows XP or the Creative control panel. Perhaps the sample rate is controlled by the app? The dirac used was 96kHz which is supported by the sound card, but maybe RightMark 3D sound is forcing 44.1kHz because that's the sample rate of the original test clip. The recording was done in Bandicam with the option to record the Stereo Mix in lossless PCM keeping the audio format, so another possibility could be that the audio could be getting resampled during the input> output loopback.
So I guess it'd just need to be re-recorded with a 44.1kHz dirac, and the volume set to 100 or lower if there's clipping at 100, right?
Anything else that should be amended from that recording above?