fakufaku/fast_bss_eval

Compability to "bsseval_sources_version"

RicherMans opened this issue · 5 comments

Hey there,
thanks for your great and super useful work of yours!

I just stumbled upon a small problem, that is related to the windowing question in #10, where I'd like to use your package as a replacement to museval.
When evaluting using fast_bss_eval.bss_eval_sources(ref,est) and museval.evaluate(ref,est), we achieve different values for the SDR.
After some checking, I found that the main problem is the bsseval_sources_version parameter, which is set to False for museval.evaluate, but produces the exactly same results as fast_bss_eval.bss_eval_sources, if set to True.

My question is: Is there some kind of parameter or any suggestion how to change the output of fast_bss_eval accordingly, such that the results are somewhat similar?

Thanks in advance!

Hi @RicherMans , if my understanding is correct, the original bss_eval toolbox contained two main functions

  • bss_eval_sources, that compares single channel reference and estimated sources
  • bss_eval_images, that compares multichannel reference and estimated sources
    At the moment, fast_bss_eval only implements the former... And I am not completely sure how to implement it yet.
    Are you evaluating multichannel signals ? Also, what is the order of differences of SDR obtained ?

Wow thanks for the quick reply @fakufaku !

bss_eval_images, that compares multichannel reference and estimated sources
At the moment, fast_bss_eval only implements the former... And I am not completely sure how to implement it yet.
Are you evaluating multichannel signals ? Also, what is the order of differences of SDR obtained ?

So I tested the differences between the two packages for single channel evaluation, using this snippet code:

import fast_bss_eval
import museval
import torch

CHUNKS = 1
CHANNELS = 1
x = torch.randn(CHANNELS, CHUNKS * 44100).tanh().numpy()
y = torch.randn(CHANNELS, CHUNKS * 44100).tanh().numpy()

sdr_mus, mus_isr, mus_sir, mus_sar = museval.evaluate(
    y.reshape(CHUNKS, -1, CHANNELS),
    x.reshape(CHUNKS, -1, CHANNELS))
print(f"MUSEVAL: {sdr_mus=}, {mus_sir=}, {mus_sar=}")

sdr_fast, sir_fast, sar_fast = fast_bss_eval.bss_eval_sources(
    y.reshape(CHUNKS, CHANNELS, -1),
    x.reshape(CHUNKS, CHANNELS, -1),
    compute_permutation=False,
)
print(f"FASTBSS: {sdr_fast=}, {sir_fast=}, {sar_fast=}")

and obtained the following results:

MUSEVAL: sdr_mus=array([[-2.98003]]), mus_sir=array([[inf]]), mus_sar=array([[-19.47029572]])
FASTBSS: sdr_fast=array([[-19.470297]], dtype=float32), sir_fast=array([[inf]], dtype=float32), sar_fast=array([[-19.470297]], dtype=float32)

While the SIR and SAR are both identical, the difference in SDR is noticeable ( -2.9 vs. -19).

@RicherMans after taking a closer look at museval implementation, I think I understand what the difference is. It would not be too hard I think to fix it for the case of single channel recordings, however, the multichannel version that would be the full bss_eval_images might take a little more time.

@fakufaku thanks a lot for the effort! I guess the multichannel version is at least for my use-case somewhat irrelevant.
Also as you might have pointed out, at least for a quick evaluation, if the results are somewhat within the same ballpark, it wouldn't matter much, one could still use the original museval evaluator for a proper evaluation.

But your package is much more convenient to estimate if say training is working.
Thanks a lot again!
And sorry I don't have a clue about signal processing, so I can't help with the implementation 😮‍💨

Don't worry :) I could try to create a bss_eval_images function restricted to single channel measurements for now. I can't guarantee I will do it very quickly, but I'll put it somewhere in my todo list!