mjhydri/BeatNet

what is the difference of the three feature extraction model?

980202006 opened this issue · 7 comments

Hi,Great work!
I realize the parameter of the 'model'.
image
I test the model on the same wav but they get different result.
image

from BeatNet.BeatNet import BeatNet
import  numpy as np
def get_bpm(inp):
    begin = inp[0][0]
    durations = []
    for line in inp[1:]:
        durations.append(line[0] - begin)
        begin = line[0]
    return 60 / np.mean(durations)

for i in range(1,4):
    estimator = BeatNet(i, mode='offline', inference_model='PF', thread=False)

    Output = estimator.process("bpm_tes1.wav")
    # print(Output)
    print(i, get_bpm(Output))

Thanks for your interest. Regarding your question, we refer you to the original paper. However, here is a brief answer for that: To report the model's performance for different unseen datasets, three models are trained such that each of them is trained with one dataset out, resulting in different performances. Given the default mode (model 1) observed more training data we recommend using it for random music pieces. However, depending on the music genre, other models may outperform it.

Also, please note that for 'offline' usages we recommend using 'DBN' inference model which is non-casual and leverages future data to infer the beats/downbeats in addition to the current data. However, for 'online', 'realtime', and 'streaming' modes, given future data is not available, particle filtering 'PF' is required as the inference model.

Thank you!

You are welcome!

Can you make the training code public, I would like to try it on my own dataset?

Of course! By requesting through email, we send the requested material directly to the developers who wish to have access to the training code.

Thank you!

Anytime!