Getting more than one fingerprint for each offset time?
meikidd opened this issue ยท 5 comments
Hi, this module is awesome! I am brand new to audio fingerprint, this module get me a good opportunity to learn how to code.
But I found that I got more than one fingerprint for each offset time, is it normal?
Sample Output:
time=34 fingerprint=238707
time=34 fingerprint=216179
time=34 fingerprint=347159
time=34 fingerprint=478295
time=43 fingerprint=601676
time=43 fingerprint=929303
time=43 fingerprint=1060439
time=45 fingerprint=132398
time=45 fingerprint=1049879
time=45 fingerprint=2360577
I am confused how to use these hashes to search in the database for matching hashes.
For example, if I have a record in database time=43 fingerprint=601676
, does it mean the sample hit this db record?
or it requires a full matching time=43 fingerprint=601676, time=43 fingerprint=929303, time=43 fingerprint=1060439
?
Hi, thanks!
Yes it is normal to (sometimes) have more than one fingerprint for each offset time.
You can see this on the Readme pictures: blue points are often linked to several gray lines. And you have one fingerprint per gray line.
The good strategy to consider if the sample is a hit is to count the proportion of matching fingerprints for a given offset in time between the reference track and the one you are analyzing.
I will publish code on this topic soon.
Good luck!
I will publish code on this topic soon.
Do you mean the implementation for section 2.3 in An Industrial-Strength Audio Search Algorithm?
Great to hear that, I am also looking for open source module for that. ๐๐๐
Yes the link to the paper is relevant.
You can also look at this python code to understand what you need to do https://github.com/worldveil/dejavu/blob/7f53f2ab6896b38cfd54cc396e2326a98b957d07/dejavu/__init__.py#L119
Awesome, thank you very much!!
For the record, here is how I implemented fingerprint search:
https://github.com/adblockradio/adblockradio/blob/master/predictor-db/hotlist.js#L127