FastFilter/xorfilter

Golang Interface in place for Source64 and Hash64?

jtejido opened this issue · 3 comments

Hi,

First, Thank you for your paper and implementation/s!

I'm wondering if it's possible to re-write this that would accept a more idiomatic 'Go' way? say, accepting Source64 and Hash64 instead of using SplitMix64 and Murmur (with that being said, I'm wondering if the random and hash functions can also be changed, say using Mersenne or XoRoShiRo generators and/or T1ha or Metro for the hash). I'm not sure if the literature mentions anything specific random generators but I know you specifically chose Murmur for the hash (in which case, Hash64 interface isn't necessary).

Contributions are certainly invited to make this easier for Go users.

I'd be interested in better understanding what you find non-idiomatic in

filter,error := xorfilter.Populate(keys) // keys is of type []uint64

Is there anything non-idiomatic about the type []uint64?

We could allow users to select families of hash functions... but note that they must be randomized hash functions whereas Hash64 is meant for cryptographic hash functions (e.g., sha256). Though it is the same word (hash), it is a totally different concept. If you hash an object with sha256, you must always get the same hash value. We use randomized hash functions where we want different runs to give you different values.

If your input comes in the form of a Source64, you can create a slice, fill it up with the values you want to filter upon. I see no benefit for the xorfilter library to support that, you can do it in your own code. Note that in xorfilter, we need to know how many you have. (This is true of all probabilistic filters since they have a capacity that is determined by the input size.)

Thanks for the prompt reply.

Sorry, I'd have to rewrite my question. It's about accepting Source64 and Hash64 as parameter for xor8 struct to make use of other random generator and hash functions other than the splitmix64 and murmur64 that's already buit-in. But I understand and we can close this one now.

Again thanks!

Ah. Yes. I misunderstood the Source64 point.

It would certainly be possible to make it pratical for users to specify their own RNG.