I based this on a Go implementation by nfisher
(linked below). It's not complete yet and doesn't even pass its test
harness, so don't use it.
-
n
- The number of expected insertions. Note this isn't unique values; it's all insertion operations -
δ
- The false positive rate as a fraction like1/64
or1/512
-
p
- The number of bits from the hash function which is used to actually look up values. This is computed based on the desired false positive rateδ
and the insertion countn
thusly:p = log2(n/δ)
-
q
- The number of bits ofp
which are used as the quotient to determine the home slot where a given hash value should go -
r
- The bits ofp
left over after theq
bits have been used to find the home slot. This remainder is used to distinguish between multiple values that have the same quotient. Because of the probabilistic nature of this data structure, the worst case is that two hashesh(x)
andh(y)
are disambiguated entirely based on the lowr
bits of their hashes, and thus the worst case false positive rate is2^-r
.So for
r = 6
, the false positive rate is1/(2^6)
which is1/64
-
BITS_PER_SLOT
- In the code sometimes this is used forr
. The code uses conditional compilation to choose either compile-time or run-time implementation. For values8
,16
,32
, and64
there are accelerated implementations that use the corresponding integer types.
- Go implementation that's a bit more readable
- Morning Paper explainer for the actual paper
- The actual paper
rust-cqf
is dual-licensed under the Apache 2.0 and MIT licenses.