erikdubbelboer/redis-lua-scaling-bloom-filter

Lua sub bugs

Closed this issue · 1 comments

I was using this as a bases for a redis bloom filter (though ended up just roughly borrowing some code and stripping out the scaling filter expansion). Testing showed 10x more false positives than expected all the time. I noticed the hash handling code was broken:

local h = { }
h[0] = tonumber(string.sub(hash, 0 , 8 ), 16)
h[1] = tonumber(string.sub(hash, 8 , 16), 16)
h[2] = tonumber(string.sub(hash, 16, 24), 16)
h[3] = tonumber(string.sub(hash, 24, 32), 16)

See http://www.lua.org/pil/20.html, the strings start at 0 and sub() is inclusive. When I fixed this things worked (no false negatives and the right amount of false positives). E.g:

local h = { }
h[0] = tonumber(string.sub(hash, 1 , 8 ), 16)
h[1] = tonumber(string.sub(hash, 9 , 16), 16)
h[2] = tonumber(string.sub(hash, 17, 24), 16)
h[3] = tonumber(string.sub(hash, 25, 32), 16)

Also maybe the line that sets bits could use ceil() instead of floor()? Not a huge deal though.

That's a very good observation. I'm still wondering why my tests weren't reporting this 10x more false positives. Maybe it's not like this in all cases. Anyways I fixed it now.