HashRing.add_node error - gb_trees.insert_1/4 {:key_exists}
weixiyen opened this issue · 11 comments
I tried creating a hash ring with nodes 1 -> 1024, and noticed that it would error out at 805 every time.
For example, you can test with this:
0..804 |> Enum.reduce(HashRing.new, fn (n, r) -> IO.inspect(n); r |> HashRing.add_node(n) end) # => works
0..805 |> Enum.reduce(HashRing.new, fn (n, r) -> IO.inspect(n); r |> HashRing.add_node(n) end) # => breaks
Error output
** (ErlangError) erlang error: {:key_exists, 2854800734}
(stdlib) gb_trees.erl:318: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:297: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:297: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:280: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:297: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:280: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:297: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:280: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:297: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:280: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:297: :gb_trees.insert_1/4
(stdlib) gb_trees.erl:277: :gb_trees.insert/3
(libring) lib/ring.ex:104: anonymous fn/3 in HashRing.add_node/3
(elixir) lib/enum.ex:1785: Enum.reduce_range_inc/4
I'll take a look tonight hopefully, looks like there is a hash collision, but I'll need to investigate - as a workaround in the meantime, I would suggest adding nodes as a tuple like {:node, i}
where i
is the integer value, as it doesn't seem to have the same problem. Let me know what version of Erlang you are seeing this on if it still happens, perhaps something has changed that I'll need to account for. Thanks fur reporting!
I've pushed a fix to master for this. I'm planning on doing some additional testing prior to release to ensure key distribution is still uniform (assuming uniform weights), but the fix looks good so far in my preliminary testing.
Just an FYI that fix is not correct, working on the full fix right now.
thanks!
This is now fixed in master, can you confirm things look good on your end before I publish 1.2.0?
Hi there, It still collides just at a larger number
0..2000 |> Enum.reduce(HashRing.new, fn (n, r) -> IO.inspect(n); r |> HashRing.add_node(n) end)
I don't think it's an integer specific issue as it eventually collides for cases like {:node, i} too
This change currently solves my problem though for 1000+, but using {:node, i} only goes up to 185 now before a collision
0..2000 |> Enum.reduce(HashRing.new, fn (n, r) -> IO.inspect(n); r |> HashRing.add_node({:node, n}) end)
For version 1.1, I was able to get above 1000 by using {:_node, n} so that's what I've been doing thus far
Hmmm, yeah the design of the hash ring is such that it inserts many shards per node added - 128 by default, so if you insert 2k nodes, you actually are inserting 256k elements in the ring and because of the similar structure of the nodes (it's always a tuple of {node, shard}
) the likelihood of collision is very high. The other factor in the design is that we need keys to be distributed uniformly across the range expressed by the nodes - this is why I was using phash2
to begin with, as the alternative is that I could simply generate a bignum from the hashed object, but then distribution is completely broken, phash2
handles distributing the values nicely across a pretty reasonable range 2^32-1
. I'm looking to see if I can modify the way the nodes are hashed such that the chance of collisions is decreased. That said, I'm wondering if you can tell me a bit more about your use case, there may be a better way of handling it, but I don't know much about how you are using libring at the moment.
@weixiyen I've found a solution and my tests look good - as an example I created a ring with n
up to 20,000. I've published 1.2.1 with the fixes.
awesome, confirmed. And still maintained the speed of ring creation over the other libraries.
So looking through the code, some nodes will simply never match correct? That's a perfectly acceptable solution for my use case, but just wanted to make sure.
Not exactly, the collisions occur for a specific shard, so it means one less shard on the ring for that node, but ultimately that doesn't affect the key distribution really at all, and each node will always be represented on the ring even if collisions occur.