Netflix/Priam

How to update tokens after adding nodes to cluster

Opened this issue · 11 comments

Hi,

Trying to find an answer in the archives, but only got an issue with "doubling the cluster" in wiki is empty.
Currently, we're evaluating Priam for our C* clusters in Xively and I'm wondering what's the easiest way of adding new nodes to a running cluster and especially how to do this properly in a multi-hundred-node cluster.

Working with manual token assignment (not using vnodes, because Priam doesn't support it), after adding a new node, I've to calculate new tokens and move nodes one-by-one. The quickest way looks like is that calling Priam's /move endpoint on each node and do the same with /cleanup as well.

Is there a better way implemented in Priam doing the same?

Question B is still: how to double the cluster size with Priam, because in that case token assignment is a bit easier?

Thanks,
Andor

Hi Andor, will engage my colleagues at Netflix and get back to you.

Regards

Hi Vinh,
Do you have an answer for this?

I believe InstanceIdentity should be updated with the new token in SimpleDB by the /move endpoint, but according to the latest sources this is not the case.

Hi Vinh,
Do you have an update on this thread?
Thanks,
Andor

Hi Andy, sorry for the delay. I will try to spend time to research and
answer your questions.
-Vinh

On Tue, Aug 30, 2016 at 4:26 AM Andor Molnár notifications@github.com
wrote:

Hi Vinh,
Do you have an update on this thread?
Thanks,
Andor


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
#493 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/ADNEbZXZnJcXRCDk2NaopmdXQ-3cTy7Tks5qlBN6gaJpZM4I4A8F
.

Any luck on this? Currently we have to update SimpleDB manually.

Hi BlueDragonX,

Yes that's still the case. The idea is that you should always double the cluster with Priam rather than adding nodes one-by-one.

Andor

That seems like a poor reason to cripple this feature. Why have a move endpoint at all, then, if that's the thinking?

Just to provide more context.
Why not add node instead of doubling? Doubling the cluster ensures balance of data.
Why not use v-nodes? v-nodes have its advantages but for Netflix, it has one critical flaw, token range offline. Specifically, there is a possibility of replication residing in same zones. Hence losing the zone will result in token range offline.

I know why doubling is better. That doesn't address the issue of move bot behaving the way it should. Either implement it correctly or remove it.

That aside, in my experience using the double ring endpoint does not result in a balanced cluster. As a result I have to move nodes. I thought this was by design, but based on this discussion that doesn't seem to be the case.

No, double-ring endpoint works properly for us (Xively), so you might have misconfigured something or you have a wrong ASG setup.
/move endpoint however should be extended with an update in SimpleDb, because currently it only forwards the request to nodetool.
It would be nice if someone could create PR on that.