Add REPLICATION_FACTOR support
erez-rabih opened this issue · 10 comments
Hi,
I saw there are two modes: consistent-hashing and broadcast. How can I set the replication factor of a metric so that a single metric arrives consistently to two graphite backends?
Replication factor doesn't exist, but I could (and probably should) add it. Let me review this.
Thanks for the fast reply.
Also, I have thought about using polymur in production. I wanted to ask you how stable is it from your experience?
It's routed almost the entire production metrics traffic at FireEye for over a year, and I've also heard from some pretty well known companies that have began using it (although I didn't gather at what scale). From a stability standpoint, it's production-worthy and doesn't have any known/open stability related bugs. Mostly just features.
Nice.
I would definitely switch my carbon-relays to polymur once replication factor is implemented.
Looks like a great project.
Renamed and will use this for issue tracking. Notes for development:
With replication, a get_nodes is called repeatedly during key lookup until a set of REPLICATION_FACTOR
length (server, instance) tuples is gathered. These are the routing targets.
Initial idea would be to specify replication factor in the destination string, e.g. * as a polymur -destinations="10.0.5.20:2003
for a REPLICATION_FACTOR equivalent of 2-replication-factor
config. Unspecified should default to 1 to be backwards compatible with existing configuration.
*Replication factor has to be applied to the whole pool, so per-destination settings don't make sense.
I think replication factor should be an independent flag as it has no relation to a specific destination.
Also, I see no use case for different replication factors on different destinations so there's not reason to attach a RF (replication factor) to a specific host:ip
Yeah, I just realized what I was doing and updated :)
Also, RF should only be taken into account when consistent hashing is used since broadcast implicitly means RF = # Destinations
Or if we really want to be smart about this - broadcast is just a specific case in which RF = #Destinations but I don't know the project well enough to decide if that's how you would like to implement this.
It should just be ignored in broadcast, since that's basically what broadcast is (send a copy of all metrics to all destinations in the list). Will probably just add a startup note that lets users know if broadcast is being used and a replication-factor is set, it's being ignored / has no effect.