Ability to specify several entrypoints to the cluster
Opened this issue · 3 comments
Hi there, @artix75!
First of all, this project looks like it could eliminate many pain points with using a Redis cluster for us, super exciting! Thanks a lot for working on this.
I've taken it for a test drive with our Redis cluster deployment and noticed a few things that would make it easier to use. For context, we run our Redis clusters in a container scheduler (think Kubernetes, Hashicorp Nomad etc) which implies that individual nodes will move around a lot.
- Being able to specify more than a single entrypoint to the cluster would allow a two-tier deployment with individually scalable redis & proxy groups. Additionally, the proxy would not fail to start if the one node it was assigned to just moved/crashed. As part of this, it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)
- Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
- If any nodes were added but I have healthy instances to talk to -> no action required
- If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes
That'd allow us to not restart the proxy whenever one of the allocation moves.
- The last one is a potential bug which I'm working on a solid repro case for: As part of cluster bootstrapping, we do a CLUSTER RESET HARD. If the proxy connected to the node prior to doing this, it will not pick up the nodes that we start learning about afterwards - The cluster can be healthy but requests sent to the proxy will fail. Take this with a grain of salt - a few assumptions here, still investigating this.
I'd be happy to send pull requests for any of the points above if they align with your vision for this project. Additionally, do you have a published roadmap/next planned work items so we could start contributing a bit? :)
Wanted to chime in on this as we are looking on using redis-cluster-proxy in the future ourselves.
I've modified a Lua client to support a dynamic list of redis servers. When the client is starting, it needs to reach at least one healthy redis node. Once a connection is established, the clients creates an internal list of servers listed in cluster slots. If the client for any reason is unable to communicate with the cluster, it will try to contact all servers in the internal list to get a working connection.
If there is nodes added/removed, the internal list is updated, as ASK/MOVED commands will force the client to update it slots information.
Modifications I did for the lua client is here: steve0511/resty-redis-cluster@master...toredash:fix/dynamic-serv-list
I've taken it for a test drive with our Redis cluster deployment and noticed a few things that would make it easier to use. For context, we run our Redis clusters in a container scheduler (think Kubernetes, Hashicorp Nomad etc) which implies that individual nodes will move around a lot.
- Being able to specify more than a single entrypoint to the cluster would allow a two-tier deployment with individually scalable redis & proxy groups. Additionally, the proxy would not fail to start if the one node it was assigned to just moved/crashed. As part of this, it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)
Could this be solved by using a service in e.g. k8s ? That way you would always hit a node that is reporting. This seems like a thing that could be solved with using DNS.
- Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
- If any nodes were added but I have healthy instances to talk to -> no action required
- If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes
That'd allow us to not restart the proxy whenever one of the allocation moves.
See my notes above about one approach in Lua
Using DNS is an interesting idea and it would for sure solve the "please give me any node which is living right now" and as such would be a step up/allow us to run a two-tiered deployment.
However, it does a lack of the flexibility that specifically listing all/a subset of IPs would provide, e.g.
- Unless the proxy continuously re-evaluates the DNS name, this would not allow us to pick up new nodes on the fly (to help in e.g. situations where the entire cluster goes down and comes back within a few minutes - rack/pod/dc outages happen). We could bounce the proxy in these scenarios but that doesn't seem ideal.
- If the proxy evaluates the DNS name to just a single IP, then it could hit an instance which just failed and fail to start up.
- It'd be preferable for the proxy to have a list of all nodes to go through in failure scenarios and there's a practical limit to records per DNS name depending on what software is being used - I, for example, could not fit one record per redis instance to a DNS name for a few of our clusters. To be fair, this is an extreme edge case but it does happen.
Thank you for posting your solution for the lua client, that behaviour looks great and would solve all of my concerns. :)
@JanBerktold Hi, thank you for your reports and suggestions.
As for the multiple entrypoints, it's a cool feature that it's worth to implement in the next future IMHO.
I'll try to answer to the other suggestions:
- ...it would be nice to be able to read servers from the config file as opposed to a mandatory command line argument as that's a little easier to setup in our context (and everybody else running in docker I imagine)
You can already specify the endpoint in the config file. Just launch the proxy specifying the config file:
./redis-cluster-proxy -c /path/to/proxy.conf
Inside the config file, you can set the endpoint by this way:
cluster 127.0.0.1:7000
In the latest commit (unstable branch), you can also find an example of a config file (proxy.conf)
- Really cool would also be the ability to reload those entrypoints (maybe by listening for SIGHUP + rereading the config file?). IMO the ideal behaviour would be
- If any nodes were added but I have healthy instances to talk to -> no action required
- If any nodes were added and all of the nodes that I know about are dead/gone -> connect to these new nodes
That'd allow us to not restart the proxy whenever one of the allocation moves.
- The last one is a potential bug which I'm working on a solid repro case for: As part of cluster bootstrapping, we do a CLUSTER RESET HARD. If the proxy connected to the node prior to doing this, it will not pick up the nodes that we start learning about afterwards - The cluster can be healthy but requests sent to the proxy will fail. Take this with a grain of salt - a few assumptions here, still investigating this.
Currently, the proxy automatically reconfigures its internal cluster representation following ASK or MOVED replies (in the next days I'll implement handlers for other cluster-related errors), anyway it always uses the endpoint specified via command-line or config file.
Implementing multiple endpoints and considering every master node as a potential entry point (also after a cluster reconfiguration, ie with added nodes) could maybe manage the issue.
I'd be happy to send pull requests for any of the points above if they align with your vision for this project. Additionally, do you have a published roadmap/next planned work items so we could start contributing a bit? :)
You're free to send PRs if you want, a roadmap will be available after the first RC1 (which is planned by the end of this month).
Thank you again :)