qubole/rubix

Avoid caching presto worker nodes

Opened this issue · 2 comments

After a presto worker nodes gets deregistered from the coordinator. because it is no longer responsive, we've noticed that it still receives cache requests to be read from for a period of time. This can lead to internal errors for the query and/or slower response times. Instead, since the rubix presto cluster manager has a handle to the presto node manager, it can get the worker nodes directly from it, sorting and adjusting the node index only when the nodes have changed.

@stagraqubole - I can take this one if you'd like.