Investigation needed, odd Redis behavior
marsupialtail opened this issue · 0 comments
Currently it is needed to do self.PFT.get after self.PFT.set to ensure that the PFT table has been properly updated before invoking the Ray RPCs which will cause the PFT table to be read across the workers.
If we don't do this I observe failures in properly initializing the workers. Adding a delay immediately after the self.PFT.set also works, which leads me to suspect a Redis timing issue.
This theoretically should not happen as the self.PFT.set should be blocking in the client, as in it should finish updating Redis atomically before returning, thus the Redis state should be good before the RPCs are launched.
This is very odd. Though this doesn't pose a problem right now we should understand this behavior.