bwlewis/doRedis

Handle redis state separately from global redis connection

Opened this issue · 3 comments

If your queue is on one redis server but you want to conduct analyses based on data from another redis server (say one operating as an LRU cache), then it would seem like the use of doRedis would be difficult and/or impose a lot of overhead and responsibility for switching between the two redisConnect states. Can the redis connection used by doRedis be isolated somehow to not interfere with other calls to redisConnect?

Sorry for the latency. What about:

library(doRedis)
x = redisConnect(host="server 1", returnRef=TRUE)   # first redis connection
registerDoRedis(host="server 2", closeExisting=FALSE)  # doRedis-specific 2nd redis connection
redisSetContext(x)

There is almost no overhead there to manage multiple open redis connections, but I do agree that there is some responsibility for switching between them.

Let me know if you think that there's a better way. I guess one approach would be to follow the DBI approach and just include a connection argument in every function. Do you think that would be better?

I think the overhead I was thinking about really was better characterized as 'responsibility'. I also wasn't sure that rebuilding the connection to the task management node at the end of the work would be sufficient to keep doRedis working as it should. I'm glad to hear authoritatively that rebuilding the connection should be enough. I thought it hadn't worked when I had tried it, but I now suspect that my earlier attempts we befuddled by the issue with forks and IO we discussed earlier or that I had neglected the closeExisting argument).

As for a 'better' way, I think it is probably a matter of style/preference. It seems to me that R's 'usual' mode is to not have global side effects from function calls, but that 'typical mode' is violated by redisConnect(..., returnRef=FALSE) does. I have grown to like DBI-style (db connection object in the function call) and RcppRedis/redux style calls where one uses functions from the connection object. However, it would seem to be unkind to users to upset the apple-cart on an existing package. So, I don't really advocate for a change in that.

Vis a vie doRedis, it might be a kindness to users to have the workers manage their own connection outside of the global connection that redisConnect builds by default. However, I imagine that I'm in a sort of rare edge case in wanting to talk to different redis servers within the same script.

I think this is a good idea. Thinking of ways to implement it now...