Handling lots of reads and writes with Rol and Redis
Opened this issue · 1 comments
I'm using Rol and Redis to be a primary data store for stats I'm calculating from the Stack exchange data dump.
Currently, the flow is:
Get for a User; Update any properties in place for that user, and then move on to the next user. Because I'm parsing XML files, and each Row can represent a different user, I end up retrieving the same user multiple thousand times using Rol.
This works well enough; but after a few hours of running this; I end up getting a timeout from Redis::
Unhandled Exception: System.TimeoutException: Timeout performing HGET /IPostStats/photo.stackexchange.com-918603, inst: 1, mgr: ExecuteSelect, err: never, queue: 2, qu: 0, qs: 2, qc: 0, wr: 0, wq: 0, in: 0, ar: 0, IOCP: (Busy=0,Free=1000,Min=8,Max=1000), WORKER: (Busy=0,Free=32767,Min=8,Max=32767), clientName: DOR
at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 1927
at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\RedisBase.cs:line 80
at StackExchange.Redis.RedisDatabase.HashGet(RedisKey key, RedisValue hashField, CommandFlags flags) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\RedisDatabase.cs:line 118
at Rol.RedisOperations.GetHashValue[TKey,TValue](Store store, RedisKey hashName, TKey field)
at StackStats.Stats.Extensions.UserExtensions.GetAverageAnswersPerDay(IUser user) in c:\projects\shift\src\StatPersistence\Stats\Extensions\UserExtensions.cs:line 28
at XmlProcessor.UserProcessor.Process(XElement element, String siteName) in c:\projects\shift\src\XmlProcessor\PostProcessor.cs:line 53
at XmlProcessor.DocumentProcessor.processFile(IProcessor processor, XmlReader reader, String siteName) in c:\projects\shift\src\XmlProcessor\DocumentProcessor.cs:line 76
at XmlProcessor.DocumentProcessor.ProcessXmlDocument(String xmlFile) in c:\projects\shift\src\XmlProcessor\DocumentProcessor.cs:line 37
at StatRunner.Program.Main(String[] args) in c:\projects\shift\src\StatRunner\Program.cs:line 28
According to StackExchange.Redis issue #83; I should increase the timeout in working with Redis (to give it more time to write data to disk before giving it more operations to do).
My question here is; does Rol have any defaults / preferences for those Redis settings; and it appears Rol currently retrieves items lazily and writes them upon setting; is there any way to change that behavior to retrieve on load and write at a predetermined point?
That's not necessarily the solution; I'm just looking for what would be a better practice here.
After speaking with JSONPCares on twitter, I am going to do two things:
- Implement caching
- Divorce the Rol objects from the objects I process; read once at load, cache in memory; and write when I'm done. This should result in less reads from Redis and less writes to redis than writing on every property setter. It also helps me make sure the operations are atomic.
Right now it appears that Rols writes are not atomic; you can write a property, have it saved to redis; try to write the next property, and have it fail, leaving the object in an inconsistent state. This is fine -- and a good design choice for what Rol's purpose is. I'm just abusing Rol because I'm lazy and it's awesome.