JasonPunyon/Rol

Handling lots of reads and writes with Rol and Redis

Opened this issue · 1 comments

I'm using Rol and Redis to be a primary data store for stats I'm calculating from the Stack exchange data dump.

Currently, the flow is:

Get for a User; Update any properties in place for that user, and then move on to the next user. Because I'm parsing XML files, and each Row can represent a different user, I end up retrieving the same user multiple thousand times using Rol.

This works well enough; but after a few hours of running this; I end up getting a timeout from Redis::

Unhandled Exception: System.TimeoutException: Timeout performing HGET /IPostStats/photo.stackexchange.com-918603, inst: 1, mgr: ExecuteSelect, err: never, queue: 2, qu: 0, qs: 2, qc: 0, wr: 0, wq: 0, in: 0, ar: 0, IOCP: (Busy=0,Free=1000,Min=8,Max=1000), WORKER: (Busy=0,Free=32767,Min=8,Max=32767), clientName: DOR
   at StackExchange.Redis.ConnectionMultiplexer.ExecuteSyncImpl[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\ConnectionMultiplexer.cs:line 1927
   at StackExchange.Redis.RedisBase.ExecuteSync[T](Message message, ResultProcessor`1 processor, ServerEndPoint server) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\RedisBase.cs:line 80
   at StackExchange.Redis.RedisDatabase.HashGet(RedisKey key, RedisValue hashField, CommandFlags flags) in c:\TeamCity\buildAgent\work\3ae0647004edff78\StackExchange.Redis\StackExchange\Redis\RedisDatabase.cs:line 118
   at Rol.RedisOperations.GetHashValue[TKey,TValue](Store store, RedisKey hashName, TKey field)
   at StackStats.Stats.Extensions.UserExtensions.GetAverageAnswersPerDay(IUser user) in c:\projects\shift\src\StatPersistence\Stats\Extensions\UserExtensions.cs:line 28
   at XmlProcessor.UserProcessor.Process(XElement element, String siteName) in c:\projects\shift\src\XmlProcessor\PostProcessor.cs:line 53
   at XmlProcessor.DocumentProcessor.processFile(IProcessor processor, XmlReader reader, String siteName) in c:\projects\shift\src\XmlProcessor\DocumentProcessor.cs:line 76
   at XmlProcessor.DocumentProcessor.ProcessXmlDocument(String xmlFile) in c:\projects\shift\src\XmlProcessor\DocumentProcessor.cs:line 37
   at StatRunner.Program.Main(String[] args) in c:\projects\shift\src\StatRunner\Program.cs:line 28

According to StackExchange.Redis issue #83; I should increase the timeout in working with Redis (to give it more time to write data to disk before giving it more operations to do).

My question here is; does Rol have any defaults / preferences for those Redis settings; and it appears Rol currently retrieves items lazily and writes them upon setting; is there any way to change that behavior to retrieve on load and write at a predetermined point?

That's not necessarily the solution; I'm just looking for what would be a better practice here.

After speaking with JSONPCares on twitter, I am going to do two things:

  • Implement caching
  • Divorce the Rol objects from the objects I process; read once at load, cache in memory; and write when I'm done. This should result in less reads from Redis and less writes to redis than writing on every property setter. It also helps me make sure the operations are atomic.

Right now it appears that Rols writes are not atomic; you can write a property, have it saved to redis; try to write the next property, and have it fail, leaving the object in an inconsistent state. This is fine -- and a good design choice for what Rol's purpose is. I'm just abusing Rol because I'm lazy and it's awesome.