bdeggleston/kickboxer

workout how to GC old instances

Opened this issue · 1 comments

currently, command instances are kept around forever, which will become a problem pretty quickly.

Giving an instance a time-to-die time after it's been executed would be pretty simple

some things to consider:

  • if a replica is removed from the cluster for some period of time, and then rejoins, it will need to learn about all the instances it missed while it was gone, without any exception. Missed messages could be considered a separate problem, and solved by something like hinted handoff. However, unlike cassandra/hinted handoff, where differences can be resolved by comparing rows, not even a single instance being missed by a replica can be tolerated. Once they're gone from the other replicas, there's no way to get them back.

One way to prevent lost instances would be to require all instances to get confirmation that all every other replica has seen the committed version of a given instance before deleting it locally. This would add some network overhead as replicas were constantly telling each other that they've seen committed instance xyz. Plus, if a replica goes down for a long time, you have this unbounded number of instances you're keeping around, just in case it comes back, although there could be a cutoff point, where a node is reset and it's data is streamed back to it if it's gone for a specified period of time.

Although once you start doing that, you may as well have a time-to-die time. If that matched the replica reset time, that would be ok. Keeping around an hour or so of queries probably wouldn't be the end of the world, as long as you're not keeping them in memory.