exalearn/colmena

Autoeviction for Task Inputs

Opened this issue · 1 comments

We don't auto-evict data from proxystore for inputs from completed tasks, which can lead to memory/disk use issues.

We could...

  • Make autoeviction on read for the inputs from the compute worker, which will break task restarting
  • Make autoeviction after the tasks complete on the worker
  • ...

Some other ideas:

  • Add a callback to Result.__del__ which evicts all the associated proxies. Not my favourite idea because one could pickle a Result and send it to another process causing the value to be GC collected on the original process.
  • Add a callback to the Future of the task result.
    • Would address the task restarting problem.
    • The decision to evict the inputs could be tied to the keep_inputs flag. This would mean that inputs eviction is done on the task server, and value eviction would be done when access by the Thinker.
    • We would need to make it clear in some way that if keep_inputs=True then managing input proxies is up to the user (and potentially add a convenience callback like Result.cleanup() or something to clean up those associated proxies).