jmejia8/Metaheuristics.jl

Save intermediate populations to file to enable restarting

jdossgollin opened this issue · 2 comments

This is a feature request and I'm not sure whether you're interested in including it in your package

For optimizing heavy-ish functions, sometimes computer stuff happens. It would be great to have an easy API for saving intermediate populations to file so that if computation is interrupted, it's possible to pick up from about where it left off.

I think this should be pretty easy to do and I'm happy to try to help out with a first stab PR (may take a while, next few weeks are busy, and will probably need some additional work) but wanted to check in on what that API and implementation might look like.

API: this seems like it might go in Options(). I'm not sure what the keyword arguments should be named, but it seems like there should be an option to save results every N iterations (defaults to Inf/missing/similar that gives current behavior). Additionally, I think making the user define the filename to use for caching would make sense.

Implementation: Every N iterations, use JDL2 to save the result from optimize. optimize would need to check for the existence of an existing file -- to load in from file, an approach similar to that given in https://jmejia8.github.io/Metaheuristics.jl/stable/examples/#Providing-Initial-Solutions should work.

Any implementation of this should probably coordinate with #39

Hi! Sure any kind of contribution is appreciated.

An easy way for saving the current status of an optimization process could be great; however, restarting the optimization (after an unexpected stopping) may require some considerations:

  1. Some algorithms are auto-adapting their parameters, we need to save such adapted parameters.
  2. Should be necessary to save information on the optimization problem as well?
  3. How to restart the seed for the random number generator fixed by the user?

I would prefer to use BSON due to is lighter and faster than JDL2, as far as I remember.

Regarding your idea, I have two possible implementations for you:

1. Using Options:

method = ECA(options = Options(save_every = 10)) # to save every 10 generations/iterations
optimize(f, bounds, method)

A possible autogenerated file structure:

├── saving
│   ├── ECA
│   │   ├── data-04-dec-2022-at-09-47-56.bson
├── optimizing-heavy-ish-functions.jl

Where the BSON file would save a dictionary containing everything saved in method. Here, probably, we need to implement recursive translation struct to Dict.

To restart, the API would check for the existence of an existing file at saving/ECA/ without modifying the code in optimizing-heavy-ish-functions.jl.


2. Wrapping the optimizers:

The idea here is to define a wrapper BackUp(method) that saves everything in method. For example

solver = BackUp(ECA(), every = 10, at = "my-back-up-file.bson")
optimize(f, bounds, solver)

Here, the API would check for the existence of an existing file at saving/my-data.bson to try to restart the optimization process. However, if the file does not exist, then initiate ECA as usual.

Names for the possible wrapper: "BackUp", "Save", "StoreTraces", ...