google/orbax

Custom strategy of keeping checkpoints

Karina1997 opened this issue · 1 comments

Hello team,

I setup checkpointing every 10 steps.

But I need to store 1 last checkpoint multiple of 100 and 1 last checkpoint multiple of 1000 together with 2 last checkpoints multiple of 10 steps.

For example
On step 2575 I want to have checkpoints from steps: 2000 (1 checkpoint multiple of 1000), 2500 (1 checkpoint multiple of 100) and 2570, 2560 (2 previous checkpoints multiple of 10 steps)

Can you please tell how can I have this custom logic?

Probably easiest to just subclass CheckpointManager and override the should_save method. That will allow you to control custom behavior for saving on particular steps.