_validate_snapshot_available() failing although torchsnapshot is available
Opened this issue ยท 1 comments
๐ Describe the bug
When running my code with torchtnt and the TorchSnapshotSaver (torchsnapshot_saver.py), I get the following error after construction of the class:
RuntimeError: TorchSnapshotSaver support requires torchsnapshot. Please make sure ``torchsnapshot`` is installed. Installation: https://github.com/pytorch/torchsnapshot#install
This line can be found here.
However, torchsnapshot can be imported.
Versions
I tried installing torchsnapshot and torchtnt from conda, pypi, and directly from the github repos. I always get this result.
I also ran into this.
It seems that torchsnapshot_saver.py
is importing override_max_per_rank_io_concurrency
from torchsnapshot.knobs
, which is only available on the main branch and not in the 0.1.0 release.
Perhaps the simplest solution is to release another version of torchsnapshot, and constraint torchtnt to depend on that.
Edit: In the short term, installing torchsnapshot with pip install --pre torchsnapshot-nightly
worked for me.