
Training loop broken

ferdinandl007 opened this issue · 2 comments

When trying to run the model for example as.
python train.py spacetimeformer toy2 --run_name spatiotemporal_toy2
--d_model 100 --d_ff 400 --enc_layers 2 --dec_layers 2
--gpus 0 --batch_size 32 --start_token_len 4 --n_heads 4
--grad_clip_norm 1 --early_stopping --trials 1

Training crashes immediately.

pytorch_lightning.utilities.exceptions.MisconfigurationException: You are trying to self.log() but it is not managed by

I had this same issue. I assume your version of pytorch-lightning is too advanced. I downgraded to
pip install pytorch-lightning==1.5
and it fixed this issue

Thanks @AlfredT15, yeah this a lightning versioning issue. The logging system seems to change often, especially with data-parallel training. I think I've already had to fix it in the public version since I originally wrote the code last year.

I am doing some followup work on this project and I'll make sure when that code comes out it's compatible with the latest PyTorch lightning.