error in train with ptlflow_demo_train.ipynb
Cyrus1993 opened this issue · 2 comments
Hi, When I run this code in Colab. I get the following error. Please advise. I did not change anything.
!python train.py raft_small \ --gpus 1 \ --train_dataset overfit-sintel \ --val_dataset none \ --train_batch_size 1 \ --max_epochs 100 \ --lr 1e-3
ERROR: torch_scatter not found. CSV requires torch_scatter library to run. Check instructions at: https://github.com/rusty1s/pytorch_scatter
Global seed set to 1234
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
01/02/2022 08:49:01 - WARNING: --train_crop_size is not set. It will be set as (432, 1024).
01/02/2022 08:49:01 - INFO: Loading 1 samples from Sintel_clean dataset.
Traceback (most recent call last):
File "train.py", line 152, in
train(args)
File "train.py", line 111, in train
trainer.fit(model)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 732, in fit
self._fit_impl, model, train_dataloaders, val_dataloaders, datamodule, ckpt_path
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 682, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 768, in _fit_impl
results = self._run(model, ckpt_path=ckpt_path)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1155, in _run
self.strategy.setup(self)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/single_device.py", line 76, in setup
super().setup(trainer)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 118, in setup
self.setup_optimizers(trainer)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/strategies/strategy.py", line 108, in setup_optimizers
self.lightning_module
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/core/optimizer.py", line 174, in _init_optimizers_and_lr_schedulers
optim_conf = model.trainer._call_lightning_module_hook("configure_optimizers", pl_module=model)
File "/usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/trainer.py", line 1535, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/ptlflow/models/base_model/base_model.py", line 358, in configure_optimizers
optimizer, self.args.lr, total_steps=self.args.max_steps, pct_start=0.05, cycle_momentum=False, anneal_strategy='linear')
File "/usr/local/lib/python3.7/dist-packages/torch/optim/lr_scheduler.py", line 1452, in init
raise ValueError("Expected positive integer total_steps, but got {}".format(total_steps))
ValueError: Expected positive integer total_steps, but got -1
Hi, thank you for reporting this error.
This was caused by a change in the default value of max_steps
in newer versions of PyTorch Lightning. The code in this repo was already updated, but the version in PyPI was not.
I just pushed a new version to PyPI, and the colab example should work now. If it doesn't, please check if the ptlflow
version being installed is the 0.2.5
. If it isn't, you can try to force to install the new version with pip install --upgrade ptlflow
.
I hope that helps.
Hi. excellent. It works exactly right.
Thank you so much for developing this awesome repository.