QData/spacetimeformer

`configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}

Opened this issue · 2 comments

when using “python ./spacetimeformer/train.py spacetimeformer mnist --embed_method spatio-temporal --local_self_attn full --local_cross_attn full --global_self_attn full --global_cross_attn full --run_name mnist_spatiotemporal --context_points 10 --gpus 0”
it occurs this problem:

Traceback (most recent call last):
  File "./spacetimeformer/train.py", line 869, in <module>
    main(args)
  File "./spacetimeformer/train.py", line 849, in main
    trainer.fit(forecaster, datamodule=data_module)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 771, in fit
    self._call_and_handle_interrupt(
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 724, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 812, in _fit_impl
    results = self._run(model, ckpt_path=self.ckpt_path)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1218, in _run
    self.strategy.setup(self)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/strategies/dp.py", line 70, in setup
    super().setup(trainer)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 139, in setup
    self.setup_optimizers(trainer)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 128, in setup_optimizers
    self.optimizers, self.lr_scheduler_configs, self.optimizer_frequencies = _init_optimizers_and_lr_schedulers(
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 203, in _init_optimizers_and_lr_schedulers
    _configure_schedulers_automatic_opt(lr_schedulers, monitor)
  File "/home/pdy265/.conda/envs/spacetimeformer/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 318, in _configure_schedulers_automatic_opt
    raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: `configure_optimizers` must include a monitor when a `ReduceLROnPlateau` scheduler is used. For example: {"optimizer": optimizer, "lr_scheduler": scheduler, "monitor": "metric_to_track"}

how to deal with it?

I am facing the same problem, the environment seems to have the packages with a proper version:

absl-py 2.1.0
aiohttp 3.9.3
aiosignal 1.3.1
antlr4-python3-runtime 4.9.3
appdirs 1.4.4
async-timeout 4.0.3
attrs 23.2.0
axial-positional-embedding 0.2.1
cachetools 5.3.3
certifi 2024.2.2
cftime 1.6.3
chardet 5.2.0
charset-normalizer 3.3.2
click 8.1.7
cmdstanpy 0.9.68
colorama 0.4.6
contourpy 1.1.1
convertdate 2.4.0
cycler 0.12.1
Cython 3.0.10
docker-pycreds 0.4.0
einops 0.7.0
filelock 3.13.3
fonttools 4.50.0
frozenlist 1.4.1
fsspec 2024.3.1
gitdb 4.0.11
GitPython 3.1.43
google-auth 2.29.0
google-auth-oauthlib 1.0.0
grpcio 1.62.1
idna 3.6
importlib-metadata 7.1.0
importlib-resources 6.4.0
Jinja2 3.1.3
joblib 1.3.2
kiwisolver 1.4.5
local-attention 1.9.0
Markdown 3.6
MarkupSafe 2.1.5
matplotlib 3.7.5
mpmath 1.3.0
multidict 6.0.5
netCDF4 1.6.5
networkx 3.1
numpy 1.24.4
nystrom-attention 0.0.11
oauthlib 3.2.2
omegaconf 2.3.0
opencv-python 4.9.0.80
opt-einsum 3.3.0
packaging 24.0
pandas 2.0.3
performer-pytorch 1.1.4
pillow 10.3.0
pip 21.1.1
protobuf 4.25.3
psutil 5.9.8
pyasn1 0.6.0
pyasn1-modules 0.4.0
pyDeprecate 0.3.2
PyMeeus 0.5.12
pyparsing 3.1.2
pystan 2.19.1.1
python-dateutil 2.9.0.post0
pytorch-lightning 1.6.0
pytz 2024.1
PyYAML 6.0.1
requests 2.31.0
requests-oauthlib 2.0.0
rsa 4.9
scikit-learn 1.3.2
scipy 1.10.1
seaborn 0.13.2
sentry-sdk 1.44.1
setproctitle 1.3.3
setuptools 56.0.0
six 1.16.0
smmap 5.0.1
spacetimeformer 1.5.0
sympy 1.12
tensorboard 2.14.0
tensorboard-data-server 0.7.2
threadpoolctl 3.4.0
torch 1.11.0+cu113
torchaudio 0.11.0+cu113
torchmetrics 0.5.1
torchvision 0.12.0+cu113
tqdm 4.66.2
typing-extensions 4.10.0
tzdata 2024.1
ujson 5.9.0
urllib3 2.2.1
wandb 0.16.6
werkzeug 3.0.2
wheel 0.43.0
yarl 1.9.4
zipp 3.18.1

I have managed to solve the problem based on the answer provided in #79 (comment)

I add some changes to that answer, considering the definition of the variables should be done using "self". The way to solve it is to modify the spacetimeformer_model.py. It's under spacetimeformer/spacetimeformer_model. In the function configure_optimizers, so it seems like that now:

def configure_optimizers(self):
    self.optimizer = torch.optim.AdamW(
        self.parameters(),
        lr=self.base_lr,
        weight_decay=self.l2_coeff,
        )
    self.scheduler = stf.lr_scheduler.WarmupReduceLROnPlateau(
        self.optimizer,
        init_lr=self.init_lr,
        peak_lr=self.base_lr,
        warmup_steps=self.warmup_steps,
        patience=3,
        factor=self.decay_factor,
    )

    monitor = 'val/loss'

    return {
        "optimizer": self.optimizer,
        "lr_scheduler": {
            "scheduler": self.scheduler,
            "monitor": monitor
        }
    }