mk-minchul/AdaFace

AttributeError: 'MultiStepLR' object has no attribute 'get_epoch_values'

martinenkoEduard opened this issue · 3 comments

\AdaFace with the following property
self.m 0.4
self.h 0.333
self.s 64.0
self.t_alpha 0.01
Global seed set to 42
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py:441: LightningDeprecationWarning: Setting Trainer(gpus=1) is deprecated in v1.7 and will be removed in v2.0. Please use Trainer(accelerator='gpu', devices=1) instead.
rank_zero_deprecation(
Using 16bit native Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Trainer(limit_train_batches=1.0) was configured so 100% of the batches per epoch will be used..
Trainer(val_check_interval=1.0) was configured so validation will run at the end of the training epoch..
start training
making validation data memfile
[rank: 0] Global seed set to 42
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1

distributed_backend=nccl
All distributed processes registered. Starting with 1 processes

creating train dataset
record file length 490623
creating val dataset
laoding validation data memfile
laoding validation data memfile
laoding validation data memfile
laoding validation data memfile
laoding validation data memfile
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

| Name | Type | Params

0 | model | Backbone | 43.6 M
1 | head | AdaFace | 5.4 M
2 | cross_entropy_loss | CrossEntropyLoss | 0

49.0 M Trainable params
0 Non-trainable params
49.0 M Total params
97.997 Total estimated model params size (MB)
Sanity Checking DataLoader 0: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:01<00:00, 9.94it/s]/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:233: UserWarning: You called self.log('agedb_30_num_val_samples', ...) in your validation_epoch_end but the value needs to be floating point. Converting it to torch.float32.
warning_cache.warn(
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:233: UserWarning: You called self.log('epoch', ...) in your validation_epoch_end but the value needs to be floating point. Converting it to torch.float32.
warning_cache.warn(
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:537: PossibleUserWarning: It is recommended to use self.log('agedb_30_val_acc', ..., sync_dist=True) when logging on epoch level in distributed setting to accumulate the metric across devices.
warning_cache.warn(
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:537: PossibleUserWarning: It is recommended to use self.log('agedb_30_best_threshold', ..., sync_dist=True) when logging on epoch level in distributed setting to accumulate the metric across devices.
warning_cache.warn(
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:537: PossibleUserWarning: It is recommended to use self.log('agedb_30_num_val_samples', ..., sync_dist=True) when logging on epoch level in distributed setting to accumulate the metric across devices.
warning_cache.warn(
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:537: PossibleUserWarning: It is recommended to use self.log('val_acc', ..., sync_dist=True) when logging on epoch level in distributed setting to accumulate the metric across devices.
warning_cache.warn(
/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/result.py:537: PossibleUserWarning: It is recommended to use self.log('epoch', ..., sync_dist=True) when logging on epoch level in distributed setting to accumulate the metric across devices.
warning_cache.warn(
Epoch 0: 0%| | 0/8635 [00:00<?, ?it/s]Traceback (most recent call last):
File "main.py", line 109, in
main(args)
File "main.py", line 81, in main
trainer.fit(trainer_mod, data_mod)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 603, in fit
call._call_and_handle_interrupt(
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/call.py", line 36, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 90, in launch
return function(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 645, in _fit_impl
self._run(model, ckpt_path=self.ckpt_path)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1098, in _run
results = self._run_stage()
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1177, in _run_stage
self._run_train()
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1200, in _run_train
self.fit_loop.run()
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/fit_loop.py", line 267, in advance
self._outputs = self.epoch_loop.run(self._data_fetcher)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/epoch/training_epoch_loop.py", line 214, in advance
batch_output = self.batch_loop.run(kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/batch/training_batch_loop.py", line 88, in advance
outputs = self.optimizer_loop.run(optimizers, kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/loop.py", line 199, in run
self.advance(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 200, in advance
result = self._run_optimization(kwargs, self._optimizers[self.optim_progress.optimizer_position])
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 247, in _run_optimization
self._optimizer_step(optimizer, opt_idx, kwargs.get("batch_idx", 0), closure)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 357, in _optimizer_step
self.trainer._call_lightning_module_hook(
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1342, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/core/module.py", line 1661, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/core/optimizer.py", line 169, in step
step_output = self._strategy.optimizer_step(self._optimizer, self._optimizer_idx, closure, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 281, in optimizer_step
optimizer_output = super().optimizer_step(optimizer, opt_idx, closure, model, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/strategies/strategy.py", line 234, in optimizer_step
return self.precision_plugin.optimizer_step(
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/plugins/precision/native_amp.py", line 85, in optimizer_step
closure_result = closure()
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 147, in call
self._result = self.closure(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 133, in closure
step_output = self._step_fn()
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/loops/optimization/optimizer_loop.py", line 406, in _training_step
training_step_output = self.trainer._call_strategy_hook("training_step", *kwargs.values())
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1480, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/strategies/ddp.py", line 352, in training_step
return self.model(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1156, in forward
output = self._run_ddp_forward(*inputs, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1110, in _run_ddp_forward
return module_to_run(*inputs[0], **kwargs[0]) # type: ignore[index]
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/administrator/anaconda3/envs/py388/lib/python3.8/site-packages/pytorch_lightning/overrides/base.py", line 98, in forward
output = self._forward_module.training_step(*inputs, **kwargs)
File "/home/administrator/adaface/AdaFace/train_val.py", line 78, in training_step
lr = self.get_current_lr()
File "/home/administrator/adaface/AdaFace/train_val.py", line 60, in get_current_lr
lr = scheduler.get_epoch_values(self.current_epoch)[0]
AttributeError: 'MultiStepLR' object has no attribute 'get_epoch_values'
Epoch 0: 0%| | 0/8635 [00:00<?, ?it/s]

Is it a pytorch-lighting error?

same errors occur.

use lr = scheduler.get_last_lr()[0] instead of lr = scheduler.get_epoch_values(self.current_epoch)[0], it worked for me :)

use lr = scheduler.get_last_lr()[0] instead of lr = scheduler.get_epoch_values(self.current_epoch)[0], it worked for me :)

Yup. get_epoch_values() is deprecated