Jabb0/FastFlow3D

Accelerator='ddp' is an invalid accelerator name

Closed this issue · 6 comments

Hello again! I encounter an error when I'm trying to run the train.py, which shows that accelerator='ddp' is an invalid accelerator name. The error message is shown at the end of the issue.
My environment is:
CUDA 11.3
Python 3.10.8
PyTorch 1.12.1
PyTorch lightning 1.8.3

and I've also tried the environment setting as follows and still encounter the same problem:
CUDA 11.3
Python 3.8.13
PyTorch 1.10.0
PyTorch lightning 1.7.7

Can you kindly offer some suggestions? Thanks a lot and looking forward to your reply!

~/FastFlow3D-main$ python train.py --accelerator='ddp' --batch_size=16 --gpus=4 --num_workers=16 --learning_rate=0.0001 --disable_ddp_unused_check=True
No weights and biases API key set. Using tensorboard instead!
Disabling unused parameter check for DDP
Traceback (most recent call last):
File "/home/fjy/FastFlow3D-main/train.py", line 286, in
cli()
File "/home/fjy/FastFlow3D-main/train.py", line 263, in cli
trainer = pl.Trainer.from_argparse_args(args,
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1917, in from_argparse_args
return from_argparse_args(cls, args, **kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 66, in from_argparse_args
return cls(**trainer_kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/utilities/argparse.py", line 340, in insert_env_defaults
return fn(self, **kwargs)
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 408, in init
self._accelerator_connector = AcceleratorConnector(
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 192, in init
self._check_config_and_set_final_flags(
File "/home/fjy/anaconda3/envs/fastflow/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/accelerator_connector.py", line 291, in _check_config_and_set_final_flags
raise ValueError(
ValueError: You selected an invalid accelerator name: accelerator='ddp'. Available names are: cpu, cuda, hpu, ipu, mps, tpu.

Jabb0 commented

Yeah, I think you're right! It's my first time to use pyTorch lightning and I get quite confused about the parameter passing required for model training. It seems that some essential settings like max_epochs and weights_save_path are not given in the run.sh file or train.py. Does this have something to do with the pyTorch lightning version or maybe these model training configurations are given elsewhere? Sincerely looking forward to your reply, thanks a lot!

Besides, after setting the accelerator to 'gpu' and strategy='ddp' in the trainer, a new error shows up, as given at the end of this comment. Since the pyTorch lightning version is updated, I think the plugin setting should be renewed as well, but I'm not sure which type to choose for plugin. Looking forward to your kind guidance, thank you!

~/FastFlow3D-main$ python train.py --accelerator='gpu' --batch_size=16 --gpus=4 --num_w orkers=16 --learning_rate=0.0001 --disable_ddp_unused_check=True
No weights and biases API key set. Using tensorboard instead!
Disabling unused parameter check for GPU
Traceback (most recent call last):
File "train.py", line 287, in
cli()
File "train.py", line 264, in cli
trainer = pl.Trainer.from_argparse_args(args,
File "/home/fjy/anaconda3/envs/pvraft/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", l ine 2449, in from_argparse_args
return from_argparse_args(cls, args, **kwargs)
File "/home/fjy/anaconda3/envs/pvraft/lib/python3.8/site-packages/pytorch_lightning/utilities/argparse.py" , line 72, in from_argparse_args
return cls(**trainer_kwargs)
File "/home/fjy/anaconda3/envs/pvraft/lib/python3.8/site-packages/pytorch_lightning/utilities/argparse.py" , line 345, in insert_env_defaults
return fn(self, **kwargs)
File "/home/fjy/anaconda3/envs/pvraft/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", l ine 433, in init
self._accelerator_connector = AcceleratorConnector(
File "/home/fjy/anaconda3/envs/pvraft/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/acc elerator_connector.py", line 193, in init
self._check_config_and_set_final_flags(
File "/home/fjy/anaconda3/envs/pvraft/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/acc elerator_connector.py", line 327, in _check_config_and_set_final_flags
raise MisconfigurationException(
pytorch_lightning.utilities.exceptions.MisconfigurationException: Found invalid type for plugin <pytorch_lig htning.strategies.ddp.DDPStrategy object at 0x7f4d4c1d08b0>. Expected one of: PrecisionPlugin, CheckpointIO, ClusterEnviroment, or LayerSync.

Jabb0 commented
Jabb0 commented

I will look into it, thanks for your suggestions! 👍