taichi-dev/taichi-nerfs

pytorch_lightning Strategy error

tlightsky opened this issue · 2 comments

(taichi-nerfs) E:\source\py\taichi-nerfs>python train.py --root_dir ./Synthetic_NeRF/Lego --exp_name Lego --perf --num_epochs 20 --batch_size 8192 --lr 1e-2 --no_save_test --gui
[Taichi] version 1.6.0, llvm 15.0.1, commit 98574106, win, python 3.8.16
[Taichi] Starting on arch=cuda
[W 04/05/23 13:48:29.986 66204] [memory_pool.cpp:taichi::lang::MemoryPool::MemoryPool@43] Missing CUDA implementation
GridEncoding: Nmin=16 b=1.31951 F=2 T=2^19 L=16
per_level_scale:  1.3195079107728942
offset_:  5710032
total_hash_size:  11420064
Traceback (most recent call last):
  File "train.py", line 291, in <module>
    trainer = Trainer(
  File "E:\Users\wangc\miniconda3\envs\taichi-nerfs\lib\site-packages\pytorch_lightning\utilities\argparse.py", line 69, in insert_env_defaults
    return fn(self, **kwargs)
  File "E:\Users\wangc\miniconda3\envs\taichi-nerfs\lib\site-packages\pytorch_lightning\trainer\trainer.py", line 393, in __init__
    self._accelerator_connector = _AcceleratorConnector(
  File "E:\Users\wangc\miniconda3\envs\taichi-nerfs\lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py", line 140, in __init__
    self._check_config_and_set_final_flags(
  File "E:\Users\wangc\miniconda3\envs\taichi-nerfs\lib\site-packages\pytorch_lightning\trainer\connectors\accelerator_connector.py", line 206, in _check_config_and_set_final_flags
    raise ValueError(
ValueError: You selected an invalid strategy name: `strategy=None`. It must be either a string or an instance of `pytorch_lightning.strategies.Strategy`. Example choices: auto, ddp, ddp_spawn, deepspeed, ... Find a complete list of options in our documentation at https://lightning.ai

fixed after comment strategy(default auto):

    trainer = Trainer(
        max_epochs=hparams.num_epochs,
        check_val_every_n_epoch=hparams.num_epochs,
        callbacks=callbacks,
        logger=None,
        enable_model_summary=False,
        accelerator='gpu',
        devices=1,
        # strategy=None,
        num_sanity_val_steps=0,
        precision=16,
    )

Thanks to @Asxcvbn, with PR #32, the issue should now be resolved!